Originalcomments < Main < Vanderbilt Biostatistics Wiki

You are here: Vanderbilt Biostatistics Wiki>Main Web>Projects>SOMProjects>CTSAProposal>CTSAthemes>CTSAtraining>Originalcomments (26 Apr 2013, JohnBock)EditAttach

Dean applies his spin to Rafe's ideas with the goal of creating 3-4 paragraphs of text usable in the proposal.

The process through which researchers design research programs, that is, sequences of related experiments, is vastly undocumented. New clinical investigators are exposed to the individual components of research (e.g., human genetics, molecular medicine, statistical methods), but they are not trained in research program design. Experienced researchers acquire this design knowledge through trial-and-error, and through interaction with other researchers. We submit that there are principles for designing sequences of related experiments, and that these principles can be systematically and accurately conveyed to new clinical investigators. We propose a course in research program (strategic) and experimental (tactical) design to benefit new clinical investigators.

How do we teach the art one researcher has learned to other researchers? What are these principles? Classical statistical experimental design rests on a number of principles, many of which were origianlly enumerated by Fisher:

Use replication to provide a valid estimate of the variability of subjects' responses.
Use randomization to balance over uncontrolled and/or unknown factors.
Use blocking to control known sources of variation that are not of primary interest.
Use factorial treatment designs which have the advantage of learning about the effects of multiple factors in a single, albeit more complex, experiment.

When feasible, multi-factor experiments can result in a net savings of subjects and cost, and provides the ability to study interactions between factors or conditions.

Like the design of experiments, the design of sequences of experiments can be distilled to key principles. These principles follow Fisher's ideas, but extend them in important ways (like what, dean?) These principles include the following:

Perform the sequence of experiments to provide valid estimates of the sources and magnitude of variability (e.g., measurement repeatability and reproducibility, intra- and inter-subject, effect of enviromental factors).
Use randomization to control for relationships that aren't yet understood or thought of.
Use blocking and factorial experiments...
Define criteria (decision analysis?) for when an experiment is done for both positive and negative outcomes.
When possible, validate the results of one technology using a different, complementary technology.

Maybe we shouldn't be too specific here. Instead, using Yu's suggestions below, show examples of experimental sequences. Can we abstract from these?

more later from Dean =========================================

Rafe offers the following, for what it is worth: I'm not exactly sure what this means but I think that is means that there is a need for improving the more strategic components of research direction. As statisticians, we can and ought comment on more than just individual study designs; we should be consulted when constellations of studies are under consideration. Figuring out how to get more than one answer for an individual trial or when additional questions will burden the trial past the point of productivity is an asset we bring to the table.

As usual, a big problem surrounding this issue is the simple dearth of statisicians. For example, there are approximately 120 faculty and 2 statisticians (one PhD and one MS) to cover all their research. Some of the research falls under the cancer center so the numbers aren't quite 120:2... yet, even if half is covered by cancer center, then it is still 60:2. Overwhelming odds, even for a statistician.

A model used in pharma has at least one statistician on each project team. Often there are statisticians devoted to the phase I, the phase II/III, and the phase IV (post-marketing) components of a compound. Further, there are typically at least 2 SAS-type programmers to support the programming tasks. A project team includes a number of clinical research people, and folks from regulatory, data management, commercial, medical writing, safety, and others. Of course, these sorts of collaborations are nearly required by law, as the FDA looks for expertise in all these areas.

The ratios of people who want to do studies to the people they need to do quality work are way, way, way, way, way, way out of balance. We need substantially more people in the support roles, statistics being one of them.

We, as statisticians, could spend all of our time just instructing and never do any real work ourselves and there still wouldn't be enough of us. Why to the investigator-types think that the work we do is something they can do? I don't run around thinking I can do surgery just because I can pull a sliver out of my kids' fingers.

Good statistics is hard, so hard that sometimes we, the experts, are not even sure what to do. We need more people. For everything. For instruction. For programming. For study planning. For analysis. Everything. People to help the investigator types realize that we need their expertise in the science at issue, not to create spreadsheets that take us longer to figure out than they would have spent with good design. The biggest place we can add value is getting the word out that, without training and experience, nonstatisticians just aren't going to be able to do what needs to be done.

Rafe offers some more (2006.02.03), supposing now that this actually might be considered a blog: After talking with Frank, the issues here with the experiment sequencing seem to be focused around a number of learning tasks. The way I now understand the problem, if Dr. Experienced Researcher spends 30 years doing translational research, pushing discoveries from the test tube out to the patient, then Dr. Experienced Researcher gets really good at knowing what experiments to do first and then second and then third, and so on, all the while conidtioning the design parameters of the next study on the totality of informtion gained up to that point. Since Dr. Experienced Researcher has been doing this for 30 years, Dr. Experienced Researcher is good at it and can do it quickly and easily and with good results, ie, Dr. Experienced Researcher knows how to demonstrate effects and how to build the story line from biochemical model on the chalkboard to a substance in a person. Age and experience bring wisdom.

Enter Dr. New Researcher. Dr. New Researcher hasn't done this before and does not have 30 years of experience like Dr. Experienced Researcher. How do we systematically, accurately, and easily convey Dr. Experienced Researcher's wisdom and knowledge to Dr. New Researcher? How do we teach the art one researcher has learned to other researchers?

Experimental design rests on a number of principles, some of which were origianlly enumerated by Fisher:

Performing the experiment so that it will provide a valid estimate of the underlying variability of subjects' responses.
Using randomization to be able to obtain these estimates of variability and to balance on uncontrolled variables.
Using blocking to balance out known sources of variation that are not of primary interest.
Utilizing factorial experimentation which has the advantages of learning about the effects of multiple factors in a single more complex experiment, rather than devoting a separate experiment (in different subjects or animals) for each experimental manipulation. When feasible, this can often result in a net savings of subjects and cost, and can add the ability to study synergism or interaction of two treatments or other conditions.

Like the design of experiments, the design of sequences of experiments ought to have principles to drive it. I might conjecture that these principles lie along the same lines as Fisher's principles of experimental designs, but there might be others as well. So, perhaps the principles might be (I'm making this up as I go along):

Perform the sequence of experiments so that it will provide a valid estimate of the sources of variability in how the discovered technology relates to its surroundings, be those surroundings cell walls, or hormones, or receptors, or DNA sequences, or whatever.
Use randomization to control for relationships that aren't yet understood or thought of.
Use blocking and factorial experiments...

Hmmm, the farther down this road I go, the more sequences of experiments look like experiments themselves.

Ok, Rafe, get to a point: This process through which researchers go when designing sequences seems vastly undocumented. If I were put in charge of figuring this out I think I would:

Dedicate some resource to document the process or processes or lack thereof that different researchers use. Interview. Watch in the lab. Get groups together to talk. Find out what is the same. Find out what is different. Find out how much is based on documentables facts and how much is based on tradition.
Devote course time in the appropriate programs (places where people who turn into Dr. New Researcher) for lectures by Dr. Experienced Researcher. Have them present what worked. Encourage them to present what didn't.
Devote course time to things other than technical issues. Strategic and tactical level learning is rarely part of the curricula of technically oriented departments.
(Here's something novel.) Find ways to bring the old people back in now and then to consult. John Tukey used to spend a day a week at Merck for something like 30 years. What was the value of that experience? Instead of all these "brilliant new investigator" awards, how about a "brilliant and wise old investigator" awards.
Match up the new folks with old folks, both within and across disciplines. Collaboration across disciplines and generations has got to be of benefit.
Encourage (and reward?) at least internal publication somehow of negative results or "look at the really dumb thing I did" papers. We see in the literature all the "good" papers, some of which aren't very good at all. How bad are the bad ones? Imagine a Journal of Big Screw-ups. Really. Imagine the learning that would take place if people read that? We all know now that if you launch a shuttle you should inspect (and repair) the tiles before trying to land. The consistent promotion only of "successful" research only reinforces young researcher to repeat the errors of their elders. (I'm starting to like this idea. Imagine a seminar of "Goofs I Made in the Lab". How many people would leave there saying, "Wow, I'm sure glad I went. I was planning to use that same reagent in my alpha-protease knockout mice next week. Now I won't have to waste a year with them becoming insulin resistant!")

Ok, I'm going to get some lunch now.

Yu Shyr adds Rafe and Dean, Here are some of my thoughts:

The main reason of performing a sequence of experiments is to help investigators do a better science by controlling experiment biases and using the resources wisely.
The focuses of the sequence may be on:

a. Primary and secondary objectives of the study. For example, a hypothesis generating vs. hypothesis testing. Is it an in-vitro, in-vivo, trials or a population based study?
b. Study design. For example, parallel design, crossover design, factory design and etc.
c. Stages of the design. For example, single-stage, two-stage, multiple-stage, or Bayesian approach.
d. The estimation of the sources of variability. For example, intra- vs. Inter-.
e. The quality control and reproducibility issues of the study. For example, lab to lab, day to day, machine to machine or researcher to researcher variability.
f. Sample size estimation based on (a) to (e).
g. The randomization and blinding issues of the study. For example, stratified permuted block randomization vs. adaptive randomization. Single or double blind? Or an open label study?
h. Monitoring study progress. For example, decision for early termination, extending a study, and condition power calculation.
i. Statistical data analysis.

Topic revision: r2 - 26 Apr 2013, JohnBock

Main

Department Home Page

Biostatistics Graduate Program

Vanderbilt University Medical Center

Biostatistics Webs
- Archive
- Main
- Sandbox
- System

Copyright © 2013-2022 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding Vanderbilt Biostatistics Wiki? Send feedback