Lesson 1: Introduction to Design of Experiments

Lesson 1: Introduction to Design of Experiments

Overview

In this course we will pretty much cover the textbook - all of the concepts and designs included. I think we will have plenty of examples to look at and experience to draw from.

Please note: the main topics listed in the syllabus follow the chapters in the book.

A word of advice regarding the analyses. The prerequisite for this course is STAT 501 - Regression Methods and STAT 502 - Analysis of Variance. However, the focus of the course is on the design and not on the analysis. Thus, one can successfully complete this course without these prerequisites, with just STAT 500 - Applied Statistics for instance, but it will require much more work, and for the analysis less appreciation of the subtleties involved. You might say it is more conceptual than it is math oriented.

 

 Text Reference: Montgomery, D. C. (2019). Design and Analysis of Experiments, 10th Edition, John Wiley & Sons. ISBN 978-1-119-59340-9

What is the Scientific Method?

Do you remember learning about this back in high school or junior high even? What were those steps again?

Decide what phenomenon you wish to investigate. Specify how you can manipulate the factor and hold all other conditions fixed, to insure that these extraneous conditions aren't influencing the response you plan to measure.

Then measure your chosen response variable at several (at least two) settings of the factor under study. If changing the factor causes the phenomenon to change, then you conclude that there is indeed a cause-and-effect relationship at work.

How many factors are involved when you do an experiment? Some say two - perhaps this is a comparative experiment? Perhaps there is a treatment group and a control group? If you have a treatment group and a control group then, in this case, you probably only have one factor with two levels.

How many of you have baked a cake? What are the factors involved to ensure a successful cake? Factors might include preheating the oven, baking time, ingredients, amount of moisture, baking temperature, etc.-- what else? You probably follow a recipe so there are many additional factors that control the ingredients - i.e., a mixture. In other words, someone did the experiment in advance! What parts of the recipe did they vary to make the recipe a success? Probably many factors, temperature and moisture, various ratios of ingredients, and presence or absence of many additives.  Now, should one keep all the factors involved in the experiment at a constant level and just vary one to see what would happen?  This is a strategy that works but is not very efficient.  This is one of the concepts that we will address in this course.

Objectives

Upon completion of this lesson, you should be able to:

  • understand the issues and principles of Design of Experiments (DOE),
  • understand experimentation is a process,
  • list the guidelines for designing experiments, and
  • recognize the key historical figures in DOE.

1.1 - A Quick History of the Design of Experiments (DOE)

1.1 - A Quick History of the Design of Experiments (DOE)

The textbook we are using brings an engineering perspective to the design of experiments. We will bring in other contexts and examples from other fields of study including agriculture (where much of the early research was done) education and nutrition. Surprisingly the service industry has begun using design of experiments as well.

 All experiments are designed experiments, it is just that some are poorly designed and some are well-designed. 

Engineering Experiments

If we had infinite time and resource budgets there probably wouldn't be a big fuss made over designing experiments. In production and quality control we want to control the error and learn as much as we can about the process or the underlying theory with the resources at hand. From an engineering perspective we're trying to use experimentation for the following purposes:

  • reduce time to design/develop new products & processes
  • improve performance of existing processes
  • improve reliability and performance of products
  • achieve product & process robustness
  • perform evaluation of materials, design alternatives, setting component & system tolerances, etc.

We always want to fine-tune or improve the process. In today's global world this drive for competitiveness affects all of us both as consumers and producers.

Robustness is a concept that enters into statistics at several points. At the analysis, stage robustness refers to a technique that isn't overly influenced by bad data. Even if there is an outlier or bad data you still want to get the right answer. Regardless of who or what is involved in the process - it is still going to work. We will come back to this notion of robustness later in the course (Lesson 12).

. . .. . .OutputInputsControllable Factorsx1x2xpz1z2zpUncontrollable factorsx1x2xpFigure 1-1 General model of a process or systemProcess

Every experiment design has inputs. Back to the cake baking example: we have our ingredients such as flour, sugar, milk, eggs, etc. Regardless of the quality of these ingredients we still want our cake to come out successfully. In every experiment there are inputs and in addition, there are factors (such as time of baking, temperature, geometry of the cake pan, etc.), some of which you can control and others that you can't control. The experimenter must think about factors that affect the outcome. We also talk about the output and the yield or the response to your experiment. For the cake, the output might be measured as texture, flavor, height, size, or flavor.

Four Eras in the History of DOE

Here's a quick timeline:

  • The agricultural origins, 1918 – 1940s
    • R. A. Fisher & his co-workers
    • Profound impact on agricultural science
    • Factorial designs, ANOVA
  • The first industrial era, 1951 – late 1970s
    • Box & Wilson, response surfaces
    • Applications in the chemical & process industries
  • The second industrial era, late 1970s – 1990
    • Quality improvement initiatives in many companies
    • CQI and TQM were important ideas and became management goals
    • Taguchi and robust parameter design, process robustness
  • The modern era, beginning circa 1990, when economic competitiveness and globalization are driving all sectors of the economy to be more competitive.
Note: A lot of what we are going to learn in this course goes back to what Sir Ronald Fisher developed in the UK in the first half of the 20th century. He really laid the foundation for statistics and for design of experiments. He and his colleague Frank Yates developed many of the concepts and procedures that we use today. Basic concepts such as orthogonal designs and Latin squares began there in the '20s through the '40s. World War II also had an impact on statistics, inspiring sequential analysis, which arose from World War II as a method to improve the accuracy of long-range artillery guns.

Immediately following World War II the first industrial era marked another resurgence in the use of DOE. It was at this time that Box and Wilson (1951) wrote the key paper in response surface designs thinking of the output as a response function and trying to find the optimum conditions for this function. George Box died early in 2013. And, an interesting fact here - he married Fisher's daughter! He worked in the chemical industry in England in his early career and then came to America and worked at the University of Wisconsin for most of his career.

The Second Industrial Era - or the Quality Revolution

image of W Edward Deming

W. Edwards Deming

The importance of statistical quality control was taken to Japan in the 1950s by W Edward Deming. This started what Montgomery calls a second Industrial Era, and sometimes the quality revolution. After the second world war, Japanese products were of terrible quality. They were cheaply made and not very good. In the 1960s their quality started improving. The Japanese car industry adopted statistical quality control procedures and conducted experiments which started this new era. Total Quality Management (TQM), Continuous Quality Improvement (CQI) are management techniques that have come out of this statistical quality revolution - statistical quality control and design of experiments.

Taguchi, a Japanese engineer, discovered and published a lot of the techniques that were later brought to the West, using an independent development of what he referred to as orthogonal arrays. In the West, these were referred to as fractional factorial designs. These are both very similar and we will discuss both of these in this course. He came up with the concept of robust parameter design and process robustness.

The Modern Era

Around 1990 Six Sigma, a new way of representing CQI, became popular. Now it is a company and they employ a technique which has been adopted by many of the large manufacturing companies. This is a technique that uses statistics to make decisions based on quality and feedback loops. It incorporates a lot of previous statistical and management techniques.

Clinical Trials

Montgomery omits in this brief history a major part of design of experimentation that evolved - clinical trials. This evolved in the 1960s when medical advances were previously based on anecdotal data; a doctor would examine six patients and from this wrote a paper and published it. The incredible biases resulting from these kinds of anecdotal studies became known. The outcome was a move toward making the randomized double-blind clinical trial the gold standard for approval of any new product, medical device, or procedure. The scientific application of the statistical procedures became very important.


1.2 - The Basic Principles of DOE

1.2 - The Basic Principles of DOE

Randomization

This is an essential component of any experiment that is going to have validity. If you are doing a comparative experiment where you have two treatments, a treatment and a control, for instance, you need to include in your experimental process the assignment of those treatments by some random process. An experiment includes experimental units. You need to have a deliberate process to eliminate potential biases from the conclusions, and random assignment is a critical step.

Replication

Replication is some in sense the heart of all of statistics. To make this point... Remember what the standard error of the mean is? It is the square root of the estimate of the variance of the sample mean, i.e., \(\sqrt{\dfrac{s^2}{n}}\). The width of the confidence interval is determined by this statistic. Our estimates of the mean become less variable as the sample size increases.

Replication is the basic issue behind every method we will use in order to get a handle on how precise our estimates are at the end. We always want to estimate or control the uncertainty in our results. We achieve this estimate through replication. Another way we can achieve short confidence intervals is by reducing the error variance itself. However, when that isn't possible, we can reduce the error in our estimate of the mean by increasing n.

Another way is to reduce the size or the length of the confidence interval is to reduce the error variance - which brings us to blocking.

Blocking

Blocking is a technique to include other factors in our experiment which contribute to undesirable variation. Much of the focus in this class will be to creatively use various blocking techniques to control sources of variation that will reduce error variance. For example, in human studies, the gender of the subjects is often an important factor.  Age is another factor affecting the response.  Age and gender are often considered nuisance factors which contribute to variability and make it difficult to assess systematic effects of a treatment.  By using these as blocking factors, you can avoid biases that might occur due to differences between the allocation of subjects to the treatments, and as a way of accounting for some noise in the experiment. We want the unknown error variance at the end of the experiment to be as small as possible. Our goal is usually to find out something about a treatment factor (or a factor of primary interest), but in addition to this, we want to include any blocking factors that will explain variation.

Multi-factor Designs

We will spend at least half of this course talking about multi-factor experimental designs: \(2^k\) designs, \(3^k\) designs, response surface designs, etc. The point to all of these multi-factor designs is contrary to the scientific method where everything is held constant except one factor which is varied. The one factor at a time method is a very inefficient way of making scientific advances. It is much better to design an experiment that simultaneously includes combinations of multiple factors that may affect the outcome. Then you learn not only about the primary factors of interest but also about these other factors. These may be blocking factors which deal with nuisance parameters or they may just help you understand the interactions or the relationships between the factors that influence the response.

Confounding

Confounding is something that is usually considered bad! Here is an example. Let's say we are doing a medical study with drugs A and B. We put 10 subjects on drug A and 10 on drug B. If we categorize our subjects by gender, how should we allocate our drugs to our subjects? Let's make it easy and say that there are 10 male and 10 female subjects. A balanced way of doing this study would be to put five males on drug A and five males on drug B, five females on drug A and five females on drug B. This is a perfectly balanced experiment such that if there is a difference between male and female at least it will equally influence the results from drug A and the results from drug B.

An alternative scenario might occur if patients were randomly assigned treatments as they came in the door. At the end of the study, they might realize that drug A had only been given to the male subjects and drug B was only given to the female subjects. We would call this design totally confounded. This refers to the fact that if you analyze the difference between the average response of the subjects on A and the average response of the subjects on B, this is exactly the same as the average response on males and the average response on females. You would not have any reliable conclusion from this study at all. The difference between the two drugs A and B, might just as well be due to the gender of the subjects since the two factors are totally confounded.

Confounding is something we typically want to avoid but when we are building complex experiments we sometimes can use confounding to our advantage. We will confound things we are not interested in order to have more efficient experiments for the things we are interested in. This will come up in multiple factor experiments later on. We may be interested in main effects but not interactions so we will confound the interactions in this way in order to reduce the sample size, and thus the cost of the experiment, but still have good information on the main effects.


1.3 - Steps for Planning, Conducting and Analyzing an Experiment

1.3 - Steps for Planning, Conducting and Analyzing an Experiment

The practical steps needed for planning and conducting an experiment include: recognizing the goal of the experiment, choice of factors, choice of response, choice of the design, analysis and then drawing conclusions. This pretty much covers the steps involved in the scientific method.

  1. Recognition and statement of the problem
  2. Choice of factors, levels, and ranges
  3. Selection of the response variable(s)
  4. Choice of design
  5. Conducting the experiment
  6. Statistical analysis
  7. Drawing conclusions, and making recommendations

What this course will deal with primarily is the choice of the design. This focus includes all the related issues about how we handle these factors in conducting our experiments.

Factors

We usually talk about "treatment" factors, which are the factors of primary interest to you. In addition to treatment factors, there are nuisance factors which are not your primary focus, but you have to deal with them. Sometimes these are called blocking factors, mainly because we will try to block on these factors to prevent them from influencing the results.

There are other ways that we can categorize factors:

Experimental vs. Classification Factors

Experimental Factors
These are factors that you can specify (and set the levels) and then assign at random as the treatment to the experimental units. Examples would be temperature, level of an additive fertilizer amount per acre, etc.SampleText
Classification Factors
These can't be changed or assigned, these come as labels on the experimental units. The age and sex of the participants are classification factors which can't be changed or randomly assigned. But you can select individuals from these groups randomly.

Quantitative vs. Qualitative Factors

Quantitative Factors
You can assign any specified level of a quantitative factor. Examples: percent or pH level of a chemical.
Qualitative Factors
These factors have categories which are different types. Examples might be species of a plant or animal, a brand in the marketing field, gender, - these are not ordered or continuous but are arranged perhaps in sets.

Try It!

Think about your own field of study and jot down several of the factors that are pertinent in your own research area? Into what categories do these fall?

Get statistical thinking involved early when you are preparing to design an experiment! Getting well into an experiment before you have considered these implications can be disastrous. Think and experiment sequentially. Experimentation is a process where what you know informs the design of the next experiment, and what you learn from it becomes the knowledge base to design the next.


Legend
[1]Link
Has Tooltip/Popover
 Toggleable Visibility