A blind or blinded-experiment is an experiment in which information about the test is masked (kept) from the participant, to reduce or eliminate bias, until after a trial outcome is known. It is understood that bias may be intentional or subconscious, thus no dishonesty is implied by blinding. If both tester and subject are blinded, the trial is called a double-blind experiment.
Blind testing is used wherever items are to be compared without influences from testers' preferences or expectations, for example in clinical trials to evaluate the effectiveness of medicinal drugs and procedures without placebo effect, observer bias, or conscious deception; and comparative testing of commercial products to objectively assess user preferences without being influenced by branding and other properties not being tested.
Blinding can be imposed on researchers, technicians, or subjects. The opposite of a blind trial is an open trial. Blind experiments are an important tool of the scientific method, in many fields of research--medicine, psychology and the social sciences, natural sciences such as physics and biology, applied sciences such as market research, and many others. In some disciplines, such as medicinal drug testing, blind experiments are considered essential.
In some cases, while blind experiments would be useful, they are impractical or unethical; an example is in the field of developmental psychology: although it would be informative to raise children under arbitrary experimental conditions, such as on a remote island with a fabricated enculturation, it is a violation of ethics and human rights.
The terms blind (adjective) or to blind (transitive verb) when used in this sense are figurative extensions of the literal idea of blindfolding someone. The terms masked or to mask may be used for the same concept; this is commonly the case in ophthalmology, where the word 'blind' is often used in the literal sense.
The French Academy of Sciences originated the first recorded blind experiments in 1784: the Academy set up a commission to investigate the claims of animal magnetism proposed by Franz Mesmer. Headed by Benjamin Franklin and Antoine Lavoisier, the commission carried out experiments asking mesmerists to identify objects that had previously been filled with "vital fluid", including trees and flasks of water. The subjects were unable to do so. The commission went on to examine claims involving the curing of "mesmerized" patients. These patients showed signs of improved health, but the commission attributed this to the fact that these patients believed they would get better--the first scientific suggestion of the now well-known placebo effect.
In 1799, the British chemist Humphry Davy performed another early blind experiment. In studying the effects of nitrous oxide (laughing gas) on human physiology, Davy deliberately did not tell his subjects what concentration of the gas they were breathing, or whether they were breathing ordinary air.
Blind experiments went on to be used outside of purely scientific settings. In 1817, a committee of scientists and musicians compared a Stradivarius violin to one with a guitar-like design made by the naval engineer François Chanot. A well-known violinist played each instrument while the committee listened in the next room to avoid prejudice.
One of the first essays advocating a blinded approach to experiments in general came from Claude Bernard in the latter half of the 19th century, who recommended splitting any scientific experiment between the theorist who conceives the experiment and a naive (and preferably uneducated) observer who registers the results without foreknowledge of the theory or hypothesis being tested. This suggestion contrasted starkly with the prevalent Enlightenment-era attitude that scientific observation can only be objectively valid when undertaken by a well-educated, informed scientist.
Double-blind methods came into especial prominence in the mid-20th century.
Single-blind describes experiments where information that could introduce bias or otherwise skew the result is withheld from the participants, but the experimenter will be in full possession of the facts.
In a single-blind experiment, the individual subjects do not know whether they are so-called "test" subjects or members of an "experimental control" group. Single-blind experimental design is used where the experimenters either must know the full facts (for example, when comparing sham to real surgery) and so the experimenters cannot themselves be blind, or where the experimenters will not introduce further bias and so the experimenters need not be blind. However, there is a risk that subjects are influenced by interaction with the researchers - known as the experimenter's bias. Single-blind trials are especially risky in psychology and social science research, where the experimenter has an expectation of what the outcome should be, and may consciously or subconsciously influence the behavior of the subject.
A classic example of a single-blind test is the Pepsi Challenge. A tester, often a marketing person, prepares two sets of cups of cola labeled "A" and "B". One set of cups is filled with Pepsi, while the other is filled with Coca-Cola. The tester knows which soda is in which cup but is not supposed to reveal that information to the subjects. Volunteer subjects are encouraged to try the two cups of soda and polled for which ones they prefer. One of the problems with a single-blind test like this is that the tester can unintentionally give subconscious cues which influence the subjects. In addition, it is possible the tester could intentionally introduce bias by preparing the separate sodas differently (e.g., by putting more ice in one cup or by pushing one cup closer to the subject). If the tester is a marketing person employed by the company which is producing the challenge, there's always the possibility of a conflict of interest where the marketing person is aware that future income will be based on the results of the test.
Double-blind describes an especially stringent way of conducting an experiment which attempts to eliminate subjective, unrecognized biases carried by an experiment's subjects (usually human) and conductors. Double-blind studies were first used in 1907 by W. H. R. Rivers and H. N. Webber in the investigation of the effects of caffeine.
In most cases, double-blind experiments are regarded to achieve a higher standard of scientific rigor than single-blind or non-blind experiments.
In these double-blind experiments, neither the participants nor the researchers know which participants belong to the control group, nor the test group. Only after all data have been recorded (and, in some cases, analyzed) do the researchers learn which participants were which. Performing an experiment in double-blind fashion can greatly lessen the power of preconceived notions or physical cues (e.g., placebo effect, observer bias, experimenter's bias) to distort the results (by making researchers or participants behave differently from in everyday life). Random assignment of test subjects to the experimental and control groups is a critical part of any double-blind research design. The key that identifies the subjects and which group they belonged to is kept by a third party, and is not revealed to the researchers until the study is over.
Double-blind methods can be applied to any experimental situation in which there is a possibility that the results will be affected by conscious or unconscious bias on the part of researchers, participants, or both. For example, in animal studies, both the carer of the animals and the assessor of the results have to be blinded; otherwise the carer might treat control subjects differently and alter the results.
Computer-controlled experiments are sometimes also erroneously referred to as double-blind experiments, since software may not cause the type of direct bias between researcher and subject. Development of surveys presented to subjects through computers shows that bias can easily be built into the process. Voting systems are also examples where bias can easily be constructed into an apparently simple machine based system. In analogy to the human researcher described above, the part of the software that provides interaction with the human is presented to the subject as the blinded researcher, while the part of the software that defines the key is the third party. An example is the ABX test, where the human subject has to identify an unknown stimulus X as being either A or B.
A triple-blind study is an extension of the double-blind design; the committee monitoring response variables is not told the identity of the groups. The committee is simply given data for groups A and B. A triple-blind study has the theoretical advantage of allowing the monitoring committee to evaluate the response variable results more objectively. This assumes that appraisal of efficacy and harm, as well as requests for special analyses, may be biased if group identity is known. However, in a trial where the monitoring committee has an ethical responsibility to ensure participant safety, such a design may be counterproductive since in this case monitoring is often guided by the constellation of trends and their directions. In addition, by the time many monitoring committees receive data, often any emergency situation has long passed.
Double-blinding is relatively easy to achieve in drug studies, by formulating the investigational drug and the control (either a placebo or an established drug) to have identical appearance (color, taste, etc.). Patients are randomly assigned to the control or experimental group and given random numbers by a study coordinator, who also encodes the drugs with matching random numbers. Neither the patients nor the researchers monitoring the outcome know which patient is receiving which treatment, until the study is over and the random code is revealed.
Effective blinding can be difficult to achieve where the treatment is notably effective (indeed, studies have been suspended in cases where the tested drug combinations were so effective that it was deemed unethical to continue withholding the findings from the control group, and the general population), or where the treatment is very distinctive in taste or has unusual side-effects that allow the researcher and/or the subject to guess which group they were assigned to. It is also difficult to use the double blind method to compare surgical and non-surgical interventions (although sham surgery, involving a simple incision, might be ethically permitted). A good clinical protocol will foresee these potential problems to ensure blinding is as effective as possible. It has also been argued that even in a double-blind experiment, general attitudes of the experimenter such as skepticism or enthusiasm towards the tested procedure can be subconsciously transferred to the test subjects.
Evidence-based medicine practitioners prefer blinded randomised controlled trials (RCTs), where that is a possible experimental design. These are high on the hierarchy of evidence; only a meta analysis of several well designed RCTs is considered more reliable.
Modern nuclear physics and particle physics experiments often involve large numbers of data analysts working together to extract quantitative data from complex datasets. In particular, the analysts want to report accurate systematic error estimates for all of their measurements; this is difficult or impossible if one of the errors is observer bias. To remove this bias, the experimenters devise blind analysis techniques, where the experimental result is hidden from the analysts until they've agreed--based on properties of the data set other than the final value--that the analysis techniques are fixed.
One example of a blind analysis occurs in neutrino experiments, like the Sudbury Neutrino Observatory, where the experimenters wish to report the total number N of neutrinos seen. The experimenters have preexisting expectations about what this number should be, and these expectations must not be allowed to bias the analysis. Therefore, the experimenters are allowed to see an unknown fraction f of the dataset. They use these data to understand the backgrounds, signal-detection efficiencies, detector resolutions, etc.. However, since no one knows the "blinding fraction" f, no one has preexisting expectations about the meaningless neutrino count N' = N × f in the visible data; therefore, the analysis does not introduce any bias into the final number N which is reported. Another blinding scheme is used in B meson analyses in experiments like BaBar and CDF; here, the crucial experimental parameter is a correlation between certain particle energies and decay times--which require an extremely complex and painstaking analysis--and particle charge signs, which are fairly trivial to measure. Analysts are allowed to work with all the energy and decay data, but are forbidden from seeing the sign of the charge, and thus are unable to see the correlation (if any). At the end of the experiment, the correct charge signs are revealed; the analysis software is run once (with no subjective human intervention), and the resulting numbers are published. Searches for rare events, like electron neutrinos in MiniBooNE or proton decay in Super-Kamiokande, require a different class of blinding schemes.
The "hidden" part of the experiment--the fraction f for SNO, the charge-sign database for CDF--is usually called the "blindness box". At the end of the analysis period, one is allowed to "unblind the data" and "open the box".
In a police photo lineup, an officer shows a group of photos to a witness or crime victim and asks him or her to pick out the suspect. This is basically a single-blind test of the witness's memory, and may be subject to subtle or overt influence by the officer. There is a growing movement in law enforcement to move to a double-blind procedure in which the officer who shows the photos to the witness does not know which photo is of the suspect.
In recruiting musicians to perform in orchestras and so on, blind auditions are now routinely done: the musicians perform behind a screen so that their physical appearance and gender cannot prejudice the listener judging the performance.
Shortly after the start of the Cold War [...] double-blind reviews became the norm for conducting scientific medical research, as well as the means by which peers evaluated scholarship, both in science and in history.