1. Random errors in epidemiological studies are unpredictable fluctuations. These errors occur because measurement devices are expected to depict some deviations in their readings, experimenters themselves may wave away. The results of investigations usually experience some degree of fault. This is the task of the instructor to manage the error so that the true result is not missed. First, the degree of confidence is selected. Usually, it is accepted to be a 95% power of confidence. Lower levels are considered not reliable and higher levels are difficult to obtain because the sample size will be too big. Next, the sample size is estimated and must be sufficient to represent the population. Finally, the mean and confidence interval of the mean is calculated with the given power of confidence. As a result, when an accurate statistical approach is applied, the random errors are considered at a reasonable degree.
Other factors that potentially diminish the degree of errors are an efficient study design and repeating the measurement. Adequate study designing means that the research plan must exclude methodological bias (misuse of terms, selection bias, recall bias, interviewer bias). Repeating of the measurement expands the data set aiming at achieving results as close to the true values as much as possible.
2. The null hypothesis is a working hypothesis that assumes the experimental data does not differ from the true data/group of comparison. Sometimes the experimental data differ from real world; also, one experimental group is not equal to another. The null hypothesis suggests these difference are just random errors (within the given confidence level). In other words, the null hypothesis assumes the deviations of results of observations are nonsignificant and the two distributions are similar.
Read also: "Academic Book Review: How to Complete It"
Should the null hypothesis be true, there is no risk of a certain factor to cause an effect. Thus, the risk ratio, as well as the odds ratio is close to 1.0. The risk difference must be close to zero because the risks of the groups are identical.
3. The alternative hypothesis suggests the experimental results differ from the real world data/group of comparison. The alternative hypothesis assumes the difference between the data sets is not due to random errors. In experiments, the alternative hypothesis states what we expect from the given survey. It is for a reason that usually experimenters are interested in comparison variables (or achieving some statistically significant improvement from a drug prescribed) rather that equating them.
4. The p-value is a calculation that indicates the significance of a given statistical statement. Also, it shows the probability of the null hypothesis is true. The smaller the p-value, there is less statistical probability the calculated ration/index/number follows the null hypothesis. When the p-value is above 0.1 (>10%), it is stated that the null hypothesis is accepted, so no statistical significance of the alternative hypothesis exists. When the p-value is below 0.05 (<5%), strong significance of evidence of the alternative hypothesis exists. The p-value between 0.05 and 0.1 suggests low significance. However, the experimental p-value may vary from a very small number (very close to zero, but not zero, indicating high reliability of the alternative hypothesis) to 1 (indicating low reliability of the alternative hypothesis).
5. It is known that observations carry some degree of error. Such deviations are technically impossible to ignore. Thus, the researchers repeat the experiments or conduct numerous measurements and calculate the mean. If the data set follows normal distribution, the standard deviation is calculated. It is important to note that the mean does not really equal the real world value. The mean is probably close to the real value. By estimating the confidence interval we expand the mean by a number of 1.96*SD (for the 95% power) so that this interval overlaps with the true value.
The p-value only states the degree of power that the given statistical calculation follows the null hypothesis. Therefore, the confidence interval is a more practical and a demonstrative approach. The confidence interval offers the researcher some space to appreciate the obtained result. The confidence interval tells the range of hypotheses that are compatible with the data.
1. Risk ratio is calculated according to the formula:
RR = a*(a + b) / c*(c + d)
Where a and b refer to the exposed group (Southwark Company in this case because this is the company of interest), c and d refer to the control group (Lambeth Company taken as the reference company). Hence, a and c are numbers with positive outcomes (the number of individuals who suffered from cholera), b and d are numbers with negative outcomes (unaffected by cholera but exposed to water).
RR = 844/167,654 / 18/19,133 = 0.005/0.0009 = 5
The ratio is high suggesting that the Southwark Company water carries substantial risk as compared to the Lambeth Company.
2. To check if this risk ratio is strong enough to state that a cause-effect relationship between cholera deaths and water supply exists need to calculate the p-value of the risk ratio. I used MedCalc software (MedCalc, 2014) to do this.
The null hypothesis that assumes that obtained RR does not differ from a neutral RR of 1.0. The alternative hypothesis suggests that obtained RR differs from a neutral RR of 1.0. According to the on-line free calculator, the z-score of the mean (the number of standard deviations from the mean) is 7 and the p-value is <0.0001. Thus, the actual RR differs from the theoretical mean of 1.0 by 7 standard deviations with a very strong probability (>99.99%). Moreover, the confidence interval of the RR is 3.4 to 8.7, which does not cover 1.0, so there no significant chance (with a 95% power) that the RR may follow the null hypothesis. Statistical calculations strongly suggest that there is a cause-effect relationship between cholera deaths and type of water supply.
3. Evidently, the story of cholera in London in 1854 is a beautiful example of biological plausibility. However, this is easy to understand with modern achievements in microbiology and infection. In the XIX century, it must have been a remarkable observancy of Dr. John Snow to find a link between type of water suppliers and lethal diarrhea.
4. According to the temporality principle, the effect occurs after the cause. Vibrio cholera is known to contaminate water. It must be demonstrated that after water exposure cholera develops. For example, suppose one of the family members drank water and got cholera. It needs to be traced that after another family member drinks the same water sample develops this disorder.
5. Biological plausibility is supported by the fact that reduction of cholera deaths occurred after the improvement in water supply. According to the plausibility principle, the biological factor application results in a certain cause. In the case of cholera 1854 outbreak, the cause (polluted water) was eliminated, and the effect (cholera) was reduced. This scenario is a reverse example of plausibility.