PAPP102 - S08: Internal evaluation of demographic data

A typology of kinds of error (continued)

We can arrange the four classes of data collection errors on the previous page into a 2 x 2 matrix, into a typology of errors.

Move your mouse over each of the four internal cells to see how the four errors can be classified.

	Random error	Systematic error	Undermines
Measurement error	(Lack of) reliability	Measurement or reporting bias	Internal validity
Sampling error	Statistical error	Sampling and selection bias	External validity
Undermines	Precision	Accuracy	Overall validity

The person performing the measurement might make random errors in measuring the subject. For example, they might misread the measurement off the tape-measure or other instrument being employed.

The person performing the measurement might make the same error repeatedly in measuring subjects, for example, by not insisting that the subject removes their shoes before being measured.

The subjects chosen for measurement may, purely by chance, be taller or shorter than the population of adult males in the country.

Either because of the way they were chosen for inclusion in the study, or because of characteristics who agreed to be measured, the subjects in your sample might have characteristics that mean that they do not fairly reflect the heights of adult males in the country.

From the last column of the table, we can see that the overall validity (that is, the ability to draw correct inferences from our data) is affected by each of these kinds of error. Thus, validity may be compromised internally because, within the sample drawn, random or systematic errors in performing the measurement may result in the data not adequately reflecting the actual properties (in the example, height) of the subjects. Equally, validity might be compromised externally because the subjects selected for measurement may be significantly different (either as a result of random or systematic error) from the underlying population.

Another way of looking at the ability of the sample to produce valid inferences and generalisations is to consider the role of random and systematic error in affecting the estimates. From the last line of the table, we can observe that random error affects the precision of the estimates derived. In terms of measurement error, this may result from each adult male in the sample being measured only once, so that the effects of randomly misreading of the height are not picked up by comparison with a second measurement taken at the same time. Systematic error (what is frequently termed ‘bias’), on the other hand, affects the accuracy of the data collected. For example, not requiring subjects to remove their shoes, or using a stretched tape-measure, would result in the height measured not being an accurate measure of the subject’s height.

The sixth part of this module examines the kinds of errors that might be introduced into the data after it has been collected. The final part suggests a variety of diagnostic investigations and techniques that might be applied to assess the quality of demographic data.