Using DHS Data (cont.)
Recode files
You will have noticed that all the file types described on the previous page are "recode files". This refers to the fact that the data collected in the field is recoded from this original, hierarchical format to the standardised flat datasets available for analysis.
- Something about there being thousands of variables with brief and confusing names and labels, therefore need to familiarise yourself with the manual
- Not enough alone, also need to familiarise self with questionnaire
This recode is done for several reasons:
- A standardised format allows easy comparison between countries – i.e. in every individual recode file v201 is always the number of children ever born to that woman.
- Dates for key events often need to be imputed if they are missing or invalid – providing these in the datasets means that researchers do not need to create their own imputation scheme, and results should remain consistent.
- Variables may need to be combined to create a form suitable for analysis – for example, a question may be asked multiple times of different respondents. These replies are combined for the recode files.
- Anthropometric indices are calculated from height and weight data and included as a variable in the recode files
Source: ICF International 2013.
When beginning to analyse a dataset it is vital to familiarise yourself both with the DHS recode manual and the original questionnaire which the data comes from. Both of these are available on the Measure DHS website. On opening the dataset you will see thousands of variables with brief, confusing names and labels. The recode manual is vital in being able to understand exactly what each variable is. However the recode manual on its own may not be enough to enable you to fully understand the dataset you are working with. Questionnaires may differ from country to country (although containing essentially the same information). This means that total standardisation is not possible. Therefore the dataset consists of both standard sections and country-specific sections, and the original questionnaires are vital for understanding the country specific sections.