Gauging quality of ‘age-data’: Evidence from Population Census and Survey
Age statistics is an integral component to population data. Most of socio-demographic characteristics of a population such as literacy, marital and socio-economic status, fertility and mortality vary with age structure. Thus age is a pivotal socio-demographic study variable central to population census and surveys. The age-sex structure of a population is a critical research domain in several demographic and epidemiological studies. Besides demographic and health concerns, the age- related data are required for age specific analysis of several measures to serve planning, technical, scientific and commercial motives [1].
Quality of age data is affected by a variety of discrepancies and misstatements in both censuses and surveys. Age data are primarily affected by preferences or avoidances for particular ages and/or digits due to socio-cultural norms prevalent in a society. Such discrepancies can result into misclassification bias and erroneous demographic and health assessment. It is a recurring issue which serves as impediment to proper planning and decision-making especially in developing countries [2]. Thus it is important to develop insights about the extent of misreporting in age data in India in census and surveys.
Prevailing scenario
The age data collected in census or demographic and health surveys suffer from various irregularities — digit preference or aversion; deliberate misstatement; ignorance in reckoning correct age and misunderstanding the question. Digit preference is one of the most frequently encountered human biases in age reporting wherein individual tend to incorrectly report age favouring certain ages ending in digits ‘0’ or ‘5’. This leads to heaping or stacking of age observation at certain ‘preferred’ ages. Often during census or household surveys, the head of a particular household reports approximate ages of all other members of household and thus does not pay much emphasis to report correct ages, leading to an inaccurate population age distribution.
Indices to assess quality of age-data
Whipple’s Index (WI), Myers’ Index (MI) and UN Age-sex Accuracy (UNASA) Index / UN Joint Score (UNJS) are three commonly used indices to measure the accuracy of reported age data. While Whipple’s index assesses the digit preference for age ending with digit 0 and 5, Myer’s blended index measures the extent of digit preference for age ending with all digits (0 to 9). Together with UNJS, , that assesses the accuracy of age data in 5-year groups, these two indices comprehensively capture the extent of age misreporting in demographic data.
Key observations: Census and NFHS-4
Overall, the quality of age reporting at national level the quality of age reporting is more accurate in Census-2011 data compared to NFHS-4 data (as indicated by UN Joint score). Prominent variations exist among states in reference to age data collecting and reporting with certain states — Lakshwadeep, Kerala, Mizoram, Sikkim and Himachal Pradesh — reporting relatively more accurate quality of age data compared to other states (Table below). In regard to sex-difference; males reported age more accurately compared to females in states with ‘good quality’ age data reportage. But in other states, both males and females reported age inaccurately.
Table: Whipple’s (WI), Myers’ (MI) Index by sex and UN Joint Score (UNJS) for India and States/UTs for Census 2011 and NFHS-4 (2015–16) data
Recommendations and way forward
The age data in census and surveys require critical intervention so as to recuperate its quality and facilitate effective planning and implementation of government schemes and programs. In order to improve to quality of age data, it is paramount to improve data collection process and quality of respondent specified information. It is important to take into consideration following facets for obtaining good quality age data:
· Collection of data of birth with age data as it reduces digit preference and age heaping [4]
· Recruitment of enumerators with required education qualification followed by their high quality training to facilitate good quality of collected data.
· Use of ‘local calendars’ relating to local events and festivals so as to facilitate better recording of age information
· Government should regularly raise awareness at ground level to motivate people to register birth details at hospital/PHC itself [4].
· Assess, quantify and report the errors in collection of age data so as to make data users cognizant about the utility and fitness of the data and maintain data quality culture.
References:
1. Office of the Registrar General and Census Commissioner, 2001. Census of India 2001. Age structure and marital status. (Retrieved through: https://censusindia.gov.in/census_and_you/age_structure_and_marital_status.aspx)
2. Agrawal, G. and Khanduja, P., 2015. Influence of literacy on India’s tendency for age misreporting: evidence from Census 2011. Journal of Population and Social Studies, 23(1), pp.47–56.
3. Jayachandran AA., Stover J., and Gupta YP., 2020. Age Reporting in Indian Census: an exploration into quality of Census data.
4. Yadav, A., Vishwakarma, M. and Chauhan, S., 2020. The quality of age data: Comparison between two recent Indian censuses 2001–2011. Clinical Epidemiology and Global Health, 8(2), pp.371–376.
Written by: ICMR-NIMS NDQF Team
Disclaimer: This blog solely provides general information related to data quality. The above expressed views are purely provided on the basis of studied relevant literature. These views have no relation to ICMR-NIMS, New Delhi or any other institution.