How reliable is crowdsourced data on diets?

From
CGIAR Initiative on Digital Innovation
Published on
13.06.24
Impact Area
Nutrition, health & food security

A study in Rwanda shows that it is possible to collect population-level and near real-time data on people’s diets via mobile phone survey, at nearly the same level of accuracy as traditional enumerated survey techniques. Top photo credit: IFPRI.

Read the study, ‘Validating high frequency deployment of the Diet Quality Questionnaire’, here. This work was funded by the CGIAR Initiative on Digital Innovation and the Rockefeller Foundation.

Ask anyone how healthily they eat, and you may have cause to doubt what they say. People may not remember well or understand the question, or they may tell you what they think you want to hear. This poses a major dilemma for researchers trying to create data on diet quality across a population, which is key for decision-makers to take action.

In 2022, a team of CGIAR researchers developed a method to use mobile phones to survey individuals in Rwanda, applying the Diet Quality Questionnaire (DQQ) which uses simple yes/no questions to identify which foods groups are eaten. The answers are used to generate indicators of diet quality. Over 52 weeks, they created a dataset of over 80,000 entries at a fraction of the cost it would take to send trained staff, or enumerators, to conduct the surveys in person.

Because of the doubts surrounding self-reported data collected over mobile phone, which was motivated by a small monetary reward, as well as potential bias in enumerator surveys, the next step was to test how reliable the data is.

Learn more about this work in our webinar.

The team identified a sample of 300 people who had previously participated in the DQQ survey. As a baseline, they sent trained enumerators to observe and weigh all the food eaten in the household that day, which was used to calculate the DQQ indicators. For the second day the group was split in two halves, with one receiving the DQQ survey via mobile phone and the other via trained enumerator, in which they reported their food consumption on the day they had been observed.

To understand how the method of survey delivery affects the results, the three data collection techniques were compared against each other. The mobile phone survey results agreed with the observed metrics for 88% of the DQQ questions, compared to 94% for the enumerator-administered survey. The source of inaccuracy in the mobile phone survey was twice as likely to be false positives (where respondents incorrectly report consumption). Using a standardized technique to filter out low quality responses, for example those who answered with a low level of agreement, it was possible to raise the mobile phone survey accuracy to 90% and enumerated to 95%.

“We were pleasantly surprised with how robust the self-reported data was’ said Rhys Manners, lead author of the study and data scientist with the International Institute of Tropical Agriculture (IITA). “The study demonstrates that self-reported data collection is an agile and rapidly deployable complement to enumerated data collection efforts. However, we need to continue to improve the system, making it more usable and accessible; and improve data quality indicators to flag low quality responses.”

Younger people were slightly more likely to provide more accurate results than older generations via the mobile survey, while the reverse was true for the enumerator survey. Wealthier people also responded slightly better to enumerator surveys.

The study showed that both enumerator and mobile phone surveys face potential, but different, issues with biases and inaccuracy, but that both methods can generally be trusted to provide reliable results. The big differentiator between the two is cost: at just US $0.70 per mobile survey, including an economic incentive, this method costs just 4% compared to enumerator-administered surveys, if deployed alone, with costs being more similar if the enumerated DQQ is done part of wider surveying (e.g., the Gallup World Poll)

Where solutions exist, crowdsourcing offers real potential for large-scale and near-real time data collection, filling essential gaps for researchers and policymakers. Experiences from a project to replicate this process in Guatemala also highlight the importance of human-centered design to ensure that survey methods fit the local context.