Food data used to find cause of outbreak

Data on what we like to buy in supermarkets may help identify the source of food-related illness outbreaks.
11 July 2014


 This image shows a display of healthy foods on a table. Foods include beans, grains, cauliflour, cantelope, pasta, bread, orange, turkey, salmon, carrots, turnips, zucchini, snowpeas, string beans, radishes, asparagus, summer squash, lean beef,...


Computer scientists have shown that food distribution data collected by Foodsupermarkets can be used to rapidly identify the source of an outbreak, saving lives and money.

Every year, millions of people end up locked to lavatory seats for longer than they'd like owing to something they ate. The global health pricetag associated with these episodes exceeds US$9 billion, and up to US$75 billion of wholesome food is thrown away needlessly in efforts to stem outbreaks.

Tracking down the sources of these food-borne illnesses, so culprit food items can be removed from the shelves, is extremely difficult. It usually involves Public Health Officials spending weeks carrying out detailed interviews with victims, and also conducting laboratory tests to try to isolate the bugs responsible.

Now, computer scientists at IBM's public health research department in San Jose, California, have discovered a way to use food sales information collected by supermarkets, together with reports of disease outbreaks, to pinpoint the sources of food-poisoning much more quickly.

The system takes the huge swathes of data logged by retailers regarding what they are selling, and where, and cross-references this with reports of disease outbreaks. With relatively few outbreaks logged (under 10), the team found, the food source responsible could be narrowed to within a shopping basket's worth of possibilities, and with high certainty.

Owing to the fact that it's relatively easy to do what the IBM team have done, it's surprising that no one else had thought to marshall the data this way.

"It's not rocket science," admits project manager James Kaufmann. But the subtlety of the system lies in the fact that there is enough variability in the sales of different foodstuffs to different geographies that the process can pick up on the relationships between consumption and an outbreak. At the moment the team have tested their system, which they also published in PLoS Computational Biology, using real retail data from Germany, and also fictitious outbreak events, to prove that it works.

To make it work comprehensively will require the collaboration of retailers across the board. So will the industry buy-in? It looks likely. "The big retailers are very positive about this," says Kaufmann. "They are as much the victims as the end-consumers, so anything that helps them to spot problem foodstuffs early and minimise their costs is good for them."


Add a comment