Throughout the project, meat from the different farms will be presented to the tasting slices. The characteristics of the meat are also determined in the laboratory. Spoolder assures that “we carry out an extensive chemical and physical analysis of the meat. We want to link this to the origin of the flesh. For example, we’re looking for isotopes that show whether a pig ate Spanish or Polish grass. “
“In general, many different types of data are collected from different European countries. These include assessments of animal welfare questionnaires that researchers complete on farms, data from taste groups, and data from laboratory studies. “It must also be possible to connect all of this data: after all, you want to check whether the meat from the German organic pig farm has a different taste and composition than, for example, conventional pork.
And then all data must also comply with the General Data Protection Regulation (GDPR), which means that farmers’ personal data may not be visible. Spoolder: “We try to guarantee anonymity as much as possible. Each farmer is given a number and a country name. But only investigators in the country in question know which farmer is behind this code. “
WUR is the coordinator of this project and is also building the data warehouse where all the data will be collected. How do you deal with this? Wouter Hoenderdaal, database developer at Wageningen Food Safety Research, is working on this. Hoenderdaal: “The project is still in the start-up phase, but the process that precedes data collection is at least as important. It’s important that everyone measures the same thing and submits that data in the same way. Therefore, we send all researchers a specific format in which to enter their data. “
Therefore, in the meat quality project, it is important that the data can be connected. Hoenderdaal: “Part of the animal goes to the laboratory, another part of the same animal goes to the tasting panels. So we need to set up an airtight coding system that allows you to track back and forth where the meat sample came from: which animal, which farm, which region and which country? “ Two parts The data warehouse consists of two parts and a kind of portal. The latter is a file system into which the researchers themselves can upload their raw data. They only get access rights to their own folder. Hoenderdaal: “All files are also password protected. Therefore, user X can only read into his own folder and can only read his own files there. “
The actual data warehouse consists of a development database and a production database. Hoenderdaal: “We will build and test in the development database and when we find that everything is correct there, all the data will be pushed into the production database. The researchers do not have access to the development and production databases, but have access to the file system. This is to prevent the database from being polluted with unusable data or, worse, partially erased by an inattentive researcher. “We built this database in Postgres, an open-source relational database that stores the data in a structured way.”
The transfer of data from the file system to the development database is automated. “We write scripts in Python so that researchers’ files automatically end up in the right place in the database. The idea is that scripts can’t prevent an error file from being uploaded to the filesystem, but can detect it and prevent it from going into the database. In this way we prevent incorrect data from being loaded into the database. We build everything to be foolproof; After all, not all researchers are equally tech-savvy. “
Researchers must be able to search the production database so they can compare their own data to others, but they cannot access this database. How do Hoenderdaal and his colleagues solve this? “We hope that they mostly want to see standard datasets that combine certain data. Then we can prepare them for you in a secure folder. If a researcher has a very specific question, we will compile a custom data set for them. “
What are the dangers of this type of international data exchange? Hoenderdaal: “Language can cause problems. The language of instruction is English, which means that the translation from the mother tongue into English can give errors. The researchers have now built in a check themselves, by first translating an English text into German and then back again. If the second text in English has the same result as the first, they know it’s okay. “
A second danger has to do with the system: a relational database like Postgres is good for storing structured data, but less for unstructured data like PDFs or text snippets. Hoenderdaal: “For example, you can get structured data about a particular meat sample from the lab, but possibly also scans. After all, not everything can be captured in structured data. We still need to develop something to link this unstructured data to the structured data. So there is a lot to learn in this project. “
If it’s up to Hans Spoolder, the meat quality project forms the basis for a large European database on the origin of meat. Spoolder: “A European database of this kind already exists for wine. The Oritain company creates a database for beef and lamb. You are interested in our data on chickens and pigs. “ Meat traceability is important to prevent meat fraud. Let’s think about the horse meat scandal, but also about labeling meat as organic when in fact it comes from intensive farming.
Spoolder: “Determining meat cheating is a side step in our project. We don’t have a budget to expand it further, but eventually we can contribute to an international meat database. “