Abstract
The exhaustive automatic detection of symptoms in social media posts is made difficult by the presence of colloquial expressions, misspellings and inflected forms of words. The detection of self-reported symptoms is of major importance for emergent diseases like the Covid-19. In this study, we aimed to (1) develop an algorithm based on fuzzy matching to detect symptoms in tweets, (2) establish a comprehensive list of Covid-19-related symptoms and (3) evaluate the fuzzy matching for Covid-19-related symptom detection in French tweets. The Covid-19-related symptom list was built based on the aggregation of different data sources. French Covid-19-related tweets were automatically extracted using a dedicated data broker during the first wave of the pandemic in France. The fuzzy matching parameters were finetuned using all symptoms from MedDRA and then evaluated on a subset of 5000 Covid-19-related tweets in French for the detection of symptoms from our Covid-19-related list. The fuzzy matching improved the detection by the addition of 42% more correct matches with an 81% precision.
| Original language | English (US) |
|---|---|
| Title of host publication | Public Health and Informatics |
| Subtitle of host publication | Proceedings of MIE 2021 |
| Publisher | IOS Press |
| Pages | 896-900 |
| Number of pages | 5 |
| ISBN (Electronic) | 9781643681856 |
| ISBN (Print) | 9781643681849 |
| DOIs | |
| State | Published - Jul 1 2021 |
| Externally published | Yes |
Bibliographical note
Publisher Copyright:© 2021 European Federation for Medical Informatics (EFMI) and IOS Press. All rights reserved.
Keywords
- Content analysis
- Covid-19
- Fuzzy matching
- Social media
- Symptoms