Abstract
The shortage of available high-quality clinical databases restricts medical diagnostics downstream. Clinical databases are often limited to controlled non-natural environments, they are restricted due to privacy limitations and require complex scoring procedures that ultimately result in rater bias. Social media includes massive amounts of information on subjects through streams of text, audio, and video data that are accessible and currently underutilized for medical research. In this work we propose a method for utilizing this information by constructing databases for medical condition assessment. To this end we have created SMDC (Social Medical Data Constructor), a utility based on medical expert requirements. Data Features and non-confidential demographic information are extracted online, and labels are derived using data mining techniques. We examine the feasibility of the suggested technology with ADHD recognition from a database extracted from YouTube clips using the self-tagging as ADHD labels. The database maintains privacy and copy write limitations, and no personally identifying information is collected. To validate the database, we show a high correlation of the derived model's predictions with expert labeling (r =0.68) and compatibility of six known ADHD motor biomarker features of hyperactivity to the ones derived using our database. Furthermore, we extracted from the video clips kinematics features and reached ADHD recognition accuracy of 83%, and 81%, for female sand males respectively. The suggested technology has the potential to assess natural real-life behavioral properties of the medical condition, and may further be of use as a pre-training phase allowing fine tuning on actual clinical data with minimal data requirements.
| Original language | English |
|---|---|
| Pages (from-to) | 164725-164736 |
| Number of pages | 12 |
| Journal | IEEE Access |
| Volume | 12 |
| DOIs | |
| State | Published - 2024 |
Bibliographical note
Publisher Copyright:© 2013 IEEE.
Keywords
- ADHD
- databases
- Machine learning
- medical diagnosis
- social networks