<p dir="ltr">In collaboration with Medicines Discovery Catapult (MDC), we developed a comprehensive database of depression-related clinical trials, including information on the availability of additional biological samples such as blood or DNA. We systematically extracted structured and unstructured data from two major clinical trial registries: the US (https://clinicaltrials.gov/, using the Aggregated Analysis of ClinicalTrials.gov (AACT) database) and the EU Clinical trials register (<a href="https://www.clinicaltrialsregister.eu/" target="_blank">https://www.clinicaltrialsregister.eu/</a>). Summary data from clinical trial records and resulting publications were extracted on interventions, conditions, drugs, sample sizes, trial phase, status, and start and end dates, participant demographics, and sponsors. To identify trials likely to include blood samples or genetic information, we applied a semantic similarity approach using vector-based natural language processing to score trial records based on textual indicators. This methodology prioritised trials most likely to contain genetic information by measuring similarity between predefined query terms and clinical trial text fields.</p>
We systematically extracted structured and unstructured data from two major clinical trial registries.
Summary data from clinical trial records and resulting publications were extracted on interventions, conditions, drugs, sample sizes, trial phase, status, and start and end dates, participant demographics, and sponsors.
To identify trials likely to include blood samples or genetic information, we applied a semantic similarity approach using vector-based natural language processing to score trial records based on textual indicators.