P-682 Predicting the Number of Oocytes Retrieved from Controlled Ovarian Hyperstimulation with Machine Learning

Autor: J Chambost, C Jacques, T Ferrand, C Hickman, P He, A Reigner, T Freour
Rok vydání: 2022
Předmět:
Zdroj: Human Reproduction. 37
ISSN: 1460-2350
0268-1161
DOI: 10.1093/humrep/deac107.631
Popis: Study question Can machine learning predict the number of oocytes retrieved from controlled ovarian hyperstimulation (COH) using a third-party dataset without the need for a data transfer? Summary answer Three machine learning models were successfully trained through the Substra Infrastructure to predict the number of oocytes retrieved from COH. No data transfer took place. What is known already A critical stage in in-vitro fertilization cycles is that of COH. Due to large inter- and intra-individual variations in ovarian response, clinicians need to decide on suitable and cost-effective ovarian stimulation protocols for patients with a view to retrieving as many mature oocytes as possible, while also minimizing the risk of complications such as ovarian hyperstimulation syndrome. A number of previous studies have identified and built predictive models on factors that influence the number of oocytes retrieved during COH. Many of these studies are, however, limited in the fact that they only consider a small number of variables in isolation. Study design, size, duration This study was a retrospective analysis of 14,415 cycles performed at a single centre between 2009 and 2020. The analysis was carried out by an external data analysis team using the Substra framework. Substra enabled the data analysis team to send computer code to run securely on the centre’s on-premises server. Thus, a high level of data security was achieved as the data did not leave the centre at any point during the study. Participants/materials, setting, methods The Light Gradient Boosting Machine algorithm was used to produce three predictive models: one that directly predicted the number of oocytes retrieved, and two that predicted which of a set of bins provided by two clinicians the number of oocytes retrieved fell into. The resulting models were evaluated on a held-out test set. In addition, the models themselves were analyzed to identify the parameters that had the biggest impact on their predictions. Main results and the role of chance On average, the model that directly predicted the number of oocytes retrieved deviated from the ground truth by 3.80 oocytes. The model that predicted the first clinician’s bins deviated by 0.73 bins whereas the model for the second clinician deviated by 0.63 bins. For all models, performance was best within the first and third quartiles of the target variable, with the model underpredicting extreme values of the target variable (no oocytes and large numbers of oocytes retrieved). Nevertheless, the erroneous predictions made for these extreme cases were still within the vicinity of the true value. Overall, all three models agreed on the importance of each feature which was estimated using Shapley Additive Explanation (SHAP) values. The feature with the highest mean absolute SHAP value (and thus the highest importance) was serum E2 before triggering, followed by the duration of gonadotropin treatment and antral follicle count. Of the other hormonal features, baseline FSH, AMH and E2 levels were similarly important and baseline LH was least important. The treatment characteristic with the highest SHAP value was the duration of treatment with longer periods being associated with a higher number of oocytes retrieved. Limitations, reasons for caution The models produced in this study were trained on a cohort from a single center. They should thus not be used in clinical practice until trained and evaluated on a larger cohort more representative of the general population. Wider implications of the findings These predictive models developed may be useful in clinical practice, assisting clinicians in optimizing COH protocols for individual patients. Our work also demonstrates the promise of using the Substra framework for allowing external researchers to provide clinically-relevant insights on sensitive fertility data in a fully secure, trustworthy manner. Trial registration number Not Applicable
Databáze: OpenAIRE