Logo do repositório

Statistical approaches enabling technology-specific assay interference prediction from large screening data sets

dc.contributor.authorPalmacci, Vincenzo
dc.contributor.authorHirte, Steffen
dc.contributor.authorHernández González, Jorge Enrique [UNESP]
dc.contributor.authorMontanari, Floriane
dc.contributor.authorKirchmair, Johannes
dc.contributor.institutionUniversity of Vienna
dc.contributor.institutionUniversidade de São Paulo (USP)
dc.contributor.institutionBayer AG
dc.contributor.institutionUniversidade Estadual Paulista (UNESP)
dc.date.accessioned2025-04-29T19:33:56Z
dc.date.issued2024-06-01
dc.description.abstractHigh throughput screening (HTS) technologies allow the biological testing of hundreds of thousands of compounds per day. Typically, a substantial proportion of the initial hits obtained by HTS are artifacts caused by assay interference. Therefore, global and technology-specific in silico models for identifying and predicting compounds interfering with biological assays have been developed. The global models benefit from training on large screening data sets, while the specialized models benefit from training on assay technology-specific experimental data. In this work, we develop and explore strategies for generating better predictors of technology-specific assay interference by utilizing the large bioactivity data matrices global models are trained on and employing partially new compound labeling approaches to maintain the assay technology awareness of specialized models. We demonstrate the utility of the statistically derived interference labels in machine learning using fluorescence-based assay interference as a representative example. Our random forest and multi-layer perceptron classifiers showed improved performance compared to existing models, achieving Matthews correlation coefficients (MCCs) of up to 0.47 on holdout data and up to 0.45 on an external test set. These results demonstrate that accurate assay-specific interference labels can be derived from large bioactivity data matrices, enabling the development of new machine-learning models without the need for further experimental data.en
dc.description.affiliationDepartment of Pharmaceutical Sciences Division of Pharmaceutical Chemistry Faculty of Life Sciences University of Vienna
dc.description.affiliationVienna Doctoral School of Pharmaceutical Nutritional and Sport Sciences (PhaNuSpo) University of Vienna
dc.description.affiliationDepartment of Machine Learning Research Bayer AG
dc.description.affiliationDepartment of Physics Sao Paulo State University Rua Cristóvão Colombo 2265, São José do Rio Preto, CEP
dc.description.affiliationChristian Doppler Laboratory for Molecular Informatics in the Biosciences Department for Pharmaceutical Sciences University of Vienna
dc.description.affiliationUnespDepartment of Physics Sao Paulo State University Rua Cristóvão Colombo 2265, São José do Rio Preto, CEP
dc.identifierhttp://dx.doi.org/10.1016/j.ailsci.2024.100099
dc.identifier.citationArtificial Intelligence in the Life Sciences, v. 5.
dc.identifier.doi10.1016/j.ailsci.2024.100099
dc.identifier.issn2667-3185
dc.identifier.scopus2-s2.0-85195410933
dc.identifier.urihttps://hdl.handle.net/11449/304131
dc.language.isoeng
dc.relation.ispartofArtificial Intelligence in the Life Sciences
dc.sourceScopus
dc.subjectAssay interfering compounds
dc.subjectBiological assays
dc.subjectFluorescence
dc.subjectHigh-throughput screening
dc.subjectMachine learning
dc.titleStatistical approaches enabling technology-specific assay interference prediction from large screening data setsen
dc.typeArtigopt
dspace.entity.typePublication
unesp.author.orcid0000-0003-2667-5877 0000-0003-2667-5877[5]
unesp.campusUniversidade Estadual Paulista (UNESP), Instituto de Biociências, Letras e Ciências Exatas, São José do Rio Pretopt

Arquivos