Natural Language Processing to Extract Information from Portuguese-Language Medical Records

dc.contributor.authorda Rocha, Naila Camila [UNESP]
dc.contributor.authorBarbosa, Abner Macola Pacheco [UNESP]
dc.contributor.authorSchnr, Yaron Oliveira [UNESP]
dc.contributor.authorMachado-Rugolo, Juliana [UNESP]
dc.contributor.authorde Andrade, Luis Gustavo Modelli [UNESP]
dc.contributor.authorCorrente, José Eduardo
dc.contributor.authorde Arruda Silveira, Liciana Vaz [UNESP]
dc.contributor.institutionUniversidade Estadual Paulista (UNESP)
dc.contributor.institutionFundação para o Desenvolvimento Médico e Hospitalar (FAMESP)
dc.date.accessioned2023-07-29T12:48:26Z
dc.date.available2023-07-29T12:48:26Z
dc.date.issued2023-01-01
dc.description.abstractStudies that use medical records are often impeded due to the information presented in narrative fields. However, recent studies have used artificial intelligence to extract and process secondary health data from electronic medical records. The aim of this study was to develop a neural network that uses data from unstructured medical records to capture information regarding symptoms, diagnoses, medications, conditions, exams, and treatment. Data from 30,000 medical records of patients hospitalized in the Clinical Hospital of the Botucatu Medical School (HCFMB), São Paulo, Brazil, were obtained, creating a corpus with 1200 clinical texts. A natural language algorithm for text extraction and convolutional neural networks for pattern recognition were used to evaluate the model with goodness-of-fit indices. The results showed good accuracy, considering the complexity of the model, with an F-score of 63.9% and a precision of 72.7%. The patient condition class reached a precision of 90.3% and the medication class reached 87.5%. The proposed neural network will facilitate the detection of relationships between diseases and symptoms and prevalence and incidence, in addition to detecting the identification of clinical conditions, disease evolution, and the effects of prescribed medications.en
dc.description.affiliationDepartment of Biostatistics Institute of Biosciences Universidade Estadual Paulista (UNESP)
dc.description.affiliationMedical School Universidade Estadual Paulista (UNESP)
dc.description.affiliationHealth Technology Assessment Center (Clinical Hospital of the Botucatu Medical School)
dc.description.affiliationResearch Support Office Fundação para o Desenvolvimento Médico e Hospitalar (FAMESP)
dc.description.affiliationUnespDepartment of Biostatistics Institute of Biosciences Universidade Estadual Paulista (UNESP)
dc.description.affiliationUnespMedical School Universidade Estadual Paulista (UNESP)
dc.description.affiliationUnespHealth Technology Assessment Center (Clinical Hospital of the Botucatu Medical School)
dc.identifierhttp://dx.doi.org/10.3390/data8010011
dc.identifier.citationData, v. 8, n. 1, 2023.
dc.identifier.doi10.3390/data8010011
dc.identifier.issn2306-5729
dc.identifier.scopus2-s2.0-85146769393
dc.identifier.urihttp://hdl.handle.net/11449/246711
dc.language.isoeng
dc.relation.ispartofData
dc.sourceScopus
dc.subjectmedical records
dc.subjectnamed entity recognition
dc.subjectneural networks
dc.titleNatural Language Processing to Extract Information from Portuguese-Language Medical Recordsen
dc.typeArtigo
unesp.author.orcid0000-0002-1684-2574[1]
unesp.author.orcid0000-0003-3668-8911[2]
unesp.author.orcid0000-0003-3984-4959[4]
unesp.author.orcid0000-0001-8931-5495[7]

Arquivos

Coleções