Logo do repositório

Comparison of the performance of multiple imputation models in filling gaps in hourly and daily meteorological series from two locations in the state of São Paulo-Brazil

dc.contributor.authorMaziero, Luana Possari
dc.contributor.authorRodrigues, Sérgio Augusto [UNESP]
dc.contributor.authorPai, Alexandre Dal [UNESP]
dc.contributor.authorCremasco, Camila Pires [UNESP]
dc.contributor.authorGabriel Filho, Luís Roberto Almeida [UNESP]
dc.contributor.institutionCEETEPS
dc.contributor.institutionUniversidade Estadual Paulista (UNESP)
dc.date.accessioned2025-04-29T20:04:07Z
dc.date.issued2024-04-01
dc.description.abstractThe presence of missing values (missings) in data series is a common issue that needs to be adequately addressed to ensure the validity of certain statistical methods and, in turn, to minimize biases that might affect study outcomes and conclusions. Various methods can be applied depending on the dataset characteristics and the amount of data lost. This study aimed to evaluate the performance of internal multiple imputation approaches, 'pmm' and'midastouch,' for sets of meteorological variables with daily and hourly frequencies. The first set was collected in the municipality of Botucatu, and the second in Tupã, both in São Paulo State, Brazil. These datasets comprise information on global solar radiation, wind speed, air temperature, maximum air temperature, minimum air temperature, relative air humidity, maximum relative humidity, and minimum relative humidity for the period from March 20, 2018, to March 19, 2021, gathered by the São Paulo State University - UNESP (Botucatu–SP) and the Brazilian Institute of Meteorology–INMET (Tupã–SP). Analysis of the missing values revealed that the time series from Botucatu–SP had 1.4% data loss, whereas Tupã–SP had 7%. Given the amount of missing data, imputation was performed using the 'pmm' and 'midastouch' methods, implemented through the R software. Results indicate that both procedures offer satisfactory performance in imputing values for continuous variables, with superior performance for hourly frequency data. The greater level of detail in hourly data enables a better understanding of the associated nuances and uncertainties.en
dc.description.affiliationEtec Prof. Massuyuki Kawano CEETEPS, SP
dc.description.affiliationDepartamento de Bioprocessos e Biotecnologia UNESP
dc.description.affiliationDepartamento de Engenharia de Biossistemas UNESP, SP
dc.description.affiliationDepartamento de Gestão Desenvolvimento e Tecnologia UNESP, SP
dc.description.affiliationUnespDepartamento de Bioprocessos e Biotecnologia UNESP
dc.description.affiliationUnespDepartamento de Engenharia de Biossistemas UNESP, SP
dc.description.affiliationUnespDepartamento de Gestão Desenvolvimento e Tecnologia UNESP, SP
dc.format.extent1815-1823
dc.identifierhttp://dx.doi.org/10.1007/s40808-023-01863-7
dc.identifier.citationModeling Earth Systems and Environment, v. 10, n. 2, p. 1815-1823, 2024.
dc.identifier.doi10.1007/s40808-023-01863-7
dc.identifier.issn2363-6211
dc.identifier.issn2363-6203
dc.identifier.scopus2-s2.0-85172990485
dc.identifier.urihttps://hdl.handle.net/11449/305752
dc.language.isoeng
dc.relation.ispartofModeling Earth Systems and Environment
dc.sourceScopus
dc.subjectDatabase reconstruction
dc.subjectMeteorological data
dc.subjectMissing data
dc.subjectTime series
dc.titleComparison of the performance of multiple imputation models in filling gaps in hourly and daily meteorological series from two locations in the state of São Paulo-Brazilen
dc.typeArtigopt
dspace.entity.typePublication
unesp.author.orcid0000-0002-9040-3182[1]
unesp.author.orcid0000-0002-2091-2141[2]
unesp.author.orcid0000-0002-1283-901X[3]
unesp.author.orcid0000-0003-2465-1361[4]
unesp.author.orcid0000-0002-7269-2806[5]

Arquivos

Coleções