Comparison of the performance of multiple imputation models in filling gaps in hourly and daily meteorological series from two locations in the state of São Paulo-Brazil
Carregando...
Arquivos
Fontes externas
Fontes externas
Data
Orientador
Coorientador
Pós-graduação
Curso de graduação
Título da Revista
ISSN da Revista
Título de Volume
Editor
Tipo
Artigo
Direito de acesso
Arquivos
Fontes externas
Fontes externas
Resumo
The presence of missing values (missings) in data series is a common issue that needs to be adequately addressed to ensure the validity of certain statistical methods and, in turn, to minimize biases that might affect study outcomes and conclusions. Various methods can be applied depending on the dataset characteristics and the amount of data lost. This study aimed to evaluate the performance of internal multiple imputation approaches, 'pmm' and'midastouch,' for sets of meteorological variables with daily and hourly frequencies. The first set was collected in the municipality of Botucatu, and the second in Tupã, both in São Paulo State, Brazil. These datasets comprise information on global solar radiation, wind speed, air temperature, maximum air temperature, minimum air temperature, relative air humidity, maximum relative humidity, and minimum relative humidity for the period from March 20, 2018, to March 19, 2021, gathered by the São Paulo State University - UNESP (Botucatu–SP) and the Brazilian Institute of Meteorology–INMET (Tupã–SP). Analysis of the missing values revealed that the time series from Botucatu–SP had 1.4% data loss, whereas Tupã–SP had 7%. Given the amount of missing data, imputation was performed using the 'pmm' and 'midastouch' methods, implemented through the R software. Results indicate that both procedures offer satisfactory performance in imputing values for continuous variables, with superior performance for hourly frequency data. The greater level of detail in hourly data enables a better understanding of the associated nuances and uncertainties.
Descrição
Palavras-chave
Database reconstruction, Meteorological data, Missing data, Time series
Idioma
Inglês
Citação
Modeling Earth Systems and Environment, v. 10, n. 2, p. 1815-1823, 2024.





