A configurable strategy for extraction, transformation and load to support data propagation on active data warehouses


This work consists of the construction of a strategy called ETL-PoCon to execute Extraction, Transformation and Load (ETL) processes in active Data Warehouses (DW) with a configurable policy. The original contribution of this work is to provide a strategy that considerably reduces the quantity of data transfers to active DW, besides maintaining a satisfactory level of data freshness. Said reduction is obtained by means of configurable policies of data propagation based on relevance of the data regarding to the information stored in the DW. The strategy was implemented in a database related to health worker that contains more than seventy thousand records of occupational accidents. Experiments have shown that the ETL-PoCon strategy significantly contributes towards a reduction of the overload on the systems involved in the active DW environment, since all results presented a reduction higher than 60% in the amount of DW refreshments.



Active data warehouse, Data warehouse, ETL, Near real-Time data warehouse

Como citar

Parallel and Distributed Computing, Applications and Technologies, PDCAT Proceedings, p. 204-209.