Logotipo do repositório
 

Publicação:
Hadoop cluster deployment: A methodological approach

Carregando...
Imagem de Miniatura

Orientador

Coorientador

Pós-graduação

Curso de graduação

Título da Revista

ISSN da Revista

Título de Volume

Editor

Tipo

Artigo

Direito de acesso

Acesso abertoAcesso Aberto

Resumo

For a long time, data has been treated as a general problem because it just represents fractions of an event without any relevant purpose. However, the last decade has been just about information and how to get it. Seeking meaning in data and trying to solve scalability problems, many frameworks have been developed to improve data storage and its analysis. As a framework, Hadoop was presented as a powerful tool to deal with large amounts of data. However, it still causes doubts about how to deal with its deployment and if there is any reliable method to compare the performance of distinct Hadoop clusters. This paper presents a methodology based on benchmark analysis to guide the Hadoop cluster deployment. The experiments employed The Apache Hadoop and the Hadoop distributions of Cloudera, Hortonworks, and MapR, analyzing the architectures on local and on clouding-using centralized and geographically distributed servers. The results show the methodology can be dynamically applied on a reliable comparison among different architectures. Additionally, the study suggests that the knowledge acquired can be used to improve the data analysis process by understanding the Hadoop architecture.

Descrição

Palavras-chave

Benchmark methodology, Big Data, Computational models, Hadoop

Idioma

Inglês

Como citar

Information (Switzerland), v. 9, n. 6, 2018.

Itens relacionados

Financiadores

Unidades

Departamentos

Cursos de graduação

Programas de pós-graduação