Cloud computing is one of the important component of today’s tech savvy society.it creates a new paradigm for information exchange without investing in a new infrastructure or licensing new software. It expands the capabilities of existing traditional way of accessing the application software, system software, storage and others through internet. In the last few years, cloud computing has grown tremendously from a promising business concept to fastest growing segments of the IT industry.
At a remarkable pace, cloud computing has transformed how business and government function by providing various services like IaaS, PaaS and SaaS. Huge amount of data is being generated by various applications on the cloud called big data applications. Big data applications need appropriate framework and techniques to store, aggregate and retrieve the data.
Consequently the objective of this research paper has divided into two sections; firstly to identifying the traditional methods of data aggregation and optimization, then evaluate aggregation of data through minimization of links association between Mapper and Reducer to reduce the data traffic in the network. Secondly to present a viable solution to overcome these major problems using high level scripting language pig on Apache Hadoop Framework. This paper proposes data aggregation and optimization for big data applications within a cloud environment.
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License