An auto-scaling framework for analyzing big data in the cloud environment

Jannapureddy, Rachana, Vien, Quoc-Tuan ORCID logoORCID:, Shah, Purav ORCID logoORCID: and Trestian, Ramona ORCID logoORCID: (2019) An auto-scaling framework for analyzing big data in the cloud environment. Applied Sciences, 9 (7) . pp. 1-16. ISSN 2076-3417 [Article] (doi:10.3390/app9071417)

PDF - Published version (with publisher's formatting)
Available under License Creative Commons Attribution 4.0.

Download (491kB) | Preview


Processing big data on traditional computing infrastructure is a challenge as the volume of data is large and thus high computational complexity. Recently, Apache Hadoop has emerged as a distributed computing infrastructure to deal with big data. Adopting Hadoop to dynamically adjust its computing resources based on real-time workload is itself a demanding task, thus conventionally a pre-configuration with adequate resources to compute the peak data load is set up. However, this may cause a considerable wastage of computing resources when the usage levels are much lower than the preset load. In consideration of this, this paper investigates an auto-scaling framework on cloud environment aiming to minimise the cost of resource use by automatically adjusting the virtual nodes depending on the real-time data load. A cost-effective auto-scaling (CEAS) framework is first proposed for an Amazon Web Services (AWS) Cloud environment. The proposed CEAS framework allows us to scale the computing resources of Hadoop cluster so as to either reduce the computing resource use when the workload is low or scale-up the computing resources to speed up the data processing and analysis within an adequate time. To validate the effectiveness of the proposed framework, a case study with real-time sentiment analysis on the universities’ tweets is provided to analyse the reviews/tweets of the people posted on social media. Such a dynamic scaling method offers a reference to improving the Twitter data analysis in a more cost-effective and flexible way.

Item Type: Article
Additional Information: Article number = 1417
Research Areas: A. > School of Science and Technology > Computer and Communications Engineering
Item ID: 26349
Notes on copyright: © 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.
Useful Links:
Depositing User: Quoc-Tuan Vien
Date Deposited: 04 Apr 2019 11:28
Last Modified: 13 Jun 2022 22:49

Actions (login required)

View Item View Item


Activity Overview
6 month trend
6 month trend

Additional statistics are available via IRStats2.