Publication Details
Abstract
Big Data processing requires high-performance solutions in today's industries with the increasing growth of data. Traditional computing techniques are not efficient to deal with huge datasets based on process and memory constraints . Distributed AI algorithms on HPC platforms are utilized in this work to enhance Big Data processing performance. Distributed Random Forest and Deep Neural Networks were experimented with multi-core CPUs and GPU clusters. Memory optimization and cache reuse were employed to minimize data access latency. Experiments based on synthetic health-care and financial data sets show remarkable improvement in processing time, prediction accuracy, and power consumption. Experiments prove the efficacy of distributed AI strategies along with HPC for scalable Big Data analysis with high performance.