Hybrid Encryption Algorithm for Big Data Security in the Hadoop Distributed File System
Abstract
A large amount of structured and unstructured data is collectively termed big data. The recent technological development streamlined several companies to handle massive data and interpret future trends and requirements. The Hadoop distributed file system (HDFS) is an application introduced for efficient big data processing. However, HDFS does not have built-in data encryption methodologies, which leads to serious security threats. Encryption algorithms are introduced to enhance data security; however, conventional algorithms lag in performance while handling larger files. This research aims to secure big data using a novel hybrid encryption algorithm combining cipher-text policy attribute-based encryption (CP-ABE) and advanced encryption standard (AES) algorithms. The performance of the proposed model is compared with traditional encryption algorithms such as DES, 3DES, and Blowfish to validate superior performance in terms of throughput, encryption time, decryption time, and efficiency. Maximum efficiency of 96.5% with 7.12 min encryption time and 6.51 min decryption time of the proposed model outperforms conventional encryption algorithms.
Keywords
big data security, Hadoop, data encryption and decryption, Hadoop distributed file system (HDFS),References
1. Q. Hou, M. Han, Z. Cai, Survey on data analysis in social media: A practical application aspect, Big Data Mining and Analytics, 3(4): 259–279, 2020, doi: 10.26599/BDMA.2020.9020006.2. A. Banik, Z. Shamsi, D.S. Laiphrakpam, An encryption scheme for securing multiple medical images, Journal of Information Security and Applications, 49: 1–8, 2019, doi: 10.1016/j.jisa.2019.102398.
3. T. Wang, Z. Zheng, M.H. Rehmani, S. Yao, Z. Huo, Privacy preservation in big data from the communication perspective – A survey, IEEE Communications Surveys & Tutorials, 21(1): 753–778, 2019, doi: 10.1109/COMST.2018.2865107.
4. X. Wang, M. Veeraraghavan, H. Shen, Evaluation study of a proposed Hadoop for data center networks incorporating optical circuit switches, IEEE/OSA Journal of Optical Communications and Networking, 10(8): C50–C63, 2018, doi: 10.1364/JOCN.10.000C50.
5. J. George, C.-A. Chen, R. Stoleru, G. Xie, Hadoop MapReduce for mobile clouds, IEEE Transactions on Cloud Computing, 7(1): 224–236, 2019, doi: 10.1109/TCC. 2016.2603474.
6. G.S. Bhathal, A. Singh, Big Data: Hadoop framework vulnerabilities, security issues and attacks, Array, 1–2: 1–8, 2019, doi: 10.1016/j.array.2019.100002.
7. R.R. Parmar, S. Roy, D. Bhattacharyya, S.K. Bandyopadhyay, T.-H. Ki, Large-scale encryption in the Hadoop environment: challenges and solutions, IEEE Access, 5: 7156–7163, 2017, doi: 10.1109/ACCESS.2017.2700228.
8. J. Samuel Manoharan, A novel user layer cloud security model based on chaotic Arnold transformation using fingerprint biometric traits, Journal of Innovative Image Processing (JIIP), 3(01): 36–51, 2021, doi: 10.36548/jiip.2021.1.004.
9. H.-Y. Tran, J. Hu, Privacy-preserving big data analytics a comprehensive survey, Journal of Parallel and Distributed Computing, 134: 207–218, 2019, doi: 10.1016/j.jpdc.2019.08.007.
10. N. Eltayieb, R. Elhabob, F. Li, An efficient attribute-based online/offline searchable encryption and its application in cloud-based reliable smart grid, Journal of Systems Architecture, 98: 165–172, 2019, doi: 10.1016/j.sysarc.2019.07.005.
11. P.K. Mallepalli, S.R. Tumma, A lightweight hybrid scheme for security of big data, Materials Today: Proceedings, pp. 1–14, 2021, doi: 10.1016/j.matpr.2021.03.151.
12. M. Parihar, Big Data security and privacy, International Journal of Engineering Research & Technology, 10(07): 323–327, 2021.
13. R. Chatterjee, R. Chakraborty, J.K. Mondal, Design of lightweight cryptographic model for end-to-end encryption in IoT domain, IRO Journal on Sustainable Wireless Systems, 1(4): 215–224, 2019, doi: 10.36548/jsws.2019.4.002.
14. W. Gao, W. Yu, F. Liang, W.G. Hatcher, C. Lu, Privacy-preserving auction for big data trading using homomorphic encryption, IEEE Transactions on Network Science and Engineering, 7(2): 776–791, 2020, doi: 10.1109/TNSE.2018.2846736.
15. A. Alabdulatif, I. Khalil, X. Yi, Towards secure big data analytic for cloud-enabled applications with fully homomorphic encryption, Journal of Parallel and Distributed Computing, 137: 192–204, 2020, doi: 10.1016/j.jpdc.2019.10.008.
16. C. Xiao, P. Li, L. Zhang, W. Liu, N. Bergmann, ACA-SDS: Adaptive crypto acceleration for secure data storage in big data, IEEE Access, 6: 44494–44505, 2018, doi: 10.1109/ACCESS.2018.2862425.
17. K. Sharma, A. Agrawal, D. Pandey, R.A. Khan, S.K. Dinkar, RSA based encryption approach for preserving confidentiality of big data, Journal of King Saud University – Computer and Information Sciences, pp. 1–16, 2019, doi: 10.1016/j.jksuci.2019.10.006.
18. S. Tahir, L. Steponkus, S. Ruj, M. Rajarajan, A. Sajjad, A parallelized disjunctive query based searchable encryption scheme for big data, Future Generation Computer Systems, 109: 583–592, 2020, doi: 10.1016/j.future.2018.05.048.
19. D. Puthal, X. Wu, N. Surya, R. Ranjan, J. Chen, SEEN: A selective encryption method to ensure confidentiality for big sensing data streams, IEEE Transactions on Big Data, 5(3): 379–392, 2019, doi: 10.1109/TBDATA.2017.2702172.
20. P. Perazzo, F. Righetti, M. La Manna, C. Vallati, Performance evaluation of attributebased encryption on constrained IoT devices, Computer Communications, 170: 151–163, 2021, doi: 10.1016/j.comcom.2021.02.012.
21. H. Deng, Z. Qin, Q. Wu, Z. Guan, Y. Zhou, Flexible attribute-based proxy re-encryption for efficient data sharing, Information Sciences, 511: 94–113, 2020, doi: 10.1016/j.ins.2019.09.052.
22. P.S. Challagidad, M.N. Birje, Efficient multi-authority access control using attributebased encryption in cloud storage, Procedia Computer Science, 167: 840–849, 2020, doi: 10.1016/j.procs.2020.03.423.
23. S. Aditham, N. Ranganathan, A system architecture for the detection of insider attacks in big data systems, IEEE Transactions on Dependable and Secure Computing, 15(6): 974–987, 2018, doi: 10.1109/TDSC.2017.2768533.
24. J.S. Raj, A novel encryption and decryption of data using mobile cloud computing platform, IRO Journal on Sustainable Wireless Systems, 2(3): 118–122, 2021, doi: 10.36548/jsws.2020.3.002.
25. S. Shakya, S. Smys, Big data analytics for improved risk management and customer segregation in banking applications, Journal of ISMAC, 3(3): 235–249, 2021, doi: 10.36548/jismac.2021.3.005.