CURRENT STATE AND DEVELOPMENT PROSPECTS OF HETEROGENEOUS STREAMING DATA PROCESSING METHODS

Authors

DOI:

https://doi.org/10.32782/IT/2023-1-13

Keywords:

heterogeneous data streams, stream processing, real-time analytics, edge computing, Internet of Things (IoT), machine learning, quantum computing

Abstract

Heterogeneous streaming data processing is a rapidly expanding field of research and development in data processing and analytics. The proliferation of diverse data sources, including social media, sensor networks, and Internet of Things (IoT) devices, has resulted in an increasing heterogeneity of streaming data in terms of data types, formats, and velocities. This presents significant challenges in processing and analyzing real-time data for actionable insights. The diversity of data types, formats, and velocities in streaming data introduces complexities that require advanced techniques and algorithms for effective processing and analysis. Data streams can consist of various data types, such as text, images, videos, sensor readings, and social media posts, each with its unique characteristics and structures. Data streams can arrive in different formats, including structured, semi-structured, and unstructured data, which may require different processing approaches. Тhe velocities at which data streams are generated can vary, ranging from high-velocity data streams that demand real-time processing to low-velocity data streams that allow batch processing. Addressing the heterogeneity of streaming data requires robust techniques that can handle diverse data types, formats, and velocities to ensure accurate and meaningful real-time data analysis. This review analyses the current research and publications on heterogeneous streaming data processing. The challenges and opportunities in processing diverse data streams in real-time, is discussed. The latest research and publications in this area are reviewed, including advancements in stream processing frameworks, machine learning algorithms, edge computing, IoT, AI, and quantum computing. The paper determines the purpose of the research, which is to provide an overview of the current state and development prospects of heterogeneous streaming data processing; presents the leading research material, including key findings and insights from recent studies. The paper concludes with prospects for further research and innovation in this field, highlighting the need to address challenges such as data heterogeneity, data velocity, concept drift, privacy and security, explainability, and the potential of quantum computing for real-time data processing.

References

Bajić, B. et al. (2019) «Edge Computing vs. Cloud Computing: Challenges and Opportunities in Industry 4.0», p. 0864-0871. Available at: https://doi.org/10.2507/30th.daaam.proceedings.120.

Nadeem, M., Lee, U, S. and Younus, M. (2022) «A Comparison of Recent Requirements Gathering and Management Tools in Requirements Engineering for IoT-Enabled Sustainable Cities», Sustainability, 14(4), p. 2427. Available at: https://doi.org/10.3390/su14042427.

Seng, P, K. et al. (2022) «Artificial Intelligence (AI) and Machine Learning for Multimedia and Edge Information Processing», Electronics, 11(14), p. 2239. Available at: https://doi.org/10.3390/electronics11142239.

Aydar, M. and Ayvaz, S. (2017) «A Suggestion-Based RDF Instance Matching System», International Journal of Computer Theory and Engineering, 9(5), p. 380-384. Available at: https://doi.org/10.7763/ijcte.2017.v9.1170.

Majeed, A. and Hwang, O, S. (2023) «Quantifying the Vulnerability of Attributes for Effective Privacy Preservation Using Machine Learning», IEEE Access, 11, p. 4400-4411. Available at: https://doi.org/10.1109/access.2023.3235016.

Díaz, O, A. et al. (2015) «Fast Adapting Ensemble: A New Algorithm for Mining Data Streams with Concept Drift», The Scientific World Journal, 2015, p. 1-14. Available at: https://doi.org/10.1155/2015/235810.

Ribeiro, T, M., Singh, S. and Guestrin, C. (2016) Local Interpretable Model-Agnostic Explanations (LIME): An Introduction. Available at: https://www.oreilly.com/content/introduction-to-local-interpretable-modelagnostic-explanations-lime/.

SHAP vs. LIME vs. Permutation Feature Importance - Medium (no date). Available at: https://pub.towardsai.net/model-explainability-shap-vs-lime-vs-permutation-feature-importance-98484efba066.

Doan, Q. et al. (2020) «Integration of IoT Streaming Data With Efficient Indexing and Storage Optimization», IEEE Access, 8, p. 47456-47467. Available at: https://doi.org/10.1109/access.2020.2980006.

Zhu, Y. et al. (2022) «Deep Learning in Diverse Intelligent Sensor Based Systems», Sensors, 23(1), p. 62. Available at: https://doi.org/10.3390/s23010062.

Maximizing Collaboration Through Secure Data Sharing - Accenture (no date). Available at: https://www.accenture.com/us-en/insights/digital/maximize-collaboration-secure-data-sharing.

Downloads

Published

2023-06-20