The Spatiotemporal Data Fusion (STDF) Approach: IoT-Based Data Fusion Using Big Data Analytics

Enormous heterogeneous sensory data are generated in the Internet of Things (IoT) for various applications. These big data are characterized by additional features related to IoT, including trustworthiness, timing and spatial features. This reveals more perspectives to consider while processing, pos...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: Dina Fawzy, Sherin Moussa, Nagwa Badr
Formato: article
Lenguaje:EN
Publicado: MDPI AG 2021
Materias:
Acceso en línea:https://doaj.org/article/8ecd0a37dcaa44df852715187a5f0335
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
Descripción
Sumario:Enormous heterogeneous sensory data are generated in the Internet of Things (IoT) for various applications. These big data are characterized by additional features related to IoT, including trustworthiness, timing and spatial features. This reveals more perspectives to consider while processing, posing vast challenges to traditional data fusion methods at different fusion levels for collection and analysis. In this paper, an IoT-based spatiotemporal data fusion (STDF) approach for low-level data in–data out fusion is proposed for real-time spatial IoT source aggregation. It grants optimum performance through leveraging traditional data fusion methods based on big data analytics while exclusively maintaining the data expiry, trustworthiness and spatial and temporal IoT data perspectives, in addition to the volume and velocity. It applies cluster sampling for data reduction upon data acquisition from all IoT sources. For each source, it utilizes a combination of k-means clustering for spatial analysis and Tiny AGgregation (TAG) for temporal aggregation to maintain spatiotemporal data fusion at the processing server. STDF is validated via a public IoT data stream simulator. The experiments examine diverse IoT processing challenges in different datasets, reducing the data size by 95% and decreasing the processing time by 80%, with an accuracy level up to 90% for the largest used dataset.