Efficient IoT Data Management for Geological Disasters Based on Big Data-Turbocharged Data Lake Architecture
Multi-source Internet of Things (IoT) data, archived in institutions’ repositories, are becoming more and more widely open-sourced to make them publicly accessed by scientists, developers, and decision makers via web services to promote researches on geohazards prevention. In this paper, we design a...
Guardado en:
Autores principales: | , , , , , |
---|---|
Formato: | article |
Lenguaje: | EN |
Publicado: |
MDPI AG
2021
|
Materias: | |
Acceso en línea: | https://doaj.org/article/bd0761398bbd4511add7af33986e393a |
Etiquetas: |
Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
|
id |
oai:doaj.org-article:bd0761398bbd4511add7af33986e393a |
---|---|
record_format |
dspace |
spelling |
oai:doaj.org-article:bd0761398bbd4511add7af33986e393a2021-11-25T17:52:54ZEfficient IoT Data Management for Geological Disasters Based on Big Data-Turbocharged Data Lake Architecture10.3390/ijgi101107432220-9964https://doaj.org/article/bd0761398bbd4511add7af33986e393a2021-11-01T00:00:00Zhttps://www.mdpi.com/2220-9964/10/11/743https://doaj.org/toc/2220-9964Multi-source Internet of Things (IoT) data, archived in institutions’ repositories, are becoming more and more widely open-sourced to make them publicly accessed by scientists, developers, and decision makers via web services to promote researches on geohazards prevention. In this paper, we design and implement a big data-turbocharged system for effective IoT data management following the data lake architecture. We first propose a multi-threading parallel data ingestion method to ingest IoT data from institutions’ data repositories in parallel. Next, we design storage strategies for both ingested IoT data and processed IoT data to store them in a scalable, reliable storage environment. We also build a distributed cache layer to enable fast access to IoT data. Then, we provide users with a unified, SQL-based interactive environment to enable IoT data exploration by leveraging the processing ability of Apache Spark. In addition, we design a standard-based metadata model to describe ingested IoT data and thus support IoT dataset discovery. Finally, we implement a prototype system and conduct experiments on real IoT data repositories to evaluate the efficiency of the proposed system.Xiaohui HuangJunqing FanZe DengJining YanJiabao LiLizhe WangMDPI AGarticlegeohazardsIoT datadata managementdata lakedistributed computingGeography (General)G1-922ENISPRS International Journal of Geo-Information, Vol 10, Iss 743, p 743 (2021) |
institution |
DOAJ |
collection |
DOAJ |
language |
EN |
topic |
geohazards IoT data data management data lake distributed computing Geography (General) G1-922 |
spellingShingle |
geohazards IoT data data management data lake distributed computing Geography (General) G1-922 Xiaohui Huang Junqing Fan Ze Deng Jining Yan Jiabao Li Lizhe Wang Efficient IoT Data Management for Geological Disasters Based on Big Data-Turbocharged Data Lake Architecture |
description |
Multi-source Internet of Things (IoT) data, archived in institutions’ repositories, are becoming more and more widely open-sourced to make them publicly accessed by scientists, developers, and decision makers via web services to promote researches on geohazards prevention. In this paper, we design and implement a big data-turbocharged system for effective IoT data management following the data lake architecture. We first propose a multi-threading parallel data ingestion method to ingest IoT data from institutions’ data repositories in parallel. Next, we design storage strategies for both ingested IoT data and processed IoT data to store them in a scalable, reliable storage environment. We also build a distributed cache layer to enable fast access to IoT data. Then, we provide users with a unified, SQL-based interactive environment to enable IoT data exploration by leveraging the processing ability of Apache Spark. In addition, we design a standard-based metadata model to describe ingested IoT data and thus support IoT dataset discovery. Finally, we implement a prototype system and conduct experiments on real IoT data repositories to evaluate the efficiency of the proposed system. |
format |
article |
author |
Xiaohui Huang Junqing Fan Ze Deng Jining Yan Jiabao Li Lizhe Wang |
author_facet |
Xiaohui Huang Junqing Fan Ze Deng Jining Yan Jiabao Li Lizhe Wang |
author_sort |
Xiaohui Huang |
title |
Efficient IoT Data Management for Geological Disasters Based on Big Data-Turbocharged Data Lake Architecture |
title_short |
Efficient IoT Data Management for Geological Disasters Based on Big Data-Turbocharged Data Lake Architecture |
title_full |
Efficient IoT Data Management for Geological Disasters Based on Big Data-Turbocharged Data Lake Architecture |
title_fullStr |
Efficient IoT Data Management for Geological Disasters Based on Big Data-Turbocharged Data Lake Architecture |
title_full_unstemmed |
Efficient IoT Data Management for Geological Disasters Based on Big Data-Turbocharged Data Lake Architecture |
title_sort |
efficient iot data management for geological disasters based on big data-turbocharged data lake architecture |
publisher |
MDPI AG |
publishDate |
2021 |
url |
https://doaj.org/article/bd0761398bbd4511add7af33986e393a |
work_keys_str_mv |
AT xiaohuihuang efficientiotdatamanagementforgeologicaldisastersbasedonbigdataturbochargeddatalakearchitecture AT junqingfan efficientiotdatamanagementforgeologicaldisastersbasedonbigdataturbochargeddatalakearchitecture AT zedeng efficientiotdatamanagementforgeologicaldisastersbasedonbigdataturbochargeddatalakearchitecture AT jiningyan efficientiotdatamanagementforgeologicaldisastersbasedonbigdataturbochargeddatalakearchitecture AT jiabaoli efficientiotdatamanagementforgeologicaldisastersbasedonbigdataturbochargeddatalakearchitecture AT lizhewang efficientiotdatamanagementforgeologicaldisastersbasedonbigdataturbochargeddatalakearchitecture |
_version_ |
1718411884470730752 |