BIcenter: A collaborative Web ETL solution based on a reflective software approach

The continuous growth of new sources of information has led to an unprecedented increase in the data collected. The dimensionality and heterogeneity of these data requires efficient strategies for searching, accessing and integrating from multiple repositories. The techniques underlying this goal ar...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: João R. Almeida, Leonardo Coelho, José L. Oliveira
Formato: article
Lenguaje:EN
Publicado: Elsevier 2021
Materias:
ETL
Acceso en línea:https://doaj.org/article/b4d64f15c22f43d5890e80cf15ba9106
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
id oai:doaj.org-article:b4d64f15c22f43d5890e80cf15ba9106
record_format dspace
spelling oai:doaj.org-article:b4d64f15c22f43d5890e80cf15ba91062021-11-28T04:34:36ZBIcenter: A collaborative Web ETL solution based on a reflective software approach2352-711010.1016/j.softx.2021.100892https://doaj.org/article/b4d64f15c22f43d5890e80cf15ba91062021-12-01T00:00:00Zhttp://www.sciencedirect.com/science/article/pii/S2352711021001497https://doaj.org/toc/2352-7110The continuous growth of new sources of information has led to an unprecedented increase in the data collected. The dimensionality and heterogeneity of these data requires efficient strategies for searching, accessing and integrating from multiple repositories. The techniques underlying this goal are usually known as Extraction, Transformation and Loading (ETL) pipelines, which aim to organise dispersed data into a common structure. However, despite their popularity and widespread use, these pipelines present a few drawbacks in specific scenarios. In clinical research, for instance, it is quite common to engage multiple researchers, institutions and datasets so that the study findings can have higher impact. This implies cooperation between several entities to design the workflow, even when these entities do not have permission to work directly with the source data, due to privacy and regulatory issues. Furthermore, extending the pipeline to other data sources requires adding new concepts and rules over time, which implies continuous updating of the ETL scripts. This paper presents a collaborative web-based ETL application that allows users to design, share and execute ETL pipelines, across multiple centres. The system is supported by a user-friendly interface in which non-technical users can build the ETL pipelines without the need to grasp the ETL details, and most importantly, without having direct access to the data.João R. AlmeidaLeonardo CoelhoJosé L. OliveiraElsevierarticleETLCollaborative pipelinePentaho Data IntegrationComputer softwareQA76.75-76.765ENSoftwareX, Vol 16, Iss , Pp 100892- (2021)
institution DOAJ
collection DOAJ
language EN
topic ETL
Collaborative pipeline
Pentaho Data Integration
Computer software
QA76.75-76.765
spellingShingle ETL
Collaborative pipeline
Pentaho Data Integration
Computer software
QA76.75-76.765
João R. Almeida
Leonardo Coelho
José L. Oliveira
BIcenter: A collaborative Web ETL solution based on a reflective software approach
description The continuous growth of new sources of information has led to an unprecedented increase in the data collected. The dimensionality and heterogeneity of these data requires efficient strategies for searching, accessing and integrating from multiple repositories. The techniques underlying this goal are usually known as Extraction, Transformation and Loading (ETL) pipelines, which aim to organise dispersed data into a common structure. However, despite their popularity and widespread use, these pipelines present a few drawbacks in specific scenarios. In clinical research, for instance, it is quite common to engage multiple researchers, institutions and datasets so that the study findings can have higher impact. This implies cooperation between several entities to design the workflow, even when these entities do not have permission to work directly with the source data, due to privacy and regulatory issues. Furthermore, extending the pipeline to other data sources requires adding new concepts and rules over time, which implies continuous updating of the ETL scripts. This paper presents a collaborative web-based ETL application that allows users to design, share and execute ETL pipelines, across multiple centres. The system is supported by a user-friendly interface in which non-technical users can build the ETL pipelines without the need to grasp the ETL details, and most importantly, without having direct access to the data.
format article
author João R. Almeida
Leonardo Coelho
José L. Oliveira
author_facet João R. Almeida
Leonardo Coelho
José L. Oliveira
author_sort João R. Almeida
title BIcenter: A collaborative Web ETL solution based on a reflective software approach
title_short BIcenter: A collaborative Web ETL solution based on a reflective software approach
title_full BIcenter: A collaborative Web ETL solution based on a reflective software approach
title_fullStr BIcenter: A collaborative Web ETL solution based on a reflective software approach
title_full_unstemmed BIcenter: A collaborative Web ETL solution based on a reflective software approach
title_sort bicenter: a collaborative web etl solution based on a reflective software approach
publisher Elsevier
publishDate 2021
url https://doaj.org/article/b4d64f15c22f43d5890e80cf15ba9106
work_keys_str_mv AT joaoralmeida bicenteracollaborativewebetlsolutionbasedonareflectivesoftwareapproach
AT leonardocoelho bicenteracollaborativewebetlsolutionbasedonareflectivesoftwareapproach
AT joseloliveira bicenteracollaborativewebetlsolutionbasedonareflectivesoftwareapproach
_version_ 1718408339993395200