ETL Semantic Model for Big Data Aggregation, Integration, and Representation

Document Type : Original Research Articles.

Authors

1 Faculty of computers and information systems , C.S dep. Kafr El-Sheikh University, Kafr ElSheikh 33511, Egypt

2 Faculty of computers and information systems , C.S dep. Mansoura University Mansoura 35516, Egypt

Abstract

 Semantic web introduces new benefits for many research topics on big-data. It semantically maintains a large amount of
data and provides meaningful meaning of unstructured data contents. Big data refers to large scale. It is used to describe a massive collection of datasets in different formats. The semantic and structural heterogeneity are the biggest problems
that still face the aggregating, integrating, and storing big data. In this paper, we solved both of the problems of columns
redundancy that are produced from the semantic heterogeneity and the problem of structural heterogeneity through
developing and implementing a new ETL model based on semantic and ontology technologies. Geospatial data is used
as a case study because its integration is complex and usually suffers from the variety of resources and the representation of the produced big data. The results of using this model showed that it solves the problem of heterogeneity in several data sources and it improves the data integration and representation.
 

Keywords

Main Subjects