Integrated Data Infrastructure (online Retail )

تفاصيل العمل

we focused on building a scalable data pipeline that extracts, transforms, and loads (ETL) data into a centralized data

warehouse

We processed various data formats like CSV, XML, and JSON using powerful tools such as SQL, Python, Hadoop, and

Apache Spark to handle big data.

We utilized Hadoop to manage and store large datasets efficiently, while Apache Spark enabled us to process and

analyze this data quickly.

By leveraging these big data technologies, we transformed raw data into actionable insights.

Finally, we visualized these insights using Python and Power BI, making the data easy to understand and useful for

decision-making.

- ZIP
- data-pipeline.zip
- (277.08KB)
- ZIP
- data-source.zip
- (175.57KB)
- ZIP
- datavisualization.zip
- (858.86KB)
- PDF
- Data-Engineer-T…resentation.pdf
- (2.43MB)
- ZIP
- DWH.zip
- (163.42KB)
- ZIP
- final-data-sour…er-cleaning.zip
- (223.85KB)
- MD
- README.md
- (56)
- ZIP
- BigData.zip
- (32.31MB)

اسم المستقل

عدد الإعجابات

عدد المشاهدات

تاريخ الإضافة

05/12/2024

تاريخ الإنجاز

06/11/2024

المهارات