تفاصيل العمل

Description:

In this project, I worked on a large-scale retail dataset using Python (Pandas) to ensure data quality and readiness for analysis and dashboards.

Key Steps Performed:

Checked and validated data types for all columns (dates, numerical, categorical).

Converted important fields such as OrderDate to datetime format.

Ensured numerical accuracy by fixing data types for UnitPrice, UnitCost, Quantity.

Handled data quality issues:

Removed invalid discount values (kept discounts between 0 and 1).

Converted negative quantities to absolute values.

Verified that there are no duplicate records.

Analyzed dataset structure:

Dataset size: 1,050,000 rows × 12 columns.

Confirmed data consistency using .info() and .describe().

Outcome:

A clean, reliable, and analysis-ready dataset suitable for:

Sales performance analysis

Profit & cost analysis

Dashboard creation (Power BI / Tableau / Looker Studio)

Tools Used:

Python (Pandas) • Jupyter Notebook • Data Cleaning • Exploratory Data Analysis (EDA)

بطاقة العمل

اسم المستقل
عدد الإعجابات
0
عدد المشاهدات
11
تاريخ الإضافة