Define the Goal / Question
Start by identifying the problem you want to solve or the insight you need.
Example: “What are the top 5 products driving sales this year?”
2. Collect the Data
Gather raw data from different sources (databases, spreadsheets, APIs, surveys, sensors, etc.).
Example: Sales data from Excel, customer info from a CRM, website traffic from Google Analytics.
3. Clean the Data
Raw data often contains errors, duplicates, or missing values. Cleaning ensures accuracy.
Steps include:
Removing duplicates
Handling missing values (filling, removing, or estimating)
Correcting inconsistencies (e.g., “USA” vs “U.S.”)
4. Explore the Data (EDA – Exploratory Data Analysis)
Use statistics and visuals to understand the data.
Examples:
Summary statistics (mean, median, standard deviation)
Visualizations (charts, histograms, box plots)
Detecting outliers or trends
5. Transform & Model the Data
Depending on the goal:
Aggregate (e.g., sum of sales per month)
Create new features (e.g., profit = revenue – cost)
Apply statistical models or machine learning (e.g., regression, clustering, classification)
6. Interpret the Results
Translate the findings into meaningful insights.
Example: “80% of revenue comes from 20% of customers” (Pareto Principle).
7. Communicate the Insights
Present results clearly with dashboards, reports, or presentations.
Use charts, graphs, or storytelling to make it easy to understand.
Tools: Power BI, Tableau, Excel, Python, R, etc.
8. Make Decisions & Take Action
The ultimate goal of data analysis is action.
Example: “Focus marketing budget on top 5 products to increase sales efficiency.”
In short:
Data analysis = Ask → Collect → Clean → Explore → Model → Interpret → Present → Act