تفاصيل العمل

## Project Overview

This project involves data analysis and visualization using the R programming language. The script provided performs two main tasks:

1. **K-Means Clustering**: The script performs k-means clustering on the provided dataset based on user input. It groups data by `age` and `total spending` and generates clustering results.

2. **Association Rule Mining**: The script reads transactional data and generates association rules using the `apriori` algorithm based on user-defined support and confidence levels.

3. **Data Visualization**: The script visualizes the dataset using various plots including pie charts, bar plots, and box plots.

## Prerequisites

Ensure you have the following R libraries installed:

- `arules`

- `dplyr`

You can install these packages using the following commands:

```R

install.packages("arules")

install.packages("dplyr")

```

## Usage Instructions

1. **Reading the Dataset**:

The script will prompt you to enter the path of your CSV file containing the dataset. Ensure your CSV file includes columns such as `age`, `total`, `customer`, `paymentType`, and `city`.

2. **K-Means Clustering**:

- The script will ask you to enter the number of clusters (between 2 and 4).

- The script will perform clustering twice:

- First, it groups data by `age` and sums the `total` spending.

- Second, it directly uses the `age` and `total` columns for clustering.

- The clustering results will be printed.

3. **Association Rule Mining**:

- The script will prompt you to enter the path of your transactional data file.

- Enter the minimum support and confidence levels (between 0.001 and 1).

- The script will generate and display the association rules.

4. **Data Visualization**:

The script generates the following visualizations:

- Pie chart of payment types.

- Pie chart and bar plot of age and total spending.

- Scatter plot of age and total spending.

- Pie chart and bar plot of total spending per city.

- Box plot of the distribution of total spending.

## Example Workflow

1. **Run the Script**:

```R

source('your_script_name.R')

```

2. **Enter the Path to Your Dataset**:

```

Enter your file: /path/to/your/dataset.csv

```

3. **Enter the Number of Clusters**:

```

Please enter a number between 2 & 4

Enter your number of clusters: 3

```

4. **Enter the Path to Your Transactional Data**:

```

Enter your file: /path/to/your/transactions.csv

```

5. **Enter Support and Confidence Levels**:

```

Enter the support: 0.01

Enter the confidence: 0.8

```

## Output

The script will output the following to the console:

- Clustering results and tables.

- Generated association rules.

The script will also generate and display the following plots:

- Pie chart for payment types.

- Pie chart, bar plot, and scatter plot for age and total spending.

- Pie chart and bar plot for total spending per city.

- Box plot for the distribution of total spending.

بطاقة العمل

اسم المستقل
عدد الإعجابات
0
عدد المشاهدات
14
تاريخ الإضافة
تاريخ الإنجاز
المهارات