Medical Insurance Cost Prediction using Random Forest Regressor
Objective
To build a machine learning model that predicts medical insurance costs based on various features using a Random Forest Regressor. The model will be deployed as a web application using Flask.
Features and Methodology
- Exploring the Dataset:
- Understanding the dataset and its features.
- Checking for missing values and data types.
- Converting Categorical Values to Numerical:
- Using techniques like one-hot encoding to convert categorical features into numerical values for the machine learning model.
- Plotting Heatmap:
- Visualizing the correlation between dependent and independent features using a heatmap.
- Data Visualization:
- Creating plots to visualize relationships between different features.
- Plotting skewness and kurtosis to understand data distribution.
- Data Preparation:
- Splitting the dataset into training and testing sets.
- Scaling the data if necessary.
- Prediction Using Different Models:
- Linear Regression
- Support Vector Regressor (SVR)
- Ridge Regressor
- Random Forest Regressor
- Hyperparameter Tuning:
- Performing hyperparameter tuning to improve model performance.
- Model Comparison:
- Plotting graphs to compare the performance of different models.
- Model Deployment:
- Preparing the model for deployment.
- Deploying the model using Flask.
Results
The Random Forest Regressor achieved an accuracy of 86% for predicting medical insurance costs.
Dataset
The dataset used for this project can be downloaded from Kaggle. Click to Download.
Installation Steps
- Install Python 3.7.0.
- Install Dependencies
python -m pip install --user -r requirements.txt
- Run the Application
python app.py
Reviews
There are no reviews yet.