Predictive Analysis of Health Risk Indicators Using Machine Learning Models and Classification

Predictive Analysis of Health Risk Indicators Using Machine Learning Models and Classification

Date

2-14-2024

Student

Ethan St. John

Faculty Mentor

Mohammad Alam, Mathematical, Computing & Information Sciences

Loading...

Media is loading
 

Files

Submission Type

Conference Proceeding

Location

1:30-1:40pm | Houston Cole Library, 11th Floor

Description

In this research project, we aim to perform predictive analysis on the Diabetes Health Indicators Dataset. This dataset contains healthcare statistics and lifestyle survey information about individuals, along with their diagnosis of diabetes. The dataset consists of 27 features, including demographics, lab test results, and survey responses in which the target variable for classification is the health status of each patient, categorized as having diabetes, being pre-diabetic, or being healthy. To conduct predictive analysis on this dataset, we will explore various machine learning algorithms suitable for both categorical and integer feature types. We will preprocess the data by handling missing values, encoding categorical variables, and scaling numerical features if necessary. For the classification task, we will evaluate the performance of algorithms such as logistic regression, decision trees, random forests, and outlosupport vector machines. We will use appropriate evaluation metrics such as accuracy, precision, recall, and F1-score to assess the performance of each model by utilizing both Microsoft machine learning studio as well as the coding language R to perform our analysis. To enhance the predictive power of our models, we may also consider feature selection techniques to identify the most relevant features for classification. This will help in reducing dimensionality and improving model efficiency. The results of the predictive analysis provide insights into the factors that contribute to diabetes and pre-diabetes, and help in identifying individuals at risk. This information can be utilized for preventive healthcare interventions and personalized treatment strategies. Overall, this research project aims to leverage the Diabetes Health Indicators Dataset to develop accurate predictive models for classifying individuals' health status based on various demographic, lab test, and survey features. The findings of this study have the potential to contribute to the field of healthcare analytics and improve patient care and management.

Keywords

student research, computing

Rights

This content is the property of Jacksonville State University and is intended for non-commercial use. Video and images may be copied for personal use, research, teaching or any "fair use" as defined by copyright law. Users are asked to acknowledge Jacksonville State University. For more information, please contact digitalcommons@jsu.edu.

Predictive Analysis of Health Risk Indicators Using Machine Learning Models and Classification

Share

COinS