Predictive Analysis of Health Risk Indicators Using Machine Learning Models and Classification
Date
2-14-2024
Faculty Mentor
Mohammad Alam, Mathematical, Computing & Information Sciences
Loading...
Files
Submission Type
Conference Proceeding
Location
1:30-1:40pm | Houston Cole Library, 11th Floor
Description
In this research project, we aim to perform predictive analysis on the Diabetes Health Indicators Dataset. This dataset contains healthcare statistics and lifestyle survey information about individuals, along with their diagnosis of diabetes. The dataset consists of 27 features, including demographics, lab test results, and survey responses in which the target variable for classification is the health status of each patient, categorized as having diabetes, being pre-diabetic, or being healthy. To conduct predictive analysis on this dataset, we will explore various machine learning algorithms suitable for both categorical and integer feature types. We will preprocess the data by handling missing values, encoding categorical variables, and scaling numerical features if necessary. For the classification task, we will evaluate the performance of algorithms such as logistic regression, decision trees, random forests, and outlosupport vector machines. We will use appropriate evaluation metrics such as accuracy, precision, recall, and F1-score to assess the performance of each model by utilizing both Microsoft machine learning studio as well as the coding language R to perform our analysis. To enhance the predictive power of our models, we may also consider feature selection techniques to identify the most relevant features for classification. This will help in reducing dimensionality and improving model efficiency. The results of the predictive analysis provide insights into the factors that contribute to diabetes and pre-diabetes, and help in identifying individuals at risk. This information can be utilized for preventive healthcare interventions and personalized treatment strategies. Overall, this research project aims to leverage the Diabetes Health Indicators Dataset to develop accurate predictive models for classifying individuals' health status based on various demographic, lab test, and survey features. The findings of this study have the potential to contribute to the field of healthcare analytics and improve patient care and management.
Keywords
student research, computing
Rights
This content is the property of Jacksonville State University and is intended for non-commercial use. Video and images may be copied for personal use, research, teaching or any "fair use" as defined by copyright law. Users are asked to acknowledge Jacksonville State University. For more information, please contact digitalcommons@jsu.edu.
Recommended Citation
St. John, Ethan, "Predictive Analysis of Health Risk Indicators Using Machine Learning Models and Classification" (2024). JSU Student Symposium 2024. 17.
https://digitalcommons.jsu.edu/ce_jsustudentsymp_2024/17