Future Directions
Using the NSCH 2007, we successfully predicted (1) Epilepsy Diagnosis in children and (2) Quality of Life in children with epilepsy. To further improve epilepsy diagnosis and quality of life, future research should incorporate additional datasets and classification methods. These include the following:
- Validate our classification model using more recent NSCH survey datasets– the NSCH 2011/2012, also conducted by the CDC, also assesses epilepsy diagnosis as well as additional demographic and social characteristics of children and family members. Given that our model accurately classifies epilepsy status and quality of life in the NSCH 2007, we could see if prediction accuracy is still high for these newer datasets.
- Add Interaction Terms – To increase the accuracy of our current model and uncover interactions between different features, we could add interaction terms to our predictor set. This may help us identify whether different features are important to diagnosis and quality of life for different demographic groups, and thus may be relevant to public health policies.
- Try additional machine learning classifiers – To answer our questions, we implemented simple classifiers including Logistic Regression, LDA, QDA, SVM, and Random Forest. Future steps could include using other algorithms such as neural nets and ensemble methods (AdaBoost, Bagging, Mixture of Experts) to see if we can achieve a higher accuracy.
- Use richer Medical Data Sets – In our project, we used a publically available survey-based dataset, which asked simple questions about disease status and demographics. Richer datasets such as longitudinal medical datasets may contain more information about patients’ health conditions, comorbid diseases, family history, history of medication use, and disease severity. Thus, using these datasets may better suit the goal of predicting future epilepsy status and impairment in patients.