This project (TDG:T0282) lies in its potential to enhance the quality assurance and enhancement (QA&E) processes for reasonably sized UG programs at EdUHK. Here, a student is regarded as at-risk if he /she is predicted to receive a CGPA of 2.5 or lower in Year 3. By identifying at-risk students early on, program leaders can implement targeted interventions to support struggling students, ultimately improving their chances of success. In order to develop predictions of students’ CGPAs, data on a large set of variables is needed. Note that student data are anonymised using alphanumeric pseudo-IDs that mask the identities of students.
We have developed a web application that facilitates the analysis and allows program leaders to view results directly. As soon as second-year results are available each August, our system identifies at-risk students, using comprehensive, anonymized datasets to ensure accuracy and privacy. Identified students are listed in late September, allowing program leaders to immediately initiate targeted interventions.
Project Timeline
The project timeline, extending from 2024 to 2026, comprehensively outlines the key data collection points and result delivery phases for three student cohorts. The initial phase in 2024 will introduce semi-automated reporting for the 2022 cohort, where results are made available through a secure drive with password-protected access for each program report. For the 2023 and 2024 cohorts, results will be delivered via a fully automated web application, streamlining the process for ease of access and efficiency. Looking forward, from the 2025 cohort onwards, as EdUHK transitions to double-degree programs, these subsequent cohorts will be analyzed under a separate CRAC project aimed at further enhancing educational strategies and outcomes.
Our methodology employs a systematic approach to proactively identify at-risk students within four or five-year undergraduate programs at EdUHK. By analyzing data collected from the university, we utilize a lasso regression model to develop predictive models after the second year of academic progress. These models specifically forecast which students are likely to graduate with a GPA below 2.50, enabling early and targeted interventions that improve student outcomes and the overall effectiveness of the programs.
To refine the accuracy of our predictions, we use the Lasso regression technique, combined with the Youden’s index to adjust our identification thresholds. This method ensures a balance between sensitivity and specificity, thereby reducing the risk of false positives and negatives. This strategic approach allows for more precise, timely assessments, providing academic staff with the insights needed to implement effective support strategies for students at critical junctures in their academic careers.
The predictive model focuses on two primary outcomes:
To achieve these goals, we follow a three-step process:
We use the trained lasso regression model \( f(x) \) to predict the Year 3 GPA for each student:
\[ Y3.GPA_{\text{prediction}} = y_{\text{pred}} = f(x) = \sum_{j=1}^{M} w_j x_j + b \]
We apply the Youden's Index to find the most optimal cut-off for classifying students as at-risk. This optimizes our ability to accurately identify students who need intervention:
\[ \text{optimal_cut_off} = \arg\max_c \left[ \frac{TP}{TP + FN} + \frac{TN}{TN + FP} - 1 \right] \]
Based on the optimal cut-off and the predicted Y3 GPA, we classify students as 'at-risk' if their GPA falls below the threshold, and 'not at-risk' if it does not.
\[ at-risk_{\text{prediction}} = y_{\text{pred_label}} = \begin{cases} \text{"At-risk"} & \text{if } y_{\text{pred}} < \text{optimal_cut_off} \\ \text{"not at-risk"} & \text{if } y_{\text{pred}} \geq \text{optimal_cut_off} \end{cases} \]
After predictions are completed, the report for each specific program becomes available for program leaders to review, as illustrated in the workflow diagram (Figure 2) below.
The methodology has applied on 7 five-year BEd programmes, 5 four-year BA or BSocSc programmes, and 2 two-year degree programmes at EdUHK using data spanning from the 2012/13 Academic Year to 2023, our predictive analytics have achieved:
We also collected feedback from the leaders of the 7 programs regarding their interventions. The table below summarizes the interventions adopted.
Intervention | Count | Percentage |
---|---|---|
Academic advising provided to individual students identified as potentially at-risk | 6 out of 7 | 85.7% |
Extra learning material provided to at-risk students | 2 out of 7 | 28.6% |
Peer tutors or a TA provided to (all) students | 2 out of 7 | 28.6% |
Supplementary classes organized | 2 out of 7 | 28.6% |
After-class learning support includes informal learning activities, talks, internships, etc. to incentivize learning | 1 out of 7 | 14.3% |