A Two-Stage, Object-Centric Deep Learning Framework for Robust Exam Cheating Detection
Proposed a two-stage deep learning framework using YOLOv8n and RexNet-150, achieving 95% accuracy in cheating detection.
Key Findings
Methodology
This paper proposes a two-stage object-centric deep learning framework for exam cheating detection. First, the YOLOv8n model is used to localize students in exam-room images. Each detected region is cropped and preprocessed, then classified by a fine-tuned RexNet-150 model as either normal or cheating behavior. The system is trained on a dataset compiled from 10 independent sources with a total of 273,897 samples, achieving 0.95 accuracy, 0.94 recall, 0.96 precision, and 0.95 F1-score.
Key Results
- Result 1: The system was trained on 273,897 samples, achieving 0.95 accuracy, 0.94 recall, 0.96 precision, and 0.95 F1-score, a 13% increase over a baseline accuracy of 0.82 in video-based cheating detection.
- Result 2: With an average inference time of 13.9 ms per sample, the proposed approach demonstrates robustness and scalability for deployment in large-scale environments.
- Result 3: Ablation studies confirmed the significant improvement in detection accuracy of the two-stage method over the full-frame approach.
Significance
This study is significant for both academia and industry. It not only improves the accuracy of exam cheating detection but also addresses transparency and complexity issues in traditional methods. By leveraging the combination of YOLOv8n and RexNet-150, the framework provides an efficient and scalable solution that can operate in real-time in resource-limited environments. Additionally, the system addresses ethical concerns by ensuring that results are delivered privately to students, avoiding public shaming.
Technical Contribution
Technical contributions include: 1) Proposing an object-centric two-stage framework that significantly improves detection accuracy; 2) Creating a large-scale standardized dataset that serves as a benchmark for future models; 3) Conducting detailed ablation studies and model comparisons, demonstrating the framework's superior performance over traditional approaches and establishing a new state-of-the-art.
Novelty
This study is the first to combine YOLOv8n and RexNet-150 for exam cheating detection, proposing an object-centric detection approach that overcomes the background noise interference issues of full-frame methods. Compared to existing methods, this framework simplifies the architecture and enhances detection efficiency and accuracy.
Limitations
- Limitation 1: The current method relies on static frames, lacking temporal continuity, which may fail to distinguish between brief innocent actions and prolonged cheating behavior.
- Limitation 2: By focusing only on faces and upper bodies, it may miss evidence of cheating on the desk, such as phones or notes.
- Limitation 3: Inconsistencies in dataset annotations may affect the model's robustness.
Future Work
Future research directions include: 1) Expanding the Region of Interest (ROIs) extraction to include more of the upper body, hands, and immediate desk area to capture a more complete visual narrative of potential cheating acts; 2) Exploring multi-class classification to identify specific types of cheating; 3) Improving data quality and annotation strategies to enhance system robustness and accuracy.
AI Executive Summary
Exam cheating detection is a critical component of academic integrity, and with the proliferation of remote and hybrid learning, ensuring fairness and transparency in assessments has become increasingly important. Traditional invigilation relies on human observation, which is inefficient and prone to errors. While some AI-powered monitoring systems have been deployed, many lack transparency or require complex multi-layered architectures.
This paper proposes an improved two-stage framework that integrates object detection and behavioral analysis. First, the YOLOv8n model is used to localize students in exam-room images. Each detected region is cropped and preprocessed, then classified by a fine-tuned RexNet-150 model as either normal or cheating behavior. The system is trained on a dataset compiled from 10 independent sources with a total of 273,897 samples, achieving 0.95 accuracy, 0.94 recall, 0.96 precision, and 0.95 F1-score.
The core technical principle of this framework is to eliminate background noise through object detection and focus on the behavioral analysis of each examinee. This allows the system to more accurately identify cheating behaviors while reducing false positives. Experimental results demonstrate a significant improvement in detection accuracy over traditional full-frame methods.
This study not only achieves technical breakthroughs but also addresses ethical concerns by ensuring that results are delivered privately to students, avoiding public shaming. Furthermore, the system's efficiency and scalability make it suitable for deployment in large-scale environments.
Despite significant progress, the current method has limitations, such as lacking temporal continuity and missing evidence on the desk. Future research will focus on expanding ROIs extraction, improving data quality, and annotation strategies to further enhance system robustness and accuracy.
Deep Analysis
Background
Exam cheating detection is a critical component of academic integrity. With the proliferation of remote and hybrid learning, ensuring fairness and transparency in assessments has become increasingly important. Traditional invigilation relies on human observation, which is inefficient and prone to errors. While some AI-powered monitoring systems have been deployed, many lack transparency or require complex multi-layered architectures. Recent advances in deep learning have provided new possibilities for automated cheating detection. In particular, the development of object detection and behavioral analysis technologies has made it possible to detect cheating behaviors in complex multi-person exam environments. However, existing methods still face challenges in handling background noise and data scarcity.
Core Problem
Exam cheating not only seriously undermines the value of learning outcomes but also poses risks to the credibility of educational institutions. Consequently, there is a pressing need for robust, scalable solutions to support proctors in monitoring exams. Existing AI-based proctoring systems face significant hurdles in handling background noise and differentiating individual behaviors. Additionally, the field is hampered by the fragmented and scarce high-quality publicly available datasets, which impedes the development and fair evaluation of generalizable models.
Innovation
This paper proposes a novel two-stage object-centric framework for cheating detection. • First, the YOLOv8n model is used to localize students in exam-room images, eliminating background noise. • Then, each detected region is cropped and preprocessed, and a fine-tuned RexNet-150 model classifies the behavior as normal or cheating. • By decoupling the complex task of scene understanding into two distinct and manageable sub-problems, the framework significantly improves detection accuracy. • Additionally, a large-scale standardized dataset is created, serving as a benchmark for future models.
Methodology
- �� Use the YOLOv8n model to detect human-like objects in exam-room images and generate bounding boxes. • Apply cropping and preprocessing steps to extract robust Regions of Interest (ROIs). • Forward these ROIs to the RexNet-150 model, where it distinguishes between cheating and non-cheating behaviors. • Finally, predicted labels and bounding boxes are drawn back onto their original image, highlighting the integration of the entire workflow. • The dataset was collected from 10 open sources, cleaned, and standardized, split into training, validation, and test sets.
Experiments
Experiments were conducted within a Kaggle Notebook environment using a single NVIDIA RTX 3080 GPU. The software stack was built on PyTorch version 2.1. The RexNet-150 model was trained for 10 epochs using the Adam optimizer with a learning rate of 0.0003. The dataset was split into training, validation, and test sets, comprising 80%, 10%, and 10%, respectively. Ablation studies confirmed the significant improvement in detection accuracy of the two-stage method over the full-frame approach.
Results
Experimental results show that the system was trained on 273,897 samples, achieving 0.95 accuracy, 0.94 recall, 0.96 precision, and 0.95 F1-score, a 13% increase over a baseline accuracy of 0.82 in video-based cheating detection. With an average inference time of 13.9 ms per sample, the proposed approach demonstrates robustness and scalability for deployment in large-scale environments. Ablation studies confirmed the significant improvement in detection accuracy of the two-stage method over the full-frame approach.
Applications
The system is applicable to educational institutions requiring large-scale proctoring, especially in remote or hybrid learning environments. Its efficiency and scalability allow it to operate in real-time in resource-limited environments. Additionally, the system's privacy design ensures student privacy, avoiding public shaming.
Limitations & Outlook
The current method relies on static frames, lacking temporal continuity, which may fail to distinguish between brief innocent actions and prolonged cheating behavior. Additionally, by focusing only on faces and upper bodies, it may miss evidence of cheating on the desk, such as phones or notes. Inconsistencies in dataset annotations may affect the model's robustness. Future research will focus on expanding ROIs extraction, improving data quality, and annotation strategies to further enhance system robustness and accuracy.
Plain Language Accessible to non-experts
Imagine you are in a large classroom taking an exam, and the teacher is at the front invigilating. Traditionally, the teacher needs to observe each student to ensure no one is cheating. This is like being in a large kitchen where the chef needs to watch every pot to make sure nothing burns or boils over. But this is very difficult because there are too many pots to watch. Now, imagine there is a smart assistant that can automatically identify each pot and alert the chef which pot needs attention. This is how the cheating detection system proposed in this paper works. It uses a technology called YOLOv8n to identify each student in the exam room, just like the smart assistant identifies each pot. Then, it uses another technology called RexNet-150 to analyze each student's behavior to determine if they are cheating. This way, the system can help the teacher invigilate more effectively, just like the smart assistant helps the chef manage the kitchen better.
ELI14 Explained like you're 14
Hey there! Have you ever wondered how teachers catch cheating during exams? Traditionally, teachers have to keep an eye on every student to make sure no one is cheating. It's like when you're playing a game and you have to keep track of multiple tasks to make sure everything is going well. But that's hard, right? Now, there's a new technology that can help teachers. It's like having a super assistant in the game that can automatically identify each task and tell you which one needs attention. This system uses a technology called YOLOv8n to identify each student in the exam room, just like the super assistant identifies each task. Then, it uses another technology called RexNet-150 to analyze each student's behavior to determine if they are cheating. This way, teachers can invigilate more easily, just like you have a super assistant in the game. Isn't that cool?
Glossary
YOLOv8n
YOLOv8n is a state-of-the-art object detection model capable of quickly and accurately identifying objects in images.
Used in the paper to localize students in exam-room images.
RexNet-150
RexNet-150 is a deep learning model for image classification, known for its efficient feature representation capabilities.
Used in the paper to analyze student behavior and determine if cheating is occurring.
F1-Score
The F1-Score is the harmonic mean of precision and recall, used to measure a model's performance on imbalanced datasets.
Used to evaluate the performance of the cheating detection system.
Recall
Recall is the proportion of actual positive cases that the model correctly identifies.
Used to evaluate the system's effectiveness in detecting cheating behavior.
Precision
Precision is the proportion of positive predictions that are actually correct.
Used to evaluate the system's reliability in reducing false alarms.
Ablation Study
An ablation study is an experimental method that evaluates the impact of removing or replacing certain parts of a model on its overall performance.
Used to confirm the superiority of the two-stage method over the full-frame approach.
Object Detection
Object detection is a computer vision technique used to identify target objects in images and mark their locations.
Used to localize students in exam-room images.
Behavioral Analysis
Behavioral analysis involves observing and analyzing individual behavior patterns to identify anomalies or specific behaviors.
Used to determine if student behavior is normal or cheating.
Dataset
A dataset is a collection of data samples used to train and evaluate machine learning models.
The paper uses a dataset compiled from 10 independent sources.
Inference Time
Inference time refers to the time it takes for a model to generate output results from input data.
Used to evaluate the system's performance in real-time applications.
Open Questions Unanswered questions from this research
- 1 How can the Region of Interest (ROIs) extraction be expanded to capture more complete cheating behavior without increasing computational complexity? Current methods primarily focus on faces and upper bodies, potentially missing evidence on the desk.
- 2 How can temporal continuity be integrated without affecting system performance to distinguish between brief innocent actions and prolonged cheating behavior?
- 3 How can data quality and annotation strategies be improved to enhance system robustness and accuracy? Inconsistencies in existing dataset annotations may affect model robustness.
- 4 How can multi-class classification be achieved without increasing system complexity to identify specific types of cheating?
- 5 How can system transparency and interpretability be improved without compromising student privacy?
Applications
Immediate Applications
Remote Exam Monitoring
The system can be used in remote exam environments to help educational institutions monitor student behavior in real-time, ensuring fairness and transparency in assessments.
Hybrid Learning Environments
In hybrid learning environments, the system can be used for large-scale proctoring, reducing the burden of manual invigilation and improving efficiency.
Academic Integrity Maintenance
By detecting and preventing exam cheating, the system helps educational institutions maintain academic integrity and enhance their credibility.
Long-term Vision
Intelligent Education Systems
The system can be part of intelligent education systems, providing real-time behavioral analysis and feedback to help students reflect and improve.
Cross-Domain Applications
The technology can be extended to other domains requiring behavioral monitoring, such as security surveillance and employee behavior analysis, providing broader social value.
Abstract
Academic integrity continues to face the persistent challenge of examination cheating. Traditional invigilation relies on human observation, which is inefficient, costly, and prone to errors at scale. Although some existing AI-powered monitoring systems have been deployed and trusted, many lack transparency or require multi-layered architectures to achieve the desired performance. To overcome these challenges, we propose an improvement over a simple two-stage framework for exam cheating detection that integrates object detection and behavioral analysis using well-known technologies. First, the state-of-the-art YOLOv8n model is used to localize students in exam-room images. Each detected region is cropped and preprocessed, then classified by a fine-tuned RexNet-150 model as either normal or cheating behavior. The system is trained on a dataset compiled from 10 independent sources with a total of 273,897 samples, achieving 0.95 accuracy, 0.94 recall, 0.96 precision, and 0.95 F1-score - a 13\% increase over a baseline accuracy of 0.82 in video-based cheating detection. In addition, with an average inference time of 13.9 ms per sample, the proposed approach demonstrates robustness and scalability for deployment in large-scale environments. Beyond the technical contribution, the AI-assisted monitoring system also addresses ethical concerns by ensuring that final outcomes are delivered privately to individual students after the examination, for example, via personal email. This prevents public exposure or shaming and offers students an opportunity to reflect on their behavior. For further improvement, it is possible to incorporate additional factors, such as audio data and consecutive frames, to achieve greater accuracy. This study provides a foundation for developing real-time, scalable, ethical, and open-source solutions.
References (7)
Analyzing the Potential of ReXNet-150: A Novel Architecture for Automobile Parts Classification
M. Ranjith Kumar, P. Adithiyan, G. J. Sendur et al.
A 3D-CNN and LSTM Based Multi-Task Learning Architecture for Action Recognition
Xi Ouyang, Shuangjie Xu, Chaoyun Zhang et al.
Real-Time Vehicle Detection Using YOLOv8-Nano for Intelligent Transportation Systems
Murat Bakirci
A Video-based Detector for Suspicious Activity in Examination with OpenPose
R. Moyo, Stanley Ndebvu, Michael Zimba et al.
A Visual Analytics Approach to Facilitate the Proctoring of Online Exams
Haotian Li, Min Xu, Yong Wang et al.
YOLOv8n-PP: a lightweight pose recognition algorithm for photovoltaic array cleaning robot
Jidong Luo, Guoyi Wang, Yanjiao Lei et al.
A Cheating Detection System in Online Examinations Based on the Analysis of Eye-Gaze and Head-Pose
Ambi Singh, Smita Das