PROGRAM FOR EXTRACTING NUMERICAL DATA FROM A VIDEO STREAM TO DETERMINE THE STATE OF HUMAN DROPPINESS BY THE DYNAMICS OF FACIAL EXPRESSION
Abstract and keywords
Abstract (English):
The article solves the problem of developing and using a program for extracting numerical data from a video stream and determining a person’s sleepiness based on the dynamics of their facial expressions. Computer vision technology is used for its development, in particular the MediaPipe library, which provides algorithms for recognizing faces and hands in images. The library also includes a Face Mesh module for extracting up to 468 key facial points in an image. This allows automating and simplifying the analysis of facial expressions and facial expressions to determine a person’s sleepiness. The program, developed in Python 3.8, allows extracting such quantitative data from a video stream as the average EAR (Eye Aspect Ratio) value, the average MAR (Mouth Aspect Ratio) value, the number of blinks, the presence of yawns, the duration of one blink, the frequency of blinks, and the average number of frames during which one blink is recorded. An algorithm for processing frames from a video stream and the results of its operation are presented. To determine the correctness of the extracted data and the possibility of using them to determine a person’s drowsiness state based on the dynamics of facial expressions, a neural network model was trained. The SUST Driver Drowsiness Dataset from the publicly available Kaggle source was selected for training the model. The dataset contains about 2,200 video files that capture drivers in a state of wakefulness and drowsiness. As part of the primary study, 200 video files were selected from the dataset by experts, 100 of which corresponded to a person’s drowsiness state, and 100 to wakefulness. Using the developed program, numerical data were extracted from each video file and saved in a single CSV file. To build a neural network model, the obtained data were divided into a training set (80%) and a test set (20%). The final accuracy of the neural network classification was 80.49%. Therefore, the numerical data extracted from the video stream using the developed program are informative and allow building adequate models for determining the state of human drowsiness. Using the parameters EAR, MAR, the number of blinks, the presence of yawns, the duration of one blink, the frequency of blinks, and the average number of frames for one blink ensures reliable determination of human drowsiness based on the dynamics of facial expression. Thus, the program is an effective tool for obtaining data, training intelligent models and developing automated monitoring systems for human drowsiness.

Keywords:
VIDEO STREAM, HUMAN SLEEPINESS STATE, FACIAL EXPRESSION DYNAMICS, MEDIAPIPE, MAR, EAR, FEATURE EXTRACTION PROGRAM, NEURAL NETWORK
Text
Text (PDF): Read Download
Login or Create
* Forgot password?