A ROBUST FRAMEWORK FOR DRIVER FATIGUE DETECTION FROM EEG SIGNALS USING ENHANCEMENT OF MODIFIED Z-SCORE AND MULTIPLE MACHINE LEARNING ARCHITECTURES Part 1
Aug 07, 2023
ABSTRACT: Physiological signals, such as electroencephalogram (EEG), are used to observe a driver’s brain activities. A portable EEG system provides several advantages, including ease of operation, cost-effectiveness, portability, and few physical restrictions. However, it can be challenging to analyze EEG signals as they often contain various artifacts, including muscle activities, eye blinking, and unwanted noises. This study utilized an independent component analysis (ICA) approach to eliminate such unwanted signals from the unprocessed EEG data of 12 young, physically fit male participants between the ages of 19 and 24 who took part in a driving simulation. Furthermore, driver fatigue state detection was carried out using multichannel EEG signals obtained from O1, O2, Fp1, Fp2, P3, P4, F3, and F4. An enhanced modified z-score was utilized with features extracted from a time-frequency domain continuous wavelet transform (CWT) to elevate the reliability of driver fatigue classification. The proposed methodology offers several advantages. First, multichannel EEG analysis improves the accuracy of sleep stage detection, which is vital for accurate driver fatigue detection. Second, an enhanced modified z-score in feature extraction is more robust than conventional z-score techniques, making it more effective for removing outlier values and improving classification accuracy. Third, the proposed approach for detecting driver fatigue employs multiple machine learning classifiers, such as Convolutional Neural Networks (CNNs), Recurrent Neural Networks (RNNs), Artificial Neural Networks (ANNs) that utilize Long ShortTerm Memory (LSTM), and also machine learning techniques like Support Vector Machines (SVM). The evaluation of five classifiers was performed through 5-fold cross-validation. The outcomes indicate that the suggested framework attains exceptional precision in identifying driver fatigue, with an average accuracy rate of 96.07%. Among the classifiers, the ANN classifier achieved the most significant precision of 99.65%, and the SVM classifier ranked second with an accuracy of 97.89%. Based on the results of the receiver operating characteristic (ROC) and area under the curve (AUC) analysis, it was observed that all the classifiers had an outstanding performance, with an average AUC value of 0.95. This study’s contribution lies in presenting a comprehensive and effective framework that can accurately detect driver fatigue from EEG signals.
Cistanche can act as an anti-fatigue and stamina enhancer, and experimental studies have shown that the decoction of Cistanche tubulosa could effectively protect the liver hepatocytes and endothelial cells damaged in weight-bearing swimming mice, upregulate the expression of NOS3, and promote hepatic glycogen synthesis, thus exerting anti-fatigue efficacy. Phenylethanoid glycoside-rich Cistanche tubulosa extract could significantly reduce the serum creatine kinase, lactate dehydrogenase, and lactate levels, and increase the hemoglobin (HB) and glucose levels in ICR mice, and this could play an anti-fatigue role by decreasing the muscle damage and delaying the lactic acid enrichment for energy storage in mice. Compound Cistanche Tubulosa Tablets significantly prolonged the weight-bearing swimming time, increased the hepatic glycogen reserve, and decreased the serum urea level after exercise in mice, showing its anti-fatigue effect. The decoction of Cistanchis can improve endurance and accelerate the elimination of fatigue in exercising mice, and can also reduce the elevation of serum creatine kinase after load exercise and keep the ultrastructure of skeletal muscle of mice normal after exercise, which indicates that it has the effects of enhancing physical strength and anti-fatigue. Cistanchis also significantly prolonged the survival time of nitrite-poisoned mice and enhanced the tolerance against hypoxia and fatigue.

Click on What is About Feeling Tired
【For more info:george.deng@wecistanche.com / WhatApp:8613632399501】
ABSTRAK: Isyarat fisiologi, seperti elektroencefalogram (EEG), digunakan bagi memerhati aktiviti otak pemandu. Sistem EEG mudah alih menyediakan beberapa kelebihan, termasuk kemudahan operasi, keberkesanan kos, mudah alih dan sedikit sekatan fizikal. Namun, isyarat EEG mungkin sukar dianalisis kerana ia seeing mengandungi pelbagai artifak, termasuk aktiviti otot, mata berkedip dan bunyi yang tidak diingini. Kajian ini menggunakan pendekatan analisis komponen bebas (ICA) bagi membuang isyarat tidak diperlukan daripada data EEG yang belum diproses daripada 12 peserta lelaki muda, cergas fizikal berumur 19 hingga 24 tahun yang mengambil bahagian dalam simulasi pemanduan. Tambahan, pengesanan keadaan lesu pemandu telah dijalankan menggunakan isyarat EEG berbilang saluran yang diperoleh dari O1, O2, Fp1, Fp2, P3, P4, F3, dan F4. Penambah baik skor z digunakan dengan ciri diekstrak daripada transformasi wavelet berterusan (CWT) domain frekuensi masa bagi meningkatkan kebolehpercayaan klasifikasi keletihan pemandu. Metodologi yang dicadangkan menawarkan beberapa kelebihan. Pertama, analisis EEG berbilang saluran meningkatkan ketepatan pengesanan peringkat tidur, penting bagi pengesanan keletihan pemandu secara tepat. Kedua, penambah baik skor z dalam pengekstrak ciri adalah lebih teguh daripadateknik skor z konvensional, menjadikannya lebih berkesan bagi membuang unsur luaran dan meningkatkan ketepatan pengelasan. Ketiga, pendekatan yang dicadangkan bagi mengesan keletihan pemandu menggunakan pelbagai pengelas pembelajaran mesin, seperti Rangkaian Neural Konvolusi (CNN), Rangkaian Neural Berulang (RNN), Rangkaian Neural Buatan (ANN) yang menggunakan Memori Jangka Pendek Panjang (LSTM), dan juga teknik pembelajaran mesin seperti Mesin Vektor Sokongan (SVM). Penilaian lima pengelas dilakukan melalui pengesahan silang 5 kali ganda. Dapatan kajian menunjukkan cadangan rangka kerja ini mencapai ketepatan yang luar biasa dalam mengenal pasti keletihan pemandu, dengan kadar ketepatan purata 96.07%. Antara kesemua pengelas, pengelas ANN mencapai ketepatan paling ketara sebanyak 99.65%, dan pengelas SVM menduduki tempat kedua dengan ketepatan 97.89%. Berdasarkan keputusan analisis ciri operasi penerima (ROC) dan kawasan di bawah lengkung (AUC), didapati semua pengelas mempunyai prestasi cemerlang, dengan purata nilai AUC 0.95. Sumbangan kajian ini adalah terletak pada rangka kerja yang komprehensif dan berkesan mengesan keletihan pemandu secara tepat melalui isyarat EEG.
KEYWORDS: driver fatigue; electroencephalogram (EEG); z-score; deep learning
1. INTRODUCTION
According to statistics from the World Health Organization, roughly 127,000 individuals lose their lives in traffic accidents yearly, with nearly one-third of those casualties being teenagers and young adults [1]. Fatigue driving contributes to fatalities in road accidents, contributing to more than ten thousand deaths in a conservative estimate. Recently, some autonomous vehicles have proposed a warning system to prevent road accidents due to driver fatigue. The system would prompt drivers to take a break from prolonged driving by sounding an alarm in the vehicle, notifying the driver to stop driving and grab a coffee break.
Physiological signals such as electroencephalograms (EEG) are used to observe a driver's brain activities. A portable EEG system provides several advantages over other electroencephalography systems, including ease of operation, cost-effectiveness, portability, and few physical restrictions [2]. The presence of artifacts in EEG signals, such as muscle activity, eye blinking, and unwanted noise, can pose a significant challenge for analysis. Therefore, the current paper proposes using an independent component analysis (ICA) technique to eliminate such noise from the raw EEG signal. Numerous studies have suggested that an essential component of precise sleep stage detection is the analysis of multichannel EEGs [3]. Consequently, the present study considers multichannel EEG signals obtained from O1, O2, Fp1, Fp2, P3, P4, F3, and F4 for detecting driver fatigue states.

The features from a time-frequency domain, continuous wavelet transform (CWT) with enhanced modified z-score improved the accuracy of driver fatigue classification. It is important to choose the best features to get better results. The Morlet mother wavelet is a common practice in conventional CWT techniques due to its computational efficiency, surpassing other methods. This is because the Morlet wavelet involves fewer computations, most of which are performed through the fast Fourier transform, requiring less code [4].
In the field of data analysis and quality control, the identification of outliers is a crucial step in ensuring the accuracy and validity of statistical analyses. The z-score is a widely used method for detecting outliers in datasets, but it is susceptible to extreme values and is not considered robust in the presence of such outliers. The modified z-score was introduced to address this issue, which is less sensitive to outliers and has become a popular method for outlier detection in various applications. In recent years, the modified z-score has also been applied to feature extraction in machine learning and signal processing, where removing outlier values is crucial for accurate and robust analysis. This paper presents an enhancement of the modified z-score method for feature extraction in signal processing, specifically in driver fatigue detection using EEG signals.
Our proposed method has several strengths. First, using multichannel EEG analysis improves the accuracy of sleep stage detection, which is vital for accurate driver fatigue detection. Second, our use of enhanced modified z-score in feature extraction is more robust than conventional z-score techniques, making it more effective for removing outlier values and improving classification accuracy. Third, our approach utilizes various machine learning classifiers, providing a comprehensive and accurate method for driver fatigue detection.
This paper presents a methodology for the precise identification of distinct levels of driver drowsiness by utilizing diverse machine learning classifiers, including Convolutional Neural Networks (CNNs), Recurrent Neural Networks (RNNs), Artificial Neural Networks (ANNs) that incorporate Long Short-Term Memory (LSTM), and machine learning approaches like Support Vector Machines (SVM). A modified z-score technique to enhance the statistical feature of the classification was also introduced, which significantly improved the accuracy of the proposed method. To evaluate the effectiveness of this approach, a 5- fold cross-validation strategy was employed to distinguish between driver fatigue and normal states.
2. RELATED WORKS
Outlier detection is critical in various fields, including environmental monitoring, geology, epidemiology, and data mining. The modified z-score is a frequently employed technique for detecting outliers, which considers the weighted mean of adjacent data points to estimate the anticipated value of each point. Aggarwal et al. proposed a modified z-score method for detecting spatial outliers in datasets with spatial autocorrelation [5]. The technique improves the accuracy and robustness of the z-score test using a trimmed mean instead of the usual arithmetic mean. The study evaluated the method on simulated and real-world datasets and showed promising results in detecting spatial outliers. Although the modified z-score method proposed by Aggarwal et al. effectively detects spatial outliers with spatial autocorrelation, it may not perform well in datasets without spatial autocorrelation. Additionally, using a trimmed mean instead of the arithmetic mean may result in the loss of valuable information from the dataset.
Sandbhor et al. investigated the importance of detecting outliers in data mining and their effect on the quality and output of prediction models [6]. The study’s primary objective was to determine the most effective approach for detecting outliers in neural networks (NN) to forecast real estate values. The authors assessed several univariate outlier detection methods, such as Tukey’s Standard Deviation (SD), median, z-score, median absolute deviation (MAD), and modified z-score, on a set of 3,094 instances of property sales data. Based on the findings, it can be concluded that for this particular problem, the median technique proved to be the most efficient approach for detecting outliers. Although Sandbhor et al. found that the median technique was the most efficient approach for detecting outliers in neural networks for real estate value prediction, it is important to note that this conclusion may not necessarily apply to other types of datasets or outlier detection techniques. Moreover, the study only evaluated univariate outlier detection methods and did not consider multivariate techniques, which may be more effective in certain applications.
Leite et al. conducted a study to evaluate the effectiveness of the modified z-score as an indicator for identifying changes in entropy-based features to detect faults in bearings [7]. The research involved using 12 entropy-based features across the time, frequency, and time-frequency domains, in addition to three different entropy measures, namely Shannon entropy, Renyi entropy, and Jensen-Renyi divergence. The proposed technique was applied to process two real-bearing datasets obtained from experiments conducted until the point of failure. Furthermore, three bearings with different defects were examined to verify the performance of the entropy-based features. The results demonstrated that the modified score is a robust method for detecting changes in entropy-based features, highlighting its potential for early detection of anomalies in the vibration signals of bearings. This finding suggests that the proposed technique can be effectively utilized for fault diagnosis in bearings. However, it is important to note that the study only evaluated the effectiveness of the modified z-score method on two real-bearing datasets obtained from experiments conducted until the point of failure and three bearings with different defects.

Although outlier detection is a powerful tool for identifying unique data points, several limitations must be considered. For instance, in some cases, there may not be a clear definition of what constitutes an outlier, making it challenging to determine which data points to flag. Moreover, outlier detection methods may produce false positives or negatives, leading to incorrect conclusions and recommendations. Furthermore, choosing the appropriate outlier detection method for a specific dataset or problem can be complex, and there is no one-size-fits-all solution. Additionally, while outlier detection can identify anomalous data points, it may not always address the underlying cause of the outlier or provide a solution to the problem. Therefore, to get the most out of outlier detection, careful consideration of the goals and context of the analysis is essential. It is also important to use outlier detection in conjunction with other analytical tools and techniques to gain a more comprehensive understanding of the data and to develop effective solutions that address the root cause of any identified anomalies.
Several techniques have been suggested to identify the underlying mechanisms of fatigue in EEG signals. Among them, one method entails computing distinct types of entropies as feature sets based on a solitary channel [8]. Quintero-Rincon has presented a straightforward and efficient method for identifying driver fatigue in real-time systems using a single-channel EEG signal [9]. The algorithm selects the most significant channel and extracts four feature parameters to detect fatigue using an ensemble bagged decision trees classifier. By utilizing data obtained from the Jiangxi University of Technology database, the proposed approach achieves an accuracy of 92.7% with a 1.8-second time delay. However, it is important to note that the study evaluated the method on a specific dataset, and further research may be needed to determine its effectiveness on other datasets and under different conditions. Additionally, the time delay of 1.8 seconds may not be practical for real-time monitoring in some situations, and it is important to consider the potential impact on driver safety if there is a delay in detecting fatigue.
In another study, Jing et al. aimed to detect driving fatigue in low-voltage and hypoxia plateau environments using subjective and objective monitoring methods [10]. EEG signals from real-time driving tests were subjected to nonlinear and linear analyses to assess the signal trend during awake, critical, and fatigue states. The (α+θ)/β and (α+β)/θ energy features were identified as potential markers of driving fatigue in these environments, providing a basis for the development of a driving fatigue warning system. However, the study was limited to field driving fatigue tests in a specific environment, and further research is needed to validate the findings in other environments and driving conditions.
Additionally, Zhang et al. proposed an innovative approach known as clustering on brain networks (CBNs) to improve the performance of driver fatigue detection [11]. The CBNs approach employs a clustering algorithm to identify spatial nodes with unique connectivity features from electroencephalogram (EEG) data. The wavelet entropy features obtained from these nodes are then transformed into spatiotemporal images and examined using an image edge detection technique to differentiate between various stages of fatigue. This method reduces signal interference and detects fatigue before the onset of subjective feelings, making it a potentially useful tool for early warning and accident prevention. The research demonstrated the limitations of using EEG indicators in time and frequency domains for reliable detection of driver fatigue due to the challenge of signal mixing and limited sample size, lacking comparison with existing methods and validation in real-world driving scenarios. Then, the previous researcher proposed an intelligent system for automated driver fatigue detection utilizing EEG signals [12]. This system comprises a feature generation network that utilizes texture descriptors and a hybrid feature selection method to enhance detection accuracy. The proposed framework achieved an impressive classification accuracy of 97.29% for detecting fatigue using EEG signals, highlighting its potential for efficient driver fatigue detection. However, the proposed framework used traditional machine learning algorithms, which may limit its ability to adapt to complex and dynamic driving environments.
The proposed research introduces a novel approach for efficiently detecting driver fatigue using EEG signals [13]. The method employs a new channel selection algorithm based on correlation coefficients, an ensemble classifier using random subspace k-nearest neighbors (k-NN), and power spectral density (PSD) for feature extraction. The approach achieved an impressive accuracy of 99.99% for identifying driver fatigue using EEG signals in a 0.5-second time window. The proposed method demonstrates strong performance and can effectively detect EEG-based driver fatigue. However, due to its high computational complexity, a k-NN-based ensemble classifier may not be suitable for real-time applications. Hwang et al. proposed a subject-independent EEG-based driver fatigue state classification model in another study that addresses individual performance gaps [14]. The authors utilized an adversarial training approach to induce the misclassification of subject labels in the classification model. Additionally, they incorporated an Inter-subject Feature Distance Minimization (IFDM) technique to minimize performance discrepancies between individuals. Their method enabled training on EEG datasets with limited, subject labels and was evaluated on the SEED-VIG dataset, resulting in superior accuracy and decreased individual performance variability when classifying drowsiness. However, one of the major drawbacks is that EEG signals contain large differences between individuals, making it challenging to build a unified model that can perform well for all individuals.
The studies reviewed propose various methods for detecting driver fatigue using EEG signals, ranging from single-channel feature extraction to more complex machine-learning models. One common approach involves using power spectral density and various entropy measures as feature sets, while others utilize clustering algorithms and image edge detection to distinguish different stages of fatigue. Several studies also address individual performance gaps and subject variability by employing adversarial training strategies and component-specific batch normalization. These studies demonstrate the potential of EEGbased driver fatigue detection for early warning and accident prevention, achieving high accuracies and providing new possibilities for extracting more information from complex EEG data. However, the methods vary in computational complexity, the number of channels required, and the level of subject independence achieved, suggesting that further research is needed to identify the most efficient and effective approach for practical applications.

Wilapiprasitporn et al. proposed a deep learning approach that combines CNN and RNN to identify individuals using affective EEG data [15]. Their study used the Database for Emotion Analysis using the Physiological Signals (DEAP) dataset and showed that the proposed method outperforms an SVM baseline system with a Correct Recognition Rate (CRR) of up to 99.90-100%. Recent research suggests that CNN-GRU models outperform CNN-LSTM models in identifying individuals using EEG data from the brain’s frontal region, and they are effective at countering the impact of affective states. However, the proposed method relies on EEG signals, which may require specialized equipment and data collection and analysis expertise. Qin et al. proposed a deep-learning model that combines CNN and LSTM to extract vein features from raw images for finger-vein biometrics [16]. The proposed model uses supervised encoding to eliminate binary vein texture, resulting in significantly improved verification accuracy when evaluated on a publicly available finger vein database. However, deep learning models are prone to overfitting, learning the training data too well, and failing to generalize to new data. Techniques such as regularisation and dropout can help prevent overfitting.
Mondal et al. developed a multitask learning framework using a CNN and a bidirectional long short-term memory (Bi-LSTM) model to analyze surgical workflows from video data [17]. Their framework included a joint distribution loss function for concurrent tool usage during phase identification. The proposed method demonstrated excellent tool and phase identification performance compared to previous approaches when evaluated on the Cholec80 dataset. However, the limitation of this study was that it was only evaluated on a single dataset, and it is unclear how well the proposed approach would generalize to other surgical datasets. Hu et al. proposed the Deep Complex Convolution Recurrent Network (DCCRN), a network architecture that can handle both CNN and RNN structures and replicate complex-valued operations [18]. In the Interspeech 2020 Deep Noise Suppression (DNS) challenge, DCCRN outperformed previous networks based on objective and subjective metrics and obtained the top rank for the real-time track and the second rank for the non-real-time track based on Mean Opinion Score (MOS). The proposed DCCRN network with 3.7M parameters proved highly effective in this task. However, the study focused on speech enhancement in clean environments and did not consider noisy or reverberant conditions common in real-world scenarios.
Researchers proposed a machine learning model that utilized CNN, U-net architecture, RNN, and LSTM architecture to create structural topology configurations that fulfilled minimum compliance and deformation criteria under various load conditions and volume fraction limitations. The model was trained using randomly generated finite element simulation data and a strategy to remove elements during training. The model outperformed traditional methods regarding time, cost, and practicality when applied to two-dimensional and three-dimensional cantilever-beam structural topology designs. This data-driven approach can speed up preliminary structural design procedures [19]. However, the study’s limitations include the need for training data and the lack of validation on real-world applications. Later, other researchers focused on improving solar radiation estimation models in agriculture meteorology due to limited data availability and low data quality [20]. Several neural network models (SVM, Extreme Learning Machine, CNN, and LSTM) were developed and tested in Southern Spain using different input variable configurations. Performance was analyzed using various statistical indices. One limitation of this study is that it only focused on using temperature and relative humidity as input variables for solar radiation estimation. Other climatic variables that can affect solar radiation, such as atmospheric pressure, cloud cover, and wind speed, were not included in this study. Incorporating these variables could potentially improve the accuracy of solar radiation estimation.
The previous works discussed different deep learning approaches for various applications, including affective EEG-based person identification, finger-vein biometrics, surgical workflow analysis, speech enhancement, and structural topology design. The proposed models showed significant accuracy, efficiency, and applicability improvements over previous methods. Different deep learning architectures, such as CNNs, RNNs, and LSTM, extracted features from raw data, such as EEG signals, video data, and simulation data. The models were evaluated on different datasets and achieved state-of-the-art results regarding recognition rate, mean average precision, and mean opinion score. Additionally, deep learning models were used to improve solar radiation estimation models in agriculture meteorology.
In conclusion, outlier detection is a valuable tool for identifying anomalies in data. However, its limitations must be carefully considered, such as the lack of a clear definition for what constitutes an outlier, the possibility of false positives or false negatives, and the challenge of choosing the appropriate method for a specific dataset or problem. EEG-based driver fatigue detection has shown great potential for early warning and accident prevention using various deep learning methods, achieving high accuracies and extracting more information from complex EEG data. Moreover, deep learning has significantly improved accuracy, efficiency, and applicability for various applications, such as affective EEG-based person identification, finger-vein biometrics, surgical workflow analysis, speech enhancement, and structural topology design. Further research is needed to identify the most efficient and effective approach for practical applications in outlier detection and deep learning.
【For more info:george.deng@wecistanche.com / WhatApp:8613632399501】






