How Deep Learning is changing machine learning AI in EEG data processing

How Deep Learning is changing machine learning AI in EEG data processing

15 Min.
By the Bitbrain team
April 23, 2020

EEG systems capture information about many different aspects of our cognition, behavior, and emotions. The technology not only helps to study the brain, but also has applications in health, in affective and emotional EEG monitoring, and in human improvement. However, EEG data is not easy to interpret: it has a lot of noise, varies significantly between individuals and, even for the same person, changes substantially over time. In this post, we will discuss how AI and machine learning are used to process data and how new trends, ie. deep learning, are changing the way EEG will be analyzed in the future. 

Why do we need machine learning for EEG data?

EEG signals record the electrical activity of the brain using EEG electrodes placed on the scalp. They are noisy, have artifacts, and, above all, they are not the type of signals people are used to deal with (images, charts,...). Doctors, neuroscientists, and biomedical engineers usually receive training for years to understand and extract meaningful information from EEG data. 

Even in these cases, the raw recorded data needs to be processed before specialists look at it. Temporal and spatial filtering is usually applied, as well as artefact rejection procedures, even if the participant is still during recording. This processed EEG can then be visually inspected to detect anomalies (e.g. episode of epilepsy), changes in the mental state (e.g. sleep phases) or to study grand average responses of groups of people. 

Visual inspection is a long, expensive, and tedious process. It does not scale up well and cannot be transferred to BCI applications. AI and machine learning tools are the perfect companion to automate, extend, and improve EEG data analysis. Indeed, BCI systems such as spellers or brain-controlled devices are based on decoding pipelines that use extensively different machine learning algorithms. 

Pre-deep learning era: Signal processing, EEG feature extraction, and classification

Before the deep learning revolution, the standard EEG pipeline combined techniques from signal processing and machine learning to enhance the signal to noise ratio, deal with EEG artefacts, extract features, and interpret or decode signals. Figure 1 shows the most common pipeline when processing EEG. 

Eeg Data Processing PipelineFigure 1: The three important steps when processing EEG: 1) Pre-processing deals with noise, artefacts, and SNR enhancement; 2) feature extraction further processes the signal to create meaningful descriptors for the decoding task at hand; and 3) decoding uses classification/regression models to transform the EEG features into high-level signals such as letters in a speller, directions of motion, affective or cognitive states or clinical markers. 

EEG signal - Preprocessing

From a computational point of view, the raw EEG signal is simply a discrete time multivariate (i.e. with multiple dimensions) time-series. The number of EEG channels determines the dimension of each point of the time series. Each point time corresponds to an EEG sample acquired at the same time point. The number of points in the time series depends on the recorded time and the sampling rate (e.g. 256 Hz). These raw signals are rarely used since they (may) contain DC offsets and drifts, electromagnetic noise, and artifacts that need to be filtered out. Signal processing is used in the first steps to remove noise, filter out artifacts, or isolate an improved version of the signal of interest. Noise and artifacts are such an important part of the analysis of the EEG signals that a whole body of literature has studied and continues to study this problem. You can learn about this in our dedicated post All about EEG artifacts and filtering tools.

Once the signal is clean, it is time to enhance and uncover the brain patterns and neural correlates of interest. In many cases, the brain processes under study are located in a particular frequency band, such as the P300 evoked response that occurs in the Theta band (4-7 Hz) or the modulation of the sensorimotor or mu rhythms, which occur between 8 and 15 Hz. The simplest processing is to use frequency filters, such as low-pass or band-pass filters, to isolate the bands of interest and remove those frequencies of no interest. Figure 2 shows the spectrum of EEG activity and the most common bands used when analyzing EEG correlates. 

Eeg Bands Brain Waves

Figure 2: Left: EEG spectrum for two different conditions: focused vs. distracted. Based on it, one may select the frequency range shaded in Grey to distinguish these two conditions. Right: EEG activity filtered on the most common bands. The Gamma Band [30--140Hz] also shows correlated activity with cognitive processes and shows alteration for cognitive disorders.  

EEG Signal Feature extraction

After pre-processing, it is time to extract meaningful features from the cleaned EEG data. In the pre-deep learning era, feature extraction was based on ad-hoc methods for the brain process of interests that range from hand-crafted features to more sophisticated techniques, such as linear and non linear spatial filtering. The latter range from generic methods such as principal component analysis and independent component analysis, to more EEG specific ones such as CSPs (Blankertz, 2007) and variants (Ang, 2008) for power features and X-Dawn (Rivet, 2009) for temporal ones. Figure 3 shows one of the simplest feature extraction methods, which basically subsamples directly in the temporal or frequency domain of the cleaned EEG signals.

Eeg Feature Extraction Methods

Figure 3. An example of the simplest feature extraction. Time domain (left): a one second window of theta band filtered EEG signal is subsampled and the corresponding values stacked on a feature vector. Frequency domain (right): the power spectrum of the EEG over a given window is computed and the power on a given frequency range is stacked into another feature vector. These vectors can be used independently or combined for further processing.

The extracted features are usually tailored to the specific application, such as finding differences between experimental conditions (e.g. levels of attention, responses to mismatched actions), distinguishing between a group of predefined classes (e.g. a speller), predicting behavior (e.g. by anticipating motion in neurorehabilitation), finding anomalies with respect to a normative database (e.g. QEEG or seizures). Current state-of-the-art techniques include Riemannian geometry-based classifiers, filter banks, and adaptive classifiers, used to handle, with varying levels of success, the challenges of EEG data (Perronnet, 2016, Lotte 2018).

EEG Data Decoding

Once features are ready, it is time to use the information to automatically decode EEG. The most common approach is supervised learning. This uses a set of examples known as the training dataset to learn a model that can classify, predict, or identify the EEG patterns based on the extracted features. A large variety of methods exist. The most common are classification methods, which classify an EEG pattern into one of a set of predefined classes, or regression methods, which transform the EEG pattern into another signal such as a motion direction. Used methods include simple linear methods (LDA for classification and Multiple Linear Regression), SVM like kernel methods, random forests, neural networks (see Section 4 for the Deep Learning methods), or a more sophisticated combination of methods. 

Whatever the method, the supervised approach needs to have a labeled training dataset. This dataset is used to train and evaluate the method, normally using cross-validation. There are some important considerations for EEG decoders due to the non-stationary and subject dependent EEG nature: 1) the extracted features for one person at a certain point in time may not be well-suited for the same person later on; and 2) the features for a particular participant may be different than the features for another participant. In technical terms, the distribution of the features changes, and the models need to be retrained on an updated training dataset. Initially, decoders were participant and session-specific, i.e. a dedicated training set is acquired for each participant and session. In practical terms, this has a big impact on the effort that has to be made to build and train these models and in the deployment of them out of laboratory settings. Calibrating each participant is an expensive and tedious process! 

To overcome this limitation, several different methods are available. Current techniques aim at minimizing this calibration process and attempt to design robust methods that work over time and across participants (Lopéz-Larraz, 2018).

There is one last point that deserves discussion. Up to now, we have assumed that we know exactly at what point in time we have the relevant EEG information. Although this is the case for many applications (e.g. an EEG speller), in many other BCI and neurotech applications, this assumption does not hold. Consider, for instance, detecting an epilepsy seizure at home, or detecting the intention of moving a limb during a neurorehabilitation session. In this setup, it is necessary to process EEG seizure online or in an asynchronous manner. This adds an additional challenge to the decoding task: it is not enough to distinguish the patterns of interest, but one also has to deal with background EEG.

All the previous processing has to be extended or adapted to obtain such asynchronous decoding. The simplest way is to use a sliding window where we compute our output for each window independently (see Figure 4 for a motion decoding example). During training, background EEG is labeled as “rest, while the onset of motion is extracted using some calibration protocol such as EMG activity or buttons. The same supervised learning algorithms can then be applied to learn the decoder. The latter can then be used over a sliding window to provide a continuous decoding.

Eeg Online Detection

Figure 4: Example of online detection using a sliding window. During training, windows are extracted from the training examples to train the classifiers. In real-time, decoding is done for each window independently. The output, in this case, is a probability along time that can be further smoothed, if required.

Deep-Learning for EEG

Deep learning has radically changed machine learning in many domains (e.g. computer vision, speech, reinforcement learning, etc.) by providing general purpose and flexible models that can work with raw data and learn the appropriate transformations for a problem at hand. These models can use large amounts of EEG data to directly learn features and capture the data structure in an efficient way that can be then transferred and/or adapted to different tasks. This end-to-end learning ability fits perfectly with the requirements of EEG analysis, where multiple interdependent processes are key and, until recently, were carefully designed for each different purpose. 

Deep learning EEG challenges

EEG data has its own challenges.

  1. First, data collection is still expensive, time-consuming, and restricted to a small number of teams working mainly in research laboratories. Medical data is not usually available due to personal data regulation, and data collected from companies is also private due to the same reason. Consequently, the corpus of data is by no means close in size to other domains such as computer vision or speech recognition. Many public EEG datasets only have a small number of participants, the order of tens (see Google Dataset search for EEG datasets and the BNCI database for BCI datasets). Some fields such as sleep and epilepsy do have public larger datasets with thousands of participants. For epilepsy, the Temple University Hospital dataset has over 23,000 sessions from over 13,500 patients, for a total of over 1.8 years of recording (Obeid, 2016). In sleep, the Massachusetts General Hospital Sleep Laboratory has over 10,000 participants with 80,000 hours of recordings, which is over 9 years.
  2. Second, the amount of information is limited due to a low signal to noise ratio and depends heavily on the data collection protocol. This limits, even more, the amount of available data and makes the transfer between protocols and participants more difficult. 
  3. Third, models developed for images and speech have been studied for many years and, although technically generic, they are not necessarily the most appropriate ones for EEG. This also includes many good strategies used for training the models that cannot be so efficiently implemented in EEG, such as data augmentation techniques for images (Hartmann, 2018).

How can deep learning be used for EEG decoding?

These challenges have not stopped researchers and practitioners from using deep learning, and the last 10 years have seen a fast increase of results across all the fields related to EEG data. There has been an increasing interest in using this type of technique. A very interesting review over more than 100 papers sheds light on the current state of the art. Figure 5 shows how the main fields of application of EEG data analysis have tried deep learning and what deep models are the most common. There is still no clear dominant architecture. Many of those applied to EEG have been directly borrowed from previous applications such as computer vision. Therefore, convolutional neural networks (CNNs) are the most common architecture, while autoencoders and recurrent networks are also used often. 

Statistics on Dl Applied to Eeg Data Publications

Figure 5: Statistics on DL applied to EEG data copied from (Roy, 2019): Number of publications per domain per year (left) and type of architectures used (right). 

In most cases, the deep learning methods perform feature extraction and decoding simultaneously (see Figure 1) and they use the same supervised approach described in Section 2. In many cases, the pre-processing is simplified, for example, by computing power features or segmenting the input data. Interestingly, some deep models have shown end-to-end decoding performance, improving previous methods while dealing directly with common EEG issues such as eye motions (eye-opening and closing, blinking, etc.), artifacts, or background EEG. 

What are the results of Deep Learning EEG decoding? And, how do results compare to previous methods? 

The authors of the meta-review in Roy (2019) have computed the median improvement in accuracy to be around 5.4% consistently across all domains shown in Fig. 6. Although they also point out some reproducibility concerns, the results show that, despite the challenges mentioned above, Deep Learning improves decoding results - in many cases, with minimal or no pre-processing. One interesting consequence of using data-hungry deep learning techniques is that the standard participant/session-specific setup has been substituted for a more ecological one where all sessions and participants contribute for decoders. To give a more detailed view, we highlight results in three different applications that are relevant for understanding the current state of the art:

  • Mental task decoding: Perhaps the most successful deep learning models are Convolutional Neural Networks. In this first example, we will look at how they can be used to decode end-to-end mental tasks from EEG. End-to-end decoding exploits deep architectures by learning maps directly from the raw data and obtaining high-level representations, in this case, the mental imagination of the user. Convolutional networks were designed in computer vision and they can be seen as shift-invariant spatial filters over the image. They learn local features that are then used to create higher-level features in deeper layers of the neural network.

    This same idea can be used to create time and frequency filters that automatically learn features from the raw EEG, and then learn more complex features from there. Figure 6 shows the ConvNet architecture proposed in (Schirrmeister, 2017). The results show that an end-to-end mapping can perform as well as filter bank common spatial patterns (FBCSP), the current state of the art method developed specifically for motor imagery (Ang et al., 2008). While FBCSP is designed to use spectral power modulations, the features used by ConvNets are not fixed a priori. For those interested in detailed technical implementation aspects, (Schirrmeister, 2017) provides a deep insight on how recent developments in regularization and normalization can make a big difference when training your model. 

Deep Learning Eeg Motor Imagery

Figure 6: ConvNet, a deep learning architecture based on CNNs for end-to-end decoding of motor imagery. The raw data enters at the first layer (top) and then creates higher-level features that capture brain patterns to distinguish between left/right hand and foot motor imagery. Image borrowed from (Schirrmeister, 2017).


  • Sleep EEG decoding: The SLEEPNET model (Biswal, 2016) has been trained and evaluated in the largest sleep physiology database assembled to date, consisting of polysomnography (PSG) recordings from over 10,000 patients from the Massachusetts General Hospital (MGH) Sleep Laboratory. SLEEPNET implements a recurrent network and achieves human-level annotation performance on an independent test set of 1,000 patients, with an average accuracy of 85.76% and algorithm-expert inter-rater agreement (IRA) of κ= 79.46%, comparable to expert-expert inter-rater agreement. This represents a 10% increase in accuracy over non deep learning methods. For those interested, Cohen’s kappa κ is used in sleep studies to measure agreement between different annotations of the sleep done by medical doctors, which have an inter-rate agreement around 65-75%. Figure 7 below shows an example of annotated sleep EEG and predicted states. 

Sleep Eeg Raw Eeg Data and Spectrogram

Figure 7: Image from (Biswal, 2017). From top to bottom: Raw EEG data and spectrogram, human labels, and predicted ones. 

  • Decoding affective states. One of the most common databases for decoding affective states is DEAP (Koelstra, 2011). It consists of 40 minute EEG and other biosensors recordings while watching music videos for a total of 32 subjects (see Figure 8 for some examples). It has been widely used to evaluate deep learning techniques with different architectures from autoencoders, CNNS, recurrent networks, and hybrid approaches, and with different preprocessing pipelines by varying the number of channels and including raw data or simple features (e.g. PSD). Current results obtain accuracies between 70 and 80% (see Craik, 2019) for a complete review and (Li, 2020) for recent results and comparisons with non deep learning approaches). 

Eeg Music

Figure 8: Example of music videos and distribution of labels according to the users. Image copied from the DEAP dataset webpage.

  • Epilepsy detection: The last method is about detecting patterns of clinical interest in brain activity that might be useful in diagnosing brain disorders, in particular, brain patterns related to epilepsy. In contrast with the previous examples, the system described in (Golmohammadi, 2019) uses a hybrid approach (see Figure 9) that combines deep learning with dynamic hidden Markov models and statistical language modeling techniques to capture expert knowledge and include it in the system. In addition to this, it has dedicated feature extraction tailored to detect three brain patterns that occur during epileptic episodes (spike and sharp waves, periodic lateralized discharges, and generalized periodic discharges) and three patterns to model artefacts, eye movements, and background noise. The model was trained and evaluated in the TUH EEG Corpus (Obeid, 2016), which is the largest publicly available corpus of clinical EEG recordings in the world. It achieved a sensitivity above 90% while maintaining a false alarm rate below 5%, which may be enough for clinical practice. 

Eeg Feature Extraction Pipeline

Figure 9: The complete pipeline includes feature extraction and three steps. Only the second one uses deep learning to process the outputs of the HMM. The third step imposes some contextual restrictions modeled by experts. 

The previous examples show that deep learning techniques are now present in all EEG decoding applications and represent the current state of the art. There are still many open questions, such as which models work best, and whether EEG- specific models and algorithms are needed 

For those interested in the technical details of how the different networks have been used with EEG, we recommend consulting some very complete reviews (Roy, 2019; Craik, 2019) that provide references to the appropriate works. Most of the results have been obtained using public datasets and code is available in the corresponding repositories (see for instance, the braindecod github for a complete deep learning decoding using CNN networks (Schirrmeister,2017)). 

A final note of caution

Beware of the hype! The increased number of EEG experiments or studies claiming better results with deep learning have not been free of controversy. Reproducibility and, when possible, comparison against well based established baselines are a must, and their lack should be treated carefully when evaluating any claims. Interestingly, Roy (2019) points out that only 7% of the reported results provide both the software (19%) and the datasets (54%) required to evaluate and replicate the method. Sometimes there are sensitive reasons for not providing source and datasets, such as privacy in medical records, or the need to exploit the dataset or the code for your own research before making it public. Nevertheless, nowadays these good practices are becoming more common and, in some cases, are required to publish the data. They are always a good indicator of the quality of the work and a good starting point for your own projects. 


  • Perronnet, L., Lécuyer, A., Lotte, F., Clerc, M., & Barillot, C. (2016). Brain-Computer Interfaces 1: Foundations and Methods.
  • Lotte, F., Bougrain, L., Cichocki, A., Clerc, M., Congedo, M., Rakotomamonjy, A., & Yger, F. (2018). A review of classification algorithms for EEG-based brain–computer interfaces: a 10 year update. Journal of neural engineering, 15(3), 031005.
  • Rivet, B., Souloumiac, A., Attina, V., & Gibert, G. (2009). xDAWN algorithm to enhance evoked potentials: application to brain–computer interface. IEEE Transactions on Biomedical Engineering, 56(8), 2035-2043.
  • Blankertz, B., Tomioka, R., Lemm, S., Kawanabe, M., & Muller, K. R. (2007). Optimizing spatial filters for robust EEG single-trial analysis. IEEE Signal processing magazine, 25(1), 41-56.
  • Ang, K. K., Chin, Z. Y., Zhang, H., & Guan, C. (2008, June). Filter bank common spatial pattern (FBCSP) in brain-computer interface. In 2008 IEEE International Joint Conference on Neural Networks (IEEE World Congress on Computational Intelligence) (pp. 2390-2397). IEEE.
  • Roy, Y., Banville, H., Albuquerque, I., Gramfort, A., Falk, T. H., & Faubert, J. (2019). Deep learning-based electroencephalography analysis: a systematic review. Journal of neural engineering, 16(5), 051001.
  • Craik, A., He, Y., & Contreras-Vidal, J. L. (2019). Deep learning for electroencephalogram (EEG) classification tasks: a review. Journal of neural engineering, 16(3), 031001.
  • Schirrmeister, R. T., Springenberg, J. T., Fiederer, L. D. J., Glasstetter, M., Eggensperger, K., Tangermann, M., ... & Ball, T. (2017). Deep learning with convolutional neural networks for EEG decoding and visualization. Human brain mapping, 38(11), 5391-5420.
  • Golmohammadi, M., Harati Nejad Torbati, A. H., Lopez de Diego, S., Obeid, I., & Picone, J. (2019). Automatic analysis of EEGs using big data and hybrid deep learning architectures. Frontiers in human neuroscience, 13, 76.
  • Biswal, S., Kulas, J., Sun, H., Goparaju, B., Westover, M. B., Bianchi, M. T., & Sun, J. (2017). SLEEPNET: automated sleep staging system via deep learning. arXiv preprint arXiv:1707.08262.
  • Obeid, I., & Picone, J. (2016). The temple university hospital EEG data corpus. Frontiers in neuroscience, 10, 196.
  • Koelstra, S., Muhl, C., Soleymani, M., Lee, J. S., Yazdani, A., Ebrahimi, T., ... & Patras, I. (2011). Deap: A database for emotion analysis; using physiological signals. IEEE transactions on affective computing, 3(1), 18-31.
  • Li, X., Zhao, Z., Song, D., Zhang, Y., Pan, J., Wu, L., ... & Wang, D. (2020). Latent Factor Decoding of Multi-Channel EEG for Emotion Recognition Through Autoencoder-Like Neural Networks. Frontiers in Neuroscience, 14, 87.
  • López-Larraz, E., Ibáñez, J., Trincado-Alonso, F., Monge-Pereira, E., Pons, J. L., & Montesano, L. (2018). Comparing recalibration strategies for electroencephalography-based decoders of movement intention in neurological patients with motor disability. International journal of neural systems, 28(07), 1750060.
  • Hartmann, K. G., Schirrmeister, R. T., & Ball, T. (2018). EEG-GAN: Generative adversarial networks for electroencephalograhic (EEG) brain signals. arXiv preprint arXiv:1806.01875.

You might also be interested in:

Versatile EEG
Mobile water-based EEG (8/16/32/64 ch).
Learn more