NeuroBrowse (2): drafting a machine learning model

Among other things, while working on the NeuroBrowse project with the start-up Mensia Technologies, we were asked to use machine learning to provide an automatic detection of anomalies in EEG recordings.

[projectinfo languages=”python,html/css,javascript” team=”5″ duration=”6 months”]

As mentioned in my first article, the NeuroBrowse project was separated into two parts: first, the development of a prototype of the web application; second, an application of machine learning basic principles to the detection of anomalies in EEGs.

Indeed, machine learning seems to be the ‘in’ thing today and it is no surprise most current engineering formations offer courses on this topic. While being a little too simplistic, I would define it as a two-phased process that starts by ‘inputting known data into a computer and teaching it what magic is’, then ‘giving the same computer unknown data and letting it work the aforementioned magic to predict things about it’. Sometimes, you don’t even have real information on the first dataset and you trust the computer to find out what magic is on its own – that is the difference between supervised and unsupervised machine learning.

Let me give an example to make things more concrete: imagine you are a real estate agent and you are asked to price a house. How would you go around and do that? You’d probably look at your past experience and at some references to establish a common pattern that links the price of houses to some of their characteristics – like their surface area, their location, their style, etc. From that, you could list these criteria for the house you are to value and predict a possible price. In fact, it is pretty much how a machine learning model would do it, too.

You could also apply this technique to decide of the roles of players in a football team: by comparing their builds – weight and height -, you’d be able to classify them in different groups that correspond to different positions on the field.

A quick overview of the machine learning part

Since a more thorough explanation of the app’s features is given in the other article, I will just focus on the machine learning part in this one.

Our goal in the NeuroBrowse project was to identify the best way to detect anomalies in an EEG epoch. It was an exploratory process that started with a recent article Mensia had not yet had time to look up, on automatic bad channel detection. Following some reflexions offered in this source, we were able to design a simple logistic regression model to check for probable anomalies in an EEG. Even though it is far from perfect, our research provides Mensia with some information for their future tests. In NeuroBrowse, the model is used to predict whether electrodes in the part of the EEG you are currently seeing could be problematic.

EEG anomalies prediction (in red: the possibly bad channels, in green: the okay ones)

Since the web application itself relies on the Django framework, coded in Python, we decided to continue on with this programming language; therefore, our model was developed with the well-known scikit-learn tool. This very powerful and yet understandable library implements everything we needed for this basic incursion in the world of machine learning. Indeed, we only examined simple models and standard procedures – such as training/test set division, K-fold validation, accuracy or AUC computation…

How does machine learning work?

Just a quick disclaimer: I don’t pretend to really explain the fundamentals of machine learning here – it is way too vast of a subject to be treated so lightly! I will just give you a few pointers on how I think machine learning can be approached.

Broadly speaking, the design of a model can usually be divided in 3 steps: the study of the data to extract relevant features and choose the kind of model to implement, the training of the model on one part of the data and its validation on the rest. You can also add an intermediary process of model optimization before its training by fine-tuning its specific parameters – because it is important to distinguish between the instance of your model (that holds several variables which are tweaked to fit your specific data more accurately) and the model itself (that relies on a set of inherent parameters called ‘hyperparameters’, are the same for all instances and determine the model’s general behavior).

Schematic representation of a machine learning model design process

Choosing the right features and the right model is not easy and there is not always only one good solution. You often have to mix experience with lucky guesses to get something truly adapted to your problem.

When observing your data, you may identify particular characteristics that help you elect a model – in particular, a basic selection criterion is whether your data is quantitative, meaning in numerical form, or qualitative. Or you can be limited in computing time or memory and be forced to choose a simpler model (for example, logistic regression requires much less storage space than neural networks!).

Now, it is essential to handle your data well, too. At first, I personally had the feeling having a huge amount of data was great because it meant more info. The problem is, if you don’t use it correctly, you are more likely to compute a crazy model than anything else.

To begin with, you have to be aware of what part of the data you exploit where: if you decide to take all of your data to train your model, and re-use the same data to check it, then you’ll get amazing results, because you are just showing your model what it already knows. So, it is sort of cheating by learning all the answers by heart. To avoid that, you divide your data into several sets that are each used at one different step: in general, you have a training dataset and a testing dataset but you can also have a supplementary validation dataset for your hyperparameters tuning.  By cutting your data this way, you are more likely to avoid the problem of ‘overfitting’ where your model performs insanely well, but only on your data.

Another important analysis to make is how balanced your training set is. In our case study, we wanted to identify ‘good’ and ‘bad’ epochs in EEGs: we needed to have roughly the same amount of ‘good’ and ‘bad’ examples to train our model, or else it would just learn  about one type of data and never predict the other side of the coin.

To train your model even better, you can use the k-fold method to exploit your data to their maximum: instead of parting your data in train/test datasets once, you do it ‘k’ times and train with different data each time.

Common data splitting methods

What about NeuroBrowse?

For NeuroBrowse, we worked in supervised classification with 6 reference files where time blocks lasted 2 seconds and had been labelled ‘correct’ or ‘abnormal’. After balancing the dataset, we had over 50 000 observations to train and test our model.

From the reference article and after discussing it with our tutor at Mensia Technologies, David Ojeda, we identified several features to represent a two-seconds epoch of an EEG channel. Some only consider the information of this channel (the ‘one-channel features’), others also take into account the value of the nearby channels (the ‘multi-channel features’).

One-channel features Multi-channel features
Hurst exponent Average correlation coefficient
R2 determination coefficient Normalized variance
(for δ, θ, α, β and γ frequency ranges)
Normalized amplitude

Statistical analysis show that not all of these features are as relevant for EEG anomalies detection, however we preferred to keep the entire set because it did give us a better model. Therefore when a NeuroBrowse user asks for an analysis of the EEG epoch he is currently watching, the model first computes these features for each channel and then gives the resulting numbers to the model to crunch and predict from.

Our research was directed at epochs classification: our model has to predict a binary variable that can either take the ‘good’ or ‘bad’ value. So we searched for machine learning models that take quantitative values as inputs and output a ‘yes/no’ type of answer.

Even though many exist, we decided to focus on 4 of them: the logistic regression, the k-nearest neighbours, the support vector machines and the neural networks. We had two objectives in doing so: learning more about the mathematical concepts behind each of these models and comparing their different complexity/efficiency ratios. As explained by our tutor:

‘It is better to have a simple and light model. No use in taking an overly complex model just to gain a little precision.’ (D. Ojeda)

In the end, given our models gave similar results – meaning their were all about as accurate in their predictions on the testing dataset -, we chose to implement the logistic regression in NeuroBrowse because it is very simple, very light and easy to interpret (basically, once trained, its variables directly indicate the importance of each of our features in determining the epoch ‘correct’ or ‘abnormal’ type).

Although we did not optimize the hyperparameters of our model, we already obtained satisfactory results with a 75% accuracy, which means that when given new testing data, it gets the right answer for 3 out of 4. This value is not great but it is better than a random pick, so it is still interesting for a first trial.

Issues & perspectives

It’s worth noting that our supervised classification relied on labels given by another algorithm the start-up was used to based on Riemannian geometry. So, in a way, our model is sort of dependent on the performance of this first algorithm in its predictions.

To improve our model, we could fine-tuned the hyperparameters or add other features. For instance, this article suggests looking at the kurtosis or some patterns in the EEG spectrum.

    1. Mensia Technologies’ website:
    2. scikit-learn. (
    3. F. Villers, “Analyse de données : Classification supervisée ou Analyse discriminante,” 2017. [Course notes for MAIN4 – Polytech Sorbonne].
    4. V. Tuyisenge, L. Trebaul, M. Bhattacharjee, B. Chanteloup-Foret, C. Saubat-Guigui, I. Mindruta, S. Rheims, L. Maillard, P. Kahane, D. Taussig, and O. David, “Automatic bad channel detection in intra- cranial electroencephalographic recordings using ensemble machine learning,” Clinical Neurophysiology, Dec. 2017. (
    5. A. Delorme, T. Sejnowski, and S. Makeig, “Enhanced detection of artifacts in EEG data using higher-order statistics and independent component analysis,” NeuroImage, vol. 34, no. 4, pp. 1443–1449, Feb. 15th, 2007. (
    6. Sunil Ray, “Essentials of Machine Learning Algorithms (with Python and R Codes).” (, Sept. 2017.

Leave a Reply

Your email address will not be published.