7. Artifact correction (ICA)

Dealing with artifacts - ICA (or ‘help, my data looks a mess’) #


Intro

Preamble: The most important thing to mention here is that there is no substitute for good quality data. You must ensure that the data you collect is the best it possible can be, because there is nothing you can do during pre-processing that can compensate for bad data. That said, even the best data will almost always have periods of noise. Why? because participants are only human. Give them a break in a testing session and they will almost inevitably move more than you thought humanly possible in the space of 30 seconds.

Most of this noise is impossible to deal with. Muscle movements, eye movements coughs and yawns are simply magnitudes larger than the brainwaves we’re interested in measuring. Most of this data will therefore have to be rejected (see 11. Artifact rejection for more info). But this will (by definition) reduce the number of trials you have per condition, and trials = power, and with great power comes great papers, or so the old saying goes.

Enter Independent Component Analysis (ICA), a means ‘remove’ the blinks from your data. However, there are some down sides (see FAQ), so don’t be fooled into thinking that ICA is a magic cure with no repercussions. Nonetheless, done properly, ICA can offer you a means through which blinks can be deal with without losing swathes of your precious data.

What does ICA do? In overly simplified terms, ICA involves separating out the different sources (e.g., muscle movements, eye blinks, other eye movements, brain activity) that contribute towards your EEG data. Separating the signal into these functionally distinct sources enables us to subtract unwanted sources out (e.g., blinks), leaving beautifully clean EEG data. However, ICA was never originally intended for EEG data, and EEG data violates assumptions of ICA (most researchers know this but use it anyway as an imperfect fix), so it is important to have a solid understanding of the pitfalls of ICA before you subject your data to this technique, so I thoroughly recommend reading around before you charge on in. Because ICA changes our data in ways that aren’t always easy to determine, you should always use it for blinks with a degree of caution, sideways eye movements with even more caution, and not much else (unless you’re very confident in what you’re doing).

Stage 1: Preparing your data for ICA artifact correction

So you’ve decided to go ahead with ICA. The first thing you need to do is clean you data. This is because (assuming you’re using your whole dataset for ICA training) you need to help the algorithm to successfully identify blinks as opposed to other sources of noise. Importantly, the number of independent components is (by necessity) always equal to the number of channels in your dataset. Because of this, you don’t want a 20 second coughing fit that your participant had half way through the session to ’take up’ 15-odd components. There are two main approaches to preparing your data for ICA. One is to create a full new dataset on which to run ICA and then transfer the ICA weights into your original dataset. The second is to prepare your existing dataset for ICA, and simply run it on that. The latter is outlined in the video below, whilst I show you how to do the former in the last video in this series.


Video

Step 2: Running ICA

Now that you’ve cleaned the data, you’re ready to run ICA. The video below will guide you through this to decompose your data by ICA in EEGLAB.


Video

Step 3: Selecting ICA components

Working out what’s a blink. This is probably one of the ERP pre-processing stages that demands the most experience, but there are a number of resources to help you become familiarised with the process of identifying occular activity, such as this incredibly helpful UCSD Tutorial. The video below will guide you through how to select and remove eye-movement components, but depending on your actual dataset the output from this can be confusing. Always air on the side of caution, and if you’re not sure you should remove a component, don’t.


Video

Step 4: When the ICA doesn’t work

Sometimes your ICA decomposition produces something unpleasant to the eyes. In this case, you might wonder what you can possibly do to improve it. Creating a separate dataset on which to run ICA can sometimes be the solution (or perhaps this is the approach you’ve decided to run with from the offset). Doing this means that you can treat the ICA dataset in a manner that you wouldn’t the original dataset, because the consequent distortions that occur won’t be carried across to the original dataset when you import the ICA weights. This can often mean that you end up with better ICA components and faster decomposition. Follow the video below to learn how to import ICA values from this separate dataset.


Video

Coming soon #



Script

Script #5 (download).

Script #5 (view).

It’s important to note that data cleaning can only be achieved via manual selection of noisy data. This means that from this stage onwards we cannot continue using Dataset #1 as in previous scripts.


Dataset

To clean your data and run ICA via the user interface, the example data set (used in the above video) can be downloaded here


Write-up

Ocular correction was conducted using Independent Component Analysis (ICA) following visual inspection and cleaning of the data. ICA used the RUNICA algorithm with EOG electrodes excluded, and resulted in an average of [ENTER AVERAGE] components removed per participant [ENTER RANGE].


Activity

Have a go at cleaning the dataset provided in ERPLAB. Then, visit the UCSD ICLabel Tutorial, where you can practice labelling the source of components based on studying their EEGLAB output.


Write-up

Independent Component Analysis (ICA) for artifact correction was conducted using the runica algorithm in EEGLAB. All channels were included in the ICA except for EOG electrodes, and electrodes that were subsequently interpolated (see Supplementary File X for the full list of excluded electrodes for each participant). Occular components were selected via visual inspection of both their characteristics (topographical map; time/trial graph and power/frequency graph), and the impact of their removal on the data. On average, X number of components were removed per subject (sd = X).