Document Type : Original Article


1 Department of E. C. E, M. S. Ramaiah University of Applied Sciences, Bangalore, India

2 Research division, Relecura Inc., Bangalore, India


Despite decades of significant research, task-based functional MRI cannot reliably predict individual differences in cognition. Furthermore, searching for methods with greater predictability alone is insufficient. We need to clarify how these techniques use brain input to create predictions in order to comprehend the links between cognition and the brain. In this study, we have applied the Interpretable Machine Learning (IML) framework to decode cognition from fMRI data and find the significant instants of the voxel time course. We compared the ability of three predictive models to decode cognitive states. The predictive IML models considered in the current study include an explainable boosting machine (EBM), a decision tree (DT) classifier, and linear regression (LR). Furthermore, the classification accuracy of Support Vector Machine (SVM) and Gaussian Naïve Bayes (GNB) classifiers is reported for cognitive state classification. The standard Star plus fMRI dataset with two cognitive tasks has been used in this study. Initially, a few voxels are selected using a clustering-based maximum margin feature engineering framework. Then, the IML models are built with selected voxels from fMRI data. The classification accuracy of 80%, 82%, 80%, 93.7%, and 82% is achieved using EBM, DT, LR, SVM, and GNB classifiers, respectively. Moreover, the IML classifiers EBM, LDT, and LR can identify the significant instants of voxels.

Graphical Abstract

Identification of Significant Instants of Voxels for Cognitive State Classification Using Interpretable Machine Learning Models



Since the early 1990s, task-specific functional magnetic resonance imaging (fMRI) has been a popular tool for neuroscientists. From a neuroscience and neuroimaging standpoint, functional Magnetic Resonance Imaging (fMRI) can be utilised to non-invasively interpret human perception and semantic information of the cerebral cortex [1]. Researchers have effectively decoded visual cues associated with human brain neuron activity from fMRI data [2, 3]. Collecting fMRI signals and image samples are challenging due to the expensive fMRI research and the complex research method. Therefore, the quantity of fMRI signals and images is limited. The difficulties with brain decoding can be summed up as follows: fMRI signals are contaminated with noise, model mapping between brain activity, and visual stimuli is limited. There is a dearth of data that can be used to compare brain activity to visual stimuli [4]. Functional MRI signals have higher complexity and smaller sample size than the paired images. It is easy to experience the dimensionality curse when the model is trained using a small number of high-dimensional data samples. On the limited datasets, the standard approaches are easily overfitting.

The collection and recording of fMRI signals of brain activity take a considerable time, and the decoding of brain activity is typically restricted to particular cognitive regions humans can understand. Functional MRI provides three-dimensional brain pictures for a specified time and indirectly assesses brain activity. As a result of an underlying neuronal activity, active brain areas can be seen in the produced brain pictures. In contrast to the less active brain regions, active brain regions contain a large amount of oxygenated haemoglobin. The ratio of oxygenated to deoxygenated hemoglobin is considered for evaluating brain activity. Functional MRI has become a popular approach for locating brain regions active in cognition, emotion, and action. Brain connection networks have investigated methods to annotate or decode the cognitive state of the brain using the dependencies between brain regions.

For the investigation of brain-related activities, including the classification of cognitive states [5] and the functional connectivity [6] of brain areas, fMRI data have been subjected to several machine learning algorithms. Due to machine intelligence techniques in computer vision, significant progress has been achieved in image analysis in recent years. Deriving brain-based prediction assessments of individual variations in cognitive ability is one objective of task-based fMRI [7]. However, the objective of obtaining a strong and predictable link between the brain and cognition through task-based fMRI is still mostly unmet [8]. Moreover, if we are unable to describe how a method uses data from various brain regions to create predictions, finding a method with a more vital predictive ability may not be sufficient.

To anticipate the individual differences in cognition from task-specific fMRI data with improved predictive capacity, we intend to: 1) select algorithms that extract information across brain areas and 2) describe how these algorithms draw information to generate prediction using the Interpretable Machine Learning (IML) framework. There are many different ideas and theories of interpretability that researchers have attempted to explain. Interpretability has been described in terms of model fidelity, model transparency, model comprehension, and model trust [9]. The fidelity of the ML model and its explanation, i.e. the ML model should explain why it is generating a prediction or making a suggestion, is a key component of interpretability. This is frequently an important element of "user trust" [10]. The semantics of the features ought to be comprehensible at the feature level. In some ML models like regression and decision tree, the explanation is a component of the model itself. The IML models are often as precise as black box models.

One of the IML models considered in this work is EBM, which is a generalized additive model (GAM). The gold standard for comprehensibility when lower-dimensional terms are taken into account is GAM. The GAMs have the form

Where, g is the link function, which adjusts the GAM to classification or regression, compared with typical GAMs, EBM has few advantages. Initially, EBM uses ML techniques like gradient boosting and bagging to learn the feature function fj of each feature. The boosting method is rigorously constrained to train on a single feature at a time in round-robin form with very small learning rates, making the order of the features irrelevant. To reduce the impacts of co-linearity and train the appropriate feature value for each feature, it round-robin cycles through the features demonstrating how individual feature contributes the problem. Finally, E, BM has the ability to automatically recognize and incorporate pairwise interaction terms.

EBMs further enhance the accuracy, while preserving interpretability. EBMs are very understandable since plotting allows one to see and comprehend how each feature contributes to the final prediction. Due to the additive nature of EBM, each feature contributes to the forecast in a modular manner that facilitates understanding the role each feature in the prediction. 

In this work, we have used IML models such as explainable boosting, decision tree, and logistic regression to decode the cognitive states from fMRI data, and to find the significant instants present in the voxel time course. The rest of the article is organised as follows: the earlier related work is covered in section II. Section III explains the proposed IML classifier model for cognitive state classification. In section IV, a short description of Starplus fMRI data is covered. The results of the proposed IML technique using Starplus fMRI data are elaborated in section V, and conclusions are presented in section VI. 

Related work

Multi-voxel pattern analysis (MVPA), in conjunction with machine learning (ML), has recently gained popularity as a technique for determining the cognitive states. The effectiveness of brain decoding has been significantly enhanced by utilising learning models to decipher fMRI signals that record brain activity [11]. Although a decoding framework based on MVPA has been established, the multi-voxel pattern analysis decoding framework is difficult to read, specifically for linear kernels. Furthermore, this method is vulnerable to image flaws such as eye movement and other artifacts. Likewise, it is crucial to consider the hemodynamic responsiveness activity, the rate of neuron vascular connection, and the signal-to-noise percentage of fMRI findings. In addition to be sought, the hardware's processing speed and algorithmic effectiveness should also take into account the brain's blood coupling latency [12]. Though the results of the current ML-based decoding model are adequate [13]. The ML models to recreate the associated stimuli from fMRI data still face several difficulties to produce a higher-precision decoding model.

In general, the sample size of fMRI data is small compared to the dimensionality of the data. The quantity and dependability of the training samples determine the effectiveness of ML-based models. Numerous brain activities are observed, along with various images corresponding to those actions. Image reconstruction techniques and quality may be enhanced [14]. However, the duration of the experiment should match its effectiveness. The selection of the contributing factors' essential qualities is incredibly crucial. The capacity of the decoding model to extract key features from neuroimaging data must be further improved to choose the essential features that are most crucial to image reconstruction. It can gain knowledge from the axiomatic attribution [15] and visual attention [16] techniques used during computer vision to identify the neuronal voxels that are most important for decoding visual stimuli.

One of the primary goals of neuroscience study is to understand the interconnections in the brain network that underlies human cognition. However, since gathering and capturing fMRI signals of human brain activity take a while, the current decoding of brain activity is typically restricted to certain cognitive regions humans can understand [17]. Most current learning-based studies cannot simultaneously consider the functional reliance and time dynamics across various brain areas. To make use of the interdependence between the brain areas, graph convolution networks (GVN) have been utilized to decode the cognitive states. In GCN, the extracted representations include temporal dynamic information and functional dependencies among the regions in the brain [18]. The other new learning techniques are extremely accurate, but also unfortunately difficult to understand, including random forests, boosted trees, kernelized SVMs, bagged trees, neural nets, and combinations of these techniques. It is still difficult to employ any of these techniques to solve mission-critical issues like healthcare, in particular, because it is typically unethical to alter the care provided to patients to gather data sets. In the case study of pneumonia risk prediction, the IML model reveals unexpected patterns in the dataset that, in the past, would have made it impossible for complex learned models to be applied in this field [19].  

In this work, we describe the application of IML models to the cognitive state classification problem and find the significant time instants present in the voxel time course. This class of models, in our opinion, represents a substantial advancement in developing highly accurate and understandable models. The primary contributions of this study are: The proposed IML-based technique is elaborated in the next section.

Proposed IML-based cognitive state classification

This study aims to develop Interpretable Machine Learning (IML) models for cognitive state classification. The study has been developed on three glass box models Explainable Boosting Machine (EBM), Decision Tree (DT) classifier, and Logistic Regression (LR). Likewise, we have developed two black box models, Support Vector Machine (SVM) and Gaussian Naïve Bayes (GNB) for decoding the cognitive states. The sample diagram of the proposed IML-based cognitive state classification is displayed in Figure 1.

Figure 1: The proposed IML-based cognitive state classification

The proposed framework is a four-step approach for decoding the cognitive states and finding the significant instants of selected voxels with proper expiations using fMRI data. The description of each step is given in the following.

Step 1: Select the specific number of brain regions or Regions of Interest (ROIs) from the fMRI data.

In general, fMRI data consists of more ROIs. Therefore, it is often required to select ROIs before developing a model. Since the current study is on cognitive task classification, we select the required number of ROIs for each cognitive task in this step.

Step 2: Select a few voxels from the selected ROIs.

The fMRI data comprises ROIs, and each ROI consists few hundred voxels. It is computationally challenging to run ML models with all the existing voxels as features. Hence, it is required to select a few voxels from the pool of selected voxels. In this step, we choose a minimum number of voxels using a clustering-based maximum margin framework [20]. The frame initially partitions the voxels into clusters and tries to find the maximum margin among voxels for given pair of tasks [21].

Step 3: Build interpretable Machine Learning (IML) models for cognitive state classification.

The voxels selected from Step 2 are used to form feature vectors for cognitive task classification. Instead of simply building a classifier model for the required task, it is always recommended to find the explanations from the classifiers while performing classification.

Let C = {vi, yi}, represents a training dataset, where vi = (vi1, vi2, vi3, …, vik) denotes a feature vector with k features, and ti is the response (target). vj denotes the jth time instant or data point in the feature space. One of the IML models considered in this work is EBM, which explains the obtained classification accuracy in terms of time instants or data points present in the feature vector. 

In this context, we have considered three glass box models such as EBM, DT, and LR classifiers, and two black box models such as SVM, and GNB, for cognitive state classification.

Step 4: Interpretability is crucial to determine how ML algorithms conclude from the data. Similarly, while decoding the cognitive states using fMRI data from the selected voxels, the interpretation is essential to understand the fMRI dataset and operation of ML algorithms. In this step, we determine the significant instants of the selected voxels using the results obtained from IML models such as EBM, DT, and LR classifiers. 

The performance of the stated approach IML-based cognitive task classification is examined on the StarPlus fMRI dataset [22]. A short description of StarPlus fMRI dataset is given in the next section.

StarPlus fMRI data

Starplus fMRI data [22] have been used to confirm the performance of the proposed technique. StarPlus data offers readily available fMRI data for the classification and investigation of the cognitive states of the human brain. The dataset was produced by Carnegie Mellon University researchers. Since it was publicly available, many people have used the data for analysis. A set of trials are created using captured brain volumes. To correctly negotiate for each trial, subjects were asked to ascertain whether a statement or symbol was followed by another statement or sign. In the initial stage, the subject was given the option of one of two sentences: "The Star is above the Plus" or "The Star is below the Plus." This will vanish from the screen after four seconds, and the next four seconds will show an empty screen. Following a four-second interval with the screen blank, an image stimulus will be displayed for four more seconds. Every 0.5 seconds during the experiment, brain images are recorded. In the second phase, the experiment is repeated, but this time the picture and sentence stimuli are switched. The dataset's individuals have about 5000 voxels that have 25 ROIs assigned to them. Seven out of 25 ROIs, according to the literature are considered for the analysis. These seven ROIs include: {LDLPFC, LIPS, CALC, LOPER, LT, LIPL, and LTRIA}, which are used for the cognitive state classification.

Results and Discussion

Although Machine Learning classifiers have been used for cognitive state classification, the interpretability of the obtained results is not clearly explored. Especially while handling the medical dataset, explainability or interpretability is most important. We have developed Interpretable Machine Learning classifiers for cognitive state classification using fMRI data in this work. In this context, we have developed glass box models such as EBM, DR, and LR classifiers and black box models such as SVM and GNB. The framework begins with the selection of ROIs from the fMRI data. It proceeds to select relevant voxels from the pool of ROIs using the clustering-based maximum margin criteria.

The IML models are applied to the StarPlus fMRI dataset. The dataset consists of 25 ROIs, of which seven ROIs are considered (as per the description given on the website) for cognitive state classification. The dataset comprises six participants' fMRI data for analysis. The dataset has 40 samples for each task (picture and sentence tasks). The ROI pool consists of approximately 500 voxels, out of which four voxels are selected using feature selection criteria. Each voxel time course has 16 data points, so the length of the feature vector is defined using 64 data points. The classification of cognitive tasks is performed in a training-test fashion, where 75% of the data is used for training and 25% of the data for testing.

Figure 1 describes the overall importance of data points in terms of their mean absolute score for cognitive state classification using the EBM classifier. The EBM is a generalized additive model with automatic interaction detection. Modern black box models can sometimes be as accurate as EBMs, but EBMs are much easier to understand. EBMs are incredibly quick at prediction time. EBMs are very understandable because it is possible to visualize and comprehend how each feature contributes to a final prediction. Since each voxel in the StarPuls fMRI has a length of 16, and four voxels are selected for cognitive state classification, the length of the feature vector becomes 64. For example, Figure 2 shows the overall importance of data points (features) considered by the EBM for the classification problem. The length of the feature vector for each sample is 64 (64 features or data points from t0 to t63). From Figure 2, it is observed that the EBM considers the data points: t38, t33, t60, t52, t34, t11, t4, t36, t19, t59, t56, t61, t21, t47, and t53 for cognitive state classification. In the actual notation of the voxel time series, t38 is the 7th instant of voxel V3, t33 is the 2nd instant of voxel V3, t60 is the 13th instant of voxel V4, and t52 is the 5th instant of voxel V3. Similarly, other features/data points are identified from the obtained results.

Figures 3, 4, and 5 present the top three feature scores and densities obtained from the EBM. From the plots, it is observed that these features have high discrimination power for the classification task.

The data points have significantly good scores for discriminating the classes. For example, Figure 6 reveals the explanation for classifying class 1 (picture task) as class 1 (picture task), with an absolute score of 89.3. Similarly, Figure 7 presents the explanations for predicting class 2 (sentence task) as class 2 (sentence task) with an absolute score of 99.9, and Figure 8 indicates the prediction of class 1 as class 2. In Figure 9, we can see the tree diagram and features considered for the classification task by the decision tree classifier. The tree finds t61, t30, t59, t29, and t35 as key nodes. The tree has emerged from the feature t61 (feature) and splits into two branches across features t30 and t59. The tree makes the decision as per the purity associated with the specified branch. It is always suggested to have low impurity in making a decision. The other IML model considered in this study is LR. Figure 10 presents the significant features considered by the LR classifier while predicting the class of the objects.

The cognitive state classification accuracy of both glass box and black box models is presented in Table 1. From the obtained results, it is observed that the SVM classifier achieves a classification accuracy of 93.7%. The IML models can achieve acceptable accuracy for cognitive task classification. Likewise, the IML models provide explanations like how the models obtain the accuracy, which helps to identify the significant instants of voxels present in the fMRI data. All the simulations are carried out in Python using Interpret ML library package [23].

The study’s contributions include: (1) The proposed Cognitive state classification framework can achieve acceptable classification accuracy with four voxels when applied to StarPlus fMRI dataset (2). The applied IML classifiers (EBM, DT, and LR) help to identify the significant instants of voxels while achieving acceptable classification accuracy, which is very useful while handling mission-critical healthcare dataset.


This study presents the Interpretable Machine Learning (IML) models for classifying cognitive tasks. In the classification task, it is essential to know the features that contribute more to classification. The standard black box ML models, such as SVM and GNB, do not provide the overall significance of features while classifying the objects. Furthermore, it is crucial while predicting the class of the objects from health care datasets like fMRI. The current study identifies the significant instants of voxels using IML models. The IML models are applied to the cognitive dataset. The IML models in work include Explainable Boosting Machine, Decision Tree, and Logistic Regression classifiers. Before applying the models to the cognitive dataset, a few (four) voxels are selected using a clustering-based maximum margin voxel selection framework. Voxels are chosen from a pool of ROIs. The models are examined on the Standard StarPlus fMRI dataset. The IML models provide the overall importance of features while classifying a pair of cognitive tasks. The classification was performed in a train-test fashion, with 75% data for training and 25% of data for testing. The IML models achieve an average classification accuracy of 80%, whereas the black model, the SVM classifier, achieves a classification accuracy of 93.7%. The applied IML models help to find the significant instants of voxels from the cognitive fMRI dataset with acceptable classification accuracy.


The authors would like to thank Dr. S. Malathi, M. S. Ramaiah University of Applied Sciences, Bangalore, for her insightful advice and support.


This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors.

Authors' contributions

All authors contributed to data analysis, drafting, and revising of the paper and agreed to be responsible for all the aspects of this work.

Conflict of Interest

The author declared that they have no conflict of interest.



Siva Ramakrishna Jeevakala

Hariharan Ramasangu



Siva Ramakrishna. J, Hariharan Ramasangu. Identification of Significant Instants of Voxels for Cognitive State Classification Using Interpretable Machine Learning Models. J. Med. Chem. Sci., 2023, 6(6) 1291-1301


[1]. Haynes J.D., Rees G., Decoding mental states from brain activity in humans, Nature reviews neuroscience, 2006, 7:523 [Crossref], [Google Scholar], [Publisher]
[2]. Haxby J.V., Gobbini M.I., Furey M.L., Ishai A., Schouten J.L., Pietrini P., Distributed and overlapping representations of faces and objects in ventral temporal cortex, Science, 2001, 293:2425 [Crossref], [Google Scholar], [Publisher]
[3]. Kamitani Y., Tong F., Decoding the visual and subjective contents of the human brain, Nature neuroscience, 2005, 8:679 [Crossref], [Google Scholar], [Publisher]
[4]. Du C., Du C., Huang L., He H., Reconstructing perceived images from human brain activities with Bayesian deep multiview learning, IEEE transactions on neural networks and learning systems, 2018, 30:2310 [Crossref], [Google Scholar], [Publisher]
[5]. Mitchell T.M., Hutchinson R., Just M.A., Niculescu R.S., Pereira F., Wang X., Classifying instantaneous cognitive states from fMRI data, In AMIA annual symposium proceedings, American Medical Informatics Association, 2003, 2003:465 [Google Scholar], [Publisher]
[6]. Ryali S., Chen, T., Padmanabhan A., Cai W. and Menon V., Development and validation of consensus clustering-based framework for brain segmentation using resting fMRI, Journal of neuroscience methods, 2015, 240:128 [Crossref], [Google Scholar], [Publisher]
[7]. Dubois J, Adolphs R, Building a Science of Individual Differences from fMRI, Trends Cogn Sci., 2016, 20:425 [Crossref], [Google Scholar], [Publisher]
[8]. Elliott M.L., Knodt A.R., Ireland. Morris M.L., Poulton R., Ramrakha S., Sison M.L., Moffitt T.E., Caspi A, Hariri A.R, What is the test-retest reliability of common task-functional MRI measures? New empirical evidence and a meta-analysis, Psychological Science, 2020, 31:792 [Crossref], [Google Scholar], [Publisher]
[9]. Lipton Z.C., The mythos of model interpretability: In machine learning, the concept of interpretability is both important and slippery, Queue, 2018, 16:31 [Crossref], [Google Scholar], [Publisher]
[10]. Ribeiro M.T., Singh S., Guestrin C., August. "Why should i trust you?" Explaining the predictions of any classifier. In Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, 2016, 1135 [Crossref], [Google Scholar], [Publisher]
[11]. Zhang Y., Tetrel L., Thirion B., Bellec P., Functional annotation of human cognitive states using deep graph convolution, NeuroImage, 2020, 231:117847 [Crossref], [Google Scholar], [Publisher]
[12]. Sitaram R., Caria A., Birbaumer N., Hemodynamic brain–computer interfaces for communication and rehabilitation, Neural networks, 2009, 22:1320 [Crossref], [Google Scholar], [Publisher]
[13]. Du C., Li J., Huang L., He H., Brain encoding and decoding in fMRI with bidirectional deep generative models, Engineering, Engineering, 2019, 5:948 [Crossref], [Google Scholar], [Publisher]
[14]. Hayashi R., Kawata H., October. Image reconstruction from neural activity recorded from monkey inferior temporal cortex using generative adversarial networks, In 2018 IEEE International Conference on Systems, Man, and Cybernetics (SMC), IEEE, 2018, 105 [Crossref], [Google Scholar], [Publisher]
[15]. Sundararajan M, Taly A, Yan Q., Axiomatic attribution for deep networks, In International conference on machine learning, 2017, 70:3319 [Crossref], [Google Scholar], [Publisher]
[16]. Xu K., Ba J., Kiros R., Cho K., Courville A., Salakhudinov R., Zemel R., Bengio Y., Show, attend and tell: Neural image caption generation with visual attention, In International conference on machine learning, 2015, 37:2048 [Google Scholar], [Publisher]
[17]. Zhang Y., Tetrel L., Thirion B., Bellec P., Functional annotation of human cognitive states using deep graph convolution, NeuroImage, 2020, 231:117847 [Crossref], [Google Scholar], [Publisher]
[18]. Gadgil S., Zhao Q., Pfefferbaum A., Sullivan E.V., Adeli E., Pohl K.M., October. Spatio-temporal graph convolution for resting-state fmri analysis. In International Conference on Medical Image Computing and Computer-Assisted Intervention, Springer, Cham, 2020, 528 [Crossref], [Google Scholar], [Publisher]
[19]. Caruana R., Lou Y., Gehrke J., Koch P., Sturm M., Elhadad N., August. Intelligible models for healthcare: Predicting pneumonia risk and hospital 30-day readmission, In Proceedings of the 21th ACM SIGKDD international conference on knowledge discovery and data mining, 2015, 172 [Crossref], [Google Scholar], [Publisher]
[20]. Ramakrishna J.S., Ramasangu H., September. Classification of cognitive state using clustering based maximum margin feature selection framework, In 2017 International Conference on Advances in Computing, Communications and Informatics (ICACCI), IEEE, 2017, 1092 [Crossref], [Google Scholar], [Publisher]
[21]. Yang S., Hou C., Nie F., Wu Y., Unsupervised maximum margin feature selection via L 2, 1-norm minimization, Neural Computing and Applications, 2012, 21:1791 [Crossref], [Google Scholar], [Publisher]
[22]. Just M., StarPlus fMRI data [Publisher]
[23]. Interpret ML Team, Welcome to the Much Anticipated Interpret Documentation! 2022 [Publisher]