TY - JOUR
T1 - Assessing expert reliability in determining intracranial EEG channel quality and introducing the automated bad channel detection algorithm
AU - Hattab, Tariq
AU - König, Seth D.
AU - Carlson, Danielle C.
AU - Hayes, Rebecca F.
AU - Sha, Zhiyi
AU - Park, Michael C.
AU - Kahn, Lora
AU - Patel, Sima
AU - McGovern, Robert A.
AU - Henry, Thomas
AU - Khan, Fawad
AU - Herman, Alexander B.
AU - Darrow, David P.
N1 - Publisher Copyright:
© 2024 The Author(s). Published by IOP Publishing Ltd.
PY - 2024/8/1
Y1 - 2024/8/1
N2 - Objective. To evaluate the inter- and intra-rater reliability for the identification of bad channels among neurologists, EEG Technologists, and naïve research personnel, and to compare their performance with the automated bad channel detection (ABCD) algorithm for detecting bad channels. Approach. Six Neurologists, ten EEG Technologists, and six naïve research personnel (22 raters in total) were asked to rate 1440 real intracranial EEG channels as good or bad. Intra- and interrater kappa statistics were calculated for each group. We then compared each group to the ABCD algorithm which uses spectral and temporal domain features to classify channels as good or bad. Main results. Analysis of channel ratings from our participants revealed variable intra-rater reliability within each group, with no significant differences across groups. Inter-rater reliability was moderate among neurologists and EEG Technologists but minimal among naïve participants. Neurologists demonstrated a slightly higher consistency in ratings than EEG Technologists. Both groups occasionally misclassified flat channels as good, and participants generally focused on low-frequency content for their assessments. The ABCD algorithm, in contrast, relied more on high-frequency content. A logistic regression model showed a linear relationship between the algorithm’s ratings and user responses for predominantly good channels, but less so for channels rated as bad. Sensitivity and specificity analyses further highlighted differences in rating patterns among the groups, with neurologists showing higher sensitivity and naïve personnel higher specificity. Significance. Our study reveals the bias in human assessments of intracranial electroencephalography (iEEG) data quality and the tendency of even experienced professionals to overlook certain bad channels, highlighting the need for standardized, unbiased methods. The ABCD algorithm, outperforming human raters, suggests the potential of automated solutions for more reliable iEEG interpretation and seizure characterization, offering a reliable approach free from human biases.
AB - Objective. To evaluate the inter- and intra-rater reliability for the identification of bad channels among neurologists, EEG Technologists, and naïve research personnel, and to compare their performance with the automated bad channel detection (ABCD) algorithm for detecting bad channels. Approach. Six Neurologists, ten EEG Technologists, and six naïve research personnel (22 raters in total) were asked to rate 1440 real intracranial EEG channels as good or bad. Intra- and interrater kappa statistics were calculated for each group. We then compared each group to the ABCD algorithm which uses spectral and temporal domain features to classify channels as good or bad. Main results. Analysis of channel ratings from our participants revealed variable intra-rater reliability within each group, with no significant differences across groups. Inter-rater reliability was moderate among neurologists and EEG Technologists but minimal among naïve participants. Neurologists demonstrated a slightly higher consistency in ratings than EEG Technologists. Both groups occasionally misclassified flat channels as good, and participants generally focused on low-frequency content for their assessments. The ABCD algorithm, in contrast, relied more on high-frequency content. A logistic regression model showed a linear relationship between the algorithm’s ratings and user responses for predominantly good channels, but less so for channels rated as bad. Sensitivity and specificity analyses further highlighted differences in rating patterns among the groups, with neurologists showing higher sensitivity and naïve personnel higher specificity. Significance. Our study reveals the bias in human assessments of intracranial electroencephalography (iEEG) data quality and the tendency of even experienced professionals to overlook certain bad channels, highlighting the need for standardized, unbiased methods. The ABCD algorithm, outperforming human raters, suggests the potential of automated solutions for more reliable iEEG interpretation and seizure characterization, offering a reliable approach free from human biases.
KW - algorithm
KW - bad channels
KW - interrater reliability
KW - intracranial EEG
KW - intrarater reliability
UR - http://www.scopus.com/inward/record.url?scp=85199715851&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85199715851&partnerID=8YFLogxK
U2 - 10.1088/1741-2552/ad60f6
DO - 10.1088/1741-2552/ad60f6
M3 - Article
C2 - 38981500
AN - SCOPUS:85199715851
SN - 1741-2560
VL - 21
JO - Journal of neural engineering
JF - Journal of neural engineering
IS - 4
M1 - 046028
ER -