Prediction of tuberculosis clusters in the riverine municipalities of the Brazilian Amazon with machine learning

Luis Silva, Luise Gomes da Motta, Lynn Eberly

Research output: Contribution to journalArticlepeer-review


Objective: Tuberculosis (TB) is the second most deadly infectious disease globally, posing a significant burden in Brazil and its Amazonian region. This study focused on the “riverine municipalities” and hypothesizes the presence of TB clusters in the area. We also aimed to train a machine learning model to differentiate municipalities classified as hot spots vs. non-hot spots using disease surveillance variables as predictors. Methods: Data regarding the incidence of TB from 2019 to 2022 in the riverine town was collected from the Brazilian Health Ministry Informatics Department. Moran’s I was used to assess global spatial autocorrelation, while the Getis-Ord GI* method was employed to detect high and low-incidence clusters. A Random Forest machine-learning model was trained using surveillance variables related to TB cases to predict hot spots among non-hot spot municipalities. Results: Our analysis revealed distinct geographical clusters with high and low TB incidence following a west-to-east distribution pattern. The Random Forest Classification model utilizes six surveillance variables to predict hot vs. non-hot spots. The machine learning model achieved an Area Under the Receiver Operator Curve (AUC-ROC) of 0.81. Conclusion: Municipalities with higher percentages of recurrent cases, deaths due to TB, antibiotic regimen changes, percentage of new cases, and cases with smoking history were the best predictors of hot spots. This prediction method can be leveraged to identify the municipalities at the highest risk of being hot spots for the disease, aiding policymakers with an evidenced-based tool to direct resource allocation for disease control in the riverine municipalities.

Original languageEnglish (US)
Article numbere240024
JournalRevista Brasileira de Epidemiologia
StatePublished - 2024

Bibliographical note

Publisher Copyright:
© 2024 | Epidemio is a publication of Associação Brasileira de Saúde Coletiva-ABRASCO.


  • Amazon
  • Epidemiology
  • Machine learning
  • Ribeirinhos
  • Spatial analysis
  • Tuberculosis


Dive into the research topics of 'Prediction of tuberculosis clusters in the riverine municipalities of the Brazilian Amazon with machine learning'. Together they form a unique fingerprint.

Cite this