Seminar 6: AI and Big Data for Environmental Health Surveillance
1. What types of data are most valuable for predicting health risks related to climate?
The most valuable data types for predicting climate-related health risks are those that capture the environmental drivers of disease, the population exposed, and the health outcomes that result, with the greatest predictive power emerging from integrated analysis across these domains . Climate data including temperature, precipitation, humidity, and derived indices like degree days above thresholds are fundamental, as they directly influence pathogen survival, vector biology, and transmission dynamics. The scoping review demonstrates that for every 1°C rise in temperature, malaria cases increase 10-20% and dengue cases increase 8-10%, while rainfall variation drives 30-50% increases in malaria risk and 20-30% increases in dengue transmission . Satellite-derived environmental data including vegetation indices (NDVI) for vector habitat identification, land surface temperature for heat mapping, and aerosol optical depth for air quality estimation provide spatial coverage where ground monitoring is sparse. Population data including density, demographics, mobility patterns, and vulnerability indicators identifies who is exposed and most susceptible. Health surveillance data documenting disease incidence, syndromic indicators, and healthcare utilization provides the outcome measures needed to validate and train predictive models. Genomic and pathogen data increasingly valuable for tracking variants and emergence. The Global Burden of Disease study demonstrates that integrated analysis of environmental and health data can quantify that ambient PM2.5 air pollution alone was responsible for 4.2% of global DALYs and 4.7 million deaths in 2021 . The most powerful predictions come from machine learning models that identify complex, non-linear relationships across these diverse data streams.