Hassan Ajulo, Theophilus Emeto, Faith Alele, Oyelola Adegboye
Undergone Revision with Geographical Analysis | 2025
Abstract
This study proposes a geographically and temporally weighted random forest (GTWRF) model for explanatory spatiotemporal analysis of COVID-19 outcomes. GTWRF extends the conventional geographically weighted random forest by incorporating spatial and temporal dependencies through an adaptive Gaussian weighting scheme. The model integrates three spatiotemporal distance (STD) functions, including one novel formulation that prioritises observations that are closer in space and time. To capture major shifts in transmission dynamics, the analysis was structured around three data-driven temporal periods derived from observed COVID-19 waves between 2020 and 2023. GTWRF was then applied to US county level COVID-19 incidence using four composite indicators (epidemiological, demographic, socioeconomic, and environmental) to identify region- and period-specific drivers of COVID-19. GTWRF using the new STD function achieved the lowest out-of-bag mean absolute error and root mean square error. Across periods, the epidemiological indicator was the leading driver of incidence, while secondary drivers shifted over time, with demographics being most influential in Period 1, environment in Period 2, and socioeconomic factors in Period 3. Regional vulnerabilities persisted, particularly in the South and West. These results show that GTWRF can characterise geographically varying, period-specific drivers of incidence and can support targeted interventions, surveillance prioritisation, and resource allocation in future outbreaks.
Keywords: Spatiotemporal, geospatial-temporal, machine learning, random forest, composite indicators, COVID-19, pandemic preparedness