Methods
Case reports in NSW, Australia from the beginning of the epidemic in
January 2020 to the peak of the epidemic at the end of March 2020 were
accessed (NSW Government, 2020a). Cases reports in which the source of
infection was determined to be locally-acquired, and in which a date of
notification and postcode of residence was reported, were selected for
inclusion into the study. A time-series of case reports was then
created. Based on the reported postcode, for all reported cases in the
time-series the closest weather observation station reporting rainfall,
temperature and humidity for the period January to March 2020 was
identified (NSW Government, 2020b). Daily observations of the following
meteorological recordings were downloaded: rainfall (mm), and
temperature (°C) and relative humidity (%) recorded at 9am and at 3pm
(Australian Government Bureau of Meteorology, 2020). The median values
for each day for all selected weather observation stations were
estimated, and time-series of median rainfall, and 9am and 3pm
temperature and relative humidity were created. Two additional
time-series were then created by determining the daily difference
between 9am and 3pm temperature, and between 9am and 3pm relative
humidity. Thus, 7 predictor time-series were available for modelling.
A correlation matrix was used to select meteorological variables to
avoid multicollinearity in the analysis. Variables with a correlation
coefficient <0.6 were retained. Each remaining variable was
included in a univariate generalized additive model (GAM) with daily
number of reported cases as the dependent variable. Variables withp value <0.1 in univariate analyses were then included
in a multivariate GAM, and the best fitting model based on Akaike
Information Criterion (AIC) was selected using a backward algorithm.
COVID-19 cases were assumed to follow a negative binomial distribution
given that the variances of the daily cases reported were greater than
their means. Meteorological variables were analyzed using a 14-day
exponential moving average (EMA), based on the assumed incubation period
of SARS-CoV-2. Natural splines of two degrees of freedom were also
included to account for additional short-term trend. A sensitivity
analysis was performed by modifying the EMA from 14 days to 10 and 21
days, respectively. R software (version 3.5.3,
http://cran.r-project.org; R Foundation for Statistical Computing,
Vienna, Austria) was used to perform all the statistical analyses and
visualization.