I am a Post-doc at School of Public Health, Imperial College London, where I work primarily with Dr. Samir Bhatt and Dr. Seth Flaxman. I did my Ph.D. at Research School of Computer Science, The Australian National University under the supervision of Professor Lexing Xie and Dr Marian-Andrei Rizoiu. I have also worked with Professor Wray Buntine at Monash University.
My current research focuses on developing flexible and scalable models for understanding various spatiotemporal data, for example, epidemics (COVID-19, Malaria, HIV) and crime. I am currently funded by the Gates foundation to work on mapping and modeling Malaria. For my doctorate, I built models for understanding the evolution of popularity in social media. My work focused on algorithms to model point processes with classical machine learning techniques as well as using modern deep learning networks, mainly recurrent networks.
PhD in Machine Learning, 2019
The Australian National University
MS(Hons.) in Artificial Intelligence, 2014
The Australian National University
BE in Computer Engineering, 2009
Maharashtra Institute of Technology
Cases of SARS-CoV-2 infection in Manaus, Brazil, resurged in late 2020, despite previously high levels of infection. Genome sequencing of viruses sampled in Manaus between November 2020 and January 2021 revealed the emergence and circulation of a novel SARS-CoV-2 variant of concern. Lineage P.1, acquired 17 mutations, including a trio in the spike protein (K417T, E484K and N501Y) associated with increased binding to the human ACE2 receptor. Molecular clock analysis shows that P.1 emergence occurred around mid-November 2020 and was preceded by a period of faster molecular evolution. Using a two-category dynamical model that integrates genomic and mortality data, we estimate that P.1 may be 1.7–2.4-fold more transmissible, and that previous (non-P.1) infection provides 54–79% of the protection against infection with P.1 that it provides against non-P.1 lineages. Enhanced global genomic surveillance of variants of concern, which may exhibit increased transmissibility and/or immune evasion, is critical to accelerate pandemic responsiveness.
The SARS-CoV-2 lineage B.1.1.7, designated a Variant of Concern 202012/01 (VOC) by Public Health England1, originated in the UK in late Summer to early Autumn 20202. Whole genome SARS-CoV-2 sequence data collected from community-based diagnostic testing shows an unprecedentedly rapid expansion of the B.1.1.7 lineage during Autumn 2020, suggesting a selective advantage. We find that changes in VOC frequency inferred from genetic data correspond closely to changes inferred by S-gene target failures (SGTF) in community-based diagnostic PCR testing. Analysis of trends in SGTF and non-SGTF case numbers in local areas across England shows that the VOC has higher transmissibility than non-VOC lineages, even if the VOC has a different latent period or generation time. The SGTF data indicate a transient shift in the age composition of reported cases, with a larger share of under 20 year olds among reported VOC than non-VOC cases. Time-varying reproduction numbers for the VOC and cocirculating lineages were estimated using SGTF and genomic data. The best supported models did not indicate a substantial difference in VOC transmissibility among different age groups. There is a consensus among all analyses that the VOC has a substantial transmission advantage with a 50% to 100% higher reproduction number.
Following initial declines, in mid 2020 a resurgence in transmission of novel coronavirus disease (COVID-19) occurred in the US and Europe. As COVID19 disease control efforts are re-intensified, understanding the age demographics driving transmission and how these affect the loosening of interventions is crucial. We analyze aggregated, age-specific mobility trends from more than 10 million individuals in the US and link these mechanistically to age-specific COVID-19 mortality data. We estimate that as of October 2020, individuals aged 20-49 are the only age groups sustaining resurgent SARS-CoV-2 transmission with reproduction numbers well above one, and that at least 65 of 100 COVID-19 infections originate from individuals aged 20-49 in the US. Targeting interventions – including transmission-blocking vaccines – to adults aged 20-49 is an important consideration in halting resurgent epidemics and preventing COVID-19-attributable deaths.
As of 1st June 2020, the US Centres for Disease Control and Prevention reported 104,232 confirmed or probable COVID-19-related deaths in the US. This was more than twice the number of deaths reported in the next most severely impacted country. We jointly model the US epidemic at the state-level, using publicly available death data within a Bayesian hierarchical semi-mechanistic framework. For each state, we estimate the number of individuals that have been infected, the number of individuals that are currently infectious and the time-varying reproduction number (the average number of secondary infections caused by an infected person). We use changes in mobility to capture the impact that non-pharmaceutical interventions and other behaviour changes have on the rate of transmission of SARS-CoV-2. We estimate that Rt was only below one in 23 states on 1st June. We also estimate that 3.7% [3.4%–4.0%] of the total population of the US had been infected, with wide variation between states, and approximately 0.01% of the population was infectious. We demonstrate good 3 week model forecasts of deaths with low error and good coverage of our credible intervals.
There are large differences in patterns of per-capita deaths in different countries that are difficult to reconcile with herd immunity arguments but are easily explained by the timing and stringency of interventions. Seroprevalence studies also provide an independent source of information that is highly consistent with mortality data. The herd immunity argument is therefore at odds with both mortality and seroprevalence data, whereas the intervention argument provides a parsimonious explanation for both. Although the impacts of current control interventions on transmission need to be balanced against their short-term and long-term economic and health impacts on society, epidemiological data suggest that no country has yet seen infection rates sufficient to prevent a second wave of transmission, should controls or behavioural precautions be relaxed without compensatory measures in place.
Following the emergence of a novel coronavirus (SARS-CoV-2) and its spread outside of China, Europe has experienced large epidemics. In response, many European countries have implemented unprecedented non-pharmaceutical interventions such as closure of schools and national lockdowns. We study the impact of major interventions across 11 European countries for the period from the start of COVID-19 until the 4th of May 2020 when lockdowns started to be lifted. Our model calculates backwards from observed deaths to estimate transmission that occurred several weeks prior, allowing for the time lag between infection and death. We use partial pooling of information between countries with both individual and shared effects on the reproduction number. Pooling allows more information to be used, helps overcome data idiosyncrasies, and enables more timely estimates. Our model relies on fixed estimates of some epidemiological parameters such as the infection fatality rate, does not include importation or subnational variation and assumes that changes in the reproduction number are an immediate response to interventions rather than gradual changes in behavior. Amidst the ongoing pandemic, we rely on death data that is incomplete, with systematic biases in reporting, and subject to future consolidation. We estimate that, for all the countries we consider, current interventions have been sufficient to drive the reproduction number Rt below 1 (probability Rt< 1.0 is 99.9%) and achieve epidemic control. We estimate that, across all 11 countries, between 12 and 15 million individuals have been infected with SARS-CoV-2 up to 4th May, representing between 3.2% and 4.0% of the population. Our results show that major non-pharmaceutical interventions and lockdown in particular have had a large effect on reducing transmission. Continued intervention should be considered to keep transmission of SARS-CoV-2 under control.
Predicting popularity, or the total volume of information outbreaks, is an important subproblem for understanding collective behavior in networks. Each of the two main types of recent approaches to the problem, feature-driven and generative models, have desired qualities and clear limitations. This paper bridges the gap between these solutions with a new hybrid approach and a new performance benchmark. We model each social cascade with a marked Hawkes self-exciting point process, and estimate the content virality, memory decay, and user influence. We then learn a predictive layer for popularity prediction using a collection of cascade history. To our surprise, Hawkes process with a predictive overlay outperform recent feature-driven and generative approaches on existing tweet data  and a new public benchmark on news tweets. We also found that a basic set of user features and event time summary statistics performs competitively in both classification and regression tasks, and that adding point process information to the feature set further improves predictions. From these observations, we argue that future work on popularity prediction should compare across feature-driven and generative modeling approaches in both classification and regression tasks.
I was a teaching instructor for the following courses at The Australian National University: