Nowcasting of Covid-19 Cases with Alternative Data?

At present, the novel coronavirus SARS-CoV-2 has a firm grip on large parts of the earth. The number of cases in individual countries is still rising or has stabilised after drastic measures in some cases.
However, these case numbers are distorted and often outdated by up to two weeks! The reasons for this are manifold and range from overloaded authorities, slow systems and tests to cyber attacks. In addition, many people probably only call the doctor when they have been ill for a while.

This raises the question to what extent it is possible to obtain the most up-to-date data possible and perhaps even “predict” the number of cases. “Alternative data” could provide one possibility for this; especially in the financial sector, this means data from outside the actual financial markets, such as entries in social media, search interest in Google or satellite images.

In the present case, the extent to which searches in Google can be helpful was investigated. Through Google Trends, Google offers the possibility of determining the search interest for possible terms and topics over time and for individual countries. This makes it easy to see how the interest in a topic is developing in a particular country.

First of all, the term “coronavirus” would of course be obvious here; however, it is distorted by the enormous amount of reporting and would probably provide little new information.
Instead, the term “fever” was analysed. Fever is a main symptom of Covid-19 and is by far not googled as often as “coronavirus”. One might even assume that many people who google “fever” actually have fever (or relatives of them).

We have investigated for four different countries – Italy, Spain, Germany and the USA – to what extent Google’s search interest in “fever” is a pre-indicator of reported corona case numbers.
To do this, we firstly obtained the search interests from Google Trends for the respective countries. We compared these with the case numbers provided by Johns Hopkins University; from these we determined the new cases per day and scaled them (division by the maximum and multiplication by 100).
The individual curves per country could then be displayed in diagrams. As it turned out, the Google Trends curves always had a lead time of at least a few days – a trend that was visible for many countries (see charts below). In addition, at this point in time the interest in “fever” seems to have declined again in many places. To draw hasty conclusions here would be negligent, of course!

Overall, procedures such as this could help to improve the information situation and put decisions on a broader basis.


Dr. Dimitrios Geromichalos
Founder / CEO
RiskDataScience GmbH
Nördliche Münchner Straße 47, 82031 Grünwald
Telefon: +4989322096365
Twitter: @riskdatascience