Covid and the rise of the data geek! 

During the COVID-19 pandemic the public, and data users, have become used to accessing nearly live local data every day. There is a new appetite to understand the intricacies and limitations of complex datasets, and the work of data scientists. At the same time many important datasets have been disrupted and there will be challenges in 2021 for continuity in reporting and using data to assess and predict local need. Duncan MacKenzie (Partnership Lead, Data Cymru) reflects on some gains and some pains in 2020, a year when we all became data geeks.

Back in the Summer, during what we now know was the lull between lockdowns one and two , we tried to book a holiday, an attempt to get away to some sunshine and sea. We managed to predict with uncanny accuracy the move of countries out of our safe ‘travel corridors’. It felt like we were on first-name terms with Easyjet customer services as we transferred our flights from one European destination to another! What is even more difficult to remember is that the threshold for safe travel corridors at the time was a national rate of 20 cases of COVID-19 per 100,000 population. 

At the time of writing (mid-December 2020), for my home (Middle Super Output Area [MSOA]) in the north of Cardiff, our latest seven day rolling incidence rate was 503 cases per 100,000 population (for 4 December to 10 December 2020). This data is available on our COVID-19 dashboard, developed in response to the pandemic. It allows users access to various datasets which display COVID-19 related data for Wales.

So not only does this show how strong a hold the virus has currently, but also how quickly the public got used to seeing and interpreting data about the pandemic and how we, as data users, become used to accessing live, or nearly live, data at a small geography in a dymanic format. Our dashboard is just one example of many where organisations are showing innovative ways of disseminating information and providing the public with access to a wide range of complex data and information in an attempt to allow them to understand the impact of the pandemic.

Good (and less good) examples of dissemination

The Office for National Statistics have been very proactive in mapping out the course of the pandemic and its economic and societal impact, Carl Baker from the House of Commons Library has done some nice things with cartogram mapping, and John Burns-Murdoch and the Financial Times have been consistently innovative with their outputs. The Government’s main output is also a good example of how a wide range of data and information can be disseminated clearly and with concise explanations.

These examples use a range of software and dissmeination techniques, but all display a clarity and flexibility that makes the data easy to understand and suitable to a wide range of users.

One example that has been perhaps slightly less successful are the Downing Street briefings. Although the intention of providing the public directly with relevant, important and up-to-date data accompanied by explanation from experts is absolutley the right thing to do, some slides were cluttered and too busy. In this instance, television is a static medium where the user has no control over what to look at or how it is presented (compare this to other disseminations). A golden rule of dissemination is to keep charts simple, focused on one message, so as to not overwhelm the user.

Downing Street do publish the slides they use in the press conferences on their website, which is a good move, but this does highlight the fact that it is very difficult to produce a dissemination for one medium that will also work for another without editing.

Keeping up with the public appetite for data
It has also been refreshing to see the public appetite to understand quite complex datasets and the intricacies and failings of data that are clear to those of us more used to handling data. What is the difference between mortality rates, weekly deaths and recorded deaths? How can a death be recorded as COVID-19 related on a death certificate but have nothing to do with COVID-19? And why are bank holidays so disruptive to time series data that is displayed weekly?
Another trend has been the ability to create, publish and update dashboards at a pace that would not have been expected in other circumstances.
As the developers of the vaccines have discovered, things that used to take a long time to develop possibly only took a long time because there was lots of dead time in the process. Eliminate this and processes can be speeded up without impacting on quality, QA and review time.
A good example of this is the ONS dashboard on coronavirus insights, which was developed using a five-day development sprint. While these are not a new idea, their prevalance may well increase as organisations become more responsive and agile in finding solutions to issues.
With open data sources and API now more common-place, the ability to access and include live data in dissmination is increasing. But, to maintain and check daily updates is time-consuming in the sense that it is an everyday job now. Despite the automation involved, there is still a need for checking and ‘hands on’ tasks. Will users come to expect updates on weekends, bank holidays and over Christmas as the “norm” in the future? This is exemplified by the release of a Holiday publishing schedule for the main Government COVID-19 dashboard.
Although, with remote working now the new “norm”, perhaps this will become less intrusive on work/ home life balance if such tasks can be done from home in a matter of minutes.


The data impact of COVID-19 in Wales

In Wales, we have seen a temporary pause to all face-to-face interviewing for social surveys since March. This was a blow, as the National Survey for Wales is a rich source of quantitative data not available from other sources. Adaptations have since been made so that survey interviews could re-start in other formats (and other COVID-19 specific surveys are also in the field now – another example of how quickly new projects can be developed and launched). 

This represents a big change in methodology for our surveys and the new formats will also reach different groups of people, which means the estimates produced from these surveys will be impacted. Statisticians and users will be grappling with the impact of these changes to methodology and estimates for some time to come.

Other datasets have been equally hard hit. Much data collection has been suspended since the start of lockdown. Others, such as educational achievement, will have little or no continuity during this period with the move to assessed grades as opposed to examinations meaning a substantial change in methodology.

All of this presents a specific data problem for researchers in Wales. Analysts working in Regional Partnership Boards (RPBs) and Public Services Boards (PSBs) have a statutory requirement to produce assessments by April 2021 (PSBs to produce a well-being assessment and RPBs a population needs assessment). They will need to take a decision on what data they can use to understand the issues currently facing their communities, what trend data can be used to understand recent change (given the impact of COVID-19 on datasets), and how they can predict what trends will occur in the future (the Well-being assessments have to include an element of understanding of future trends and issues that public service providers will face).

All these issues will make these assessment more difficult than any previously undertaken, and that is before the availability of resources and technical skills for analysis, engagement and consultation are considered.

However, it feels like analysts and the wider research community in Wales (and beyond) have shown a huge degree of flexibility, innovation and responsiveness during this pandemic so far, and that these qualities will stand them in good stead for future challenges.

AUTHOR BIO: Duncan Mackenzie is Data Cymru’s strategic lead for Partnership Support. He works with partnerships and other organisations in Wales, helping them with their data and information needs. He has previously worked for the Wales Audit Office, Welsh Government and Merthyr Tydfil County Borough Council. Before moving to Cardiff with his family, he managed contact databases for the Greater London Authority and the Corporation of London. When not talking about statistics at work, he is a qualified basketball statistician, providing in-game live stats for the Cardiff Met Archers WBBL team.