Code Monkey home page Code Monkey logo

covid-policy-tracker's Introduction

Oxford Covid-19 Government Response Tracker (OxCGRT)




Final OxCGRT dataset available, June 2023
This repository contains old data. The OxCGRT stopped publishing real-time updates at the end of 2022. A final version of the OxCGRT dataset is available at https://github.com/OxCGRT/covid-policy-dataset
We recommend people only continue to use this OxCGRT/covid-policy-tracker repository if they need to access old or historical versions of the OxCGRT dataset.



The Oxford Covid-19 Government Response Tracker (OxCGRT) collected information on which pandemic response measures were enacted by governments, and when. This is a project from the Blavatnik School of Government. More information on the OxCGRT is available on the school's website: https://www.bsg.ox.ac.uk/covidtracker. This README contains information about using the database.

This repository is where we published OxCGRT data in real time across 2020-2023. We recommend people use the final version of the dataset (published in the OxCGRT/covid-policy-dataset repository linked above) as this has more jurisdictions, and more consistent data formats between files. This repository remains accessible primarily so people can access historical versions of the dataset. The previous version of this README has been moved to old_README.md.

Citing OxCGRT data

Our data is made available free to use for any purpose under a Creative Commons CC BY 4.0 license (see: our license, and a summary of CC BY 4.0 at Creative Commons), this means you must give appropriate credit and link back to our original work. Here are three suggested ways to cite our work:

  • Recommended reference for academic publications: Thomas Hale, Noam Angrist, Rafael Goldszmidt, Beatriz Kira, Anna Petherick, Toby Phillips, Samuel Webster, Emily Cameron-Blake, Laura Hallas, Saptarshi Majumdar, and Helen Tatlow. (2021). “A global panel database of pandemic policies (Oxford COVID-19 Government Response Tracker).” Nature Human Behaviour. https://doi.org/10.1038/s41562-021-01079-8
  • Short credit for media use (CC BY 4.0 License): Oxford COVID-19 Government Response Tracker, Blavatnik School of Government, University of Oxford.
  • Full credit for media use (CC BY 4.0 License): Thomas Hale, Anna Petherick, Toby Phillips, Jessica Anania, Bernardo Andretti de Mello, Noam Angrist, Roy Barnes, Thomas Boby, Emily Cameron-Blake, Alice Cavalieri, Martina Di Folco, Benjamin Edwards, Lucy Ellen, Jodie Elms, Rodrigo Furst, Liz Gomes Ribeiro, Kaitlyn Green, Rafael Goldszmidt, Laura Hallas, Nadya Kamenkovich, Beatriz Kira, Sandhya Laping, Maria Luciano, Saptarshi Majumdar, Thayslene Marques Oliveira, Radhika Nagesh, Annalena Pott, Luyao Ren, Julia Sampaio, Helen Tatlow, Will Torness, Adam Wade, Samuel Webster, Andrew Wood, Hao Zha, Yuxi Zhang. Oxford COVID-19 Government Response Tracker, Blavatnik School of Government, University of Oxford.

covid-policy-tracker's People

Contributors

actions-user avatar bernardoandretti avatar laurahallas avatar saptahash avatar tatlowhelen avatar tobyphillips avatar totalamateurhour avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

covid-policy-tracker's Issues

Question regarding data

Hiya!

I am very sorry for this very basic question.

I've read the documentation and I have surely missed it, but I cannot make sense of the dot ('.') value in the data. Like, for example, in the case of Argentina for July 16, in the latest pull (34775b1).

I found information about the null values in the database but nothing about the dots. Can I assume that this is just data that has not been processed yet?

Thank you!

Issues with C1 school closing for UK

Hi,

Fantastic work!

I am using the covid-policy-tracker data and found there were some discrepancies for the C1 school closing for UK. For example, when C1_Flag = 1, C1_School Closing has two distinctive entries (e.g., 2 and 3) while other variables (ie. date) stays the same. It does not seem to come from different regions because there is no informaiton under "subregion". Do you know how to intrepret this and which value to take at the naitonal level? Thanks.

Best,
Jiayao

Possible additional measures

Thanks for putting this together. I read that you welcome feedback and will consider additional measures. Advice on:

-self isolation e.g. stay at home for x days if you have a fever
-social distancing e.g. keep two meters from people not in your household

would be nice to have. I couldn't see a proxy, I imagine both would be lumped with public info campaigns, but apologies in case I missed something.

The README file in this repo has some bad links - [404:NotFound]

The README file in this repo has some bad links - [404:NotFound]

Status code [404:NotFound] - Link: https://github.com/OxCGRT/covid-policy-tracker/blob/master/www.bsg.ox.ac.uk
Status code [404:NotFound] - Link: https://github.com/OxCGRT/covid-policy-tracker/blob/master/data/OxCGRT_US_states_temp.csv

This was found by an new experimental hobby project that I have just created: https://github.com/MrCull/GitHub-Repo-ReadMe-Dead-Link-Finder
If this has been in any way helpful then please consider giving the above Repo a Star.

lockdown on weekends, how to code?

If a policy has been introduced which requires lockdown only on weekends, is this policy coded for only once, or is the stringency coded differently for every weekend compared to every weekday? (Ukraine)

CZE C8_International travel controls value

Hello,
Value for CZE C8 criteria is 4, ie total travel ban for the last few days. However, there is no total ban for CZE.
According to IATA: https://www.iatatravelcentre.com/international-travel-document-news/1580226297.htm
"Published 16.06.2020

  1. Passengers are not allowed to enter.
    This does not apply to:
    -nationals of Croatia and their family members.
    -nationals of Austria, Czechia, Estonia, Germany, Hungary, Latvia, Lithuania, Poland, Slovakia and Slovenia."

Israel data missing/out of date?

Since 9/20, the NPIs for Isreal have not been updated in the data feed. This is important because Israel recently went into a second lockdown, but this is not reflected in the data.

E3 missing

There is no information on E3 in this document.

Data removed

The csv data that was here before has been replaced with some html markup

The data from Brazil seems corrupt

If one looks at the data from November on across all the distrito federals of Brazil one can see that it is identical when normalizing by population. This is not possible, which means that the data has been calculated by multiplying the total cases in Brazil by the population shares in each distrito federal.
I would like to add a pic here, but it is not possible and maybe you should look at it yourselves anyway.

The "flag=0" hides national measures (example with France)

Hi,

The "flag=0" used for regional measures sometimes seems to mask national measures.

For example, here in France, from 29/02 to 13/03, the value should be 1, and then 2 between 13/03 and 17/03. But because of the "flag =0", this gradation of the national measures does not seem to appear in the database.

Is there a way to remedy it? We would need this kind of fine information for our study.

Thanks for all the amazing work !
image

Health care policies investments in USD

According to your documentation, the announced short term spending on the healthcare system (H4) and announced public spending on COVI-19 vaccine development are in USD. Which exchange rate are you using? In addition, the short term spending seems to be "long term" rather than "short term" just by looking into the values announced by governments in LAC.

Data Quality Issues

Dear OxCRGT Team:

Thank you again for your tremendous data collection efforts and for putting this repo online. This is a very good idea. I maintain a related R package {tidycovid19}. Given timeliness and importance of this data, I have been looking into the data quality of this project in the past. See here for an admittedly dated blog post. I would like to mention three important issues that I believe addressing would improve the internal consistency of your data. All points are substantiated with some R code below.

  1. In this repo you are not providing the notes to the data (the CSV link on your webpage does). As the notes provide the reference for a certain measurement, I think that they are absolutely essential for researchers to assess the quality of your data. I would definitely include them here.

  2. Zero values compared to missing values. Based on the code below, only 7 % of your zero values are supported by references, while more than 90 % of the other measures are. Yet, zero measures are the most frequent data value in your analysis (44 % of the non-NA cases). I would strongly encourage you to reconsider all zero measures that you do not have references for. For me it seems as if your zeros are in some cases informed statements about certain measures not being present whereas in most cases they indicate nothing different from NA.

  3. Organization of data by country-day observations. This is causing me a lot of headaches. Another high-quality repository for government intervention data provided by ACAPS provides its data in country-response structure, meaning that a government response characterizes an observation. This is also how regulatory intervention data is normally stored. It makes much more sense from a data collection standpoint and allows one to validate your data much quicker. For example, in order to assess how many of your interventions are actually supported by references, one first has to assess changes over time, assume that these are driven by government interventions, and then calculate the according statistics. See the code below for what I mean. Transforming your data to a country-response structure reduces your 143,874 country-response type-day observations to 8,184 country-response observations without any loss if information. When you focus on changes in the actual measures (and not in the references that are not included on Github), this number reduces even more to 4,310. For this focused sample it is much easier to provide data quality assurance and you can still produce any view/slice and cut of the data as you like.

Minor things that you might want to clean up

  • There is a trailing comma in your CSV
  • I would remove the confirmed cases or confirmed deaths data as these are not an integral part of your data collection. Alternatively, I would provide a reference to the data source or explain how you collected that data.

Thank you for listening and again for your contributions to open science!

Joachim

PS: Code follows

library(tidyverse)
library(lubridate)
github_url <- "https://raw.githubusercontent.com/OxCGRT/covid-policy-tracker/master/data/OxCGRT_latest.csv"
web_url <- "https://ocgptweb.azurewebsites.net/CSVDownload"

github_data <- read_csv(github_url) %>% select(-X27)
web_data <- read_csv(web_url) %>% select(-X40)

web_data %>% select(-ends_with("Notes")) %>% all_equal(github_data)

# [1] TRUE

# Web data is equal to Github data but contains Notes

# Reorganize to long to ease comparisons

df <- web_data
names(df)[c(seq(from = 4, by = 3, length.out = 7), 32, 34)] <- paste0("S", c(1:7, 12, 13), "_Measure")

long_dta <- df %>% select(1:23, 32:35) %>%
  # S7, S12, S13 have no "IsGeneral" value. I attach NA vars for consistency
  mutate(S7_IsGeneral = NA,
                S12_IsGeneral = NA,
                S13_IsGeneral = NA) %>%
  pivot_longer(4:30, names_pattern = "(.*)_(.*)", names_to = c("Type", ".value")) %>%
  mutate(Date = ymd(Date)) %>%
  arrange(CountryName, Type, Date)

nrow(long_dta)

# [1] 143874

# Give speaking names to Type

mat <- str_split_fixed(names(web_data)[4:35], "_", 2)
df <- tibble(
  Type = mat[, 1],
  ResponseType = mat[, 2]
) %>%
  filter(
    ResponseType != "Notes",
    ResponseType != "IsGeneral"
  )

long_dta <- long_dta %>% left_join(df, by = "Type") %>% select(-Type) %>%
  select(CountryName, CountryCode, ResponseType, Date, everything())

  # Focus on obs that are either the first or that contain changes from prior date

gov_resp <- long_dta %>%
  group_by(CountryCode, ResponseType) %>%
  filter(
    (row_number() == 1 &
       (!is.na(IsGeneral) | !is.na(Measure) | !is.na(Notes)))  |
      (is.na(lag(IsGeneral)) & !is.na(IsGeneral)) |
      (is.na(lag(Measure)) & !is.na(Measure)) |
      (is.na(lag(Notes)) & !is.na(Notes)) |
      (!is.na(lag(IsGeneral)) & is.na(IsGeneral)) |
      (!is.na(lag(Measure)) & is.na(Measure)) |
      (!is.na(lag(Notes)) & is.na(Notes)) |
      (lag(IsGeneral) != IsGeneral) |
      (lag(Measure) != Measure) |
      (lag(Notes) != Notes)
  ) %>%
  ungroup()

nrow(gov_resp)

# [1] 8184

# 8,184 observations that reflect changes in the data - impressive

# But, unfortunately, many of those only reflect changes that are just driven
# by notes (that are sometimes sticky, sometimes not) that are marginally
# changed or omitted after the initial day. See for example in the raw long data:

long_dta[72:76,]

# # A tibble: 5 x 7
# CountryName CountryCode ResponseType  Date       Measure IsGeneral Notes
# <chr>       <chr>       <chr>         <date>       <dbl>     <dbl> <chr>
# 1 Afghanistan AFG         School closi… 2020-03-12       0        NA  NA
# 2 Afghanistan AFG         School closi… 2020-03-13       0        NA  NA
# 3 Afghanistan AFG         School closi… 2020-03-14       2         1 "\n\nOn March 14, 2020, the…
# 4 Afghanistan AFG         School closi… 2020-03-15       2         1 "On March 14, 2020, the Afg…
# 5 Afghanistan AFG         School closi… 2020-03-16       2         1 "On March 14, 2020, the Afg…

# compared to

long_dta[161:165,]

# # A tibble: 5 x 7
# CountryName CountryCode ResponseType   Date       Measure IsGeneral Notes
# <chr>       <chr>       <chr>          <date>       <dbl>     <dbl> <chr>
# 1 Afghanistan AFG         Testing frame… 2020-02-21       0        NA  NA
# 2 Afghanistan AFG         Testing frame… 2020-02-22       0        NA  NA
# 3 Afghanistan AFG         Testing frame… 2020-02-23       1        NA "'The Ministry of Public H…
# 4 Afghanistan AFG         Testing frame… 2020-02-24       1        NA  NA
# 5 Afghanistan AFG         Testing frame… 2020-02-25       1        NA  NA

# To identify "real" government interventions, I focus on measure changes
# and discard changes that are just driven by the notes.

gov_resp %>%
  group_by(CountryCode, ResponseType) %>%
  filter((row_number() == 1 ) |
           (lag(Measure) != Measure) |
           (lag(IsGeneral) != IsGeneral) |
           (is.na(Measure) & !is.na(lag(Measure))) |
           (!is.na(Measure) & is.na(lag(Measure))) |
           (is.na(IsGeneral) & !is.na(lag(IsGeneral))) |
           (!is.na(IsGeneral) & is.na(lag(IsGeneral)))) %>%
  mutate(NotesThere = !is.na(Notes)) %>%
  ungroup() -> gov_resp

nrow(gov_resp)

# [1] 4310 - still impressive

gov_resp %>%
  group_by(ResponseType) %>%
  summarise(
    N = n(),
    PctNotes = sum(NotesThere)/n()
  ) %>% arrange(-N)

# A tibble: 9 x 3
# ResponseType                          N PctNotes
# <chr>                             <int>    <dbl>
# 1 International travel controls       527    0.488
# 2 Restrictions on internal movement   515    0.447
# 3 Workplace closing                   495    0.432
# 4 Cancel public events                493    0.426
# 5 School closing                      485    0.416
# 6 Public information campaigns        464    0.379
# 7 Testing framework                   463    0.404
# 8 Close public transport              453    0.342
# 9 Contact tracing                     415    0.345
# Compare this across type of measure changes

gov_resp %>%
  group_by(Measure) %>%
  summarise(
    N = n(),
    PctNotes = sum(NotesThere)/n()
  ) %>% arrange(Measure)

# A tibble: 5 x 3
# Measure     N PctNotes
# <dbl> <int>    <dbl>
# 1       0  1414   0.0686
# 2       1   642   0.924
# 3       2   973   0.941
# 4       3   169   0.923
# 5      NA  1112   0.0108

# While most of the non-zero measures are supported by refernces ony 6% of th
# zeros are. Yet they are by far the most frequent observation type in the data
# My hunch: Most of the zeros are actuall not backed by actual "events" or
# clear evidence that there is no such event.

# Share of zero measures

sum(gov_resp$Measure == 0, na.rm = TRUE)/sum(!is.na(gov_resp$Measure))

# [1] 0.4421513

Column name casing

Hi, I noticed that the last commit to the latest data updated column names from UpperCase to lowercase.

Was this planned or is this a mistake? Changing our workflow to be more flexible to this, but just wanted to point out as others may face issues.

US states data missing September 12-20?

The two tables that include subnational index values for US states ("data/..latest" and "data/..latest_withnotes") have a gap (no data) from September 12th to 20th. Was there a pause in generating these or were they lost? I believe, this already was the case several weeks ago. Sorry if I am missing something here. And thanks for your outstanding work!

Brazil data mistakes

There appears to be a mistake in processing Brazil's data on the API. The confirmed cases and deaths numbers bounce between a high and a low value.

For example, 2020-03-27 shows 2915 cases and 77 deaths, while the next day 2020-03-28 shows just 4 cases and 0 deaths. If only this were true.

This is also a problem for a few other values of countries, including Ecuador and Lithuania.

French data are not " up to date" (13/05/2020)

Hello, very good job, we will use it soon for pedagogic and research purposes !!
thanks a lot
but unfortunately ...

btw i think that the French data are not " up to date": many restrictions are finished or less intense in France for 3 days (like school, transports ...) 13/05/2020

please have a look to : http://www.leparisien.fr/societe/coronavirus-dernier-jour-de-confinement-en-france-suivez-notre-direct-10-05-2020-8313930.php

best regards +
take care
Henri

July entries

since 5th of July most column's data have not been uploaded

Cases and deaths are not there from 6 August onwards?

Hello!

I use the confirmed cases and confirmed deaths data as well but I see that the last date with data is on the 6th of August. Is it just me or there will be no more case and death data? I also couldn't find an announcement about it.

Thanks!
Meltem

Missing data in V2 of API

Hello,
Did some format analyses of the data coming from the DATE/COUNTRY API and have some concerns.
There are many entries that are missing data:

  • policy_value_display_field (one or several policies are missing this field, but others have in dome files)

  • stringencyData - stringencyData":{"msg":"Data unavailable"}. This causes response to miss important values like value_date and country_code and processing is much harder.

  • Also in many the policy Action parameter "is_general" is still on OLD key "isgeneral". I think that S99 policy is not converted(?)
    DNK_2020-04-29:{"policyActions":[{"policy_type_code":"S99","policy_type_display":"No data. Data may be inferred for last 7 days.","policyvalue":1,"isgeneral":true,"notes":null}],"stringencyData":{"date_value":"2020-04-29","country_code":"DNK","confirmed":8851,"deaths":434,"stringency_actual":null,"stringency":81.47}}

There are only the syntax errors in my opinion. I have not been able to check for the quality of values. First I need help in making data uniform in order to proceed.

Regarding the "RANGE" API. In general it misses "policyActions" which makes this API unusable if one want the policyActions as well.

The CSV file
Keys are different between API and CSV, which makes it incompatible. I do not know if this is intentional or not.

Attaching list of countries and dates for which data is, in my opinion, missing or wrong in DATE/COUNTRY API

I also have a couple of questions regarding data update interval

  1. Today is 2nd of May. When is data for 2nd of May available trough DATE/COUNTRY API?
  2. If "source" today (2nd of May) changes values for previous days (lets say for 30th of April) will this change be reflected in Your data for that country on 30th of April?
    MissingDataInDailyObservations.txt
    isgeneral_OnOldFormat.txt

Thanks for helping out!
Regards
Ljuba

Question on NPI data for China

First, thank you for the awesome work you have been doing and kudos to all who are making this possible.
I am trying to understand whether the NPI data that you have for China is for the whole country or is mostly based on interventions taken in Wuhan. Indeed I would expect the indicators to be lower in many categories as many interventions have been lifted in most chinese regions but Wuhan.

Can you share your perspective on how to interpret these indicators? Is it fair to say
that for China you are capturing the most strict measures applied in any region /state/ province?

Thanks again

Pierre

How are closures of bars/restaurants coded?

Hi,

thanks for this fantastic work! It's not clear from your documentation how a "private gathering", a "public event" or a "workplace" are defined. When bars or restaurants are required to close, does this affect the three above-mentioned categories, or just a subset? Same question when bars are required to close at certain hours, or when there are restrictions on the occupancy.

Policy Measures in US States don't seem correct

Thanks for putting this package together!.

Looking at Georgia I notice that all policy variables have increased over time without any decrease. Georgia has relaxed a number of restrictions and this relaxation does not seem to be accounted for.

Could you please shine some light on this?

[Country Charts] No Cases vs Government Response Index?

Tried a search for cases vs government response in the image folders, but images for all countries reflect deaths vs government response index. Is there any way we can find images of cases vs government response? Thanks

Possible to see chart's code ?

Hello OxCGRT team,

First of all, thank you for the amazing work you are doing. I'm using your data for some academic research and it has been a great source.

I'm trying to study some correlations and causality. I wondererd if it was possible to see the codes used to make the charts in order to customize them for specific studies.

Thanks !
Leo

Outdated information for France

Some policy informations are not up-to-date for France for several months. For instance your database states that:

  • Public events are cancelled
  • Gatherings are restricted for 10 people or less
  • It is recommended not to leave the house
  • It is recommended not to travel between cities

This is not the case.

You seem to be relying on an outdated webpage. According to your notes, your source is: https://web.archive.org/web/20200701055822/https://www.gouvernement.fr/en/coronavirus-covid-19
However as we can see in the URL this webpage was captured on July 1st. But the main problem is that the current webpage itself was not updated since June 22nd, as indicated at the top! This doesn't mean nothing changed but that the government didn't manage to maintain an up-to-date English webpage.

StringencyIndexForDisplay data missing/NaN for last 2 days for over 100 countries

First of all, thank you very much for the fantastic job you are doing. I am using it for "my" dashboard at https://got-data-for.me/wmap to visualise the global situation (dropdown string_idx).

When updating the data for that dashboard, as my data pipeline is not yet fully up (there is always a little bit that needs fixing) I do encounter frequent missing data points for StringencyIndexForDisplay, often around 30...50 countries, missing yesterday's data. I am usually padding this myself.

Ever since day before yesterday, though, things have changes substantially, it seems. Many countries lack data for two days, there is countries with NaN in StringencyIndexForDisplay, so padding got very challenging recently.

It appears as if something has changed and data quality is affected, unfortunately I cannot point you to the source of it. The data from 2020-06-01 was the last one I managed to successfully pad.

Keep up your excellent work!

Data chart incorrect, please correct

Hi,
great fan of the data & chart, however it is incorrect for Mexico. According to the online data, Mexico´s stringency index was 82.4 in May, but the chart shows only 62.4 (one major horizontal line lower!, 20 points lower than the actual data.
Please correct it, I was going to use this chart in a journal article but cannot as it is incorrect.

Thanks in advance for the correction!
K. Lengyel

Change in historical values

Hello.

I downloaded the file data/OxCGRT_latest.csv at 9:20AM on March 6th, 2020.

I downloaded the same file at 12:04PM on March 6th, 2020.

I noticed that the historical values for the StringencyIndexForDisplay values have changed.

9:20AM Download
Date CountryName CountryCode StringencyIndex StringencyIndexForDisplay
14231 20200111 Portugal PRT 0.00 0.00
14232 20200112 Portugal PRT 0.00 0.00
14233 20200113 Portugal PRT 0.00 0.00
14234 20200114 Portugal PRT 0.00 0.00
14235 20200115 Portugal PRT 0.00 0.00
14236 20200116 Portugal PRT 0.00 0.00
14237 20200117 Portugal PRT 0.00 0.00
14238 20200118 Portugal PRT 0.00 0.00
14239 20200119 Portugal PRT 0.00 0.00
14240 20200120 Portugal PRT 0.00 0.00
14241 20200121 Portugal PRT 0.00 0.00
14242 20200122 Portugal PRT 0.00 0.00
14243 20200123 Portugal PRT 0.00 0.00
14244 20200124 Portugal PRT 0.00 0.00
14245 20200125 Portugal PRT 0.00 0.00
14246 20200126 Portugal PRT 11.11 11.11
14247 20200127 Portugal PRT 11.11 11.11
14248 20200128 Portugal PRT 11.11 11.11
14249 20200129 Portugal PRT 11.11 11.11
14250 20200130 Portugal PRT 11.11 11.11

12:07PM Download
Date CountryName CountryCode StringencyIndex StringencyIndexForDisplay
14231 20200111 Portugal PRT NaN NaN
14232 20200112 Portugal PRT NaN NaN
14233 20200113 Portugal PRT NaN NaN
14234 20200114 Portugal PRT NaN NaN
14235 20200115 Portugal PRT NaN NaN
14236 20200116 Portugal PRT NaN NaN
14237 20200117 Portugal PRT NaN NaN
14238 20200118 Portugal PRT NaN NaN
14239 20200119 Portugal PRT NaN NaN
14240 20200120 Portugal PRT NaN NaN
14241 20200121 Portugal PRT NaN NaN
14242 20200122 Portugal PRT NaN NaN
14243 20200123 Portugal PRT NaN NaN
14244 20200124 Portugal PRT NaN NaN
14245 20200125 Portugal PRT NaN NaN
14246 20200126 Portugal PRT NaN NaN
14247 20200127 Portugal PRT NaN NaN
14248 20200128 Portugal PRT NaN NaN
14249 20200129 Portugal PRT NaN NaN
14250 20200130 Portugal PRT NaN NaN

I use python's pd.read_csv() to directly download the file. I downloaded the file every 30 minutes thereafter and get the same results as the 12:07PM download.

I wanted to either raise the awareness of the potential corrupted data or to understand that this was an intended change that took place on your side.

Thank you so much!

Z

Fiscal stimulus data contains duplicates, clearly wrong data and missing data

This is a great resource that I've been heavily leaning on for some of my own work, but I wanted to make you aware of some data quality issues with the values in the E3 indicator (fiscal measures).

1. Existence of duplicates

There seems to be a number of duplicate entries on consecutive days. For example, Cape Verde has the same value entered 16 times on consecutive days (and it seems to me to be an incorrect value as well, more on this in my next point.

2. Wrong data / incorrect format

There are a number of records where the fiscal value seems to be

For example, the dataset seems to be saying that Bolivia injected a measly 19 cents into its economy on the 9th of April.

Bolivia

This issue seems occurs for a number of countries throughout the dataset on this measure including Turkey, Guam, Pakistan, Ireland, Cape Verde, Morocco. I suspect whoever has data entered it assumed the data should be entered as millions of dollars instead of the complete integer.

There is also an possible typo with Slovenia, which the dataset has $217 billion entered as fiscal stimulus which is about 400% of their GDP. I suspect a zero or two have been inadvertingly added on.

3. Missing data.

The figures for Canada seem to be incorrect. the IMF World Economic Outlook reports fiscal stimulus of around 9% of GDP $205 billion. However the sum of the figures in E3 here fall way short (about 1.8 biollion). I suspect that either data has been missed or that the fiscal stimulus announced by Canada doesn't meet the specification of your E3 indicator?

Thanks again though for the resource.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.