The WCota dataset is pulled from the Github repository by researcher Wesley Cota (W. Cota, “Monitoring the number of COVID-19 cases and deaths in brazil at municipal and federative units level”, SciELOPreprints:362, 2020). The number of confirmed cases and deaths caused by SARS-COV-2 infections aggregated at the state and municipal levels are compiled from data from the Ministry of Health and State Health Departments.
The author gathers data from publicly available state and municipal secretariat reports prior to registration in the Brazilian Ministry of Health database. This process helps to make data on COVID-19 available in real time, as it takes a long time to register cases from state and municipal secretariats in the Brazilian single system.
Also, the data provided by the Ministry of Health has an infrequent, slow process of update, the site goes down frequently, and the data is unstructured.
We collected the data from February 1st 2020, and onward. The data can be daily updated. Links to publications that use the WCota data or provide other publicaly accessible locations of the data can be found in ( Jorge et al., 2021, 35 ).
The collected, clean and formatted data freely available from WCota can be accessed and downloaded in (WCota, 2020), under the Creative Commons Attribution ShareAlike (CC-BY-SA 4.0) licence.
A python code is available on our Github directory to download data from the WCota project, see details in Github. Pamepi uses files named cases-brazil-cities-time that contain the time series of new cases and deaths by Covid-19.
The WCota dataset has a total of 12 columns and showed a size of 311 MB in the last update of May 11th, 2022. A code with more details about the variables, data processing and analysis methods is presented in our Github directory.
WCota dataset depends on the quality of the reported information given by the state and municipal health secretaries. When files are provided in pdf or images, it can compromise the tabulation of the data in real-time. Additionally, the deaths and cases of COVID-19 are tabulated according to the date when the data was collected.
Therefore, the epidemiological curve constructed can show a delay of one to up to 7 weeks in relation to the date of the first symptoms or the date of the laboratory test of the case (Observatório COVID-19, 2020). Still, the dataset is considered an excellent source to measure the course of the pandemic in real-time.
epi_week | date | state | city | ibgeID | newDeaths | deaths | newCases | totalCases | deaths_per_100k_inhabitants | totalCases_per_100k_inhabitants | deaths_by_totalCases |
---|---|---|---|---|---|---|---|---|---|---|---|
12 | 2020-03-16 | SP | São Paulo/SP | 3550308 | 0 | 0 | 83 | 145 | 0 | 1170 | 0 |
12 | 2020-03-20 | CE | Fortaleza/CE | 2304400 | 0 | 0 | 46 | 63 | 0 | 2330 | 0 |
12 | 2020-03-20 | DF | Brasília/DF | 5300108 | 0 | 0 | 45 | 87 | 0 | 2812 | 0 |
12 | 2020-03-18 | RJ | Rio de Janeiro/RJ | 3304557 | 0 | 0 | 32 | 55 | 0 | 812 | 0 |
12 | 2020-03-18 | BA | Salvador/BA | 2927408 | 0 | 0 | 12 | 17 | 0 | 586 | 0 |
12 | 2020-03-20 | PR | Curitiba/PR | 4106902 | 0 | 0 | 10 | 27 | 0 | 1375 | 0 |
12 | 2020-03-18 | RS | Porto Alegre/RS | 4314902 | 0 | 0 | 9 | 15 | 0 | 1005 | 0 |
12 | 2020-03-19 | MG | Belo Horizonte/MG | 3106200 | 0 | 0 | 8 | 18 | 0 | 711 | 0 |
12 | 2020-03-19 | ES | Vila Velha/ES | 3205200 | 0 | 0 | 7 | 7 | 0 | 1376 | 0 |
12 | 2020-03-17 | PE | Recife/PE | 2611606 | 0 | 0 | 6 | 13 | 0 | 783 | 0 |
Original field name | epi_week | date | state | city | ibgeID | newDeaths | deaths | newCases | totalCases | deaths_per_100k_inhabitants | totalCases_per_100k_inhabitants | deaths_by_totalCases |
---|---|---|---|---|---|---|---|---|---|---|---|---|
Field name given by PAMEpi | sem_ntf | date | uf | mun_name | code_full_mun_ibge | newDeaths | deaths | newCases | totalCases | deaths_per_100k_inhabitants | totalCases_per_100k_inhabitants | deaths_by_totalCases |
Field label | epidemiological week | Date | State | County | code IBGE | deaths in the day | Accumulated deaths | Confirmations on the day | Confirmed accum. | Accumulated deaths/100k inhab. | Accum./100k inhab. | deaths/confirmed |
Type | Number | Date YYYY-MM-DD | String | String | Number | Number | Number | Number | Number | Number | Number | Number |
Original categories | Uncategorized | Uncategorized | Uncategorized | Uncategorized | Uncategorized | Uncategorized | Uncategorized | Uncategorized | Uncategorized | Uncategorized | Uncategorized | Uncategorized |
Categories given by PAMEpi | Uncategorized | Uncategorized | Uncategorized | Uncategorized | Uncategorized | Uncategorized | Uncategorized | Uncategorized | Uncategorized | Uncategorized | Uncategorized | Uncategorized |
Description | Epidemiological week number. | Data release date in YYYY-MM-DD format. | Abbreviation of the federative unit, example: SP, or 'TOTAL' when referring to the entire country | Full name of the municipality in City/State format. It can have the value 'CASO SEM LOCALIZAÇÃO DEFINIDA/UF', referring to those cases in the federative unit that do not have their municipality defined. | Unique identification of the municipality provided by the Brazilian Institute of Statistical Research (IBGE) | Difference between the number of deaths on the corresponding date and the previous one | Cumulative number of deaths on that date | Difference between the number of cases of the corresponding date and the previous one | Cumulative number of cases on that date | Number of deaths per 100,000 inhabitants for that location | Number of cases per 100,000 inhabitants for that locality | Ratio between number of deaths and cases (deaths/totalCases) |