-
-
Notifications
You must be signed in to change notification settings - Fork 93
Dataset Introduction
Translated with www.DeepL.com/Translator (free version)
https://github.com/swsoyee/2019-ncov-japan/blob/master/50_Data/byDate.csv
Source of data: New Coronavirus Latest Status Map and Number of Infected Persons in Japan (later referred to as NewsDigest
) in the middle of the page, click 表で見る
to view the statistical table of the day. Please note that the figures on this site are the actual number of patients, and relapses are not counted as new cases (while the calculation method for Arau and each municipality is that relapses are counted as new cases).
- set up a timed start script on AWS to automatically obtain the data set for this project and compare the difference between the values in the table for this site at the time of start.
- automatically submit a data update type Pull Request to that project if there is a difference.
- Github Action detects that the Pull Request contains data classes changes After Github Action detects that the pull request contains data class changes, start data preprocessing generates the data format required by the website and automatically submits the update, then the branch is automatically merged into the master branch and the data-update branch will be deleted.
- Automatically deploy to the production environment via CD step to complete the live data update of the website.
Since it is convenient to get only the current day's difference in the table, for the part of NewsDigest
that revises the past values based on the auth data, daily data verification is required for manual revision. The real-time confirmation items used in this project are basically identical to the values published by NewsDigest
. If there are differences, the data for this project will need to be revised.
Number of infected persons in the upper right corner (note: this value includes the number of confirmed Princess Diamonds).
- Epidemic map tab - the number of new people added daily in the new map (indicating the mode as easy) (the pointed bar on the map).
- Epidemic map tab - the number of infected persons in the cumulative map (indicating the mode as simple) (color block).
- Epidemic map tab - the number of new (bubble-shaped) and cumulative (color-blocked) infections in the dynamic (indicating the mode as detailed) map.
- Epidemic map tab - new, number of infected, infection nudged, 100,000 confirmed, Rt, multiplied days in the right infection table (hidden by default, scheduled to be repealed or need to be modified).
- Epidemic map tab - details pie chart (denominator) calculations in the recovery/death table on the right.
- Epidemic Map tab - calculation of the total number of patients now (denominator) below the map.
- Multidimensional comparison tab - the quick report values in the comparison curve with the Atsuro Province.
- Infected person heat map tab - daily added heat map and multiplied time heat map.
- Effective number of infections tab - Rt graphs.
- Generic bottom - the number of infected persons at the bottom.
Everything under the infected nudge tab.
https://github.com/swsoyee/2019-ncov-japan/blob/master/50_Data/death.csv
See 1.2.
See 1.3.
See 1.4.
Number of fatalities in the upper right corner (note: this value includes the number of confirmed Princess Diamonds).
- Epidemic map tab - Deaths, deaths per million in the recovery/deaths table on the right (expected to change to 100,000 deaths, consistent with the infection table)
- Epidemic Map tab - calculation of the total number of patients now below the map (deaths are excluded from the denominator)
- Multidimensional comparison tab - the quick-reported values in the comparison curve with the Hauro province
- generic bottom - the number of deaths at the bottom.
All contents under the infected person nudge tab.
Base data: https://github.com/swsoyee/2019-ncov-japan/blob/master/50_Data/MHLW/summary.csv
Consultation hotline data: https://github.com/swsoyee/2019-ncov-japan/blob/master/50_Data/MHLW/callCenter.csv
In the Ministry of Health, Labour and Welfare's Report Publication Data, a uniform formatted PDF summary report will be published daily from May 9 onwards, so data after May 9 can be obtained from List swsoyee/2019-ncov-japan/blob/master/50_Data/MHLW/summaryUrlList.csv). Prior to this confirmation of diagnosis, PCR test, and other thick The standard of data collected from various provinces in Laos was uneven, and it was difficult to aggregate data under a completely uniform standard. Since this project followed all data consistently almost from the beginning of the epidemic, and also received changes in the classification criteria of data from ARAO, from public to private, and various remarks, all data reflect only the status at the time of data publication, and do not revise past data.
For example, the Ministry of Health, Labour and Welfare makes a note on yesterday's data in the latter day's summary, saying that for some reason yesterday's values were incorrect and that today's values are the new ones after correcting yesterday's values. Since the Ministry of Health, Labour and Welfare does not revise PDF files published yesterday, the project also does not perform additional processing of past data.
And because of this, assuming that the number of detectors published yesterday was +1000
, and today's correction of yesterday's correct number of detectors is +500
, and the number of detectors published today is +300
, you will see the occurrence of the number of detectors on the website as -200
.
Actual: 500 yesterday + 300 today = 800
Published: yesterday 1000 + today 300 - revised yesterday (-500) = 800 (website shows +1000, -200)
- daily around 6:00 p.m. JST, the Ministry of Health, Labour and Welfare's report publication information page will be updated with the news content of "Current Status of the New Coronavirus Infection and the Ministry of Health, Labour and Welfare's Response to it (Order and x year, x day edition)".
- obtain the PDF address of "[Particular Paper 1] Status of the positive person of each prefecture's examination (domestic examples of air port examination and Chikatari cases)" on this page and add it to the item [list](https://github.com/swsoyee/2019-ncov-japan/blob/master/50_ Data/MHLW/summaryUrlList.csv).
- Run the R script FetchSummary.R, and the data will be automatically updated to the base data summary .csv.
- Since the PDF file does not contain the quarantine values, modify the quarantine values according to the message page of the day.
- Execute CreateTable.R to pre-process the data to generate various tables for the website display, thus reducing the website rendering pressure.
- only the values in the PDF tables published on the current day are collected and not revised according to the revisions made to yesterday's data in the notes.
- There is a separate column of
Confirmed' data in the PDF table, but based on the values provided by the Ministry of Health, Labour and Welfare itself, there is a situation where
Positive' <In treatment' +
Recovered' +Deceased' +
Confirmed', so instead of using the `Confirmed' values directly, they are calculated by themselves. - if
positive
- (in-treatment
+recovered
+dead
) < 0, the negative number is assumed to be those who have not recovered, the negative value is subtracted fromrecovered
andconfirmed
is set to 0, but ifpositive
- (in-treatment
+recovered
+dead
) > 0, then the value is is set to the value ofConfirmed
. Therefore, there may be a discrepancy between the `confirmed' value on this site and the value published by the Ministry of Health, Labour and Welfare.
The number of people tested and the number of people recovered in the upper right corner (note: this value includes the number of people diagnosed with the Princess Diamond).
- Epidemic map tab - the number of people now infected (color block) in the new map (representation mode is simple).
- Epidemic map tab - the number of people with serious illness (color block) in the Serious illness map (indicating the mode as simple).
- Epidemic map tab - all data in the checklist on the right side.
- Epidemic map tab - recovery data in the right recovery/death table and calculation of the pie chart (Note: Since the calculation of the pie chart mixes real-time data and thick laborious provincial data, strictly speaking the definition of the number of confirmed cases and the data collection time criteria are not exactly the same for both data sets, it is not recommended to do a direct addition or subtraction calculation, but since there is no better quality data set only a mix of compromise processing can be done. (Errors are negligible in cases where the number of confirmed mothers is large enough).
- Epidemic map tab - calculation of the total number of patients now below the map (numerator calculation of the number of recoveries, deaths).
- Multidimensional comparison tab - the quick report values in the comparison curve with the Atsugi Province.
- Multidimensional comparison tab - two histograms based on Atsuro Province data and the radar chart on the right (indicators need to be redesigned)
- Generic bottom - the number of PCR tests and the number of recoveries at the bottom.
- All under the PCR Detection Nudge tab.
- All content under the Rehabilitator Nudge tab.
- All content under the Counseling Hotline tab.