-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Historical data using the API #1
Comments
@themonk911 Thank you for getting in touch. Could you please elaborate on the metrics you were trying to extract? It might be helpful if you share your |
Hi, Yeah I'm trying to get granular data for all ltla and utla, for all recorded data points (I'm part of the https://github.com/GoogleCloudPlatform/covid-19-open-data project). It seems like by default I only get a couple of days data. My example below is including only one area for brevity.
|
Hi @themonk911 So, you can get the data for either So what you need is as follows (for Adur): adur_data = [
'areaType=ltla',
'areaName=Adur'
]
cases_and_deaths = {
"date": "date",
"areaName": "areaName",
"areaCode": "areaCode",
"cases": {
"new": "newCasesBySpecimenDate",
"total": "newCasesBySpecimenDate",
},
"deaths": {
"new": "newDeathsByDeathDate",
"total": "cumDeathsByDeathDate"
}
}
api = Cov19API(filters=adur_data, structure=cases_and_deaths)
data = api.get_json()
print(data) This would return 136 records. If you need all the data for all_ltla = [
'areaType=ltla'
]
cases_and_deaths = {
"date": "date",
"areaName": "areaName",
"areaCode": "areaCode",
"cases": {
"new": "newCasesBySpecimenDate",
"total": "newCasesBySpecimenDate",
},
"deaths": {
"new": "newDeathsByDeathDate",
"total": "cumDeathsByDeathDate"
}
}
api = Cov19API(filters=all_ltla, structure=cases_and_deaths)
data = api.get_json()
print(data) This one would return 50,000+ records (50+ pages), so it might take a while if the data isn't cached. Feel free to change the {
"date": "date",
"areaName": "areaName",
"areaCode": "areaCode",
"newCasesBySpecimenDate": "newCasesBySpecimenDate",
"cumCasesBySpecimenDate": "cumCasesBySpecimenDate",
"newDeathsByDeathDate": "newDeathsByDeathDate",
"cumDeathsByDeathDate": "cumDeathsByDeathDate"
} Hope this helps. Re your involvement with the Open Data project, keep an eye out for our R and JavaScript SDKs. You might find them useful too. They'll be released soon. |
Thanks, much appreciated :) |
It might be worth listing the metrics compatible with each areaType in the documentation. |
@themonk911 I would have if it were static, even so we have over 100 metrics for 619 area names in 7 area types. They tend to change regularly as we add new metrics or change the existing ones. We use some to calculate the others, and those would be useless to end users because some of them may not even be available for constituents of a specific area type. Just to give you an idea, we uploaded over 124k records to the database today - it increases every day - which consisted of over 58 million lines of data. This was produced in a massive pipeline with >20 sources and 100s of data bricks. We QA this whole dataset every single day before we release it. Rule of thumb: we use the API to populate the website, so your best reference is the website. If a specific metric isn't displayed in there, it's probably because we don't have data for that metric in the area type / name. Having said that, I'll try to draft something that covers the most important metrics - as soon as I find a bit of time. |
Thanks very much for publishing this API! I couldn't work out from the docs what the best way to get all historical records for a given resource is. Could you provide some guidance?
The text was updated successfully, but these errors were encountered: