-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
* Add preprocessing script For hive partitioned dataset * first version of database creation also update disease search terms cols * new script * Update main.py * Update main.py * add log output for skipping * format preproc * add postproc, more efficient storage * several fixes in main script, begin postproc * larger postproc chunk size * Update postproc.py * update readme, big updates for main and postproc - better input data - renamed database_flat - more complex query with optional proximity - renamed "municipality" to "location" * Update .gitignore * Create location_search_terms.xlsx * switch to much faster iteration it is less memory-efficient * add query amsterdam all diseases all years note: we need word boundaries in disease mention! tuberculosis == tering, and this is a common word ending ("spijsvertering", "godslastering") * query for amsterdam, dordrecht, and groningen * update gröningen regex character error * add word boundaries to disease search terms * move initialresults to archive * move maps to archive in preparation of splitting out analyses to separate git repo * move two more scripts to archive * add more scripts to archive, update readme * Correct path in api harvest for api key * update readme for analysis split * try turning off unicode support for (much) faster performance * Update .gitignore * fix postproc uncertainty non-coverage * move to csv for search terms * Update query_db.py * add github release badge
- Loading branch information
1 parent
0409b08
commit 0937ede
Showing
101 changed files
with
1,504 additions
and
48 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
File renamed without changes.
File renamed without changes
File renamed without changes
File renamed without changes
File renamed without changes
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes
File renamed without changes
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
File renamed without changes
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,11 @@ | ||
Label,Disease,Type ,Regex | ||
Typhus,Typhoid fever; Paratyphoid fever,Food- and water-borne infectious diseases ,\b(ty(ph|f)(us|euz\w*)|febris\s?typhoidea|kwaadaardige\s?koorts)\b | ||
Dysentery,Diarrhoea; Dysentery; Acute diseases of the digestive system,Food- and water-borne infectious diseases ,\b(diarrhoea|dysenter\w*|rood\s?loop|buik\s?loop|bloed\s?gang)\b | ||
Cholera,Cholera (including: Asiatic cholera; Cholera nostras) ,Food- and water-borne infectious diseases ,\b(choler\w*|krim\s?koorts)\b | ||
Smallpox,Smallpox,Airborne infectious diseases,\b(pokken|variola)\b | ||
ScarletFever,Scarlet fever,Airborne infectious diseases,\b(rood\s?vonk|scarlatina|scharlaken\s?koorts)\b | ||
Measles,Measles,Airborne infectious diseases,\b(mazelen|rood\s?ziekte|rubeola|rubella)\b | ||
Tuberculosis,"Respiratory tuberculosis (incl: Tuberculosis of the lung and larynx, haemoptysis)",Airborne infectious diseases,\b(tering|verteringsziekte)\b | ||
Diphteria,Croup; Diphtheria,Airborne infectious diseases,\b((c|k)roup|angina\s?diphtheri\w*|diphtheri\w*|difteritis)\b | ||
Influenza,Acute respiratory disease (including influenza),Airborne infectious diseases,\b(griep|influenza)\b | ||
Malaria,Malaria (including: intermittent fever; pernicious fever),Other infectious diseases (mixed aetiology),\b(malaria|moeras\s?koorts|polder\s?koorts)\b |
Binary file not shown.
Oops, something went wrong.