##What the script does The script fetches Yelp's venues for big cities.
For each city, it uses a list of locations, e.g.:
- for Paris, the list of the GPS locations of the centers of each arrondissement and more (for Grand Paris)
For one city, all the venues are temporarily stored in the Venue table of the dedicated SQLite database and one venue is inserted only if it doesn't exist already. This way there is no duplicate for one city.
When all the requests are done for a city, the script writes the json file (YYYY-MM-DD.json), empties the Venue table and goes to the next city.
If for some reason, a request fails (e.g. Yelp stops answering) the script logs in a special file (yelp-back.json) the list of locations the script still needs to request. These locations are used in place of the full list of locations used by default. This is a convenient way to enable the script to restart from where it stopped.
This script could be run daily.
##Files ###Specific files
- YelpCapture.py
- YelpCaptor.py
- YelpDatabaseManager.py
###Common files
- Captor.py
- FileManager.py
- ExitLogger.py
- LoggerBuilder.py
###Additional Python modules
- requests
- requests_oauthlib
###Logs
- ./logs/yelp.log for complete logs
- ./logs/yelp-exit.json for time and exit status of the last execution
- ./logs/yelp-back.json for parameters to use when the script is relaunched after error
###Other files and directories
- ./yelp/: and the sub-folders for each city where the JSON files will be written
- ./yelp/locations.json
- ./yelp/yelp.db: the SQLite database
##How to use the script
-
Copy the files
-
Create the logs folder
-
Create the yelp folder and its sub-folders for each city
-
Cretae the yelp/locations.json file and write a JSON dictionary where the key is the name of the city and the value is a list of locations (GPS lat/long coordinates), e.g.:
{ {"Paris":["48.8618, 2.3477","48.8644, 2.3385","48.8605, 2.3380"]} }
-
Create the yelp/yelp.db SQLite database and the Venue table:
CREATE TABLE Venue ( id TEXT(50,0) NOT NULL PRIMARY KEY, json TEXT NOT NULL );
-
Run the script:
python YelpCapture.py
##Exit status and Errors ###Exit status
-
0 in case of success
-
1 in case of an InitError (problem with the ./yelp/locations.json file or with the ./yelp/ folder or sub-folders)
-
2 in case of a RequestException (e.g. network problem, HTTP error, timeout, too many redirections, etc.) different from any Foursquare API Error
-
3 in case of a Yelp API Error (see on Yelp for more information)
-
4 in case of a problem with the database (e.g. database / table / field not found)
-
5 in case of another type of Exception
###Errors InitError: File ./yelp/locations.json file is missing => You have forgotten to write the ./yelp/locations.json file
InitError: The ./yelp/locations.json file does not contain any correct JSON object
=> See {How to use the script} to verify that your ./yelp/locations.json file is correctly written and check that you use " instead of ' for your keys and values.
SQLite3 OperationalError: no such table: Venue
=> See {How to use the script} to create your yelp/yelp.db
##Good to know
###Limits 10000 calls/day => Yelp