-
Notifications
You must be signed in to change notification settings - Fork 3
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
1 parent
b49f2af
commit f2c0d3a
Showing
8 changed files
with
189 additions
and
0 deletions.
There are no files selected for viewing
15 changes: 15 additions & 0 deletions
15
_freeze/docs/geocode/forward-geocoding/execute-results/html.json
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,15 @@ | ||
{ | ||
"hash": "d8ef2099ddcfaf861ce33943369845df", | ||
"result": { | ||
"engine": "knitr", | ||
"markdown": "---\ntitle: Forward Geocoding\n--- \n\n\nForward geocoding is the process of taking an address or place information and identifying its location on the globe. \n\nTo geocode addresses, the `{arcgisgeocode}` package provides the function `find_address_candidates()`. This function geocodes a single address at a time and returns up to 50 address candidates (ranked by a score). \n\nThere are two ways in which you can provide address information: \n\n1. Provide the entire address as a string via the `single_line` argument\n2. Provide parts of the address using the arguments `address`, `city`, `region`, `postal` etc. \n\n# Single line address geocoding \n\nIt can be tough to parse out addresses into their components. Using the `single_line` argument is a very flexible way of geocoding addresses. Doing utilizes the ArcGIS World Geocoder's address parsing capabilities. \n\nFor example, we can geocode the same location using 3 decreasingly specific addresses.\n\n\n::: {.cell}\n\n```{.r .cell-code}\nlibrary(arcgisgeocode)\n\naddresses <- c(\n \"380 New York Street Redlands, California, 92373, USA\",\n \"Esri Redlands\",\n \"ESRI CA\"\n)\n\nlocs <- find_address_candidates(\n addresses,\n max_locations = 1L\n)\n\nlocs$geometry\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\nGeometry set for 3 features \nGeometry type: POINT\nDimension: XY\nBounding box: xmin: -117.1948 ymin: 34.05726 xmax: -117.1948 ymax: 34.05726\nGeodetic CRS: WGS 84\n```\n\n\n:::\n\n::: {.cell-output .cell-output-stderr}\n\n```\nPOINT (-117.1948 34.05726)\n```\n\n\n:::\n\n::: {.cell-output .cell-output-stderr}\n\n```\nPOINT (-117.1957 34.05609)\nPOINT (-117.1957 34.05609)\n```\n\n\n:::\n:::\n\n\nIn each case, it finds the correct address! \n\n# Geocoding from a dataframe \n\nMost commonly, you will need to geocode addresses from a column in a data.frame. It is important to note that the `find_address_candidates()` function does not work well in a `dplyr::mutate()` function call. Particularly because it is possible to return more than 1 address at a time. \n\nLet's read in a csv of bike stores in Tacoma, WA. To use `find_address_candidates()` with a data.frame, it is recommended to create a unique identifier of the row positions. \n\n\n::: {.cell}\n\n```{.r .cell-code}\nlibrary(dplyr)\n\nfp <- \"https://www.arcgis.com/sharing/rest/content/items/9a9b91179ac44db1b689b42017471ae6/data\"\n\nbike_stores <- readr::read_csv(fp) |>\n mutate(id = row_number())\n\nbike_stores\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n# A tibble: 10 × 3\n store_name original_address id\n <chr> <chr> <int>\n 1 Cascadia Wheel Co. 3320 N Proctor St, Tacoma, WA 984… 1\n 2 Puget Sound Bike and Ski Shop between 3206 N. 15th and 1414, N … 2\n 3 Takoma Bike & Ski 3010 6th Ave, Tacoma, WA 98406 3\n 4 Trek Bicycle Tacoma University Place 3550 Market Pl W Suite 102, Unive… 4\n 5 Opalescent Cyclery 814 6th Ave, Tacoma, WA 98405 5\n 6 Sound Bikes 108 W Main, Puyallup, WA 98371 6\n 7 Trek Bicycle Tacoma North End 3009 McCarver St, Tacoma, WA 98403 7\n 8 Second Cycle 1205 M.L.K. Jr Way, Tacoma, WA 98… 8\n 9 Penny bike co. 6419 24th St NE, Tacoma, WA 98422 9\n10 Spider's Bike, Ski & Tennis Lab 3608 Grandview St, Gig Harbor, WA… 10\n```\n\n\n:::\n:::\n\n\n\nTo geocode addresses from a data.frame, you can use `dplyr::reframe()`. \n\n\n::: {.cell}\n\n```{.r .cell-code}\nbike_stores |>\n reframe(\n find_address_candidates(original_address)\n )\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n# A tibble: 13 × 62\n input_id result_id loc_name status score match_addr long_label short_label\n <int> <int> <chr> <chr> <dbl> <chr> <chr> <chr> \n 1 1 NA World M 100 3320 N Proct… 3320 N Pr… 3320 N Pro…\n 2 2 NA World M 97.6 N 15th St & … N 15th St… N 15th St …\n 3 2 NA World M 97.3 1414 N Alder… 1414 N Al… 1414 N Ald…\n 4 2 NA World M 94.7 S 15th St & … S 15th St… S 15th St …\n 5 2 NA World M 84.4 3206 N 15th … 3206 N 15… 3206 N 15t…\n 6 3 NA World M 100 3010 6th Ave… 3010 6th … 3010 6th A…\n 7 4 NA World M 100 3550 Market … 3550 Mark… 3550 Marke…\n 8 5 NA World M 100 814 6th Ave,… 814 6th A… 814 6th Ave\n 9 6 NA World M 100 108 W Main, … 108 W Mai… 108 W Main \n10 7 NA World M 100 3009 McCarve… 3009 McCa… 3009 McCar…\n11 8 NA World M 100 1205 Martin … 1205 Mart… 1205 Marti…\n12 9 NA World M 97.9 6419 24th St… 6419 24th… 6419 24th …\n13 10 NA World M 100 3608 Grandvi… 3608 Gran… 3608 Grand…\n# ℹ 54 more variables: addr_type <chr>, type_field <chr>, place_name <chr>,\n# place_addr <chr>, phone <chr>, url <chr>, rank <dbl>, add_bldg <chr>,\n# add_num <chr>, add_num_from <chr>, add_num_to <chr>, add_range <chr>,\n# side <chr>, st_pre_dir <chr>, st_pre_type <chr>, st_name <chr>,\n# st_type <chr>, st_dir <chr>, bldg_type <chr>, bldg_name <chr>,\n# level_type <chr>, level_name <chr>, unit_type <chr>, unit_name <chr>,\n# sub_addr <chr>, st_addr <chr>, block <chr>, sector <chr>, nbrhd <chr>, …\n```\n\n\n:::\n:::\n\n\nNotice how there are multiple results for each `input_id`. This is because the `max_locations` argument was not specified. To ensure only the best match is returned set `max_locations = 1`\n\n\n\n::: {.cell}\n\n```{.r .cell-code}\ngeocoded <- bike_stores |>\n reframe(\n find_address_candidates(original_address, max_locations = 1)\n ) |>\n # reframe drops the sf class, must be added\n sf::st_as_sf()\n\ngeocoded\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\nSimple feature collection with 10 features and 61 fields\nGeometry type: POINT\nDimension: XY\nBounding box: xmin: -122.5871 ymin: 47.19164 xmax: -122.294 ymax: 47.32301\nGeodetic CRS: WGS 84\n# A tibble: 10 × 62\n input_id result_id loc_name status score match_addr long_label short_label\n <int> <int> <chr> <chr> <dbl> <chr> <chr> <chr> \n 1 1 NA World M 100 3320 N Proct… 3320 N Pr… 3320 N Pro…\n 2 2 NA World M 97.6 N 15th St & … N 15th St… N 15th St …\n 3 3 NA World M 100 3010 6th Ave… 3010 6th … 3010 6th A…\n 4 4 NA World M 100 3550 Market … 3550 Mark… 3550 Marke…\n 5 5 NA World M 100 814 6th Ave,… 814 6th A… 814 6th Ave\n 6 6 NA World M 100 108 W Main, … 108 W Mai… 108 W Main \n 7 7 NA World M 100 3009 McCarve… 3009 McCa… 3009 McCar…\n 8 8 NA World M 100 1205 Martin … 1205 Mart… 1205 Marti…\n 9 9 NA World M 97.9 6419 24th St… 6419 24th… 6419 24th …\n10 10 NA World M 100 3608 Grandvi… 3608 Gran… 3608 Grand…\n# ℹ 54 more variables: addr_type <chr>, type_field <chr>, place_name <chr>,\n# place_addr <chr>, phone <chr>, url <chr>, rank <dbl>, add_bldg <chr>,\n# add_num <chr>, add_num_from <chr>, add_num_to <chr>, add_range <chr>,\n# side <chr>, st_pre_dir <chr>, st_pre_type <chr>, st_name <chr>,\n# st_type <chr>, st_dir <chr>, bldg_type <chr>, bldg_name <chr>,\n# level_type <chr>, level_name <chr>, unit_type <chr>, unit_name <chr>,\n# sub_addr <chr>, st_addr <chr>, block <chr>, sector <chr>, nbrhd <chr>, …\n```\n\n\n:::\n:::\n\n\nWith this result, you can now join the address fields back onto the `bike_stores` data.frame using a `left_join()`.\n\n\n::: {.cell}\n\n```{.r .cell-code}\nleft_join(\n bike_stores,\n geocoded,\n by = c(\"id\" = \"input_id\")\n) |>\n # left_join keeps the class of the first table\n # must add sf class back on\n sf::st_as_sf()\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\nSimple feature collection with 10 features and 63 fields\nGeometry type: POINT\nDimension: XY\nBounding box: xmin: -122.5871 ymin: 47.19164 xmax: -122.294 ymax: 47.32301\nGeodetic CRS: WGS 84\n# A tibble: 10 × 64\n store_name original_address id result_id loc_name status score match_addr\n <chr> <chr> <int> <int> <chr> <chr> <dbl> <chr> \n 1 Cascadia W… 3320 N Proctor … 1 NA World M 100 3320 N Pr…\n 2 Puget Soun… between 3206 N.… 2 NA World M 97.6 N 15th St…\n 3 Takoma Bik… 3010 6th Ave, T… 3 NA World M 100 3010 6th …\n 4 Trek Bicyc… 3550 Market Pl … 4 NA World M 100 3550 Mark…\n 5 Opalescent… 814 6th Ave, Ta… 5 NA World M 100 814 6th A…\n 6 Sound Bikes 108 W Main, Puy… 6 NA World M 100 108 W Mai…\n 7 Trek Bicyc… 3009 McCarver S… 7 NA World M 100 3009 McCa…\n 8 Second Cyc… 1205 M.L.K. Jr … 8 NA World M 100 1205 Mart…\n 9 Penny bike… 6419 24th St NE… 9 NA World M 97.9 6419 24th…\n10 Spider's B… 3608 Grandview … 10 NA World M 100 3608 Gran…\n# ℹ 56 more variables: long_label <chr>, short_label <chr>, addr_type <chr>,\n# type_field <chr>, place_name <chr>, place_addr <chr>, phone <chr>,\n# url <chr>, rank <dbl>, add_bldg <chr>, add_num <chr>, add_num_from <chr>,\n# add_num_to <chr>, add_range <chr>, side <chr>, st_pre_dir <chr>,\n# st_pre_type <chr>, st_name <chr>, st_type <chr>, st_dir <chr>,\n# bldg_type <chr>, bldg_name <chr>, level_type <chr>, level_name <chr>,\n# unit_type <chr>, unit_name <chr>, sub_addr <chr>, st_addr <chr>, …\n```\n\n\n:::\n:::", | ||
"supporting": [], | ||
"filters": [ | ||
"rmarkdown/pagebreak.lua" | ||
], | ||
"includes": {}, | ||
"engineDependencies": {}, | ||
"preserve": {}, | ||
"postProcess": true | ||
} | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,15 @@ | ||
{ | ||
"hash": "08f4c4ce2f08106e03a5a393817ad0bb", | ||
"result": { | ||
"engine": "knitr", | ||
"markdown": "---\ntitle: Overview\n---\n\n\nAddresses represent a physical place. They're meant to be interpreted by people and help guide navigation of the built environment. Addresses represent a geographical place but lack geographic data.\n\nThe package `{arcgisgeocode}` enables you to search for an address (geocode), reverse geocode, find candidate matches, get suggestions, and batch geocode. Geocoding is the process of converting text to an address and a location.\n\n- **Address geocoding**, also known as forward geocoding, is the process of converting text for an address to a complete address with a location.\n- **Place geocoding** is the process of searching for addresses for businesses, administrative locations, and geographic features.\n- **Reverse geocoding** is the process of converting a point to an address or place.\n- **Batch geocoding**, also known as bulk geocoding, is the process of converting a list of addresses or place names to a set of complete addresses with locations.\n\n# Licensing considerations\n\nMany features of the ArcGIS World Geocoder are provided for free such as forward geocoding, reverse geocoding, and place search. However, **storing results is not free**. Additionally, the bulk geocoding functionality requires a developer account or available credits. \n\nIn order to store results, each function has an argument `for_storage` which should be set to `TRUE` if you intend to store the results. \n\nTo learn more about free and paid geocoding operations refer to the [storage parameter documentation](https://developers.arcgis.com/documentation/mapping-apis-and-services/geocoding/services/geocoding-service/#storage-parameter).\n\n| Function | Description | Free |\n| -------- | ----------- | ---- |\n| `find_address_candidates()` | Finds up to 50 location candidates based on a provided address. _This function is vectorized_ to work with many addresses at a time. | ✅ |\n| `reverse_geocode()` | Returns an address based on the provided coordinate. _This function is vectorized_ to work with many locations at a time. | ✅ |\n| `suggest_places()` | Returns possible POI information based on a location and a search phrase. This function is not vectorized. | ✅ |\n| `geocoded_addresses()` | Bulk geocodes addresses returning a single location per address. Use this for highly performant and scalable address geocoding. | ❌ |\n\n\n# Get started\n\nTo start geocoding with the R-ArcGIS Bridge, install the R package from CRAN. \n\n\n\n\n```r\n# install from CRAN\ninstall.packages(\"arcgisgeocode\")\n```\n\n\n::: {.cell}\n\n```{.r .cell-code}\n# Load the library\nlibrary(arcgisgeocode)\n```\n:::\n\n\n## Geocode an address\n\nPerform single address geocoding using the `find_address_candidates()` function. Limit the number of results using the `max_locations` argument. \n\n\n::: {.cell}\n\n```{.r .cell-code}\nloc <- find_address_candidates(\n \"501 Edgewood Ave SE, Atlanta, GA 30312\", max_locations = 1\n)\n\nloc[, 1:8]\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\nSimple feature collection with 1 feature and 8 fields\nGeometry type: POINT\nDimension: XY\nBounding box: xmin: -84.37108 ymin: 33.75396 xmax: -84.37108 ymax: 33.75396\nGeodetic CRS: WGS 84\n input_id result_id loc_name status score\n1 1 NA World M 100\n match_addr\n1 501 Edgewood Ave SE, Atlanta, Georgia, 30312\n long_label short_label\n1 501 Edgewood Ave SE, Atlanta, GA, 30312, USA 501 Edgewood Ave SE\n geometry\n1 POINT (-84.37108 33.75396)\n```\n\n\n:::\n:::\n\n\n## Reverse geocode \n\nFrom a location, find its corresponding address using `reverse_geocode()`. \n\n\n::: {.cell}\n\n```{.r .cell-code}\nreverse_geocode(c(-84.371, 33.753))\n```\n\n::: {.cell-output .cell-output-stderr}\n\n```\nRegistered S3 method overwritten by 'jsonify':\n method from \n print.json jsonlite\n```\n\n\n:::\n\n::: {.cell-output .cell-output-stdout}\n\n```\nSimple feature collection with 1 feature and 22 fields\nGeometry type: POINT\nDimension: XY\nBounding box: xmin: -84.37103 ymin: 33.75322 xmax: -84.37103 ymax: 33.75322\nGeodetic CRS: WGS 84\n match_addr\n1 39 Daniel St SE, Atlanta, Georgia, 30312\n long_label short_label addr_type\n1 39 Daniel St SE, Atlanta, GA, 30312, USA 39 Daniel St SE PointAddress\n type_field place_name add_num address block sector neighborhood\n1 39 39 Daniel St SE \n district city metro_area subregion region region_abbr territory\n1 Atlanta Fulton County Georgia GA \n postal postal_ext country_name country_code geometry\n1 30312 1907 United States USA POINT (-84.37103 33.75322)\n```\n\n\n:::\n:::\n", | ||
"supporting": [], | ||
"filters": [ | ||
"rmarkdown/pagebreak.lua" | ||
], | ||
"includes": {}, | ||
"engineDependencies": {}, | ||
"preserve": {}, | ||
"postProcess": true | ||
} | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Empty file.
Empty file.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,91 @@ | ||
--- | ||
title: Forward Geocoding | ||
--- | ||
|
||
Forward geocoding is the process of taking an address or place information and identifying its location on the globe. | ||
|
||
To geocode addresses, the `{arcgisgeocode}` package provides the function `find_address_candidates()`. This function geocodes a single address at a time and returns up to 50 address candidates (ranked by a score). | ||
|
||
There are two ways in which you can provide address information: | ||
|
||
1. Provide the entire address as a string via the `single_line` argument | ||
2. Provide parts of the address using the arguments `address`, `city`, `region`, `postal` etc. | ||
|
||
# Single line address geocoding | ||
|
||
It can be tough to parse out addresses into their components. Using the `single_line` argument is a very flexible way of geocoding addresses. Doing utilizes the ArcGIS World Geocoder's address parsing capabilities. | ||
|
||
For example, we can geocode the same location using 3 decreasingly specific addresses. | ||
|
||
```{r} | ||
library(arcgisgeocode) | ||
addresses <- c( | ||
"380 New York Street Redlands, California, 92373, USA", | ||
"Esri Redlands", | ||
"ESRI CA" | ||
) | ||
locs <- find_address_candidates( | ||
addresses, | ||
max_locations = 1L | ||
) | ||
locs$geometry | ||
``` | ||
|
||
In each case, it finds the correct address! | ||
|
||
# Geocoding from a dataframe | ||
|
||
Most commonly, you will need to geocode addresses from a column in a data.frame. It is important to note that the `find_address_candidates()` function does not work well in a `dplyr::mutate()` function call. Particularly because it is possible to return more than 1 address at a time. | ||
|
||
Let's read in a csv of bike stores in Tacoma, WA. To use `find_address_candidates()` with a data.frame, it is recommended to create a unique identifier of the row positions. | ||
|
||
```{r message = FALSE} | ||
library(dplyr) | ||
fp <- "https://www.arcgis.com/sharing/rest/content/items/9a9b91179ac44db1b689b42017471ae6/data" | ||
bike_stores <- readr::read_csv(fp) |> | ||
mutate(id = row_number()) | ||
bike_stores | ||
``` | ||
|
||
|
||
To geocode addresses from a data.frame, you can use `dplyr::reframe()`. | ||
|
||
```{r} | ||
bike_stores |> | ||
reframe( | ||
find_address_candidates(original_address) | ||
) | ||
``` | ||
|
||
Notice how there are multiple results for each `input_id`. This is because the `max_locations` argument was not specified. To ensure only the best match is returned set `max_locations = 1` | ||
|
||
|
||
```{r} | ||
geocoded <- bike_stores |> | ||
reframe( | ||
find_address_candidates(original_address, max_locations = 1) | ||
) |> | ||
# reframe drops the sf class, must be added | ||
sf::st_as_sf() | ||
geocoded | ||
``` | ||
|
||
With this result, you can now join the address fields back onto the `bike_stores` data.frame using a `left_join()`. | ||
|
||
```{r} | ||
left_join( | ||
bike_stores, | ||
geocoded, | ||
by = c("id" = "input_id") | ||
) |> | ||
# left_join keeps the class of the first table | ||
# must add sf class back on | ||
sf::st_as_sf() | ||
``` |
Oops, something went wrong.