Geocoding basic address lines using the Open Route Service API

This post was originally written during my training period at The Data School explaining a technique I used on a client project.

For a recent client project I was trying to visualise salary data on a map as a means of exploring salaries in different locations. The problem I had was that I only had a city/area name for each of my data sets, as can be seen below. Of course Tableau can pick up these names and plot them as cities but I was finding that some of them were not accurate. Having no geographic coordinates also meant that I had no means of matching these locations to a shapefile to make a slightly more interesting map. I wanted to try and create something that was slightly better than some incorrectly positioned circles. 

Initial data structure

After some asking around at The Data School, I was introduced to the Open Route Service API. This allows you to input various lines of address data, such as street address, city, county etc and the API will provide you with some geographic coordinates for address you enter. Another great feature of the API is that you can reverse geocode, so you can input a bunch of latitude and longitudes and the API will return the closest matched address to your points. I’m sure these are features that many people would find useful. 

So how did I go about preparing this data using Alteryx. The Open Route Service is free to sign up to. Once you have signed up to the service, you can generate yourself an API key and start calling the API. To geocode your data, you need to use the ‘forward geocode service’ with the URL endpoint…&text=…&size=1&

Overview of the Alteryx workflow

There is extensive documentation on the service on the different parameters that can be used to structure the URL, but in this instance I’m populating the URL with my city followed by United Kingdom (e.g. Bristol, United Kingdom). I have used the formula tool to structure my URL using the different fields, this can be seen below. The API will make several matches to locations if they are quite vague like mine, so using size=1 returns just the first record for each city. ensures the search is only made in the UK. 

Using the formula tool to populate the URL

The next stage of the workflow makes the call to the API through the download tool. In this instance I had to use the ‘wait a second’ macro, which is part of the 2016 CREW macro pack and can be downloaded here. This adds in a delay of your choice for each row to be passed onto the next tool. I found that I was calling the API too much without this in place, so not all of my rows were being geocoded. With a smaller dataset you may find that it’s okay, however, mine only contains 130 rows and was still getting blocked. On the free plan of Open Route Service you can make 40 calls a minute, from this you can start to work out how many seconds delay you’d need on your workflow. 

Once the data is downloaded, I have followed a fairly standard approach to parsing out the relevant data. I’ve used the JSON parse tool so start to bring some structure to my data, and the text to columns tool to split the JSON headers into separate fields. You can see below that in this data I now have some geographic coordinates (circled in blue). 

The API call giving some longitude and latitudes (in blue)

The next stages of the workflow involve assigning a label to each location of my original dataset using the multi-row formula tool, which can be seen below. I’ve then split my data into two streams, those that contain the geographic coordinates and those that have the more detailed address information. I’ve used the cross-tab tool to pivot the data into its correct structure, grouping by the label I had previously assigned to each address point. Now we can start to see some data that will be useful to use.

Multi-row formula to assign labels to each location
Data ready to use

The final stage involved joining the data back to my original file, to bring through information regarding salaries. Last but not least I used the create points tool to generate some centroids for my latitude and longitudes to check the API had placed generate the points in the right location. 

From here, I went on to join my data to a hex map of the UK and came up with the visualisation below. You can explore it on my Tableau Public here. I found this the Open Route Service to be very helpful and will definitely be using it again in the future. Hopefully you manage to find this useful in your own work as well! 

Leave a Reply