I’ve decided to start a series of short blog posts documenting my experience of using Tableau Prep Builder and Alteryx Designer to complete numerous data prep challenges. I feel that blogging about the experiences will be a useful resource when comparing the two tools as, while being very powerful in their own right, they both have their strengths and weaknesses when directly compared. With Prep Builder still being relatively in its infancy and new releases coming thick and fast, I think that this will be a good process to show where features could be developed and see how the tool improves over time.
The data prep tasks will largely come from Preppin’ Data (link) and Alteryx Weekly Challenges (link). These sets of challenges are great resources for practising and developing your skills with either of the tools and something that I would highly recommend. For this series of blogs, I’m going to kick off with Preppin’ Data Week 32 (link).
The input data for the task can be seen above. The objectives were:
- To split the address field into property number, town, postcode and address.
- Put all the products and sales into one column
- Clean the product fields.
A couple of simple tasks but some challenges which are very common when cleaning up data. After completing those steps we should end up with the following output.
The full workflow can be seen below but these are the steps I took:
- In the initial cleaning step, I used regex to extract the property number, town, postcode and country. I also used regex to replace the ‘-’ in the two product fields. This was a little bit regex overkill as there were definitely simpler ways to go about this (see below).
- I then split the data into two streams, one containing product 1 and sales field, the other containing product 2 and sales 1. The fields were renamed product and sales in both streams.
- Finally, the two streams were unioned back together and we have our final output, simple as that!
In prep builder, you can pack a lot into each step which I personally quite like, as it’s very clear to keep track of what you have done in the ‘changes’ panel. Some people may prefer splitting each task out into individual steps but I find that’s down to personal preference.
When completing the challenge in Alteryx, I decided to go down a slightly different path and not use regex. This actually was a lot more straightforward and maybe something I should have considered when I completed the task the first time around.
- First I used the replace function to get rid of the ‘-’ in the products. As previously mentioned, using regex to do this was probably slightly overkill.
- Next, the text to column tool was used to split the address field into four. I used the data cleansing tool to get rid of the letters in the property number field and the trim function to remove white space from the rest of the address fields.
- Finally, like the prep workflow, the data was split into two streams, renamed and then unioned back together.
Although slightly different approaches were taken, in my opinion, Prep Builder slightly edges this challenge due to fewer steps required to complete the task. I could have also gone for the regex approach in Alteryx, for which I would have only required one formula tool, this way there would have been fewer stages. However, had I gone for the split method in Prep, I wouldn’t have been required to remove the white space as the built-in split function is very clean! The process was also made easier in Prep with the ability to drop and rename fields at any point in the workflow.
A fairly simple challenge to kick things off, a more rigorous test is in store next time!