Prep Builder vs Alteryx: Alteryx Weekly Challenge 196

Two of my goals for this year are to make this comparison blog a more regular feature and to complete more Alteryx weekly challenges. So after completing the first challenge of 2020, I figured I might as well bring the two together and try to complete the challenge in Prep Builder.

The Challenge

Your task is to create a workflow that determines the quarter where your company has the highest profit margin with its clients. Quarters aren’t defined in the traditional sense though. Quarters are defined by the length of the relationship with your company. 

For example, if customer A’s first contract started 4/1/2018, your company defines Q1 with this client as 4/1/2018, Q2 as 7/1/2018, Q5 as 4/1/2019, Q9 as 4/1/2020 etc’. 

A link to the challenge can be found here

There were two input files provided. One held information about the contracts of clients, this included a client ID, contract date and the order volume for that month. The other included costs. For each year there is a quarterly premium, a per-unit fee for customers and a per-unit production cost. See below. 

Costs
Contracts

The output required was a table which contained the costs, revenue and profit margin for each quarter. This was sorted to show the quarters with the highest profit margins. See below. 

Final Output

Alteryx

The first step I took was to parse out the client ID, contract date and order volume, as they were all contained within one field. This is easily achieved using the text to columns tool to split on the | delimiter, I followed this up with the data cleansing and select tools to remove unwanted whitespace and remove the additional field. 

The next stage was to generate the quarters for each client. First I used the Date Time tool to convert the contract dates from a string to a date-time value. I then sorted by client ID and contract date ascending. Finally, I used the multi-row formula tool to add an additional quarter number for each new contract date for each client. 

As you may be aware, there is no such equivalent of the multi-row formula tool in Tableau Prep so I knew this would require a slightly different technique, more on that later. To finish off this step I parsed out the year from the contract date using a simple formula tool. The data was now looking ready to calculate some profit margins from the costs data. 

I joined in the costs data on year and then calculated the cost and revenue for each contract date using the following formulas: 

Cost = [Order Volume] * [Per unit Production Cost]

Revenue = [Quarterly Premium] + ([Order Volume] * [Per Unit Fee for Customers])

The costs and revenue for each quarter were then summed up using the summarize tool and calculated the quarterly profit margins were calculated with the following formula: 

Profit Margin = ([Revenue] – [Costs]) / [Revenue]

Finally, I sorted the data by the profit margin to reveal the quarters with the highest profit margin. Again, a simple step like this proves rather troublesome in Prep Builder. Nothing too taxing from the Alteryx version, the full solution can be seen below. 

Alteryx Workflow

Tableau Prep Builder

I really enjoyed taking on this challenge in Tableau Prep as I knew it wasn’t going to be a like for like solution and I was going to have to think outside the box a little. 

The first step was no different from Alteryx, splitting out the three fields using custom split, a little bit of renaming and dropping fields and converting the contract date by simply changing the data type. I like how Prep Builder can automatically make this change without the user having to specify the structure of the date-time field as you do in Alteryx. Of course, this is a more simple example with the date field already prepared in a nice format, with more complicated data it’s not such an easy transition to make. 

With no equivalent multi-row formula or the ability to sort the actual data in Prep, I had to come up with a different method of determining the quarters for each client. To do this, I took the minimum contract date for each client and then joined this back onto the original data with the join conditions 

Client = Client 

Min Contract Date <= Contract Date

I then calculated the date difference in months between the minimum contract date and the actual contract date. This was followed by a calculation that reads: 

Quarter = ([Months from first contract] + 3)/3 

Now I had my quarters, I calculated my year field with the YEAR formula and I was ready to join in my costs. 

Prep logic for generating quarters

Although I feel that of the two techniques used, the multi-row formula method is the easier of the two and this can’t be achieved (YET…), in Prep. The specific method I have used actually can’t be achieved in Alteryx either, as you cannot set specific join conditions in the Alteryx join tool. As mentioned, I would’ve gone for the multi-row technique if this was at my disposal in Tableau Prep, however, I think it’s worth pointing out the power of the join conditions in Prep. I am also aware that with all the tools in Alteryx it’s usually not a feature that is widely missed. 

Anyway, onto the next stages of the challenge. The next stages were very similar to those taken in Alteryx. The costs data was joined in on the year field, and the cost and revenue for each contract was calculated. This was then aggregated to a quarter level and the profit margins were calculated. 

The final stage was another that was going to be more difficult in Tableau Prep. It’s possible to sort a specific field in the profile pane, however, you cannot sort the data source itself. Of course, you could output your data to Tableau Desktop and created a sorted table very quickly but I decided to add a rank to the data to try and replicate the Alteryx output as best I could. (Even when you’ve made a rank field you can’t sort the data, wait for 2020.1…). The steps I took can be seen in the image below. 

Generating a rank in Prep

The first step was to join the data to itself on the quarterly profit margin field, using the join condition: 

Quarterly profit margin <= quarterly profit margin

This will expand the data set but is the first step to creating a rank. The next step is to group by the original profit margin field, and count the profit margins which were less than or equal to that value. This provides you with a rank field, as the highest profit margin will have only joined to itself, the second will have joined to itself and the one above to give a count of two and so on. 

Finally, the ranked quarterly profit margins can be joined back to the original data. As I mentioned earlier, the data still won’t be sorted in order, but at least there is a new rank field to try and mimic the Alteryx output. Rank is currently not a feature in Tableau Prep however, it is coming very soon so keep your eyes peeled. 

I think this was a great challenge to highlight that whilst most data prep tasks can be completed in Prep, there are still areas for improvement. Updates and new features for Tableau Prep are coming thick and fast so I’m really excited to see where the product is by the end of this year. LODs and rank should be coming soon in 2020.1 and multi-row formula might not be on the too distant horizon… an exciting time ahead!

Full Tableau Prep Flow

Leave a Reply