How to create a chart from data in wide format in Tableau - formatting

Background and objective
I have data that is organised as follows:
|---------------------|------------------|---------------------|------------------|
| Place | Time_Car | Time_Jet | Time_Heli
| | | |
|---------------------|------------------|---------------------|------------------|
| LocX | 34 | 12 | 14 |
|---------------------|------------------|---------------------|------------------|
| LocY | 24 | 8 | 10 |
|---------------------|------------------|---------------------|------------------|
In Tableau, I want to be able to plot the average of the maximum times for each Destination column, for each type of transport.
So,
For Time_heli, the maximum time to LocX is 8 (out of 7 and 8)
The maximum time to LocY is 5
The average of those two is 6.5
This produces something this after repeating for Time_car and Time_jet:
I am using Tableau 10.5.4 version on Windows 8
Approach
To do, I will get the maximum times for each Place by creating a new field for each type of transport with:
{FIXED Place,Option: MAX([Time_Car])}
{FIXED Place,Option: MAX([Time_Jet])}
{FIXED Place,Option: MAX([Time_Heli])}
However, this won't allow me plot the data in one chart on Tableau.
How can I achieve the chart above?

Related

Comparing every row in table with the master row

I have a Redshift table with single VARCHAR column named "Test" and several float columns. The "Test" column has unique values, one of them is "Control", others are not hardcoded.
Tables has ~10 rows (not static) and ~10 columns.
I need to generate the Looker report which will show the original data and the difference between the corresponding float columns in "Control" and other Tests.
Input Example:
Test | Metric_1 | Metric_2
----------------------------
Control| 10 | 100
A | 12 | 120
B | 8 | 80
The desirable report:
| Control | A | A-Control | B | B-Control
|---------|----|-----------|---|-----------
Metric_1 | 10 | 12 | 2 | 8 | -2
Metric_2 | 100 | 120| 20 | 80| -20
To calculate the difference for the each row with "Control"
I tried:
SELECT T.test,
T.metric_1 - Control.metric_1 AS DIFF1,
T.metric_2 - Control.metric_2 AS DIFF2,
...
FROM T, (SELECT * FROM T WHERE test='Control') AS Control
I can do part of work in Looker (it can transpose),
part in SQL, but still cannot figure out how to build this report.
You could transpose the test dimension, being able to build part of it:
| Control | A | B |
|---------|----|---|
Metric_1 | 10 | 12 | 8 |
Metric_2 | 100 | 120| 80|
Then operate on top of this results using table calculations.
You can use the functions pivot_where() or pivot_index().
For example, pivot_where(test = 'A', metric) - pivot_where(test = 'Control', metric)

SQL-design issue - Ordering worker for work teams on a weekly basis

I'm making a web solution using ASP.net MVC6 and Azure SQL-db.
My goal is to make an order system for ordering work teams on a weekly basis and it must be possible to display the work order 6 weeks ahead from todays date. Each work order is connected to a project. A manager should be able to choose a project and then start ordering different kinds of workers (disiplin), assign his need for manpower for each disiplin for 6 weeks ahead. A disiplin can be carpenter, painter, bricklayer etc.
Each project can have any number of disiplin assigned so it's not possible to hard code this into the table structur. You can hardcode the week either as week 2 in 2016 is different from week 2 in 2017
A workorder can look like this:
Project A
Disiplin | Week 1 | Week 2 | Week 3 | Week 4 | Week 5 | Week 6
Carpenter | 4 | 3 | 0 | 0 | 3 | 0
Painter | 0 | 0 | 2 | 3 | 3 | 3
Next project can look like this:
Project B
Disiplin | Week 44 | Week 45 | Week 46 | Week 47 | Week 48 | Week 49
Carpenter | 4 | 3 | 0 | 0 | 3 | 0
Painter | 0 | 0 | 2 | 3 | 3 | 3
Bricklayer| 1 | 2 | 1 | 5 | 3 | 0
Carpentry | 4 | 3 | 0 | 0 | 3 | 0
As you see the week number and number of disiplin may vary from project to project. I can't seem to wrap my head around how to design the SQL-tables to efficently store these values.
Can anyone review this issue and point me in the right direction? Thanks.
EDIT:
The problem is really not to store data but how to query for them. You never know for how many weeks each disiplin has registered data and you don't know how many disiplins registered on each project. In addition for week 2 you may have registered the manpower-needs for Carpenters but not for Painters. I could make a query for each disipline, but I would preferably have one query to get the complete grid.

How to transpose cell data by section in Open Refine?

I have a data table that looks like this:
Name | Date-Freq | Date-Amount | Date-Freq | Date-Amount
A | 4 | 3000 | 8 | 9000
B | 5 | 4000 | 9 | 7000
C | 6 | 5000 | 10 | 8000
and I want it to look like this:
Name | Date | Freq | Amount
A | July 2014 | 4 | 3000
A | Aug 2014 | 8 | 9000
B | July 2014 | 5 | 4000
B | Aug 2014 | 9 | 7000
C | July 2014 | 6 | 5000
C | Aug 2014 | 10 | 3000
What is the best way to do something like this? Should I just create two new columns?
What you want to accomplish will require lots of steps and faceting and adding new columns, etc. But you can accomplish this with OpenRefine, YES.
You will need to use a combination of:
Always work in Records mode (not row mode) for any kind of Merging work.
Transpose Cells across columns into rows (into one MERGE column with prepend)
Moving your Name column to beginning column.
Fill down on your Name column (because it is now blank on some cells after the Transpose. and later you might need to fill down again after any particular Transpose or Merging.)
Use Custom Text Faceting with value.startsWith("Amount") etc.
Use Add new column based on to create new columns based on the MERGE column
Move Columns as necessary to do Step 2 again.
Repeat steps as necessary.
Here is an example OpenRefine project showing the beginning of what happens after the 1st set of the above steps (Use Undo/Redo to see, doesn't show Facets though):
OpenRefine Project with Transpose across cells into one column

Find a subset of numbers that equals to the target weighted average and target sum

There is a SQL server table containing 1 million of rows. A sample data is shown below.
Percentage column is computed as = ((Y/X)* 100)
+----+--------+-------------+-----+-----+-------------+
| ID | Amount | Percentage | X | Y | Z |
+----+--------+-------------+-----+-----+-------------+
| 1 | 10 | 9.5 | 100 | 9.5 | 95 |
| 2 | 20 | 9.5 | 100 | 9.5 | 190 |
| 3 | 40 | 5 | 100 | 5 | 200 |
| 4 | 50 | 5.555555556 | 90 | 5 | 277.7777778 |
| 5 | 70 | 8.571428571 | 70 | 6 | 600 |
| 6 | 100 | 9.230769231 | 65 | 6 | 923.0769231 |
| 7 | 120 | 7.058823529 | 85 | 6 | 847.0588235 |
| 8 | 60 | 10.52631579 | 95 | 10 | 631.5789474 |
| 9 | 80 | 10 | 100 | 10 | 800 |
| 10 | 95 | 10 | 100 | 10 | 950 |
+----+--------+-------------+-----+-----+-------------+
Now I need to find the rows such that their amount value add up to a given Amount and weighted average matches to the given Percentage.
For example, if the target Amount =365 and target Percentage=9.84, then from the given dataset, we can say that rows with ID=1,2,6,8,9,10 form the subset which will match the given targets.
Amount = 10+20+100+60+80+95
= 365
Percentage = Sum of (product of Amount and Percentage)/Sum of (Amount)
(I am using Z column to store the products of Amount and Percentage to make the calculations easier)
= ((10*9.5)+(20*9.5)+(100*9.23077)+(60*10.5264)+(80*10)+(95*10))/ (10+20+100+60+80+95)
= 9.834673618
So the rows 1,2,6,8,9,10 matches the given target sum and target weighted average.
Proposed algorithm should work on the 1 million rows and main objective is to achieve the match on the weighted average (Percentage) with Amount as much close as possible to the target Amount.
I found few questions on the stackoverflow which are related to match the target sum. But my problem is to match two target attributes Sum and weighted average.
Which algorithm can be used to achieve this?
Since the target "Percentage" is only approximate (therefore not an actual constraint), let's try removing it and find a solution for Amount. This can only make the problem easier.
What's left is the Subset Sum Problem, which is NP-complete. There are simple exponential-time solutions, and sneaky pseudo-polynomial-time solutions, but I don't think any of them will be practical for a table with 106 rows.
If this is an academic exercise, I suggest you write up the cleverest pseudo-polynomial-time solution you can come up with. If it's a task in the real world, I suggest you go back to the person who gave it to you, explain that an exact solution is impractical, and negotiate for an approximate solution.

Django field widget doesn't show appropriate attribute

I'm using Django and this is a question on how to organize your models, or equivalentely, organize tables in SQL.
At the moment I have a table where each row contains a primary key, a "value" (a float multiple of 0.01) and a "amount" (integer). This is how I need this data.
However, I need to serve it differentely. I need to sum the "amount"s over rows with the same "value".
Example, my table is
| id | value | amount |
| 1 | 1.2 | 10 |
| 2 | 1.2 | 27 |
| 3 | 1.2 | 4 |
| 4 | 1.3 | 21 |
| 5 | 1.3 | 1 |
| 6 | 1.4 | 5 |
| 7 | 1.4 | 9 |
For my app I need to serve this as
| value | amount |
| 1.2 | 41 |
| 1.3 | 22 |
| 1.4 | 14 |
Now my question is: What is the best way to do this? Should I generate the second table from the first every time I need to serve it? Or should I add a new model to my app that gets updated everytime my current model gets updated, and so containing redundant information but getting the job done faster?
EDIT:
qb = Order.objects.filter(
models.Q(status='B')|models.Q(status='K')
).filter(
side='L', market__pk=self.pk
).order_by(
'-value'
).values('value').annotate(amount_sum=Sum('amount'))
The output is
[{'amount_sum': 22, 'value': Decimal('1.3')}, {'amount_sum': 41, 'value': Decimal('1.2')}]
from django.db.models import Sum
MyTable.objects.values('value').annotate(amount_sum=Sum('amount'))
This will return a list of dictionaries that contain value and amount_sum. You can name amount_sum whatever.
Django doc for Sum