How to generate this report? - sql

I'm trying to set up a report based on several tables.
I have a table Actual that looks like this:
+--------+------+
| status | date |
+--------+------+
| 5 | 7/10 |
| 8 | 7/9 |
| 8 | 7/11 |
| 5 | 7/18 |
+--------+------+
Table Targets looks like this:
+--------+-------------+--------+------------+
| status | weekEndDate | target | cumulative |
+--------+-------------+--------+------------+
| 5 | 7/12 | 4 | 45 |
| 5 | 7/19 | 5 | 50 |
| 8 | 7/12 | 4 | 45 |
| 8 | 7/19 | 5 | 50 |
+--------+-------------+--------+------------+
Grouping the Actual records by which Targets.weekEndDate they fall under, I have the following aggregate query GroupActual:
+-------------+------------+--------------+--------+------------+
| weekEndDate | status | weeklyTarget | actual | cumulative |
+-------------+------------+--------------+--------+------------+
| 7/12 | 5 | 4 | 1 | 45 |
| 7/12 | 8 | 4 | 2 | 41 |
| 7/19 | 5 | 5 | 1 | 50 |
| 7/19 | 8 | 4 | | 45 |
+-------------+------------+--------------+--------+------------+
I'm trying to create this report:
+--------+------------+------+------+
| status | category | 7/12 | 7/19 | ...etc for every weekEndDate entry in Targets
+--------+------------+------+------+
| 5 | actual | 1 | 1 |
| 5 | target | 4 | 5 |
| 5 | cumulative | 45 | 50 |
+--------+------------+------+------+
| 8 | actual | 2 | |
| 8 | target | 4 | 5 |
| 8 | cumulative | 45 | 50 |
+--------------+------+------+------+
I can use a crosstab query to make the date columns, but I'm not sure how to have rows for "actual", "target", and "cumulative". They aren't values in the same table, which means (I think) that a crosstab query won't be useful for this breakdown. Should I try to change GroupActual so that it puts the data in the shape I'm looking for? Kind of confused as to where to go next with this...
EDIT: I've made some headway on the crosstabs as per PowerUser's solution, but I'm having trouble with the one for Target. I modified the wizard's generated SQL in an attempt to get what I want but it's not working out. I used a version of GroupActual that only has the weekEndDate,status, and weeklyTarget columns; here's the SQL:
TRANSFORM weeklyTarget
SELECT status
FROM TargetStatus_forCrosstab_Target
GROUP BY status,weeklyTarget
PIVOT Format([weekEndDate],"Short Date");

You're almost there. The problem is that you can't do this all in a single crosstab. You need to make 3 crosstabs (one for 'actual', one for 'target', and one for 'cumulative'), then make a Union query to combine them all.
Additional Tip: In your individual crosstabs, add a Sort column. Your 'actual' crosstab will have a Sort value of 1, 'Target' will have a Sort value of 2, and 'Cumulative' will have 3. That way, when you union them together, you can get them all in the right order.

Related

Transform table from sequential identifier to real with attributes

I changed a but the context, but it's basically the same issue.
Imagine we are in a never-ending tunnel, shaped like a circle. We split every section of the circle, from 1 to 10 and we'll call each section slot (sl). There are 2 groups (gr) of living things walking in the tunnel. Each group has 2 bands, where each has a name and global hitpoints (hp). Every group is walking forward (although the bands might change order). If a group is at slot #10 and moves forward, he will be at slot #1. We snapshot their information every day. All the data gathered is stored in a table with this structure:
+----------+----------------+------------------+----------------+----------------+------------------+----------------+----------------+------------------+----------------+----------------+------------------+--------------+--+
| day_id | | gr_1_sl_1_id | | gr_1_sl_1_name | | gr_1_sl_1_hp | | gr_1_sl_2_id | | gr_1_sl_2_name | | gr_1_sl_2_hp | | gr_2_sl_1_id | | gr_2_sl_1_name | | gr_2_sl_1_hp | | gr_2_sl_2_id | | gr_2_sl_2_name | | gr_2_sl_2_hp | |
+----------+----------------+------------------+----------------+----------------+------------------+----------------+----------------+------------------+----------------+----------------+------------------+--------------+--+
| 1 | 3 | orc | 100 | 4 | goblin | 10 | 10 | human | 50 | 1 | dwarf | 25 | |
| 2 | 6 | goblin | 7 | 7 | orc | 76 | 2 | human | 60 | 3 | dwarf | 28 | |
+----------+----------------+------------------+----------------+----------------+------------------+----------------+----------------+------------------+----------------+----------------+------------------+--------------+--+
As you can see, the columns are structured in a sequential way, while the data shows what is the actual value. What I want is to have the information shaped this way instead:
+---------+-------+-------+-----------+---------+
| id_game | gr_id | sl_id | band_name | band_hp |
+---------+-------+-------+-----------+---------+
| 1 | 1 | 3 | orc | 100 |
| 1 | 1 | 4 | goblin | 10 |
| 1 | 2 | 10 | human | 50 |
| 1 | 2 | 1 | dwarf | 25 |
| 2 | 1 | 6 | goblin | 7 |
| 2 | 1 | 7 | orc | 76 |
| 2 | 2 | 2 | human | 60 |
| 2 | 2 | 3 | dwarf | 28 |
+---------+-------+-------+-----------+---------+
I have this information in power bi, although I can create views in sql server if need be. I have tried many things, closest thing I got was unpivoting and parsing the original columns to get day_id, gr_id, sl_id, attributes and values. In attributes and values, it's basically name and hp with their corresponding value (I changed hp into string), but then I'm stocked, I'm not sure what to do next.
Anyone has any ideas ? Keep in mind that I oversimplified the problem; there are more groups, more slots, more bands and more statistics (i.e. attack and defense rating, etc.)
You seem to want to unpivot the table. In SQL Server, I recommend using apply:
select t.day_id, v.*
form t cross apply
(values (1, 1, gr_1_sl_1_id, gr_1_sl_1_name, gr_1_sl_1_hp),
(1, 2, gr_1_sl_2_id, gr_1_sl_2_name, gr_1_sl_2_hp),
(2, 1, gr_2_sl_1_id, gr_1_sl_1_name, gr_2_sl_1_hp),
(2, 2, gr_2_sl_2_id, gr_1_sl_2_name, gr_2_sl_2_hp)
) v(id_game, gr_id, sl_id, band_name, band_hp);
In other databases, you can do something similar with union all.

How to add conditional count based on mutiple columns

I'm trying to summarise a T-SQL output that looks a little like this:
+---------+---------+-----+-------+
| perf_no | section | row | seat |
+---------+---------+-----+-------+
| 7128 | 6 | A | 4 |
| 7128 | 6 | A | 5 |
| 7128 | 6 | A | 7 |
| 7128 | 6 | A | 9 |
| 7128 | 6 | A | 28 |
| 7129 | 6 | A | 29 |
| 7129 | 6 | A | 8 |
| 7129 | 6 | A | 9 |
| 7129 | 8 | A | 6 |
| 7129 | 8 | B | 3 |
| 7129 | 8 | B | 4 |
+---------+---------+-----+-------+
Comparing one row to the row(s) below, if the perf_no, section, and row values are the same, and the difference between the seat values is 1, then I want to consider them a group, and count the number of rows in that group.
To give you a real world example, these are seats in a theatre! I'm trying to summarise what seats are available.
Using the table above to illustrate:
rows 1 & 2 show that seats 4 & 5 in section 6, row 8 for performance 7128 are available. So that's 2 seats together
row 3 shows that 7 in sectino 6, row 8 for performance 7128 is available on its own. So that's a single seat (1)
rows 5 & 6 have the same section and row, and the seats are consecutive, but you can see the performance is different. So that's a single seat too.
So the output for the table above would look a little like...
(I've left in the spaces just so visually you can see the groupings more easily - obviously the final version will have none)
+---------+---------+----------+-------+
| perf_no | section | seat_row | total |
+---------+---------+----------+-------+
| 7128 | 6 | A | 2 |
| | | | |
| 7128 | 6 | A | 1 |
| 7128 | 6 | A | 1 |
| 7128 | 6 | A | 1 |
| 7129 | 6 | A | 1 |
| 7129 | 6 | A | 2 |
| | | | |
| 7129 | 6 | A | 1 |
| 7129 | 8 | B | 2 |
+---------+---------+----------+-------+
I've been trying to use some conditional case statements to not much avail. Any assistance very gratefully received!
This is a type of gaps-and-islands problem. You can generate a grouping by subtracting a sequence (generated by row_number()) from the seat:
select perf_no, section, row, count(*) as num_seats,
min(seat) as first_seat, max(seat) as last_seat
from (select t.*,
row_number() over (partition by perf_no, section, row order by seat) as seqnum
from t
) t
group by perf_no, section, row, (seat - seqnum);

Selecting all rows in a master table and summing columns in multiple detail tables

I have a master table (Project List) along with several sub tables that are joined on one common field (RecNum). I need to get totals for all of the sub tables, by column and am not sure how to do it. This is a sample of the table design. There are more columns in each table (I need to pull * from "Project List") but I'm showing a sampling of the column names and values to get an idea of what to do.
Project List
| RecNum | Project Description |
| 6 | Sample description |
| 7 | Another sample |
WeekA
| RecNum | UserName | Day1Reg | Day1OT | Day2Reg | Day2OT | Day3Reg | Day3OT |
| 6 | JustMe | 1 | 2 | 3 | 4 | 5 | 6 |
| 6 | NotMe | 1 | 2 | 3 | 4 | 5 | 6 |
| 7 | JustMe | | | | | | |
| 7 | NotMe | | | | | | |
WeekB
| RecNum | UserName | Day1Reg | Day1OT | Day2Reg | Day2OT | Day3Reg | Day3OT |
| 6 | JustMe | 7 | 8 | 1 | 2 | 3 | 4 |
| 6 | NotMe | 7 | 8 | 1 | 2 | 3 | 4 |
| 7 | JustMe | | | | | | |
| 7 | NotMe | | | | | | |
So the first query should return the complete totals for both users, like this:
| RecNum | Project Description | sumReg | sumOT |
| 6 | Sample description | 40 | 52 |
| 7 | Another sample | 0 | 0 |
The second query should return the totals for just a specified user, (WHERE UserName = 'JustMe') like this:
| RecNum | Project Description | sumReg | sumOT |
| 6 | Sample description | 20 | 26 |
| 7 | Another sample | 0 | 0 |
Multiple parallel tables with the same structure is usually a sign of poor database design. The data should really be all in one table, with additional columns specifying the week.
You can, however, use union all to bring the data together. The following is an example of a query:
select pl.recNum, pl.ProjectDescription,
sum(Day1Reg + Day2Reg + Day3Reg) as reg,
sum(Day1OT + Day2OT + Day3OT) as ot
from ProjectList pl join
(select * from weekA union all
select * from weekB
) w
on pl.recNum = w.recNum
group by l.recNum, pl.ProjectDescription,;
In practice, you should use select * with union all. You should list the columns out explicitly. You can add appropraite where clauses or conditional aggregation to get the results you want in any particular case.

How to sum the values of an measure in MDX?

I am not sure how to put this but, trying to sum the values of an measure using MDX.
My MDX is as follows :
select {[CompanyDimension].[Foo],
[CompanyDimension].[Bar],
[CompanyDimension].[CDK]} on columns,
TopCount([${SLRDimension}].Children,
10,
[Measures].[ProjectCountMeasure]) on rows
from [Foo_Cube]
where ([FAreaDimension].[Admin])
For this expression, I am getting following output :
+----------------------------------------------------------------------+
| | CompanyDimension.NameHierarchy |
+----------------------------------------------------------------------+
| SLRDimension | Foo | Bar | CDK
+----------------------------------------------------------------------+
| Development | 1 | 1 | 6
| Testing | | | 3
| Implementation | | 1 | 5
| Reports | 1 | | 4
| Planning | 1 | | 5
| Reporting | | | 1
| Coding | | | 2
| Performance | | | 1
| Designed | | 1 |
| Designing | | | 2
+----------------------------------------------------------------------+
Now I want to get the sum of values per row. for example, in 1st row for Development, I want its corresponding value to be 7 instead of having 3 values i.e. 1, 1, 6
I am newbie to MDX world so I do not know how to do this. Please help !
I want final values as follows :
+----------------------------------------------------------------------+
| | CompanyDimension.NameHierarchy |
+----------------------------------------------------------------------+
| SLRDimension | Sum
+----------------------------------------------------------------------+
| Development | 7
| Testing | 3
| Implementation | 6
| Reports | 5
| Planning | 6
| Reporting | 1
| Coding | 2
| Performance | 1
| Designed | 1
| Designing | 2
+----------------------------------------------------------------------+
Using the Pentaho sample data SteelWheelsSales cube as basis, this is similar to what you have now:
SELECT NON EMPTY {[Customers].[All Customers]} ON ROWS,
NON EMPTY {[Markets].[APAC],[Markets].[EMEA]} ON COLUMNS
FROM [SteelWheelsSales]
and this is what you want:
SELECT NON EMPTY {[Customers].[All Customers]} ON ROWS,
NON EMPTY {[Measures].[Quantity]} ON COLUMNS
FROM [SteelWheelsSales]
WHERE {[Markets].[APAC],[Markets].[EMEA]}
notice how I replaced the columns with the measure I want to see, and how I moved the markets to the WHERE clause.

Quickly calculating running totals in sql server using set based operations

I have some data that looks like this:
+---+--------+-------------+---------------+--------------+
| | A | B | C | D |
+---+--------+-------------+---------------+--------------+
| 1 | row_id | disposal_id | excess_weight | total_weight |
| 2 | 1 | 1 | 0 | 30 |
| 3 | 2 | 1 | 10 | 30 |
| 4 | 3 | 1 | 0 | 30 |
| 5 | 4 | 2 | 5 | 50 |
| 6 | 5 | 2 | 0 | 50 |
| 7 | 6 | 2 | 15 | 50 |
| 8 | 7 | 2 | 5 | 50 |
| 9 | 8 | 2 | 5 | 50 |
+---+--------+-------------+---------------+--------------+
And I am transforming it to look like this:
+---+--------+-------------+---------------+--------------+
| | A | B | C | D |
+---+--------+-------------+---------------+--------------+
| 1 | row_id | disposal_id | excess_weight | total_weight |
| 2 | 1 | 1 | 0 | 30 |
| 3 | 2 | 1 | 10 | 30 |
| 4 | 3 | 1 | 0 | 20 |
| 5 | 4 | 2 | 5 | 50 |
| 6 | 5 | 2 | 0 | 45 |
| 7 | 6 | 2 | 15 | 45 |
| 8 | 7 | 2 | 5 | 30 |
| 9 | 8 | 2 | 5 | 25 |
+---+--------+-------------+---------------+--------------+
Basically, I need to update the total_weight column by subtracting the sum of the excess_weights from previous rows in the table which belong to the same disposal_id.
I'm currently using a cursor because it's faster then other solutions I've tried (cte, triangular join, cross apply). My cursor solution keeps a running total that is reset to zero for each new disposal_id, increments it by the excess weight, and performs updates when needed and runs in about 40 seconds. The other solutions I've tried took anywhere from 3-5 minutes and I'm wondering if there is a relatively performant way to do this using set based operations?
I've spent a lot of time optimizing such queries, ended up with two performant options: either store precalculated running totals, as described in Denormalizing to enforce business rules: Running Totals, or calculate them on the client, which is also fast and easy.
The other solution you probably already tried is to do something like the answers found here
Unless you are using Oracle, which has decent aggregates for cumulative sum, you're better off using a cursor. At best, you're going to have to rejoin the table to itself or use another methods for what should be a O(n) operation. In general, the set based solution for problems like these are messy or really messy.
'previous rows' implies an ordering. so no - no set based operations there.
Oracle's LEAD and LAG are built for this, but SQL Server forces you into triangular joins... which i suppose you have investigated.