SQL - Return dataset pivoted on one column with multiple aggregates - sql

Using TSQL 2012 here, essentially I have a dataset that looks like the following:
Period Values OtherValues SiteName MoreColumns
1 12 45 Site 1 34
2 34 6 Site 1 346
2 56 79 Site 1 345
3 3 78 Site 1 67
3 4 67 Site 1 8
What I would like to return is a dataset that groups on site and sums all the other columns based on the period they're against.
Site P1V P2v P3V P1OtherV P2OtherV P3OtherV
Site 1 12 90 7 45 85 145
I know I can do it with a case in the style of:
SELECT CASE WHEN Period = '1' THEN Sum(Values) As P1Values,
CASE WHEN Period = '2' THEN Sum(Values) As P2Values,
CASE WHEN Period = '3' THEN Sum(Values) As P3Values
.....
But surely there's a more elegant solution for this? The dataset should return three sums (for each period) for 7 columns, so in total 21 sums, with the potential to grow.

I would use UNPIVOT/PIVOT sequence:
SELECT * FROM
(
SELECT Period+Col Period, SiteName, Value FROM
Src UNPIVOT (Value FOR Col IN (SapValues,OtherValues)) U
) U
PIVOT (SUM(Value) FOR Period IN (P1SapValues,P2SapValues,P3SapValues,
P1OtherValues,P2OtherValues,P3OtherValues)) P

You can do it using PIVOT and making source table with UNION
WITH T1 AS
(
SELECT Period + 'SapValues' as Period, SapValues as Value, SiteName FROM T
UNION ALL
SELECT Period + 'OtherValues' as Period, OtherValues as Value, SiteName FROM T
)
SELECT *
FROM T1
PIVOT
(
Sum(Value)
FOR Period IN ([P1SapValues],[P2SapValues],[P3SapValues],
[P1OtherValues],[P2OtherValues],[P3OtherValues])
) AS PivotTable;

Related

Sql getting MAX and MIN values based on two columns for the ids from two others

I'm having difficulties figuring a query out, would someone be able to assist me with this?
Problem: 4 columns that represent results for the 2 separate tests. One of them taken in UK and another in US. Both of them are the same test and I need to find the highest and lowest score for the test taken in both countries. I also need to avoid using subqueries and temporary tables. Would appreciate theoretical ideas and actual solutions for the problem.
The table looks like this:
ResultID Test_UK Test_US Test_UK_Score Test_US_Score
1 1 2 48 11
2 4 1 21 24
3 3 1 55 71
4 5 6 18 78
5 7 4 19 49
6 1 3 23 69
7 5 2 98 35
8 6 7 41 47
The desired results I'm looking for:
TestID HighestScore LowestScore
1 71 23
2 35 11
3 69 55
4 49 21
5 98 18
6 78 41
7 47 19
I tried implementing a case of comparison, but I still ended up with subquery to pull out the final results. Also tried union, but it ends up in a sub query again. As far as I can think it shoul be a case when then query, but can't really come up with the logic for it, as it requires to match the ID's of the tests.
Thank you!
What I've tried and got the best results (still wrong)
select v.TestID,
max(case when Test_US_Score > Test_UK_Score then Test_UK_Score else null end) MaxS,
min(case when Test_UK_Score > Test_US_Score then Test_US_Score else null end) MinS
FROM ResultsDB rDB CROSS APPLY
(VALUES (Test_UK, 1), (Test_US, 0)
) V(testID, amount)
GROUP BY v.TestID
Extra
The answer provided by M. Kanarkowski is a perfect solution. I'm no expert on CTE, and a bit confused, how would it be possible to adapt this query to return the result ID of the row that min and max were found.
something like this:
TestID Result_ID_Max Result_ID_Min
1 3 6
2 7 1
3 6 3
Extra 2
The desired results of the query would me something like this.
The two last columns represent the IDs of the rows from the original table where the max and min values were found.
TestID HighestScore LowestScore Result_ID_Of_Max Result_ID_Of_Min
1 71 23 3 6
2 35 11 7 1
3 69 55 6 3
For example you can use union to have results from both countries togehter and then just pick the maximum and the minimum for your data.
with cte as (
select Test_UK as TestID, Test_UK_Score as score from yourTable
union all
select Test_US as TestID, Test_US_Score as score from yourTable
)
select
TestID
,max(score) as HighestScore
,min(score) as LowestScore
from cte
group by TestID
order by TestID
Extra:
I assumed that you want to have the additional column with the previous result. If not just take the above select and replace Test_UK_Score and Test_US_Score with ResultID.
with cte as (
select Test_UK as TestID, Test_UK_Score as score, ResultID from yourTable
union all
select Test_US as TestID, Test_US_Score as score, ResultID from yourTable
)
select
TestID
,max(score) as HighestScore
,min(score) as LowestScore
,max(ResultID) as Result_ID_Max
,min(ResultID) as Result_ID_Min
from cte
group by TestID
order by TestID

Count distinct values of a Column based on Distinct values of First Column

I am dealing with a huge volume of traffic data. I want to identify the vehicles which have changed their lanes, I'm Microsoft Access with VB.Net.
Traffic Data:
Vehicle_ID Lane_ID Frame_ID Distance
1 2 12 100
1 2 13 103
1 2 14 105
2 1 16 130
2 1 17 135
2 2 18 136
3 1 19 140
3 2 20 141
I have tried to distinct the Vehicle_ID and then count(distinct Lane_ID).
I could list the distinct Vehicle_ID but the it counts the total Lane_ID instead of Distinct Lane_ID.
SELECT
Distinct Vehicle_ID, count(Lane_ID)
FROM Table1
GROUP BY Vehicle_ID
Shown Result:
Vehicle_ID Lane Count
1 3
2 3
3 2
Correct Result:
Vehicle_ID Lane Count
1 1
2 2
3 2
Further to that i would like to get all Vehicle_ID who have changed their lane (all data including previous lane and new lane). Output result would be somehow like: Vehicle_ID Lane_ID Frame_ID Distance
2 1 17 135
2 2 18 136
3 1 19 140
3 2 20 141
Access does not support COUNT(DISTINCT columnname) so do this:
SELECT t.Vehicle_ID, COUNT(t.Lane_ID) AS [Lane Count]
FROM (
SELECT DISTINCT Vehicle_ID, Lane_ID FROM Table1
) AS t
GROUP BY t.Vehicle_ID
So
to identify the vehicles which have changed their lanes
you need to add to the above query:
HAVING COUNT(t.Lane_ID) > 1
SELECT
Table1.Vehicle_ID,
LANE_COUNT
FROM Table1
JOIN (
SELECT Vehicle_ID, COUNT(*) as LANE_COUNT FROM (
SELECT distinct Vehicle_ID, Lane_ID FROM Table1
) dTable1 # distinct vehicle and land id
GROUP BY Vehicle_ID # counting the distinct
) cTable1 ON cTable1.Vehicle_ID = Table1.Vehicle_ID # join the table with the counting
I think you should do one by one,
Distinct the vehicle id and land id
counting the distinct combination
and merge the result with the actual table.
If you want vehicles that have changed their lanes, then you can do:
SELECT Vehicle_ID,
IIF(MIN(Lane_ID) = MAX(Lane_ID), 0, 1) as change_lane_flag
FROM Table1
GROUP BY Vehicle_ID;
I think this is as good as counting the number of distinct lanes, because you are not counting actual "lane changes". So this would return "2" even though the vehicle changes lanes multiple times:
2 1 16 130
2 1 17 135
2 2 18 136
2 1 16 140
2 1 17 145
2 2 18 146

Re-Organize Access Table by converting Rows to Columns

I'm pretty new to access and SQL and need some help re-organizing a table. I have the following table (sorry for the table below - having trouble posting):
ID GroupID Distance Code Start_Finish
1 44 7 A S1
2 44 14 A F1
3 45 12 B S1
4 45 16 B F1
5 45 31 C S2
6 45 36 C F2
7 45 81 B S3
8 45 88 B F3
And need for the table to be transformed into:
GroupID Code Start_Distance Finish_Distance
44 A 7 14
45 B 12 16
45 C 31 36
45 B 81 88
try something like this
Select GroupID, Code, min(distance) as Start_distance, max(distance) as Finish_distance
from Table
group by GroupID, Code
If the min and max functions don't give you what you need, try it with First() and Last() instead.
Oops - just noticed you have 2 different entries in the output for GroupID 45 Code B - is that a requirement? With that data structure and requirement, the problem gets much more difficult.
Now I see the final column in the 1st table - I think that can be used to get the output you want:
Select GroupID, Code, mid(start_finish,2) as T, min(distance) as Start_distance, max(distance) as Finish_distance
from Table
group by GroupID, Code, T
You can use conditional aggregation for this.
select GroupID
, CODE
, max(case when Left(Start_Finish, 1) = 'S' then Distance end) as Start_Distance
, max(case when Left(Start_Finish, 1) = 'F' then Distance end) as Finish_Distance
from SomeTable
group by GroupID
, CODE

Assigning a value of data for each record having the same condition in SQL Server 2008

I have a table in SQL Server 2008 like:
Period Name Value
1 A 10
2 A 20
3 A 30
4 A 40
1 B 50
2 B 80
3 B 70
4 B 60
What I need to write a select query includes a new column MainValue which contains the value where period=4 for a name for each data.
Example:
Period Name Value MainValue
1 A 10 40
2 A 20 40
3 A 30 40
4 A 40 40
1 B 50 60
2 B 80 60
3 B 70 60
4 B 60 60
How can I provide this? I tried the one below, but it is not working as I want.
Select
*,
(select Value where Period = 4) as MainValue
from myTable;
Any help would be appreciated.
Try this:
SELECT Period, Name, Value,
MAX(CASE WHEN Period=4 THEN Value END) OVER (PARTITION BY Name) AS MainValue
FROM mytable
The query uses a window function with a condition applied over Name partitions: the function returns the Value corresponding to Period=4 inside each partition.
You can do this a number of ways. A correlated sub-query as the column, a cross apply to a correlated query, or a cte. I personally like the cte approach. It would look something like this.
with MainValues as
(
select Name
, Value
from SomeTable
where Period = 4
)
select st.*
, mv.Value as MainValue
from SomeTable st
join MainValues mv on st.Name = mv.Name

SQL Server 2012 buckets based on running total

For SQL Server 2012, I am trying to assign given rows to sequential buckets based on the maximum size of the bucket (100 in the sample below) and running total of a column. Most of the solutions I found partition by known column changing value e.g. partition by department id etc. However, in this situation all I have is sequential id and size. The closest solution I have found is discussed in this thread for SQL Server 2008 and I tried it but the performance very slow for large row set much worse than cursor based solution. https://dba.stackexchange.com/questions/45179/how-can-i-write-windowing-query-which-sums-a-column-to-create-discrete-buckets
This table can contain up to 10 Million rows. With SQL Server 2012 supporting SUM OVER and LAG and LEAD functions, wondering if someone can suggest a solution based on 2012.
CREATE TABLE raw_data (
id INT PRIMARY KEY
, size INT NOT NULL
);
INSERT INTO raw_data
(id, size)
VALUES
( 1, 96) -- new bucket here, maximum bucket size is 100
, ( 2, 10) -- and here
, ( 3, 98) -- and here
, ( 4, 20)
, ( 5, 50)
, ( 6, 15)
, ( 7, 97)
, ( 8, 96) -- and here
;
--Expected output
--bucket_size is for illustration only, actual needed output is bucket only
id size bucket_size bucket
-----------------------------
1 100 100 1
2 10 10 2
3 98 98 3
4 20 85 4
5 50 85 4
6 15 85 4
7 97 98 5
8 1 98 5
TIA
You can achieve this quite easily in SQL Server 2012 using a window function and framing. The syntax looks quite complex, but the concept is simple - sum all the previous rows up to and including the current one. The cumulative_bucket_size column in this example is for demonstration purposes, as it is part of the equation used to derive the bucket number:
DECLARE #Bucket_Size AS INT;
SET #Bucket_Size = 100
SELECT
id,
size,
SUM(size) OVER (
PARTITION BY 1 ORDER BY id ASC
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
) AS cumulative_bucket_size,
1 + SUM(size) OVER (
PARTITION BY 1 ORDER BY id ASC
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
) / #Bucket_Size AS bucket
FROM
raw_data
The PARTITION BY clause is optional, but would be useful if you had different "bucket sets" for column groupings. I have added it here for completeness.
Results:
id size cumulative_bucket_size bucket
------------------------------------------
1 96 96 1
2 10 106 2
3 98 204 3
4 20 224 3
5 50 274 3
6 15 289 3
7 97 386 4
8 96 482 5
You can read more about windows framing in the following article:
https://www.simple-talk.com/sql/learn-sql-server/window-functions-in-sql-server-part-2-the-frame/
Before you can use the running total method to assign bucket numbers, you need to generate that bucket_size column, because the numbers would be produced based on that column.
Based on your expected output, the bucket ranges are
1..10
11..85
86..100
You could use a simple CASE expression like this to generate a bucket_size column like in your example:
CASE
WHEN size <= 10 THEN 10
WHEN size <= 85 THEN 85
ELSE 100
END
Then you would use LAG() to determine if a row starts a new sequence of sizes belonging to the same bucket:
CASE bucket_size
WHEN LAG(bucket_size) OVER (ORDER BY id) THEN 0
ELSE 1
END
These two calculations could be done in the same (sub)query with the help of CROSS APPLY:
SELECT
d.id,
d.size,
x.bucket_size, -- for illustration only
is_new_seq = CASE x.bucket_size
WHEN LAG(x.bucket_size) OVER (ORDER BY d.id) THEN 0
ELSE 1
END
FROM dbo.raw_data AS d
CROSS APPLY
(
SELECT
CASE
WHEN size <= 10 THEN 10
WHEN size <= 85 THEN 85
ELSE 100
END
) AS x (bucket_size)
The above query would produce this output:
id size bucket_size is_new_seq
-- ---- ----------- ----------
1 96 100 1
2 10 10 1
3 98 100 1
4 20 85 1
5 50 85 0
6 15 85 0
7 97 100 1
8 96 100 0
Now use that result as a derived table and apply SUM() OVER to is_new_seq to produce the bucket numbers, like this:
SELECT
id,
size,
bucket = SUM(is_new_seq) OVER (ORDER BY id)
FROM
(
SELECT
d.id,
d.size,
is_new_seq = CASE x.bucket_size
WHEN LAG(x.bucket_size) OVER (ORDER BY d.id) THEN 0
ELSE 1
END
FROM dbo.raw_data AS d
CROSS APPLY
(
SELECT
CASE
WHEN size <= 10 THEN 10
WHEN size <= 85 THEN 85
ELSE 100
END
) AS x (bucket_size)
) AS s
;