What would be a good way to pivot this data? - sql

I have data in the following granularity:
CityID | Name | Post_Science | Post_Science | Post_Reading | Pre_Reading | Post_Writing | Pre_Writing
123 | Bob | 2.0 | 1.0 | 2.0 | 4.0 | 1.0 | 1.0
I'll be calling those <Post/Pre>_XXXXXX columns as Labels. Basically, these column names without the 'Pre' or 'Post' text map to a Label in another table.
I want to pivot the data in a way so that the pre and post values of the same Label are in the same row, for each group of CityID, Name, Label. So it would look like this:
CityID | Name | Pre Category | Post Category | Label
123 | Bob | 1.0 | 2.0 | Science
123 | Bob | 4.0 | 2.0 | Reading
123 | Bob | 1.0 | 1.0 | Writing
The Label comes from a separate table via a join. Hopefully that doesn't confuse anyone. If it does, ignore the column for now.
So there are much more of these categories - Science, Reading, and Writing are just a few I picked for example.
I've thought of two options to get the data in this format:
Unpivot all the data into a long list of all the values at a group of CityID, Name, Label. Then parse the Label name and pivot back into the pre and post values of one category into 1 row
Do a bunch of Unions. So select all Science in one select statement, all the Reading in another select statement and union them. There are about 50 pairings, so 50 union statements
I'm imagining the first option is cleaner than the latter. Any other options though?

This is unpivoting and I strongly recommend apply:
select t.CityId, t.Name, v.*
from t cross apply
(values (t.Post_Science, t.Pre_Science, 'Science'),
(t.Post_Reading, t.Pre_Reading, 'Reading'),
(t.Post_Writing, t.Pre_Writing, 'Writing')
) v(postcategory, precategory, label) ;
UNPIVOT is very particular syntax to do one thing. APPLY introduces lateral joins, which are very powerful for this and many other purposes.

Clearly Gordon's solution would be more performant, but if you have MANY or VARIABLE COLUMNS, here is an option that will dynamically UNPIVOT your data without actually using DYNAMIC SQL
Example
Select A.CityID
,A.Name
,PreCat = max(case when Item Like 'Pre%' then Value end)
,PostCat = max(case when Item Like 'Post%' then Value end)
,Label = substring(Item,charindex('_',Item+'_')+1,50)
From YourTable A
Cross Apply ( values (cast((Select A.* for XML RAW) as xml))) B(XMLData)
Cross Apply (
Select Item = xAttr.value('local-name(.)', 'varchar(100)')
,Value = xAttr.value('.','varchar(max)')
From XMLData.nodes('//#*') xNode(xAttr)
Where xAttr.value('local-name(.)','varchar(100)') not in ('CityId','Name','Other-Columns','To-Exclude')
) C
Group By A.CityID
,A.Name
,substring(Item,charindex('_',Item+'_')+1,50)
Returns
CityID Name PreCat PostCat Label
123 Bob 4.0 2.0 Reading
123 Bob 1.0 2.0 Science
123 Bob 1.0 1.0 Writing

Related

SQL Join with SUM and Group By

I have an application in MVC5 C#. I want to add a feature where the admin can see how many parking spaces each parking lot has, along with how many spots are currently taken vs available.
Let's say, I have two tables (Lots and Staff).
the 'Lots' Table is the name of the existing parking lots and a count of how many spaces each lot has:
Name | Count
A | 200
B | 450
C | 375
The 'Staff' Table contains each person's ID, and an int column for each lot. These columns are either 1 (employee is assigned to this lot) or 0 (employee not assigned to this lot
StaffID | LotA | Lot B | Lot C
7264 | 0 | 1 | 0
2266 | 0 | 0 | 1
3344 | 1 | 0 | 0
4444 | 0 | 1 | 0
In the above scenario, the desired output would be . . . .
Lot | Total | Used | Vacant
A | 200 | 1 | 199
B | 450 | 2 | 448
C | 375 | 1 | 374
I have only done simple joins, in the past, and I am struggling to figure out whether it is best to use COUNT with a Where column = 1, or a Sum. But I also choke on the group by and get lost.
I have tried similar variations of the following (with no success)
SELECT Lots.Name,
Lots.Count,
SUM(Staff.LotA) As A_Occupied,
SUM(Staff.LotB) as B_Occupied,
SUM(Staff.LotC) as C_Occupied
FROM Lots
CROSS JOIN Staff
GROUP BY Lots.Name
In theory, I expected this to yield a listing of the lots, how many spaces in each lot, and how many occupied in each lot, grouped by the lot name.
This generates an error in the Group By because I am selecting columns that are not included in the Group by, but that just confuses me even more as to how to Group By for my needs.
In general, I have a bad habit of trying to over complicate things and so I apologize ahead of time. I am not a 'classically trained coder'. I did my best to organize the question and make it understandable
Firstly, your design would work better if it was properly normalised - your Staff table should have a single column representing the Lot that's used by each staff member (if you had 100 lots, would you have 100 columns?). Then there would be no need to pivot the data from columns into the rows you need.
With your current schema, you could use a cross apply to pivot the columns into rows which can then be correlated to each Lot name:
select l.[name] as Lot, q.Qty as Used, l.count - q.Qty as Vacant
from lots l
cross apply (
select Sum(LotA) A, Sum(LotB) B, Sum(LotC) C
from Staff s
)s
cross apply (
select Qty from (
select * from (values ('A', s.A),('B', s.B),('C', s.C))v(Lot, Qty)
)x
where x.Lot = l.[name]
)q;
See Demo Fiddle
is a layman solution. Hope it helps.
with parking as (
select 'A' as lot ,sum(case when lota >0 then 1 else 0 end ) as lots from staff
union all
select 'B' as lotb , sum(case when lotb >0 then 1 else 0 end ) as lots from staff
union all
select 'C' as lotc, sum(case when lotc >0 then 1 else 0 end ) as lots from staff
)
select a.name, a.count, b.lots, a.count-b.lots as places
from lots a join parking b on a.name=b.lot

how to merge two columns and make a new column in sql

I have to merge two columns in one column that has deferent values
I wrote this code but I can't continue:
SELECT Title, ...
FROM BuyItems
input:
| title | UK | US | total |
|-------|-----|-----|-------|
| coca | 3 | 5 | 8 |
| cake | 2 | 0 | 2 |
output:
|title |Origin | Total|
|------|-------|------|
|coca | UK | 3 |
|coca | US | 5 |
|cake | US | 2 |
You can use CROSS APPLY and a table value constructor to do this:
-- EXAMPLE DATA START
WITH BuyItems AS
( SELECT x.title, x.UK, x.US
FROM (VALUES ('coca', 3, 5), ('cake', 2, 0)) x (title, UK, US)
)
-- EXAMPLE DATA END
SELECT bi.Title, upvt.Origin, upvt.Total
FROM BuyItems AS bi
CROSS APPLY (VALUES ('UK', bi.UK), ('US', bi.US)) upvt (Origin, Total)
WHERE upvt.Total <> 0;
Alternatively, you can use the UNPIVOT function:
-- EXAMPLE DATA START
WITH BuyItems AS
( SELECT x.title, x.UK, x.US
FROM (VALUES ('coca', 3, 5), ('cake', 2, 0)) x (title, UK, US)
)
-- EXAMPLE DATA END
SELECT upvt.Title, upvt.Origin, upvt.Total
FROM BuyItems AS bi
UNPIVOT (Total FOR Origin IN (UK, US)) AS upvt
WHERE upvt.Total <> 0;
My preference is usually for the former, as it is much more flexible. You can use explicit casting to combine columns of different types, or unpivot multiple columns. UNPIVOT works just fine, and there is no reason not to use it, but since UNPIVOT works in limited scenarios, and CROSS APPLY/VALUES works in all scenarios, I just go for this option as default.
Use apply:
select v.*
from t cross apply
(values (t.title, 'UK', uk), (t.title, 'US', us)
) v(title, origin, total)
where v.total > 0;
This is a simple unpivot:
SELECT YT.title,
V.Origin,
V.Total
FROM dbo.YourTable YT
CROSS APPLY (VALUES('UK',UK),
('US',US))V(Origin,Total);

Unpack all arrays in a JSON column SQL Server 2019

Say I have a table Schema.table with these columns
id | json_col
on the forms e.g
id=1
json_col ={"names":["John","Peter"],"ages":["31","40"]}
The lengths of names and ages are always equal but might vary from id to id (size is at least 1 but no upper limit).
How do we get an "exploded" table - a table with a row for each "names", "ages" e.g
id | names | ages
---+-------+------
1 | John | 31
1 | Peter | 41
2 | Jim | 17
3 | Foo | 2
.
.
I have tried OPENJSON and CROSS APPLY but the following gives any combination of names and ages which is not correct, thus I need to to a lot of filtering afterwards
SELECT *
FROM Schema.table
CROSS APPLY OPENJSON(Schema.table,'$.names')
CROSS APPLY OPENJSON(Schema.table,'$.ages')
Here's my suggestion
DECLARE #tbl TABLE(id INT,json_col NVARCHAR(MAX));
INSERT INTO #tbl VALUES(1,N'{"names":["John","Peter"],"ages":["31","40"]}')
,(2,N'{"names":["Jim"],"ages":["17"]}');
SELECT t.id
,B.[key] As ValueIndex
,B.[value] AS PersonNam
,JSON_VALUE(A.ages,CONCAT('$[',B.[key],']')) AS PersonAge
FROM #tbl t
CROSS APPLY OPENJSON(t.json_col)
WITH(names NVARCHAR(MAX) AS JSON
,ages NVARCHAR(MAX) AS JSON) A
CROSS APPLY OPENJSON(A.names) B;
The idea in short:
We use OPENJSON with a WITH clause to read names and ages into new json variables.
We use one more OPENJSON to "explode" the names-array
As the key is the value's position within the array, we can use JSON_VALUE() to read the corresponding age-value by its position.
One general remark: If this JSON is under your control, you should change this to an entity-centered approach (array of objects). Such a position dependant storage can be quite erronous... Try something like
{"persons":[{"name":"John","age":"31"},{"name":"Peter","age":"40"}]}
Conditional Aggregation along with applying CROSS APPLY might be used :
SELECT id,
MAX(CASE WHEN RowKey = 'names' THEN value END) AS names,
MAX(CASE WHEN RowKey = 'ages' THEN value END) AS ages
FROM
(
SELECT id, Q0.[value] AS RowArray, Q0.[key] AS RowKey
FROM tab
CROSS APPLY OPENJSON(JsonCol) AS Q0
) r
CROSS APPLY OPENJSON(r.RowArray) v
GROUP BY id, v.[key]
ORDER BY id, v.[key]
id | names | ages
---+-------+------
1 | John | 31
1 | Peter | 41
2 | Jim | 17
3 | Foo | 2
Demo
The first argument for OPENJSON would be a JSON column value, but not a table itself

Using dynamic unpivot with columns with different types

i have a table with around 100 columns named F1, F2, ... F100.
I want to query the data row-wise, like this:
F1: someVal1
F2: someVal2
...
I am doing all this inside a SP, therefore, I am generating the sql dynamically.
I have successfully generated the following sql:
select CAST(valname as nvarchar(max)), CAST(valvalue as nvarchar(max)) from tbl_name unpivot
(
valvalue for valname in ([form_id], [F1],[F2],[F3],[F4],[F5],[F6],[F7],[F8],[F9],[F10],[F11],[F12],[F13],[F14],[F15],[F16],[F17],[F18],[F19],[F20],[F21],[F22],[F23],[F24],[F25],[F26],[F27],[F28],[F29],[F30],[F31],[F32],[F33],[F34],[F35],[F36],[F37],[F38],[F39],[F40],[F41],[F42],[F43],[F44],[F45],[F46],[F47],[F48],[F49],[F50],[F51],[F52],[F53],[F54],[F55],[F56],[F57],[F58],[F59],[F60],[F61],[F62],[F63],[F64],[F65],[F66],[F67],[F68],[F69],[F70],[F71],[F72],[F73],[F74],[F75],[F76],[F77],[F78],[F79],[F80],[F81],[F82],[F83],[F84],[F85])
) u
But on executing this query, I get this exception:
The type of column "F3" conflicts with the type of other columns
specified in the UNPIVOT list.
I guess this is because F3 is varchar(100) while form_id, F1 and F2 are varchar(50). According to my understanding, I shouldn't be getting this error because I am casting all the results to nvarchar(max) in the select statement.
This table has all kinds of columns like datetime, smallint and int.
Also, all the columns of this table except one have SQL_Latin1_General_CP1_CI_AS collaltion
What is the fix for this error ?
this solution is you must use a subquery to let all columns be the same type to have the same length.
Try to CAST the values in subquery then unpivot instead of select
select valname, valvalue
from (
SELECT
CAST([form_id] as nvarchar(max)) form_id,
CAST([F1] as nvarchar(max)) F1,
CAST([F2] as nvarchar(max)) F2,
CAST([F3] as nvarchar(max)) F3,
CAST([F4] as nvarchar(max)) F4,
....
FROM tbl_name
) t1 unpivot
(
valvalue for valname in ([form_id], [F1],[F2],[F3],[F4],[F5],[F6],[F7],[F8],[F9],[F10],[F11],[F12],[F13],[F14],[F15],[F16],[F17],[F18],[F19],[F20],[F21],[F22],[F23],[F24],[F25],[F26],[F27],[F28],[F29],[F30],[F31],[F32],[F33],[F34],[F35],[F36],[F37],[F38],[F39],[F40],[F41],[F42],[F43],[F44],[F45],[F46],[F47],[F48],[F49],[F50],[F51],[F52],[F53],[F54],[F55],[F56],[F57],[F58],[F59],[F60],[F61],[F62],[F63],[F64],[F65],[F66],[F67],[F68],[F69],[F70],[F71],[F72],[F73],[F74],[F75],[F76],[F77],[F78],[F79],[F80],[F81],[F82],[F83],[F84],[F85])
) u
In a simplest way I would use CROSS APPLY with VALUES to do unpivot
SELECT *
FROM People CROSS APPLY (VALUES
(CAST([form_id] as nvarchar(max))),
(CAST([F1] as nvarchar(max))),
(CAST([F2] as nvarchar(max))),
(CAST([F3] as nvarchar(max))),
(CAST([F4] as nvarchar(max))),
....
) v (valvalue)
Here is a sample about CROSS APPLY with VALUES to do unpivot
we can see there are many different types in the People table.
we can try to use cast to varchar(max), let columns be the same type.
CREATE TABLE People
(
IntVal int,
StringVal varchar(50),
DateVal date
)
INSERT INTO People VALUES (1, 'Jim', '2017-01-01');
INSERT INTO People VALUES (2, 'Jane', '2017-01-02');
INSERT INTO People VALUES (3, 'Bob', '2017-01-03');
Query 1:
SELECT *
FROM People CROSS APPLY (VALUES
(CAST(IntVal AS VARCHAR(MAX))),
(CAST(StringVal AS VARCHAR(MAX))),
(CAST(DateVal AS VARCHAR(MAX)))
) v (valvalue)
Results:
| IntVal | StringVal | DateVal | valvalue |
|--------|-----------|------------|------------|
| 1 | Jim | 2017-01-01 | 1 |
| 1 | Jim | 2017-01-01 | Jim |
| 1 | Jim | 2017-01-01 | 2017-01-01 |
| 2 | Jane | 2017-01-02 | 2 |
| 2 | Jane | 2017-01-02 | Jane |
| 2 | Jane | 2017-01-02 | 2017-01-02 |
| 3 | Bob | 2017-01-03 | 3 |
| 3 | Bob | 2017-01-03 | Bob |
| 3 | Bob | 2017-01-03 | 2017-01-03 |
Note
when you use unpivot need to make sure the unpivot columns date type are the same.
Many ways a cat can skin you, or vice-versa.
Jokes apart, what D-Shih suggested is what you should start with and may get you home and dry.
In a majority of cases;
Essentially the UNPIVOT operation is concatenating the data from multiple rows. Starting with a CAST operation is the best way forward as it makes the data types identical(preferably a string type like varchar or nvarchar), its also a good idea to go with the same length for all UNPIVOTED columns in addition to having the same type.
In other cases;
If this still does not solve the problem, then you need to look deeper and check whether the ANSI_Padding setting is ON or OFF across all columns on the table. In latter day versions of SQL server this is mostly ON by default, but some developers may customise certain columns to have ANSI_PADDING set to off. If you have a mixed setup like this its best to move the data to another table with ANSI_PADDING set to ON. Try using the same UNPIVOT query on that table and it should work.
Check ANSI_Padding Status
SELECT name
,CASE is_ansi_padded
WHEN 1 THEN 'ANSI_Padding_On'
ELSE 'ANSI_Padding_Off'
AS [ANSI_Padding_Check]
FROM sys.all_columns
WHERE object_id = object_id('yourschema.yourtable')
Many situations be better suited for CROSS APPLY VALUES. It all depends on you, the jockey to choose horses for courses.
Cheers.

Add Value column using another column as Key

Hopefully the table itself states the problem. Essentially with the Type column on the left, is it possible to add a unique code/value column using Type as a hash key/set based on the appearance orders of the types:
Type | Code
-----------
ADA | 1
ADA | 1
BIM | 2
BIM | 2
CUR | 3
BIM | 2
DEQ | 4
ADA | 1
... | ...
We can't simply hard-code the conversion as each time there's arbitrary number of Types.
You can use dense_rank():
select type, dense_rank() over (order by type) as code
from t;
However, I would advise you to create another table and to use that:
create table Types as (
select row_number() over (order by type) as TypeId,
type
from t
group by type;
Then, join that in:
select t.type, tt.TypeId
from t join
types tt
on t.type = tt.type;