Joining two tables and getting rows as columns - sql

I have two tables. One is a transaction table of user id with following details.
user_id Product_id timestamp transaction_id
123 A_1 ID1
123 A_2 ID1
124 A_1 ID2
125 A_2
Now there is a product_id mapping with the division to which the product belongs
Mapping:
Product_id Division
A_1 Grocery
A_2 Electronics and so on
I need a final table where I have one record for each user id and the corresponding items bought in each division as separate columns. Like
User_ID Grocery Electronics
123 1 1
124 1 0
I did something like this:
select user_id, case (when Division ="Grocery" then count(product_id) else 0) end as Grocery
when Division="Electronics" then count(product_id) else 0) end as Electronics
from
( select user_id, a.product_id, b.division from transact as a
left join
mapping b
on a.product_id=b.product_id
)
group by user_id
Does this sound good?

When you use conditional aggregation, the case is the argument to the aggregation function:
select user_id,
sum(case when m.Division = 'Grocery' then 1 else 0 end) as Grocery,
sum(case when m.Division = 'Electronics' then 1 else 0 end) as Electronics
from transact t left join
mapping m
on t.product_id = m.product_id
group by user_id
SQL Fiddle
In addition:
Your table aliases should be abbreviations for the table. Makes the code easier to understand.
You don't need a subquery.
Use single quotes for string constants. This is the SQL standard.

This is a dynamic version of Gordon Linoff's answer, which uses conditional aggregation. This works for SQL-Server.
DECLARE #sql1 VARCHAR(4000) = ''
DECLARE #sql2 VARCHAR(4000) = ''
DECLARE #sql3 VARCHAR(4000) = ''
SELECT #sql1 =
'SELECT
t.user_id
'
SELECT #sql2 = #sql2 +
' , SUM(CASE WHEN m.division = ''' + division + ''' THEN 1 ELSE 0 END) AS [' + division + ']' + CHAR(10)
FROM(
SELECT DISTINCT division FROM Mapping
)t
ORDER BY division
SELECT #sql3 =
'FROM Transact t
LEFT JOIN Mapping m
ON m.product_id = t.product_id
GROUP BY t.user_id'
PRINT (#sql1 + #sql2 + #sql3)
EXEC (#sql1 + #sql2 + #sql3)
SQL Fiddle

as per my understanding it can be acheived using PIVOT also
declare #transact table (user_id int,product_id varchar(5),transaction_id varchar(5))
insert into #transact (user_id,product_id,transaction_id)values (123,'A_1','ID1'),(123,'A_2','ID1'),(124,'A_1','ID2'),(125,'A_2',NULL)
declare #mapping table (product_id varchar(5),division varchar(20))
insert into #mapping (product_id,division)values ('A_1','Grocery'),('A_2','Electronics')
select * from (
select t.user_id,tt.division as Division,tt.product_id As product_id from #transact t
left join #mapping tt
on t.product_id = tt.product_id)T
PIVOT(COUNT(product_id)FOR division IN ([Grocery],[Electronics]))P

Related

Create a pivot of same columns in to 1 row

I'm using a SQL Server, I've a query which return the data of all the fields, The main thing is that 1 field can belongs to multiple records, the record ID differentiate them.
I've a data set like this.
This is my current data set
My current query:
Select fd.FieldName ,FV.FieldID, Data , R.RecordID from FieldValues FV
Inner Join Records R on R.RecordID = FV.RecordID
Inner Join Forms F On f.FormID = R.FormID
Inner join Fields fd on fd.FieldID = fv.FieldID
Where R.RecordID IN (45,46)
I need to create 1 row of each columns that belongs to the same RecordID like this.
Service Name Location city VendorCode RecordID
Raj ABC LOCATION ABC CITY 32 45
BEN ABC LOCATION ABC CITY -- 46
The above is my desired output.
I've tried with pivot but have not succeeded.
If you don't like to deal with dynamic pivot and you do know the key of the rows you want to convert into columns, you can use standard sql with max and case when
select
max(case fd.FieldName when 'SelectService' then Data else null end) as ServiceName,
max(case fd.FieldName when 'EnterYourLocation' then Data else null end) as Location,
max(case fd.FieldName when 'City' then Data else null end) as city,
max(case fd.FieldName when 'VendorCodeOption' then Data else null end) as VendorCode,
R.RecordId
from FieldValues FV
Inner Join Records R on R.RecordID = FV.RecordID
Inner Join Forms F On f.FormID = R.FormID
Inner join Fields fd on fd.FieldID = fv.FieldID
where R.RecordID IN (45,46)
group by R.RecordId
This is the solution with pivot but it is missing to include adjust joins
declare #columns varchar(max) set #columns = ''
select #columns = coalesce(#columns + '[' + cast(col as varchar(MAX)) + '],', '')
FROM ( select FieldName as col from FieldValues group by FieldName ) m
set #columns = left(#columns,LEN(#columns)-1)
DECLARE #SQLString nvarchar(max);
set #SQLString = '
select * from
( select RecordId, FieldName, Data from FieldValues) m
PIVOT
( MAX(Data)
FOR FieldName in (' + #columns + ')
) AS PVT'
EXECUTE sp_executesql #SQLString

Trying to Sum up Cross-Tab Data in SQL

I have a table where every ID has one or more places, and each place comes with a count. Places can be repeated within IDs. It is stored in rows like so:
ID ColumnName DataValue
1 place1 ABC
1 count1 5
2 place1 BEC
2 count1 12
2 place2 CDE
2 count2 6
2 place3 BEC
2 count3 9
3 place1 BBC
3 count1 5
3 place2 BBC
3 count2 4
Ultimately, I want a table where every possible place name is its own column, and the count per place per ID is summed up, like so:
ID ABC BEC CDE BBC
1 5 0 0 0
2 0 21 6 0
3 0 0 0 9
I don't know the best way to go about this. There are around 50 different possible place names, so specifically listing them out in a query isn't ideal. I know I ultimately have to pivot the data, but I don't know if I should do it before or after I sum up the counts. And whether it's before or after, I haven't been able to figure out how to go about summing it up.
Any ideas/help would be greatly appreciated. At this point, I'm having a hard time finding where to even start. I've seen a few posts with similar problems, but nothing quite as convoluted as this.
EDIT:
Right now I'm working with this to pivot the table, but this leaves me with columns named place1, place2, .... count1, count2,...
and I don't know how to appropriately sum up the counts and make new columns with the place names.
DECLARE #cols NVARCHAR(MAX), #query NVARCHAR(MAX);
SET #cols = STUFF(
(
SELECT DISTINCT
','+QUOTENAME(c.[ColumnName])
FROM #temp c FOR XML PATH(''), TYPE
).value('.', 'nvarchar(max)'), 1, 1, '');
SET #query = 'SELECT [ID], '+#cols+'from (SELECT [ID],
[DataValue] AS [amount],
[ColumnName] AS [category]
FROM #temp
)x pivot (max(amount) for category in ('+#cols+')) p';
EXECUTE (#query);
Your table structure is pretty bad. You'll need to normalize your data before you can attempt to pivot it. Try this:
;WITH IDs AS
(
SELECT DISTINCT
id
,ColId = RIGHT(ColumnName, LEN(ColumnName) - 5)
,Place = datavalue
FROM #temp
WHERE ISNUMERIC(datavalue) = 0
)
,Counts AS
(
SELECT DISTINCT
id
,ColId = RIGHT(ColumnName, LEN(ColumnName) - 5)
,Cnt = CAST(datavalue AS INT)
FROM #temp
WHERE ISNUMERIC(datavalue) = 1
)
SELECT
piv.id
,ABC = ISNULL(piv.ABC, 0)
,BEC = ISNULL(piv.BEC, 0)
,CDE = ISNULL(piv.CDE, 0)
,BBC = ISNULL(piv.BBC, 0)
FROM (SELECT i.id, i.Place, c.Cnt FROM IDs i JOIN Counts c ON c.id = i.id AND c.ColId = i.ColId) src
PIVOT ( SUM(Cnt)
FOR Place IN ([ABC], [BEC], [CDE], [BBC])
) piv;
Doing it with dynamic SQL would yield the following:
SET #query =
';WITH IDs AS
(
SELECT DISTINCT
id
,ColId = RIGHT(ColumnName, LEN(ColumnName) - 5)
,Place = datavalue
FROM #temp
WHERE ISNUMERIC(datavalue) = 0
)
,Counts AS
(
SELECT DISTINCT
id
,ColId = RIGHT(ColumnName, LEN(ColumnName) - 5)
,Cnt = CAST(datavalue AS INT)
FROM #temp
WHERE ISNUMERIC(datavalue) = 1
)
SELECT [ID], '+#cols+'
FROM
(
SELECT i.id, i.Place, c.Cnt
FROM IDs i
JOIN Counts c ON c.id = i.id AND c.ColId = i.ColId
) src
PIVOT
(SUM(Cnt) FOR Place IN ('+#cols+')) piv;';
EXECUTE (#query);
Try this out:
SELECT id,
COALESCE(ABC, 0) AS ABC,
COALESCE(BBC, 0) AS BBC,
COALESCE(BEC, 0) AS BEC,
COALESCE(CDE, 0) AS CDE
FROM
(SELECT id,
MIN(CASE WHEN columnname LIKE 'place%' THEN datavalue END) AS col,
CAST(MIN(CASE WHEN columnname LIKE 'count%' THEN datavalue END) AS INT) AS val
FROM t
GROUP BY id, RIGHT(columnname, 1)
) src
PIVOT
(SUM(val)
FOR col in ([ABC], [BBC], [BEC], [CDE])) pvt
Tested here: http://rextester.com/XUTJ68690
In the src query, you need to re-format your data, so that you have a unique id and place in each row. From there a pivot will work.
If the count is always immediately after the place, the following query will generate a data set for pivoting.
The result data set before pivoting has the following columns:
id, placename, count
select placeTable.id, placeTable.datavalue, countTable.datavalue
from
(select *, row_number() over (order by id, %%physloc%%) as rownum
from test
where isnumeric(datavalue) = 1
) as countTable
join
(select *, row_number() over (order by id, %%physloc%%) as rownum
from test
where isnumeric(datavalue) <> 1
) as placeTable
on countTable.id = placeTable.id and
countTable.rownum = placeTable.rownum
Tested on sqlfidde mssqlserver: http://sqlfiddle.com/#!6/701c91/18
Here is one other approach using PIVOT operator with dynamic style
declare #Col varchar(2000) = '',
#Query varchar(2000) = ''
set #Col = stuff(
(select ','+QUOTENAME(DataValue)
from table where isnumeric(DataValue) = 0
group by DataValue for xml path('')),1,1,'')
set #Query = 'select id, '+#Col+' from
(
select id, DataValue,
cast((case when isnumeric(DataValue) = 1 then DataValue else lead(DataValue) over (order by id) end) as int) Value
from table
) as a
PIVOT
(
sum(Value) for DataValue in ('+#Col+')
)pvt'
EXECUTE (#Query)
Note : I have used lead() function to access next rows data if i found character string values and replace with numeric data values
Result :
id ABC BBC BEC CDE
1 5 NULL NULL NULL
2 NULL NULL 21 6
3 NULL 9 NULL NULL

Convert row to column when data are not numbers

I have a Question table, which has a unknown number of questions.(first table in the figure)
I also have an AnswerSheet table, which records student's answer to question.(second table in the figure)
Create table Question
(
Id int,
Text nvarchar(50),
PRIMARY KEY (Id)
)
Create table AnswerSheet
(
StudentId int,
QuestionId int,
Answer nvarchar(50),
PRIMARY KEY (StudentId,QuestionId),
FOREIGN KEY (QuestionId) REFERENCES Question (Id)
)
insert into Question
values(1,'What''s your age'),
(2,'What''s your gender'),
(3,'When do you go home'),
....
insert into AnswerSheet
values(500,1,'20'),
(500,2,'Male'),
(500,3,'5:00pm'),
(501,1,'50'),
(502,2,'I don''t know##'),
....
How do I write a SQL to generate a table like this?
StudentId What's your age What's your gender When do you go home ...
--------- ---------------- ------------------- -------------------
500 20 Male 5:00pm ...
501 50 NULL NULL
502 NULL I don''t know## NULL ...
I feel Pivot is promising but I'm not sure how to use it especially PIVOT requires an aggreation function but my data are not numbers.
Assuming you wanted to go Dynamic
Example
Declare #SQL varchar(max) = Stuff((Select ',' + QuoteName(Text) From Question Order by ID For XML Path('')),1,1,'')
Select #SQL = '
Select *
From (
Select StudentID
,Col = B.Text
,Value = A.Answer
From AnswerSheet A
Join Question B on A.QuestionID=B.ID
) A
Pivot (max(Value) For [Col] in (' + #SQL + ') ) p'
Exec(#SQL);
Returns
StudentID What's your age What's your gender When do you go home
500 20 Male 5:00pm
501 50 NULL NULL
502 NULL I don't know## NULL
If it Helps, the Generated SQL Looks Like This
Select *
From (
Select StudentID
,Col = B.Text
,Value = A.Answer
From AnswerSheet A
Join Question B on A.QuestionID=B.ID
) A
Pivot (max(Value) For [Col] in ([What's your age],[What's your gender],[When do you go home]) ) p
I know this question is answered by accepted one, but I hope this approach helps others.
simply you can achieve your goal without using Pivot, via using Group by as next:-
Select b.StudentId,
Min(Case a.text When 'What''s your age' Then b.answer End) 'What''s your age',
Min(Case a.text When 'What''s your gender' Then b.answer End) 'What''s your gender',
Min(Case a.text When 'When do you go home' Then b.answer End) 'When do you go home'
from Question a inner join AnswerSheet b
on a.id = b.Questionid
Group By StudentId
and you mentioned unknown number of questions, so the next code for dynamic:-
DECLARE #DynamicQuestions VARCHAR(8000)
SELECT #DynamicQuestions = Stuff(
(SELECT N' Min(Case a.text When''' + replace (Text,'''','''''')
+ ''' Then b.answer End) '''
+ replace (Text,'''','''''') + ''','
FROM Question FOR XML PATH(''),TYPE)
.value('text()[1]','nvarchar(max)'),1,1,N'')
select #DynamicQuestions =
left(#DynamicQuestions,len(#DynamicQuestions)-1) -- for Removing last comma
exec ('Select b.StudentId, '+ #DynamicQuestions +
'from Question a inner join AnswerSheet b
on a.id = b.Questionid
Group By StudentId' )
Result:-
StudentId What's your age What's your gender When do you go home
500 20 Male 5:00pm
501 50 NULL NULL
502 NULL I don't know## NULL

creating a pivot table using sql

I am trying to create a pivot table in sql but am having difficulties. Here is my problem: I have a column in my database called 'statusreason', and I need to provide a sum of each statusreason for the past week. My set is as follows:
I need to pivot this table so that it appears like the following:
There are a number of statusreasons that are not represented in the above table, since they did not occur in the past week.
The query used to generate the result set is:
select inv.statusreason
, count(inv.statusreason) as 'StatusCount'
from invoicetbl inv (nolock)
inner join trucktbl tru (nolock) on inv.tru_key = tru.tru_key
where inv.client_key = 123
and inv.createdate > getdate() - 7
group by inv.statusreason
If this isn't enough information, please advise what I could add to improve the question.
Thank you for any assistance you can provide.
Since you want to convert your rows of data into columns, you need to PIVOT the data. This can be done a number of ways.
If you have a limited number of values that you are going to be returning, then you can use an aggregate function with a CASE expression:
select
count(case when statusreason = 181 then 1 end) [181],
count(case when statusreason = 20 then 1 end) [20],
count(case when statusreason = 212 then 1 end) [212],
count(case when statusreason = 232 then 1 end) [232]
from
(
select inv.statusreason
from invoicetbl inv (nolock)
inner join trucktbl tru (nolock)
on inv.tru_key = tru.tru_key
where inv.client_key = 123
and inv.createdate > getdate() - 7
) d;
Or you can use the PIVOT function:
select [181], [20], [212], [232]
from
(
select inv.statusreason
from invoicetbl inv (nolock)
inner join trucktbl tru (nolock)
on inv.tru_key = tru.tru_key
where inv.client_key = 123
and inv.createdate > getdate() - 7
) d
pivot
(
count(statusreason)
for statusreason in ([181], [20], [212], [232])
) p;
If you have an unknown number of values that will be returned, then you will want to look at using dynamic SQL. This creates a sql string that will then be executed.
DECLARE #cols AS NVARCHAR(MAX),
#query AS NVARCHAR(MAX)
select #cols = STUFF((SELECT distinct ',' + QUOTENAME(statusreasons )
from statusreasontbl
FOR XML PATH(''), TYPE
).value('.', 'NVARCHAR(MAX)')
,1,1,'')
set #query = 'SELECT ' + #cols + '
from
(
select inv.statusreason
from invoicetbl inv (nolock)
inner join trucktbl tru (nolock)
on inv.tru_key = tru.tru_key
where inv.client_key = 123
and inv.createdate > getdate() - 7
) x
pivot
(
count(statusreason)
for statusreason in (' + #cols + ')
) p '
execute sp_executesql #query;

Dynamic Multi-Column SQL

I have two tables with structures like this:
VelocityBase
Aisle | ItemId | ConfigId | InventSizeId | InventColorId | InventLocationId | DataAreaId | VelocityCategory
VelocitySalesCount
ItemId | ConfigId | InventSizeId | InventColorId | InventLocationId | DataAreaId | Sales
Every row in the Base table represents a SKU and the sum of the related SalesCount records' "Sales" fields determines the "Picks". This query works:
SELECT Aisle, COUNT(*) as '# SKUs',
SUM(Sales) as '# Picks',
SUM(CASE WHEN VelocityCategory = 'Hot' THEN 1 ELSE 0 END) as 'Hot SKUs',
SUM(CASE WHEN VelocityCategory = 'Hot' THEN SALES ELSE 0 END) as 'Hot Picks',
SUM(CASE WHEN VelocityCategory = 'Warm' THEN 1 ELSE 0 END) as 'Warm SKUs',
SUM(CASE WHEN VelocityCategory = 'Warm' THEN SALES ELSE 0 END) as 'Warm Picks',
SUM(CASE WHEN VelocityCategory = 'Cold' THEN 1 ELSE 0 END) as 'Cold SKUs',
SUM(CASE WHEN VelocityCategory = 'Cold' THEN SALES ELSE 0 END) as 'Cold Picks'
FROM [dbo].[VelocityBase] Base
LEFT OUTER JOIN [dbo].[VelocitySalesCount] SalesCount
ON Base.ItemId = SalesCount.ItemId
AND Base.ConfigId = SalesCount.ConfigId
AND Base.InventSizeId = SalesCount.InventSizeId
AND Base.InventColorId = SalesCount.InventColorId
AND Base.InventLocationId = SalesCount.InventLocationId
AND SalesCount.DataAreaId = Base.DataAreaId
GROUP BY Aisle
ORDER BY Aisle
However, the columns are hard coded. What I would like is that the "Hot", "Warm", "Cold", etc be generated based on what values are present in the database for this column. That way if a user added a row that had "Lukewarm" as the VelocityCategory, two new columns would appear with that data.
I'm not sure if something like SQL to generate SQL or maybe a PIVOT function would do the trick.
Thanks in advance!
EDIT:
I'm narrowing in. I've got the Sum of the Sales figures using this:
DECLARE #SQLStatement NVARCHAR(4000)
,#PivotValues NVARCHAR(4000);
SET #PivotValues = '';
SELECT #PivotValues = #PivotValues + ',' + QUOTENAME(VelocityCategory)
FROM
(
SELECT DISTINCT VelocityCategory
FROM dbo.VelocityBase
) src;
SET #PivotValues = SUBSTRING(#PivotValues,2,4000);
SELECT #SQLStatement =
'SELECT pvt.*
FROM
(
SELECT Aisle, VelocityCategory, Sales
FROM VelocityBase Base
LEFT OUTER JOIN [dbo].[VelocitySalesCount] SalesCount
ON Base.ItemId = SalesCount.ItemId
AND Base.ConfigId = SalesCount.ConfigId
AND Base.InventSizeId = SalesCount.InventSizeId
AND Base.InventColorId = SalesCount.InventColorId
AND Base.InventLocationId = SalesCount.InventLocationId
AND SalesCount.DataAreaId = Base.DataAreaId
) VelocityBase
PIVOT ( Sum(Sales) FOR VelocityCategory IN ('+#PivotValues+') ) pvt';
EXECUTE sp_executesql #SQLStatement;
Thanks for the link to the previous question which got me this far.
I usually do not use PIVOT, just "usual" dynamic SQL like this:
DECLARE #sSQL NVARCHAR(MAX)= '' ,
#sSQLSum NVARCHAR(MAX)= '' ,
#sSQlBegin NVARCHAR(MAX)= '
SELECT Aisle, COUNT(*) As ''# SKUs'',
SUM(Sales) As ''# Picks'',
' ,
#sSQLEnd NVARCHAR(MAX)= 'FROM [Dbo].[VelocityBase] Base
LEFT OUTER JOIN [Dbo].[VelocitySalesCount] SalesCount
ON Base.ItemId = SalesCount.ItemId
AND Base.ConfigId = SalesCount.ConfigId
AND Base.InventSizeId = SalesCount.InventSizeId
AND Base.InventColorId = SalesCount.InventColorId
AND Base.InventLocationId = SalesCount.InventLocationId
AND SalesCount.DataAreaId = Base.DataAreaId
GROUP BY Aisle
ORDER BY Aisle' ;
WITH c AS ( SELECT DISTINCT
VelocityCategory N
FROM Dbo.VelocityBase
)
SELECT #sSQLSum = #sSQLSum + 'SUM(CASE WHEN c.N=''' + c.N
+ ''' THEN 1 ELSE 0 END ) AS ''' + c.N + ' SKUs'',' + CHAR(13)
+ 'SUM(CASE WHEN c.N=''' + c.N
+ ''' THEN SALES ELSE 0 END ) AS ''' + c.N + ' Sales'',' + CHAR(13)
FROM c
IF(LEN(#sSQLSum))>0
SET #sSQLSum = LEFT(#sSQLSum, ( LEN(#sSQLsum) - 2 ))
SET #sSQL = #sSQlBegin + #sSQLSum + CHAR(13) + #sSQLEnd
EXEC (#sSQL)
Unless you generate the query dynamically, I don't think there's a way to generate what you want.
Your problem could be solved easily if your tables were normalized. For instance, the VelocityBase table should have a VelocityCategoryID column instead of a VelocityCategory column. This new column should be a foreign key to a new table called VelocityCategory (or something like that) then your query for this calculation becomes almost trivial.