Create table of sum of events in SQL table - sql

I'm trying to pivot a table from the format
| ID | access date |
--------------
| 1 | 08.10|
| 1 | 08.10|
| 4 | 08.10|
| 2 | 02.09|
To
|ID | 02.09 | 03.09 | 04.09 | ....
| 1 | 4 | 0 | 2 |
| 2 | 1 | 2 | 5 |
| 3 |
.
.
.
I've tried using the PIVOT function but since I have a lot of different dates I don't want to type out the query
SELECT *
FROM (
SELECT [Sequence of events] as ID
,[Submission Date] as access_date
FROM [database_name].[dbo].[Event Logging]
) AS SOURCE_TABLE
PIVOT( SUM(ID) for access_date IN ("08.01", "09.01", "10.01"....)
) as pvt_table
I'm very new to SQL so I'd appreciate some insight into how to solve this problem.

This is not answer about solving problem in your way but it is about solving it another way.
What i would do is create 2 tables. First one would be called DATE_DB where i would store DATEID and DATE and it would look like this:
| DATEID | DATE |
| 1 | 01.01|
| 2 | 02.02|
....
Then in second table I store data like this:
| ID | DATEID | VALUE |
| 1 | 2 | 10 |
| 2 | 2 | 3 |
| 3 | 3 | 4 |
| 4 | 2 | 5 |
So in second table column ID is used only for primary key and has nothing to do but with tables like this and JOIN command you can use it like this:
SELECT DATE_DB.DATE, SECONDTABLE.VALUE
FROM SECONDTABLE
LEFT JOIN DATE_DB ON SECONDTABLE.DATEID = DATE_DB.DATE
ORDER BY DATE_DB.DATE
which will display result like this:
| DATE | VALUE |
| 02.01 | 10 |
| 02.01 | 3 |
| 02.01 | 5 |
| 03.01 | 4 |

Try it out like this, you need dynamic sql, note script isn't tested out, also when you naming your columns try not to have space, ether use CamelCase or underscore to separate words
And last thing, this is for SQL-Server, as you didn't tag anything and your code looks like sql-server
declare #cols nvarchar(max)
select #cols = STUFF((SELECT DISTINCT ',' + QUOTENAME([Submission Date])
from [database_name].[dbo].[Event Logging]
FOR XML PATH(''), TYPE
).value('.', 'NVARCHAR(MAX)')
,1,1,'')
declare #sql nvarchar(max);
set #sql = '
SELECT *
FROM (
SELECT [Sequence of events] as ID
,[Submission Date] as access_date
FROM [database_name].[dbo].[Event Logging]
) AS SOURCE_TABLE
PIVOT( SUM(ID) for access_date IN (' + #cols + ')
) as pvt_table';
-- print (#sql)
execute (#sql)

Related

How to pivot sql data, and squash results into non-null rows by Date per ID

Not a good title to the post, but hopefully it'll catch some eyes.
I have a very complex situation in T-SQL that I am unable to accomplish. I'm hoping someone with expertise knows an elegant and fast solution so that my performance is not impacted. I'm dealing with billions of rows.
PREFACE
I have a table called Customers with a unique ID. Those customers have Files, Files have Properties, and each Property Name corresponds to a single Value.
Tables:
Customers
Files -
Property - contains both Name and Value
The Customer ID is present in all of these tables, as are audit fields such as UpdatedDtm and CreationDtm.
USE CASE
I need to join all customers to their files (filtering for a few) and then tie every file to their properties (again filtering these). This is easy but results in lots of rows, one for each customer x file x property.
I know that the property names will never changes, and I want to return just a select few, so I used a pivot and resulted in a nice table, but it fell apart after I started doing more complex queries.
THE PROBLEM
First, the properties have a DateTime for when they were altered (UpdatedDtm), and I need to return everything altered from 1 hour of the creation date (CreationDtm) in the File table.
This results in me trimming down my list of potential properties, but now I have a table with an RowNumber() per ID and no good way to pivot and select the first one that isn't null and still preserve the number of columns for the table defnition. This is important because I'm using Dynamic SQL and placing it in an indexed temp table with a Composite Key on CustomerID and FileName.
BEFORE PIVOT
| UpdatedDtm | CustomerID | FileName | Property | Value |
| ---------- | ---------- | ---------- | -------- | -------------- |
| 1/1/2015 | 1 | FileOne | Size | NULL |
| 1/1/2015 | 1 | FileOne | Format | JPG |
| 1/7/2015 | 1 | FileOne | Size | 88KB |
| 1/7/2015 | 1 | FileOne | Format | JPG |
| 1/7/2015 | 1 | FileOne | Comment | NULL |
| 1/11/2015 | 1 | FileOne | Comment | NULL |
| 1/1/2015 | 1 | FileTwo | Size | 91KB |
| 1/1/2015 | 1 | FileTwo | Format | PNG |
| 1/11/2015 | 1 | FileTwo | Comment | NULL |
| 1/2/2015 | 2 | FileThree | Size | 74KB |
| 1/2/2015 | 2 | FileThree | Format | XLS |
| 1/2/2015 | 2 | FileThree | State | Open |
| 1/7/2015 | 2 | FileThree | State | Closed |
| 1/10/2015 | 2 | FileThree | Comment | NULL |
| 1/1/2015 | 3 | FileFour | Size | 2KB |
| 1/2/2015 | 3 | FileFour | Size | 10KB |
| 1/3/2015 | 3 | FileFour | Size | 13KB |
| 1/4/2015 | 3 | FileFour | Size | 21KB |
| 1/5/2015 | 3 | FileFour | Size | 27KB |
| 1/6/2015 | 3 | FileFour | Size | 32KB |
| 1/7/2015 | 3 | FileFour | Size | 39KB |
| 1/8/2015 | 3 | FileFour | Size | 44KB |
| 1/1/2015 | 3 | FileFour | Format | TXT |
| 1/1/2015 | 3 | FileFour | Comment | NULL |
Please don't ask me why the database is setup this way or to change the schema. That is set in stone and out of my control. I need to be able to solve the use case as described.
AFTER PIVOT (Expectation)
| CustomerID | FileName | Size | Format | State | Comment |
| ---------- | ---------- | ---- | ------ | ------ | ------- |
| 1 | FileOne | 88KB | JPG | NULL | NULL |
| 1 | FileTwo | 91KB | PNG | NULL | NULL |
| 2 | FileThree | 74KB | XLS | Closed | NULL |
| 3 | FileFour | 44KB | TXT | NULL | NULL |
I have included some NULL values and missing values to showcase that I need to preserve the same columnar properties regardless of them having data, but I also need to squash the data by the the first non-null value within my date range.
CODE (My attempt)
IF Object_id('tempdb..#FilesQuery') IS NOT NULL DROP TABLE #FilesQuery;
CREATE TABLE #FilesQuery (
SeqNum int,
CustomerID numeric(16,0),
FileName varchar(64),
PropertyName varchar(64),
PropertyValue varchar(64)
)
INSERT INTO #FilesQuery
SELECT
CASE WHEN P.[Value] IS NOT NULL
THEN ROW_NUMBER() OVER (partition by C.CustomerID order by UpdatedDtm)
ELSE 0
END as SeqNum,
C.CustomerID
,F.Name as FileName
,P.Name as PropertyName
,P.Value as PropertyValue
FROM Customers C
INNER JOIN Files F ON F.CustomerID = C.CustomerID
LEFT JOIN Properties P
ON P.CustomerID = C.CustomerID
AND P.FileID = F.FileID
WHERE F.FileName IN ('FileOne','FileTwo','FileThree','FileFour')
AND P.Name IN ('Size','Format','State','Comment')
--PIVOT
DECLARE #cols AS nvarchar(MAX)
SELECT #cols = STUFF(
(SELECT DISTINCT ',' + QUOTENAME(PropertyName)
FROM #FilesQuery fq
FOR XML PATH(''),TYPE).value('.','NVARCHAR(MAX)'),1,1,'')
DECLARE #dynSql AS nvarchar(MAX)
SET #dynSql = '
SELECT DISTINCT *
FROM (
SELECT
fq.CustomerID,
fq.FileName,
fq.PropertyName,
fq.PropertyValue
FROM #FilesQuery fq
) SRC
PIVOT (
Max([PropertyValue])
FOR PropertyName IN (' + #cols + ')
) PVT
'
IF Object_id('tempdb..#Results') IS NOT NULL DROP TABLE #Results;
CREATE TABLE #Results (
CustomerID varchar(16) NOT NULL,
FileName varchar(64) NOT NULL,
FileSize varchar(64) NULL,
FileFormat varchar(64) NULL,
FileState varchar(64) NULL,
FileComment varchar(64) NULL,
CONSTRAINT pk_CustDoc PRIMARY KEY (CustomerID,FileName)
)
INSERT INTO #Results EXEC #dynSql;
I'm sorry this code isn't complete, it is the working section I have. The other tries I made resulted in bad data pulls.
I tried using SeqNum and a combination of case statements to try and select the first non-null value for each row so that the data was all on one line, but it ended up being more like.
FileOne NULL NULL Open NULL
FileOne NULL JPG NULL NULL
and so on...
I've been struggling on solving this special case for awhile and am about to scrap and it do something procedural with looping, but that would kill my query time and performance.
Anyone have a good solution? Am I over-thinking things?
you should filter your data before you PIVOT and you will get your desired results. Here is a cte version to show you the steps of how to get what you want.
;WITH cteDefineRowPrecedence AS (
SELECT *
,ROW_NUMBER() OVER (PARTITION BY CustomerId, FileName, Property ORDER BY
CASE WHEN Value IS NOT NULL THEN 0 ELSE 1 END
,UpdatedDtm DESC) as RowNum
FROM
#Table
)
, cteDesiredRwows AS (
SELECT
CustomerId
,FileName
,Property
,Value
FROM
cteDefineRowPrecedence t
WHERE
t.RowNum = 1
AND t.Value IS NOT NULL
)
SELECT *
FROM
cteDesiredRwows t
PIVOT (
MAX(Value)
FOR Property IN (Size,[Format],[State],Comment)
) p
ORDER BY
CustomerId
,FileName
And here is a nested query version that will make it easier to embed/put in your dynamic sql....
SELECT *
FROM
(
SELECT CustomerId, FileName, Property, Value
FROM
(SELECT *
,ROW_NUMBER() OVER (PARTITION BY CustomerId, FileName, Property ORDER BY
CASE WHEN Value IS NOT NULL THEN 0 ELSE 1 END
,UpdatedDtm DESC) as RowNum
FROM
#Table) r
WHERE
r.RowNum = 1
AND r.Value IS NOT NULL
) t
PIVOT (
MAX(Value)
FOR Property IN (Size,[Format],[State],Comment)
) p
ORDER BY
CustomerId
,FileName
You might need to add a WHERE condition inside the CTE definition to restrict the date/time range to what you want.
WITH CTE AS (
SELECT DISTINCT
CustomerID
, FileName
, Property
, Value
FROM
<table_name>
)
SELECT *
FROM
CTE
PIVOT (MAX(value) FOR Property IN( 'Size', 'Format', 'State', 'Comment')) p

SQL server: Transpose Rows to Columns (n:m relationship)

After trying it myself for some hours now I need to ask for help. I only did some basic SQL until now.
I want to solve the following:
(I have translated a couple of things for you to understand the context)
I have three tables:
Workers (Mitarbeiter in German - mitID)
| mitID | Name | FamName | DOB | abtIDref |
|-------|--------|---------|------------|----------|
| 1 | Frank | Sinatra | 12.12.1915 | 1 |
| 2 | Robert | Downey | 4.4.1965 | 2 |
INFO: abtIDref is an 1:n relation for the Workplace, but not involved here
Skills (Faehigkeiten in German - faeID)
| faeID | Descr | time | cost |
|-------|-------|------|------|
| 1 | HV | 2 | 0 |
| 2 | PEV | 1 | 0 |
| 3 | Drive | 8 | 250 |
| 4 | Nex | 20 | 1200 |
Link-List
| linkID | mitIDref | feaIDref | when |
|--------|----------|----------|------------|
| 1 | 2 | 1 | 27.07.2014 |
| 2 | 2 | 2 | 01.01.2016 |
| 3 | 2 | 3 | 20.01.2016 |
| 4 | 1 | 3 | 05.06.2015 |
| 5 | 1 | 4 | 02.11.2015 |
The desired result is:
| mitID | Name | FamName | DOB | abtIDref | HV | PEV | Drive | Nex |
|-------|--------|---------|------------|----------|-----------|------------|------------|------------|
| 1 | Frank | Sinatra | 12.12.1915 | 1 | | | 05.06.2015 | 02.11.2015 |
| 2 | Robert | Downey | 4.4.1965 | 2 | 27.7.2014 | 01.01.2016 | 20.01.2015 | |
Alternative it could be:
| mitID | Name | FamName | DOB | abtIDref | HV | PEV | Drive | Nex |
|-------|--------|---------|------------|----------|----|-----|-------|-----|
| 1 | Frank | Sinatra | 12.12.1915 | 1 | | | x | x |
| 2 | Robert | Downey | 4.4.1965 | 2 | x | x | x | |
The goal is that users/admins can add up new skills and someone can see on this resultlist, if a person has this skill.
What did i try:
I've come across multiple examples of dynamic SQL and the pivot function, but I don't know how to use it in my case, because I don't run a function like AVG() or MIN().
I tried it like this:
DECLARE #columns AS VARCHAR(MAX);
DECLARE #sql AS VARCHAR(MAX);
select #columns = substring((Select DISTINCT ',' + QUOTENAME(faeID) FROM mdb_Fähigkeiten FOR XML PATH ('')),2, 1000);
SELECT #sql = 'SELECT * FROM mdb_Mitarbeiter
PIVOT
(
MAX(Value)
FOR mitID IN( ' + #columns + ' )
);';
execute(#sql);
And a second approach was:
declare #collist nvarchar(max)
SET #collist = stuff((select distinct ',' + QUOTENAME(Question)
FROM #t1 -- your table here
FOR XML PATH(''), TYPE).value('.', 'NVARCHAR(MAX)'),1,1,'')
select #collist
declare #q nvarchar(max)
set #q = '
select *
from (
select
Vorname, Bezeichnung, faeIDref
from (
select #t1.*, #t2.Answer, #t2.parent
from #t1
inner join #t2 on #t1.QID = #t2.QID
) as x
) as source
pivot (
max(Answer)
for Question in (' + #collist + ')
) as pvt
'
exec (#q)
But TBH I don't get the functions found.
I hope you can provide me with some guidance what I have to change (or even if I can) achieve this.
I believe the query below is what you are looking for. Adjust the column and table names as needed to fit your database.
DECLARE #sql AS NVARCHAR(MAX)
DECLARE #cols AS NVARCHAR(MAX)
SELECT #cols= ISNULL(#cols + ',','') + QUOTENAME(Descr)
FROM Faehigkeiten ORDER BY faeID
SET #sql = N'
SELECT mitID, Name, FamName, DOB, abtIDref, ' + #cols + '
FROM (
SELECT mitID, Name, FamName, DOB, abtIDref, [when], descr
FROM Mitarbeiter m
JOIN [Link-List] l ON m.mitID = l.mitIDref
JOIN Faehigkeiten f ON f.faeID = l.feaIDref
) a
PIVOT(MAX([when]) FOR descr IN (' + #cols + ')) p'
EXEC sp_executesql #sql

Creating New Table by select query with sequental column names

I have a table that contains column names like this;
+--------------------------------------------------------------------+-----------+
| BankTable | |
+--------------------------------------------------------------------+-----------+
| Id | BANK1 | BANK2 | BRANCH1 | BRANCH2 | IBAN1 | IBAN2 |
+----+-----------+-----------+-------------+-------------+-----------+-----------+
| 1 | BANK1_ID1 | BANK2_ID1 | BRANCH1_ID1 | BRANCH2_ID1 | IBAN1_ID1 | IBAN2_ID1 |
+----+-----------+-----------+-------------+-------------+-----------+-----------+
| 2 | BANK1_ID2 | BANK2_ID2 | BRANCH1_ID2 | BRANCH1_ID2 | IBAN1_ID2 | IBAN2_ID2 |
+----+-----------+-----------+-------------+-------------+-----------+-----------+
How can i write a query that returns the result like this;
+------------------------------------------+
| BANK |
+------------------------------------------+
| ID | BANK | BRANCH | IBAN |
+----+-----------+-------------+-----------+
| 1 | BANK1_ID1 | BRANCH1_ID1 | IBAN1_ID1 |
+----+-----------+-------------+-----------+
| 2 | BANK2_ID2 | BRANCH1_ID2 | IBAN2_ID2 |
+----+-----------+-------------+-----------+
P.s: I am writing select query by Id column. BTW query result contains one row every time.
Any help appreciated.
SOLUTION
I don't know if it's good approach but i solved this by based on #Giorgos Betsos answer. Here is how i fixed this problem.
SELECT BANK, BRANCH, IBAN
FROM (
SELECT BANK1, BANK2, BRANCH1, BRANCH2, IBAN1, IBAN2
FROM BankTable
WHERE ID = your_id_here
) AS src
UNPIVOT (
BANK FOR Col IN(BANK1, BANK2)
) AS unpvt1
UNPIVOT (
BRANCH FOR Col1 IN(BRANCH1, BRANCH2)
) AS unpvt2
UNPIVOT (
IBAN FOR Col2 IN(IBAN1, IBAN2)
) AS unpvt3
WHERE RIGHT(Col, 1) = RIGHT(Col1, 1)
AND RIGHT(Col, 1) = RIGHT(Col2, 1)
You can use UNPIVOT for this:
SELECT Bank
FROM (
SELECT Id, BANK1, BANK2, BANK3, BANK4, BANK5
FROM BankTable
WHERE id = 1) AS src
UNPIVOT (
Bank FOR Col IN([BANK1], [BANK2], [BANK3], [BANK4], [BANK5])) AS unpvt
Demo here
If number of column is more, then you can use a dynamic sql query as below:
Query
declare #sql as varchar(max);
select #sql =stuff(
(select 'union all select [' + column_name + '] as Bank
from BankTable where Id = 1 '
from information_schema.columns
where table_name = 'BankTable'
and column_name like 'BANK[0-9]%'
for xml path('')), 1, 9, '');
execute(#sql);
Result
+-----------+
| Bank |
+-----------+
| BANK1_ID1 |
| BANK2_ID1 |
| BANK3_ID1 |
| BANK4_ID1 |
| BANK5_ID1 |
+-----------+

Get rows of table as columns in second table

I apologize in advance if this question has been asked before.
I have two tables. Table one has three columns CustomerID, SequenceNum, Value, table two has a large amount of columns. I would like to fill in the columns of table two with the values of table one by column, not by row.
An example:
------------------------------------
| CustomerID | SequenceNum | Value |
------------------------------------
| 1 | 1 | A |
------------------------------------
| 1 | 2 | B |
------------------------------------
| 1 | 3 | C |
------------------------------------
| 2 | 1 | Q |
------------------------------------
| 2 | 2 | R |
------------------------------------
| 3 | 1 | X |
------------------------------------
becomes
---------------------------------------------------------------------
| CustomerID | PrimaryVal | OtherVal1 | OtherVal2 | OtherVal3 | ... |
---------------------------------------------------------------------
| 1 | A | B | C | NULL | ... |
---------------------------------------------------------------------
| 2 | Q | R | NULL | NULL | ... |
----------------------------------------------------------------------
| 3 | X | NULL | NULL | NULL | ... |
---------------------------------------------------------------------
In essence. Each unique CustomerID in table one will have a single row in table two. Each SequenceNum of a particular CustomerID will fill in a column in table two under PrimaryVal, OtherVal1, OtherVal2, etc.. A row which has a SequenceNum equal to 1 will fill the PrimaryVal field, and 2-18 (the maximum sequence length is 18) will fill OtherVal#.
The main problem I see is the variable amount of values in a sequence. Some sequences may only contain 1 row, some will fill up all 18 spots, and anything in between.
Any advice on how to solve this problem would be greatly appreciated. Thank you.
Given you know that it is 18 columns max, I would take the normal pivot route.
select customerID, Pivoted.*
from Customer
pivot( Value for sequencenum in (1,2,3,4,5,6, upto 18)) as Pivoted
I've been lazy here and not aliased the columns but you can if you need to.
This can be done with a Dynamic Pivot. The first STUFF Select (or any other GROUP_CONCAT hack) is used to determine the columns needed (based on the values of SequenceNum) before applying this into a dynamic pivot which then assigns the values to these columns.
You'll need to take an opinion on an aggregate during the pivot (I've used Min), although if there aren't duplicate CustomerId, SequenceNum tuples, this is a fairly arbitrary choice:
DECLARE
#cols AS NVARCHAR(MAX),
#query AS NVARCHAR(MAX);
SET #cols = STUFF((SELECT distinct ',' + QUOTENAME(SequenceNum)
FROM Table1
FOR XML PATH(''), TYPE
).value('.', 'NVARCHAR(MAX)')
,1,1,'');
set #query = N'SELECT CustomerID, ' + #cols + N' from
Table1
pivot
(
min(Value)
for SequenceNum in (' + #cols + N')
) p ';
execute(#query);
SqlFiddle here

Dynamic field content as Row Sql

I have the following dataset on a sql database
----------------------------------
| ID | NAME | AGE | STATUS |
-----------------------------------
| 1ASDF | Brenda | 21 | Single |
-----------------------------------
| 2FDSH | Ging | 24 | Married|
-----------------------------------
| 3SDFD | Judie | 18 | Widow |
-----------------------------------
| 4GWWX | Sophie | 21 | Married|
-----------------------------------
| 5JDSI | Mylene | 24 | Singe |
-----------------------------------
I want to query that dataset so that i can have this structure in my result
--------------------------------------
| AGE | SINGLE | MARRIED | WIDOW |
--------------------------------------
| 21 | 1 | 1 | 0 |
--------------------------------------
| 24 | 1 | 1 | 0 |
--------------------------------------
| 18 | 0 | 0 | 1 |
--------------------------------------
And the status column can be dynamic so there will be more columns to come.
Is this possible?
Since you are using SQL Server, you can use the PIVOT table operator like this:
SELECT *
FROM
(
SELECT Age, Name, Status FROM tablename
) AS t
PIVOT
(
COUNT(Name)
FOR Status IN(Single, Married, Widow)
) AS p;
SQL Fiddle Demo
To do it dynamically you have to use dynamic sql like this:
DECLARE #cols AS NVARCHAR(MAX);
DECLARE #query AS NVARCHAR(MAX);
select #cols = STUFF((SELECT distinct ',' +
QUOTENAME(status)
FROM tablename
FOR XML PATH(''), TYPE
).value('.', 'NVARCHAR(MAX)')
, 1, 1, '');
SELECT #query = '
SELECT *
FROM
(
SELECT Age, Name, Status FROM tablename
) AS t
PIVOT
(
COUNT(Name)
FOR Status IN( ' +#cols + ')
) AS p;';
execute(#query);
Updated SQL Fiddle Demo