Transforming Data From 2 Tables into 1 - sql

I'm working with a database that allows the storage of "Custom Property" fields with each record in an "Item" table. This is done by having preset fields called [CustomString00] through [CustomString199], [CustomNumber00] through [CustomNumber199] and [CustomDate00] through [CustomDate199] in the Item table. There is another table called the "CustomProperty" table that assigns the name to each custom field and the column to use in the Item table. Here is how it looks.
Item:
| Id | CustomString00| ... | CustomString199 | CustomNumber00 | ... | CustomNumber199 | CustomDate00 | ... | CustomDate199 |
| 1 | 'IN REPAIR' | ... | NULL | 78.4 | ... | NULL | 2017-03-04 | ... | NULL |
| 2 | 'FINISHED' | ... | NULL | 68.5 | ... | NULL | 2017-03-05 | ... | NULL |
| 3 | 'WIP' | ... | NULL | NULL | ... | NULL | 2017-03-07 | ... | NULL |
CustomProperty:
| Name | Type| ColumnName |
| 'Status' | 0 | 'CustomString00' |
| 'Temperature' | 1 | 'CustomNumber00' |
| 'Made Date' | 2 | 'CustomDate00' |
For each Custom Property that is defined, there will be a record in the CustomProperty table that will indicate what data type it is and which column to use for that property. Currently, there could be up to 200 Custom Properties defined for each type, ie, 200 Text, 200 Date and 200 Numeric. The user defines the Custom Properties as they need them. If a user is only using 55 total custom properties, then a lot of the fields in the Item table will not be used.
I would like to create a view that is more 'friendly' so that our users can create their own reports to show these properties. This view would use these two tables to create a new table that looked like this:
| Id | Status | Temperature | Made Date |
| 1 | 'IN REPAIR' | 78.4 | 2017-03-04 |
| 2 | 'FINISHED' | 68.5 | 2017-03-05 |
| 3 | 'WIP' | NULL | 2017-03-07 |
This view should show a column for each property that is defined in the Custom Property table. For This example, there are only 3 Custom Properties defined, so 3 fields are shown in this view. If all 600 Custom Properties were defined, then there would be 600 fields in this view. If there is a value stored for that Custom Property in the Item table, then that value is shown. If there is no value then a NULL would be shown for that property (as shown in Temperature for Item 3).
Using Dynamic SQL I've got some results, but not what I'm looking for. I've made a query that Unpivots the Custom Property fields and returns a result of Items like this:
| Id | CPName | CPTextValue | CPNumberValue | CPDateValue |
| 1 | 'Status' | 'IN REPAIR' | NULL | NULL |
| 1 | 'Temperature' | NULL | 78.4 | NULL |
| 1 | 'Made Date' | NULL | NULL | 2017-03-04 |
| 2 | 'Status' | 'FINISHED ' | NULL | NULL |
| 2 | 'Temperature' | NULL | 68.5 | NULL |
| 2 | 'Made Date' | NULL | NULL | 2017-03-05 |
| 3 | 'Status' | 'WIP' | NULL | NULL |
| 3 | 'Made Date' | NULL | NULL | 2017-03-07 |
My query is getting pretty complicated, so I'm wondering if I'm taking the wrong approach. Here is what I've done so far.
DECLARE #textcolsUnpivot AS NVARCHAR(MAX),
#datecolsUnpivot AS NVARCHAR(MAX),
#numbercolsUnpivot AS NVARCHAR(MAX),
#query AS NVARCHAR(MAX)
select #textcolsUnpivot
= stuff((select ','+quotename(columnname)
from customproperty
where custompropertytype = 0
order by columnname
for xml path('')), 1, 1, '')
select #datecolsUnpivot
= stuff((select ','+quotename(columnname)
from customproperty
where custompropertytype = 1
order by columnname
for xml path('')), 1, 1, '')
select #numbercolsUnpivot
= stuff((select ','+quotename(columnname)
from customproperty
where custompropertytype = 2
order by columnname
for xml path('')), 1, 1, '')
set #query
= 'select id, CPName, CPTextValue, NULL as CPDateValue, NULL as CPNumberValue from
(select id, CPTextValue, CPCol from item
unpivot
(
CPTextValue
for CPCol in ('+ #textcolsunpivot +')
) unpiv ) as pv
inner join
(select columnname, name as CPName, custompropertytype from customproperty) as cp
on cp.columnname = pv.CPCol
union
select id, CPName, NULL, CPDateValue, NULL from
(select id, CPDateValue, CPCol from item
unpivot
(
CPDateValue
for CPCol in ('+ #datecolsunpivot +')
) unpiv ) as pv
inner join
(select columnname, name as CPName, custompropertytype from customproperty) as cp
on cp.columnname = pv.CPCol
union
select id, CPName, NULL, NULL, CPNumberValue from
(select id, CPNumberValue, CPCol from item
unpivot
(
CPNumberValue
for CPCol in ('+ #numbercolsunpivot +')
) unpiv ) as pv
inner join
(select columnname, name as CPName, custompropertytype from customproperty) as cp
on cp.columnname = pv.CPCol
'
exec sp_executesql #query;
For additional clarification, the schema of the tables are:
Item:
Id - pk, (it's actually a GUID, but I'm using an int for this example.), not null
CustomString00 through CustomString199 - nvarchar(max), null
CustomDate00 through CustomDate199 - datetime, null
CustomNumber00 through CustomNumber199 - float, null
CustomProperty:
Name - nvarchar(100),not null
Type - int, not null
ColumnName - nvarchar(50), not null
If I was to continue my current approach, I think I need to now PIVOT the results of my previous query to put it in the form that I'm looking for. Is this correct?

Related

Create table of sum of events in SQL table

I'm trying to pivot a table from the format
| ID | access date |
--------------
| 1 | 08.10|
| 1 | 08.10|
| 4 | 08.10|
| 2 | 02.09|
To
|ID | 02.09 | 03.09 | 04.09 | ....
| 1 | 4 | 0 | 2 |
| 2 | 1 | 2 | 5 |
| 3 |
.
.
.
I've tried using the PIVOT function but since I have a lot of different dates I don't want to type out the query
SELECT *
FROM (
SELECT [Sequence of events] as ID
,[Submission Date] as access_date
FROM [database_name].[dbo].[Event Logging]
) AS SOURCE_TABLE
PIVOT( SUM(ID) for access_date IN ("08.01", "09.01", "10.01"....)
) as pvt_table
I'm very new to SQL so I'd appreciate some insight into how to solve this problem.
This is not answer about solving problem in your way but it is about solving it another way.
What i would do is create 2 tables. First one would be called DATE_DB where i would store DATEID and DATE and it would look like this:
| DATEID | DATE |
| 1 | 01.01|
| 2 | 02.02|
....
Then in second table I store data like this:
| ID | DATEID | VALUE |
| 1 | 2 | 10 |
| 2 | 2 | 3 |
| 3 | 3 | 4 |
| 4 | 2 | 5 |
So in second table column ID is used only for primary key and has nothing to do but with tables like this and JOIN command you can use it like this:
SELECT DATE_DB.DATE, SECONDTABLE.VALUE
FROM SECONDTABLE
LEFT JOIN DATE_DB ON SECONDTABLE.DATEID = DATE_DB.DATE
ORDER BY DATE_DB.DATE
which will display result like this:
| DATE | VALUE |
| 02.01 | 10 |
| 02.01 | 3 |
| 02.01 | 5 |
| 03.01 | 4 |
Try it out like this, you need dynamic sql, note script isn't tested out, also when you naming your columns try not to have space, ether use CamelCase or underscore to separate words
And last thing, this is for SQL-Server, as you didn't tag anything and your code looks like sql-server
declare #cols nvarchar(max)
select #cols = STUFF((SELECT DISTINCT ',' + QUOTENAME([Submission Date])
from [database_name].[dbo].[Event Logging]
FOR XML PATH(''), TYPE
).value('.', 'NVARCHAR(MAX)')
,1,1,'')
declare #sql nvarchar(max);
set #sql = '
SELECT *
FROM (
SELECT [Sequence of events] as ID
,[Submission Date] as access_date
FROM [database_name].[dbo].[Event Logging]
) AS SOURCE_TABLE
PIVOT( SUM(ID) for access_date IN (' + #cols + ')
) as pvt_table';
-- print (#sql)
execute (#sql)

SQL : Remove duplicate based on a critera

I'm mostly new to SQL, thus I don't know a lot about all the advanced option it provides. I work currently with MS SQL Server 2016 (Developer edition).
I have the following result:
| Type | Role | GUID |
|--------|--------|--------------------------------------|
| B | 0 | ABC |
| B | 0 | KLM |
| A | 0 | CDE |
| A | 0 | EFG |
| A | 1 | CDE |
| B | 1 | ABC |
| B | 1 | GHI |
| B | 1 | IJK |
| B | 1 | KLM |
From the following SELECT :
SELECT DISTINCT
Type,
Role,
GUID
I'm looking to count the GUID following these constrains :
-> If there is multiple row with the same GUID, only count the row with "Role" set to "1", else, count the one with a "Role" set to 0
-> if there is only one, count it either as a "Role 0" or "Role 1", according to their own Role value.
My objective is to get the following result :
| Type | Role | COUNT(GUID) |
|--------|--------|--------------------------------------|
| A | 0 | 1 | => counted EFG as there was no other row with a "Role" set to 1
| A | 1 | 1 | => counted CDE with "Role" set to 1, but the row with "Role" set to 0 is ignored
| B | 1 | 4 |
Your query is not implementing the logic you mention. Here is a method that uses subqueries and window functions:
select type, role, count(*)
from (select t.*,
count(*) over (partition by GUID) as guid_cnt
from t
) t
where (guid_cnt > 1 and role = 1) or
(guid_cnt = 1 and role = 0)
group by type, role;
The subquery gets the count of rows that match a GUID. The outer where then uses that for filtering according to your conditions.
Note: role is not a good choice for a column name. It is a keyword (see here) and may be reserved in the future (see here).
A NOT EXISTS could be used for this.
For example:
declare #T table ([Type] char(1), [Role] int, [GUID] varchar(3));
insert into #T ([Type], [Role], [GUID]) values
('A',0,'CDE'),
('A',0,'EFG'),
('A',1,'CDE'),
('B',0,'ABC'),
('B',0,'KLM'),
('B',1,'ABC'),
('B',1,'GHI'),
('B',1,'IJK'),
('B',1,'KLM');
select [Type], [Role], COUNT(DISTINCT [GUID]) as TotalUniqueGuid
from #T t
where not exists (
select 1
from #T t1
where t.[Type] = t1.[Type]
and t.[Role] = 0 and t1.[Role] > 0
and t.[GUID] = t1.[GUID]
)
group by [Type], [Role];
Returns:
Type Role TotalUniqueGuid
A 0 1
A 1 1
B 1 4

How to pivot sql data, and squash results into non-null rows by Date per ID

Not a good title to the post, but hopefully it'll catch some eyes.
I have a very complex situation in T-SQL that I am unable to accomplish. I'm hoping someone with expertise knows an elegant and fast solution so that my performance is not impacted. I'm dealing with billions of rows.
PREFACE
I have a table called Customers with a unique ID. Those customers have Files, Files have Properties, and each Property Name corresponds to a single Value.
Tables:
Customers
Files -
Property - contains both Name and Value
The Customer ID is present in all of these tables, as are audit fields such as UpdatedDtm and CreationDtm.
USE CASE
I need to join all customers to their files (filtering for a few) and then tie every file to their properties (again filtering these). This is easy but results in lots of rows, one for each customer x file x property.
I know that the property names will never changes, and I want to return just a select few, so I used a pivot and resulted in a nice table, but it fell apart after I started doing more complex queries.
THE PROBLEM
First, the properties have a DateTime for when they were altered (UpdatedDtm), and I need to return everything altered from 1 hour of the creation date (CreationDtm) in the File table.
This results in me trimming down my list of potential properties, but now I have a table with an RowNumber() per ID and no good way to pivot and select the first one that isn't null and still preserve the number of columns for the table defnition. This is important because I'm using Dynamic SQL and placing it in an indexed temp table with a Composite Key on CustomerID and FileName.
BEFORE PIVOT
| UpdatedDtm | CustomerID | FileName | Property | Value |
| ---------- | ---------- | ---------- | -------- | -------------- |
| 1/1/2015 | 1 | FileOne | Size | NULL |
| 1/1/2015 | 1 | FileOne | Format | JPG |
| 1/7/2015 | 1 | FileOne | Size | 88KB |
| 1/7/2015 | 1 | FileOne | Format | JPG |
| 1/7/2015 | 1 | FileOne | Comment | NULL |
| 1/11/2015 | 1 | FileOne | Comment | NULL |
| 1/1/2015 | 1 | FileTwo | Size | 91KB |
| 1/1/2015 | 1 | FileTwo | Format | PNG |
| 1/11/2015 | 1 | FileTwo | Comment | NULL |
| 1/2/2015 | 2 | FileThree | Size | 74KB |
| 1/2/2015 | 2 | FileThree | Format | XLS |
| 1/2/2015 | 2 | FileThree | State | Open |
| 1/7/2015 | 2 | FileThree | State | Closed |
| 1/10/2015 | 2 | FileThree | Comment | NULL |
| 1/1/2015 | 3 | FileFour | Size | 2KB |
| 1/2/2015 | 3 | FileFour | Size | 10KB |
| 1/3/2015 | 3 | FileFour | Size | 13KB |
| 1/4/2015 | 3 | FileFour | Size | 21KB |
| 1/5/2015 | 3 | FileFour | Size | 27KB |
| 1/6/2015 | 3 | FileFour | Size | 32KB |
| 1/7/2015 | 3 | FileFour | Size | 39KB |
| 1/8/2015 | 3 | FileFour | Size | 44KB |
| 1/1/2015 | 3 | FileFour | Format | TXT |
| 1/1/2015 | 3 | FileFour | Comment | NULL |
Please don't ask me why the database is setup this way or to change the schema. That is set in stone and out of my control. I need to be able to solve the use case as described.
AFTER PIVOT (Expectation)
| CustomerID | FileName | Size | Format | State | Comment |
| ---------- | ---------- | ---- | ------ | ------ | ------- |
| 1 | FileOne | 88KB | JPG | NULL | NULL |
| 1 | FileTwo | 91KB | PNG | NULL | NULL |
| 2 | FileThree | 74KB | XLS | Closed | NULL |
| 3 | FileFour | 44KB | TXT | NULL | NULL |
I have included some NULL values and missing values to showcase that I need to preserve the same columnar properties regardless of them having data, but I also need to squash the data by the the first non-null value within my date range.
CODE (My attempt)
IF Object_id('tempdb..#FilesQuery') IS NOT NULL DROP TABLE #FilesQuery;
CREATE TABLE #FilesQuery (
SeqNum int,
CustomerID numeric(16,0),
FileName varchar(64),
PropertyName varchar(64),
PropertyValue varchar(64)
)
INSERT INTO #FilesQuery
SELECT
CASE WHEN P.[Value] IS NOT NULL
THEN ROW_NUMBER() OVER (partition by C.CustomerID order by UpdatedDtm)
ELSE 0
END as SeqNum,
C.CustomerID
,F.Name as FileName
,P.Name as PropertyName
,P.Value as PropertyValue
FROM Customers C
INNER JOIN Files F ON F.CustomerID = C.CustomerID
LEFT JOIN Properties P
ON P.CustomerID = C.CustomerID
AND P.FileID = F.FileID
WHERE F.FileName IN ('FileOne','FileTwo','FileThree','FileFour')
AND P.Name IN ('Size','Format','State','Comment')
--PIVOT
DECLARE #cols AS nvarchar(MAX)
SELECT #cols = STUFF(
(SELECT DISTINCT ',' + QUOTENAME(PropertyName)
FROM #FilesQuery fq
FOR XML PATH(''),TYPE).value('.','NVARCHAR(MAX)'),1,1,'')
DECLARE #dynSql AS nvarchar(MAX)
SET #dynSql = '
SELECT DISTINCT *
FROM (
SELECT
fq.CustomerID,
fq.FileName,
fq.PropertyName,
fq.PropertyValue
FROM #FilesQuery fq
) SRC
PIVOT (
Max([PropertyValue])
FOR PropertyName IN (' + #cols + ')
) PVT
'
IF Object_id('tempdb..#Results') IS NOT NULL DROP TABLE #Results;
CREATE TABLE #Results (
CustomerID varchar(16) NOT NULL,
FileName varchar(64) NOT NULL,
FileSize varchar(64) NULL,
FileFormat varchar(64) NULL,
FileState varchar(64) NULL,
FileComment varchar(64) NULL,
CONSTRAINT pk_CustDoc PRIMARY KEY (CustomerID,FileName)
)
INSERT INTO #Results EXEC #dynSql;
I'm sorry this code isn't complete, it is the working section I have. The other tries I made resulted in bad data pulls.
I tried using SeqNum and a combination of case statements to try and select the first non-null value for each row so that the data was all on one line, but it ended up being more like.
FileOne NULL NULL Open NULL
FileOne NULL JPG NULL NULL
and so on...
I've been struggling on solving this special case for awhile and am about to scrap and it do something procedural with looping, but that would kill my query time and performance.
Anyone have a good solution? Am I over-thinking things?
you should filter your data before you PIVOT and you will get your desired results. Here is a cte version to show you the steps of how to get what you want.
;WITH cteDefineRowPrecedence AS (
SELECT *
,ROW_NUMBER() OVER (PARTITION BY CustomerId, FileName, Property ORDER BY
CASE WHEN Value IS NOT NULL THEN 0 ELSE 1 END
,UpdatedDtm DESC) as RowNum
FROM
#Table
)
, cteDesiredRwows AS (
SELECT
CustomerId
,FileName
,Property
,Value
FROM
cteDefineRowPrecedence t
WHERE
t.RowNum = 1
AND t.Value IS NOT NULL
)
SELECT *
FROM
cteDesiredRwows t
PIVOT (
MAX(Value)
FOR Property IN (Size,[Format],[State],Comment)
) p
ORDER BY
CustomerId
,FileName
And here is a nested query version that will make it easier to embed/put in your dynamic sql....
SELECT *
FROM
(
SELECT CustomerId, FileName, Property, Value
FROM
(SELECT *
,ROW_NUMBER() OVER (PARTITION BY CustomerId, FileName, Property ORDER BY
CASE WHEN Value IS NOT NULL THEN 0 ELSE 1 END
,UpdatedDtm DESC) as RowNum
FROM
#Table) r
WHERE
r.RowNum = 1
AND r.Value IS NOT NULL
) t
PIVOT (
MAX(Value)
FOR Property IN (Size,[Format],[State],Comment)
) p
ORDER BY
CustomerId
,FileName
You might need to add a WHERE condition inside the CTE definition to restrict the date/time range to what you want.
WITH CTE AS (
SELECT DISTINCT
CustomerID
, FileName
, Property
, Value
FROM
<table_name>
)
SELECT *
FROM
CTE
PIVOT (MAX(value) FOR Property IN( 'Size', 'Format', 'State', 'Comment')) p

Creating New Table by select query with sequental column names

I have a table that contains column names like this;
+--------------------------------------------------------------------+-----------+
| BankTable | |
+--------------------------------------------------------------------+-----------+
| Id | BANK1 | BANK2 | BRANCH1 | BRANCH2 | IBAN1 | IBAN2 |
+----+-----------+-----------+-------------+-------------+-----------+-----------+
| 1 | BANK1_ID1 | BANK2_ID1 | BRANCH1_ID1 | BRANCH2_ID1 | IBAN1_ID1 | IBAN2_ID1 |
+----+-----------+-----------+-------------+-------------+-----------+-----------+
| 2 | BANK1_ID2 | BANK2_ID2 | BRANCH1_ID2 | BRANCH1_ID2 | IBAN1_ID2 | IBAN2_ID2 |
+----+-----------+-----------+-------------+-------------+-----------+-----------+
How can i write a query that returns the result like this;
+------------------------------------------+
| BANK |
+------------------------------------------+
| ID | BANK | BRANCH | IBAN |
+----+-----------+-------------+-----------+
| 1 | BANK1_ID1 | BRANCH1_ID1 | IBAN1_ID1 |
+----+-----------+-------------+-----------+
| 2 | BANK2_ID2 | BRANCH1_ID2 | IBAN2_ID2 |
+----+-----------+-------------+-----------+
P.s: I am writing select query by Id column. BTW query result contains one row every time.
Any help appreciated.
SOLUTION
I don't know if it's good approach but i solved this by based on #Giorgos Betsos answer. Here is how i fixed this problem.
SELECT BANK, BRANCH, IBAN
FROM (
SELECT BANK1, BANK2, BRANCH1, BRANCH2, IBAN1, IBAN2
FROM BankTable
WHERE ID = your_id_here
) AS src
UNPIVOT (
BANK FOR Col IN(BANK1, BANK2)
) AS unpvt1
UNPIVOT (
BRANCH FOR Col1 IN(BRANCH1, BRANCH2)
) AS unpvt2
UNPIVOT (
IBAN FOR Col2 IN(IBAN1, IBAN2)
) AS unpvt3
WHERE RIGHT(Col, 1) = RIGHT(Col1, 1)
AND RIGHT(Col, 1) = RIGHT(Col2, 1)
You can use UNPIVOT for this:
SELECT Bank
FROM (
SELECT Id, BANK1, BANK2, BANK3, BANK4, BANK5
FROM BankTable
WHERE id = 1) AS src
UNPIVOT (
Bank FOR Col IN([BANK1], [BANK2], [BANK3], [BANK4], [BANK5])) AS unpvt
Demo here
If number of column is more, then you can use a dynamic sql query as below:
Query
declare #sql as varchar(max);
select #sql =stuff(
(select 'union all select [' + column_name + '] as Bank
from BankTable where Id = 1 '
from information_schema.columns
where table_name = 'BankTable'
and column_name like 'BANK[0-9]%'
for xml path('')), 1, 9, '');
execute(#sql);
Result
+-----------+
| Bank |
+-----------+
| BANK1_ID1 |
| BANK2_ID1 |
| BANK3_ID1 |
| BANK4_ID1 |
| BANK5_ID1 |
+-----------+

Dynamic field content as Row Sql

I have the following dataset on a sql database
----------------------------------
| ID | NAME | AGE | STATUS |
-----------------------------------
| 1ASDF | Brenda | 21 | Single |
-----------------------------------
| 2FDSH | Ging | 24 | Married|
-----------------------------------
| 3SDFD | Judie | 18 | Widow |
-----------------------------------
| 4GWWX | Sophie | 21 | Married|
-----------------------------------
| 5JDSI | Mylene | 24 | Singe |
-----------------------------------
I want to query that dataset so that i can have this structure in my result
--------------------------------------
| AGE | SINGLE | MARRIED | WIDOW |
--------------------------------------
| 21 | 1 | 1 | 0 |
--------------------------------------
| 24 | 1 | 1 | 0 |
--------------------------------------
| 18 | 0 | 0 | 1 |
--------------------------------------
And the status column can be dynamic so there will be more columns to come.
Is this possible?
Since you are using SQL Server, you can use the PIVOT table operator like this:
SELECT *
FROM
(
SELECT Age, Name, Status FROM tablename
) AS t
PIVOT
(
COUNT(Name)
FOR Status IN(Single, Married, Widow)
) AS p;
SQL Fiddle Demo
To do it dynamically you have to use dynamic sql like this:
DECLARE #cols AS NVARCHAR(MAX);
DECLARE #query AS NVARCHAR(MAX);
select #cols = STUFF((SELECT distinct ',' +
QUOTENAME(status)
FROM tablename
FOR XML PATH(''), TYPE
).value('.', 'NVARCHAR(MAX)')
, 1, 1, '');
SELECT #query = '
SELECT *
FROM
(
SELECT Age, Name, Status FROM tablename
) AS t
PIVOT
(
COUNT(Name)
FOR Status IN( ' +#cols + ')
) AS p;';
execute(#query);
Updated SQL Fiddle Demo