I have the following table layout. Each line value will always be unique. There will never be more than one instance of the same Id, Name, and Line.
Id Name Line
1 A Z
2 B Y
3 C X
3 C W
4 D W
I would like to query the data so that the Line field becomes a column. If the value exists, a 1 is applied in the field data, otherwise a 0. e.g.
Id Name Z Y X W
1 A 1 0 0 0
2 B 0 1 0 0
3 C 0 0 1 1
4 D 0 0 0 1
The field names W, X, Y, Z are just examples of field values, so I can't apply an operator to explicitly check, for example, 'X', 'Y', or 'Z'. These could change at any time and are not restricted to a finate set of values. The column names in the result-set should reflect the unique field values as columns.
Any idea how I can accomplish this?
It's a standard pivot query.
If 1 represents a boolean indicator - use:
SELECT t.id,
t.name,
MAX(CASE WHEN t.line = 'Z' THEN 1 ELSE 0 END) AS Z,
MAX(CASE WHEN t.line = 'Y' THEN 1 ELSE 0 END) AS Y,
MAX(CASE WHEN t.line = 'X' THEN 1 ELSE 0 END) AS X,
MAX(CASE WHEN t.line = 'W' THEN 1 ELSE 0 END) AS W
FROM TABLE t
GROUP BY t.id, t.name
If 1 represents the number of records with that value for the group, use:
SELECT t.id,
t.name,
SUM(CASE WHEN t.line = 'Z' THEN 1 ELSE 0 END) AS Z,
SUM(CASE WHEN t.line = 'Y' THEN 1 ELSE 0 END) AS Y,
SUM(CASE WHEN t.line = 'X' THEN 1 ELSE 0 END) AS X,
SUM(CASE WHEN t.line = 'W' THEN 1 ELSE 0 END) AS W
FROM TABLE t
GROUP BY t.id, t.name
Edited following update in question
SQL Server does not support dynamic pivoting.
To do this you could either use dynamic SQL to generate a query along the following lines.
SELECT
Id ,Name,
ISNULL(MAX(CASE WHEN Line='Z' THEN 1 END),0) AS Z,
ISNULL(MAX(CASE WHEN Line='Y' THEN 1 END),0) AS Y,
ISNULL(MAX(CASE WHEN Line='X' THEN 1 END),0) AS X,
ISNULL(MAX(CASE WHEN Line='W' THEN 1 END),0) AS W
FROM T
GROUP BY Id ,Name
Or an alternative which I have read about but not actually tried is to leverage the Access Transform function by setting up an Access database with a linked table pointing at the SQL Server table then query the Access database from SQL Server!
Here is the dynamic version
Test table
create table #test(id int,name char(1),line char(1))
insert #test values(1 , 'A','Z')
insert #test values(2 , 'B','Y')
insert #test values(3 , 'C','X')
insert #test values(4 , 'C','W')
insert #test values(5 , 'D','W')
insert #test values(5 , 'D','W')
insert #test values(5 , 'D','P')
Now run this
declare #names nvarchar(4000)
SELECT #names =''
SELECT #names = #names + line +', '
FROM (SELECT distinct line from #test) x
SELECT #names = LEFT(#names,(LEN(#names) -1))
exec('
SELECT *
FROM(
SELECT DISTINCT Id, Name,Line
FROM #test
) AS pivTemp
PIVOT
( COUNT(Line)
FOR Line IN (' + #names +' )
) AS pivTable ')
Now add one row to the table and run the query above again and you will see the B
insert #test values(5 , 'D','B')
Caution: Of course all the problems with dynamic SQL apply, if you can use sp_executeSQL but since parameters are not use like that in the query there really is no point
Assuming you have a finite number of values for Line that you could enumerate:
declare #MyTable table (
Id int,
Name char(1),
Line char(1)
)
insert into #MyTable
(Id, Name, Line)
select 1,'A','Z'
union all
select 2,'B','Y'
union all
select 3,'C','X'
union all
select 3,'C','W'
union all
select 4,'D','W'
SELECT Id, Name, Z, Y, X, W
FROM (SELECT Id, Name, Line
FROM #MyTable) up
PIVOT (count(Line) FOR Line IN (Z, Y, X, W)) AS pvt
ORDER BY Id
As you are using SQL Server, you could possibly use the PIVOT operator intended for this purpose.
If you're doing this for a SQL Server Reporting Services (SSRS) report, or could possibly switch to using one, then stop now and go throw a Matrix control onto your report. Poof! You're done! Happy as a clam with your data pivoted.
Here's a rather exotic approach (using sample data from the old Northwind database). It's adapted from the version here, which no longer worked due to the deprecation of DBCC RENAMECOLUMN and the addition of PIVOT as a keyword.
set nocount on
create table Sales (
AccountCode char(5),
Category varchar(10),
Amount decimal(8,2)
)
--Populate table with sample data
insert into Sales
select customerID, 'Emp'+CAST(EmployeeID as char), sum(Freight)
from Northwind.dbo.orders
group by customerID, EmployeeID
create unique clustered index Sales_AC_C
on Sales(AccountCode,Category)
--Create table to hold data column names and positions
select A.Category,
count(distinct B.Category) AS Position
into #columns
from Sales A join Sales B
on A.Category >= B.Category
group by A.Category
create unique clustered index #columns_P on #columns(Position)
create unique index #columns_C on #columns(Category)
--Generate first column of Pivot table
select distinct AccountCode into Pivoted from Sales
--Find number of data columns to be added to Pivoted table
declare #datacols int
select #datacols = max(Position) from #columns
--Add data columns one by one in the correct order
declare #i int
set #i = 0
while #i < #datacols begin
set #i = #i + 1
--Add next data column to Pivoted table
select P.*, isnull((
select Amount
from Sales S join #columns C
on C.Position = #i
and C.Category = S.Category
where P.AccountCode = S.AccountCode),0) AS X
into PivotedAugmented
from Pivoted P
--Name new data column correctly
declare #c sysname
select #c = Category
from #columns
where Position = #i
exec sp_rename '[dbo].[PivotedAugmented].[X]', #c, 'COLUMN'
--Replace Pivoted table with new table
drop table Pivoted
select * into Pivoted from PivotedAugmented
drop table PivotedAugmented
end
select * from Pivoted
go
drop table Pivoted
drop table #columns
drop table Sales
Related
No matter where I place my With statement inside the SQL query, the keyword in the next line always shows an error, 'Incorrect syntax near keyword'. I also tried putting semi-colon.
; WITH Commercial_subset AS
(
SELECT DISTINCT
PRDID_Clean, Value, [Year]
FROM
Reporting_db_SPKPI.DBO.[tbl_RCCP_commercial]
WHERE
MEASURE = 'Unit Growth Rate'
)
--error appears at truncate
TRUNCATE TABLE Reporting_db_SPKPI.DBO.[tbl_RCCP_3_NR_dup]
Example 1:
[Example 1][1]
Example 2:
[Example 2][2]
What am I missing?
[1]: https://i.stack.imgur.com/lkfVd.png
[2]: https://i.stack.imgur.com/tZRnG.png
My Final code after getting suggestions in the comments,
--Ensure the correct database is selected for creating the views
USE Reporting_db_SPKPI
--Create the table where new values will be appended
Insert into Reporting_db_SPKPI.DBO.[tbl_RCCP_3_NR_dup]
Select *, Replace(productID,'-','') as ProductID_clean from Reporting_db_SPKPI.DBO.[tbl_RCCP_3_NR]
GO
--Create a subset as view which will be used for join later
Create or Alter View QRY_Commerical_Subset AS
Select Distinct PRDID_Clean, Value, [Year] From Reporting_db_SPKPI.DBO.[tbl_RCCP_commercial] where MEASURE = 'Unit Growth Rate'
Go
--Create a view with distinct list of all SKUs
CREATE OR ALTER VIEW QRY_RCCP_TEMP AS
SELECT
PRODUCTID, ROW_NUMBER() Over (ORDER BY ProductID) AS ID
FROM (
SELECT
DISTINCT A.ProductID_clean ProductID
FROM
Reporting_db_SPKPI.DBO.[tbl_RCCP_3_NR_dup] A
LEFT JOIN
Reporting_db_SPKPI.DBO.QRY_Commerical_Subset B ON A.ProductID_clean = B.PRDID_Clean
WHERE
B.PRDID_Clean IS NOT NULL --and A.filename = 'Capacity Planning_INS_Springhill' --DYNAMIC VARIABLE HERE
and Cast(A.SnapshotDate as date) =
(SELECT Max(Cast(SnapshotDate as date)) FROM reporting_db_spkpi.dbo.tbl_RCCP_3_NR)
) T
GO
SET NOCOUNT ON
-- For every product id from the distinct list iterate the following the code
DECLARE #I INT = 1
WHILE #I <= (SELECT MAX(ID) FROM QRY_RCCP_TEMP)
BEGIN
DECLARE #PRODUCT NVARCHAR(50) = (SELECT PRODUCTID FROM QRY_RCCP_TEMP WHERE ID = #I)
DROP TABLE Reporting_db_SPKPI.DBO.[tbl_RCCP_3_NR_temp]
--Retrieve last 12 months of value from NR and add it to a temp table in increasing order of their months. These 12 data points will be baseline
SELECT
Top 12 A.*,
Case When B.[Value] is Null then 0 else CAST(B.[Value] as float) End GROWTH
INTO
Reporting_db_SPKPI.DBO.[tbl_RCCP_3_NR_temp]
FROM
Reporting_db_SPKPI.DBO.[tbl_RCCP_3_NR_dup] A
LEFT JOIN
--using the view here
QRY_Commerical_Subset B ON B.PRDID_Clean = A.ProductID_clean AND B.[YEAR] = YEAR(A.[MONTH])+1
WHERE
A.PRODUCTID= #PRODUCT
AND Cast(A.SnapshotDate AS DATE) = (SELECT Max(Cast(SnapshotDate AS DATE)) FROM reporting_db_spkpi.dbo.[tbl_RCCP_3_NR_dup])
Order by
[Month] desc
-- Generate 3 years of data
DECLARE #J INT = 1
WHILE #J<=3
BEGIN
--Calculate next year's value
UPDATE Reporting_db_SPKPI.DBO.[tbl_RCCP_3_NR_temp]
SET
[Value] = [Value]*(1+ GROWTH),
[MONTH] = DATEADD(YEAR,1,[Month]),
MonthCode= 'F' + CAST(CAST(SUBSTRING(MonthCode,2,LEN(MonthCode)) AS INT) + 12 AS NVARCHAR(10))
--Add it to the NR table.
Insert into Reporting_db_SPKPI.DBO.[tbl_RCCP_3_NR_dup]
(ProductID, MonthCode, Value, Month, FileName,
LastModifiedDate, SnapshotDate, Quarter, IsError, ErrorDescription)
Select
ProductID, MonthCode, Value, Month, FileName,
LastModifiedDate, SnapshotDate, Quarter, IsError, ErrorDescription
from
Reporting_db_SPKPI.DBO.[tbl_RCCP_3_NR_temp]
--Update growth rate for next year
UPDATE Reporting_db_SPKPI.DBO.[tbl_RCCP_3_NR_temp]
SET GROWTH = Case When B.[Value] is Null then 0 else CAST(B.[Value] as float) End
FROM Reporting_db_SPKPI.DBO.QRY_Commerical_Subset B
WHERE B.PRDID_Clean = ProductID_clean AND [YEAR] = YEAR([MONTH])+1
SET #J=#J+1
END
SET #I=#I+1
END
DROP VIEW QRY_RCCP_TEMP
DROP VIEW QRY_Commerical_Subset
The WITH is a Common Table Expression, aka CTE.
And a CTE is like a template for a sub-query.
For example this join of the same sub-query:
SELECT *
FROM (
select distinct bar
from table1
where foo = 'baz'
) AS foo1
JOIN (
select distinct bar
from table1
where foo = 'baz'
) AS foo2
ON foo1.bar > foo2.bar
Can be written as
WITH CTE_FOO AS (
select distinct bar
from table1
where foo = 'baz'
)
SELECT *
FROM CTE_FOO AS foo1
JOIN CTE_FOO AS foo2
ON foo1.bar > foo2.bar
It's meant to be used with a SELECT.
Not with a TRUNCATE TABLE or DROP TABLE.
(It can be used with an UPDATE though)
As such, treat the TRUNCATE as a seperate statement.
TRUNCATE TABLE Reporting_db_SPKPI.DBO.[tbl_RCCP_3_NR_dup];
WITH Commercial_subset AS
(
SELECT DISTINCT
PRDID_Clean, Value, [Year]
FROM
Reporting_db_SPKPI.DBO.[tbl_RCCP_commercial]
WHERE
MEASURE = 'Unit Growth Rate'
)
SELECT *
FROM Commercial_subset;
Btw, the reason why many write a CTE with a leading ; is because the WITH clause raises an error if the previous statement wasn't ended with a ;. It's just a small trick to avoid that error.
In SQL Server 2008, I have data like this (Case: varchar(20), Time: time):
Case Time
-------------
D1 18:44
D2 19:12
C1 21:20
F2 21:05
...
What I would like to do is to count cases per hour. Should include all cases.
Expected result:
.... Column18 Column19 Column20 Column21 ...
1 1 0 2
where Column18 refers to the cases between 18:00 and 18:59, and same logic for others. I have from Column0 to Column23, 1 column per hour...
What I am doing is:
Select
...
, Column18 = sum(CASE WHEN Time like '18:%' THEN 1 ELSE 0 END)
, Column19 = sum(CASE WHEN Time like '19:%' THEN 1 ELSE 0 END)
, Column20 = sum(CASE WHEN Time like '20:%' THEN 1 ELSE 0 END)
, Column21 = sum(CASE WHEN Time like '21:%' THEN 1 ELSE 0 END)
...
from
mytable
Even though my query works, it is long and repetitive, so it does not seem professional to me. I wonder if there is any better way to handle this situation. Any advice would be appreciated.
We can go with Dynamic Pivot -
declare #ColString varchar(1000)=''
;with cte as(
select 0 as X
union all
select x+1 as X
from cte where X <23
)
select #ColString = #ColString + ',[Column' + cast(X as varchar) + ']' from cte
select #ColString = stuff(#ColString,1,1,'')
declare #DynamicQuery varchar(3000)=''
select #DynamicQuery =
'select *
from (
select [case],''Column''+cast(datepart(hh,[time]) as varchar) as [time]
from #xyz
) src
pivot
(
count([case]) for [Time] in ('+ #ColString + ')
) piv'
exec (#DynamicQuery)
Input data -
create table #xyz ([Case] varchar(10),[Time] time(0))
insert into #xyz
select 'D1','18:44' union all
select 'D2','19:12' union all
select 'C1','21:20' union all
select 'F2','21:05'
Your query is basically fine, but I strongly discourage you from using string functions on date/time columns.
datepart() is definitely one solution:
Select ...,
Column18 = sum(CASE WHEN datepart(hour, Time) = 18 THEN 1 ELSE 0 END)
Column19 = sum(CASE WHEN datepart(hour, Time) = 19 THEN 1 ELSE 0 END)
Direct comparison is more verbose, but more flexible:
select . . .,
sum(case when time >= '18:00' and time < '19:00' then 1 else 0 end) as column18,
sum(case when time >= '19:00' and time < '20:00' then 1 else 0 end) as column19,
Note that this uses as. SQL Server supports the syntax alias =. However, other databases do not use such syntax, so I prefer to stick with the ANSI-standard method of defining aliases.
Putting the values on rows instead of columns is probably the more "typical" solution:
select datepart(time, hour) as hr, count(*)
from t
group by datepart(time, hour)
order by hr;
As written, this will not return hours with zero counts.
Here is the simplest answer I could come up. Thanks a lot for all the advices. Looks way better now:
create table #temp (CaseID varchar(20),TheTime time)
insert into #temp values ('A1','03:56')
insert into #temp values ('A2','03:12')
insert into #temp values ('B2','03:21')
insert into #temp values ('C1','05:12')
insert into #temp values ('B3','06:00')
insert into #temp values ('B4','07:14')
insert into #temp values ('B5','07:18')
insert into #temp values ('D1','18:44')
insert into #temp values ('D2','19:54')
insert into #temp values ('C2','21:12')
insert into #temp values ('F4','21:50')
select *
from (
select CaseID, DATEPART(hour,TheTime) as HourOfDay
from #temp
) t
PIVOT
(
Count(CaseID)
FOR HourOfDay IN ([00],[01],[02],[03],[04],[05],[06],[07],[08],
[09],[10],[11],[12],[13],[14],[15],[16],[17],
[18],[19],[20],[21],[22],[23])
) AS PivotTable
I have a table with 3 columns.
one of them is [Code]. I have many records on this table.
I want to select records that their [Code] are numbers close to 10 regularly
for example if select records that has [Code]=9 then select records that has [Code] = 8 etc...
This is what I implement based on your though.
If you wish near record or record-id, not value, then you can change only condition a.data to a.rid.
declare #t table (data int)
insert into #t values(1), (2),(3),(4),(5),(6),(7),(8),(9),(10),(11),(12),(50),(51),(52)
declare #value int = 11 , #getDatToValue int = 2
select * from
(
select * , ROW_NUMBER( ) over(order by data) rid
from #t
)
a
where
a.data between (#value - #getDatToValue) and (#value + #getDatToValue)
Imagine the following two tables:
create table MainTable (
MainId integer not null, -- This is the index
Data varchar(100) not null
)
create table OtherTable (
MainId integer not null, -- MainId, Name combined are the index.
Name varchar(100) not null,
Status tinyint not null
)
Now I want to select all the rows from MainTable, while combining all the rows that match each MainId from OtherTable into a single field in the result set.
Imagine the data:
MainTable:
1, 'Hi'
2, 'What'
OtherTable:
1, 'Fish', 1
1, 'Horse', 0
2, 'Fish', 0
I want a result set like this:
MainId, Data, Others
1, 'Hi', 'Fish=1,Horse=0'
2, 'What', 'Fish=0'
What is the most elegant way to do this?
(Don't worry about the comma being in front or at the end of the resulting string.)
There is no really elegant way to do this in Sybase. Here is one method, though:
select
mt.MainId,
mt.Data,
Others = stuff((
max(case when seqnum = 1 then ','+Name+'='+cast(status as varchar(255)) else '' end) +
max(case when seqnum = 2 then ','+Name+'='+cast(status as varchar(255)) else '' end) +
max(case when seqnum = 3 then ','+Name+'='+cast(status as varchar(255)) else '' end)
), 1, 1, '')
from MainTable mt
left outer join
(select
ot.*,
row_number() over (partition by MainId order by status desc) as seqnum
from OtherTable ot
) ot
on mt.MainId = ot.MainId
group by
mt.MainId, md.Data
That is, it enumerates the values in the second table. It then does conditional aggregation to get each value, using the stuff() function to handle the extra comma. The above works for the first three values. If you want more, then you need to add more clauses.
Well, here is how I implemented it in Sybase 13.x. This code has the advantage of not being limited to a number of Names.
create proc
as
declare
#MainId int,
#Name varchar(100),
#Status tinyint
create table #OtherTable (
MainId int not null,
CombStatus varchar(250) not null
)
declare OtherCursor cursor for
select
MainId, Name, Status
from
Others
open OtherCursor
fetch OtherCursor into #MainId, #Name, #Status
while (##sqlstatus = 0) begin -- run until there are no more
if exists (select 1 from #OtherTable where MainId = #MainId) begin
update #OtherTable
set CombStatus = CombStatus + ','+#Name+'='+convert(varchar, Status)
where
MainId = #MainId
end else begin
insert into #OtherTable (MainId, CombStatus)
select
MainId = #MainId,
CombStatus = #Name+'='+convert(varchar, Status)
end
fetch OtherCursor into #MainId, #Name, #Status
end
close OtherCursor
select
mt.MainId,
mt.Data,
ot.CombStatus
from
MainTable mt
left join #OtherTable ot
on mt.MainId = ot.MainId
But it does have the disadvantage of using a cursor and a working table, which can - at least with a lot of data - make the whole process slow.
How can I find subsets of data over multiple rows in sql?
I want to count the number of occurrences of a string (or number) before another string is found and then count the number of times this string occurs before another one is found.
All these strings can be in random order.
This is what I want to achieve:
I have one table with one column (columnx) with data like this:
A
A
B
C
A
B
B
The result I want from the query should be like this:
2 A
1 B
1 C
1 A
2 B
Is this even possible in sql or would it be easier just to write a little C# app to do this?
Since, as per your comment, you can add a column that will unambiguously define the order in which the columnx values go, you can try the following query (provided the SQL product you are using supports CTEs and ranking functions):
WITH marked AS (
SELECT
columnx,
sortcolumn,
grp = ROW_NUMBER() OVER ( ORDER BY sortcolumn)
- ROW_NUMBER() OVER (PARTITION BY columnx ORDER BY sortcolumn)
FROM data
)
SELECT
columnx,
COUNT(*)
FROM marked
GROUP BY
columnx,
grp
ORDER BY
MIN(sortcolumn)
;
You can see the method in work on SQL Fiddle.
If sortcolumn is an auto-increment integer column that is guaranteed to have no gaps, you can replace the first ROW_NUMBER() expression with just sortcolumn. But, I guess, that cannot be guaranteed in general. Besides, you might indeed want to sort on a timestamp instead of an integer.
I dont think you can do it with a single select.
You can use AdventureWorks cursor:
create table my_Strings
(
my_string varchar(50)
)
insert into my_strings values('A'),('A'),('B'),('C'),('A'),('B'),('B') -- this method will only work on SQL Server 2008
--select my_String from my_strings
declare #temp_result table(
string varchar(50),
nr int)
declare #myString varchar(50)
declare #myLastString varchar(50)
declare #nr int
set #myLastString='A' --set this with the value of your FIRST string on the table
set #nr=0
DECLARE string_cursor CURSOR
FOR
SELECT my_string as aux_column FROM my_strings
OPEN string_cursor
FETCH NEXT FROM string_cursor into #myString
WHILE ##FETCH_STATUS = 0 BEGIN
if (#myString = #myLastString) begin
set #nr=#nr+1
set #myLastString=#myString
end else begin
insert into #temp_result values (#myLastString, #nr)
set #myLastString=#myString
set #nr=1
end
FETCH NEXT FROM string_cursor into #myString
END
insert into #temp_result values (#myLastString, #nr)
CLOSE string_cursor;
DEALLOCATE string_cursor;
select * from #temp_result
Result:
A 2
B 1
C 1
A 1
B 2
Try this :
;with sample as (
select 'A' as columnx
union all
select 'A'
union all
select 'B'
union all
select 'C'
union all
select 'A'
union all
select 'B'
union all
select 'B'
), data
as (
select columnx,
Row_Number() over(order by (select 0)) id
from sample
) , CTE as (
select * ,
Row_Number() over(order by (select 0)) rno from data
) , result as (
SELECT d.*
, ( SELECT MAX(ID)
FROM CTE c
WHERE NOT EXISTS (SELECT * FROM CTE
WHERE rno = c.rno-1 and columnx = c.columnx)
AND c.ID <= d.ID) AS g
FROM data d
)
SELECT columnx,
COUNT(1) cnt
FROM result
GROUP BY columnx,
g
Result :
columnx cnt
A 2
B 1
C 1
A 1
B 2