I have the following table.
Key | Count | Amount
----| ----- | ------
1 | 2 | 10
1 | 2 | 15
2 | 5 | 1
2 | 5 | 2
2 | 5 | 3
2 | 5 | 50
2 | 5 | 20
3 | 3 | 5
3 | 3 | 4
3 | 3 | 5
Sorry I couldn't figure out who to make the above a table.
I'm running this on SQL Server Management Studio 2012.
I'd like the stdevp return of the amount columns but if the number of records is less than some value 'x' (there will never be more than x records for a given key), then I want to add zeros to account for the remainder.
For example, if 'x' is 6:
for key 1, I need stdevp(10,5,0,0,0,0)
for key 2, I need stdevp(1,2,3,50,20,0)
for key 3, I need stdevp(5,4,5,0,0,0)
I just need to be able to add zeros to the calculation. I could insert records to my table, but that seems rather tedious.
This seems complicated -- padding data for each key. Here is one approach:
with xs as (
select 0 as val, 1 as n
union all
select 0, n + 1
from xs
where xs.n < 6
)
select k.key, stdevp(coalesce(t.amount, 0))
from xs cross join
(select distinct key from t) k left join
(select t.*, row_number() over (partition by key order by key) as seqnum
from t
) t
on t.key = k.key and t.seqnum = xs.n
group by k.key;
The idea is that the cross join generates 6 rows for each key. Then the left join brings in available rows, up to the maximum.
Consider below table
Table
ActivityId Flag Type
---------- ----- -----
1 N
2 N
3 Y EXT
4 Y
5 Y
6 N
7 Y INT
8 Y
9 N
10 N
11 N
12 Y EXT
13 N
14 N
15 N
16 Y EXT
17 Y
18 Y INT
19 Y
20 Y EXT
21 Y
22 N
23 N
First record has always Flag = N and then any sequence of Flag = Y or Flag = N can exist for next records. Every time flag changes from N to Y, the Type field is either EXT or INT. Next Y records (before next N) might have Type = EXT or INT or NULL and this is not important.
I want to calculate Cycle No for this sequence of N/Y. First cycle starts when Flag = N (always first record has flag = N) and cycle ends when flag changes to Y and Type = EXT. Then, next cycle starts when flag changes to N and ends when flag becomes Y and type = EXT. this is repeated until all records are processed. The result for above table is:
Result
ActivityId Flag Type Cycle No
---------- ----- ----- --------
1 N 1
2 N 1
3 Y EXT 1
4 Y
5 Y
6 N 2
7 Y INT 2
8 Y 2
9 N 2
10 N 2
11 N 2
12 Y EXT 2
13 N 3
14 N 3
15 N 3
16 Y EXT 3
17 Y
18 Y INT
19 Y
20 Y EXT
21 Y
22 N 4
23 N 4
I am using SQL Server 2008 R2 (no LAG/LEAD).
Can you please help me find the SQL query to calculate Cycle No?
I've got a solution, its not pretty, but through stepwise refinement I get to your result:
The solution is in three steps
Isolate the activityid for all the possible start cycle and end
cycle rows.
Filter out all the dud start events
Number the cycles, and find the interval of activityid for each
cycle
First i select out all start and end cycle events:
with tab as
(select * from (values
(1,'N',''),(2,'N',''),(3,'Y','EXT'),(4,'Y','')
,(5,'Y',''),(6,'N',''),(7,'Y','INT'),(8,'Y','')
,(9,'N',''),(10,'N',''),(11,'N',''),(12,'Y','EXT')
,(13,'N',''),(14,'N',''),(15,'N',''),(16,'Y','EXT')
,(17,'Y',''),(18,'Y','INT'),(19,'Y',''),(20,'Y','EXT')
,(21,'Y',''),(22,'N',''),(23,'N','')) a(ActivityId,Flag,[Type]))
,CTE1 as
( select
ROW_NUMBER() over (order by t1.ActivityId) rn
,t1.ActivityId
,case when t1.Flag='N' then 'Start' else 'End' end Cycle
from tab t1
where t1.Flag='N' or (t1.Flag='Y' and t1.[Type]='Ext')
)
select * from cte1
This returns
rn ActivityId Cycle
1 1 Start
2 2 Start
3 3 End
4 6 Start
5 9 Start
6 10 Start
7 11 Start
8 12 End
9 13 Start
10 14 Start
11 15 Start
12 16 End
13 20 End
14 22 Start
15 23 Start
The problem is now that while we are sure of when the cycle ends, that is when Flag is N and Type is Ext, we are not sure when the cycle starts. Row 1 and 2 both denote a possible start event. But luckily we can see that only a start event following an End event is to be counted. Since we do not have lag or lead we must join the CTE with itself:
,CTE2 as
(
select ROW_NUMBER() over (order by a1.activityid) rn
,a1.ActivityId
,a1.Cycle
,a2.Cycle PrevCycle
from CTE1 a1 left join CTE1 a2 on a1.rn=a2.rn+1
where
a2.Cycle is null -- First Cycle
or
( a2.Cycle is not null
and
(
(a1.Cycle='End' and a2.Cycle='Start') -- End of cycle
or
(a1.Cycle='Start'
and a2.Cycle='End') -- next cycles
)
)
)
select * from cte2
This returns
rn ActivityId Cycle PrevCycle
1 1 Start NULL
2 3 End Start
3 6 Start End
4 12 End Start
5 13 Start End
6 16 End Start
7 22 Start End
I Select the first start event - since we always start with one, and then keep the END events that follow a start event.
Finally we only keep the rest of the start events if the previous event was an End event.
Now we can find the start and end of each cycle, and number them:
,cte3 as
(
select ROW_NUMBER() over (order by b1.ActivityId) CycleNumber
,b1.ActivityId StartId,b2.ActivityId EndId
from cte2 b1 left join cte2 b2
on b1.rn=b2.rn-1
where b1.Cycle='Start'
)
select * from cte3
Which gives us what we need:
CycleNumber StartId EndId
1 1 3
2 6 12
3 13 16
4 22 NULL
Now we just need to join this back on our table:
select
a.ActivityId,a.Flag,a.[Type],CycleNumber
from tab a
left join cte3 b on a.ActivityId between b.StartId and isnull(b.EndId,a.ActivityId)
This gives the result you were looking for.
This is just a quick and dirty solution, perhaps with a little TLC you can pretty it up, and reduce the number of steps.
The full solution is here:
with tab as
(select * from (values
(1,'N',''),(2,'N',''),(3,'Y','EXT'),(4,'Y','')
,(5,'Y',''),(6,'N',''),(7,'Y','INT'),(8,'Y','')
,(9,'N',''),(10,'N',''),(11,'N',''),(12,'Y','EXT')
,(13,'N',''),(14,'N',''),(15,'N',''),(16,'Y','EXT')
,(17,'Y',''),(18,'Y','INT'),(19,'Y',''),(20,'Y','EXT')
,(21,'Y',''),(22,'N',''),(23,'N','')) a(ActivityId,Flag,[Type]))
,CTE1 as
( select
ROW_NUMBER() over (order by t1.ActivityId) rn
,t1.ActivityId
,case when t1.Flag='N' then 'Start' else 'End' end Cycle
from tab t1
where t1.Flag='N' or (t1.Flag='Y' and t1.[Type]='Ext')
)
,CTE2 as
(
select ROW_NUMBER() over (order by a1.activityid) rn
,a1.ActivityId
,a1.Cycle
,a2.Cycle PrevCycle
from CTE1 a1 left join CTE1 a2 on a1.rn=a2.rn+1
where
a2.Cycle is null -- First Cycle
or
( a2.Cycle is not null
and
(
a1.Cycle='End' -- End of cycle
or
(a1.Cycle='Start'
and a2.Cycle='End') -- next cycles
)
)
)
,cte3 as
(
select ROW_NUMBER() over (order by b1.ActivityId) CycleNumber
,b1.ActivityId StartId,b2.ActivityId EndId
from cte2 b1 left join cte2 b2
on b1.rn=b2.rn-1
where b1.Cycle='Start'
)
select
a.ActivityId,a.Flag,a.[Type],CycleNumber
from tab a
left join cte3 b on a.ActivityId between b.StartId and isnull(b.EndId,a.ActivityId)
If you are happy with recursion, this can be achieved rather simply with a bit of comparison logic to the preceding row when ordered by your ActivityId:
declare #t table(ActivityId int,Flag nvarchar(1),TypeValue nvarchar(3));
insert into #t values(1 ,'N',null),(2 ,'N',null),(3 ,'Y','EXT'),(4 ,'Y',null),(5 ,'Y',null),(6 ,'N',null),(7 ,'Y','INT'),(8 ,'Y',null),(9 ,'N',null),(10,'N',null),(11,'N',null),(12,'Y','EXT'),(13,'N',null),(14,'N',null),(15,'N',null),(16,'Y','EXT'),(17,'Y',null),(18,'Y','INT'),(19,'Y',null),(20,'Y','EXT'),(21,'Y',null),(22,'N',null),(23,'N',null);
with rn as -- Derived table purely to guarantee incremental row number. If you can guarantee your ActivityId values are incremental start to finish, this isn't required.
( select row_number() over (order by ActivityId) as rn
,ActivityId
,Flag
,TypeValue
from #t
),d as
( select rn -- Recursive CTE that compares the current row to the one previous.
,ActivityId
,Flag
,TypeValue
,cast(1 as decimal(10,5)) as CycleNo
from rn
where rn = 1
union all
select rn.rn
,rn.ActivityId
,rn.Flag
,rn.TypeValue
,cast(
case when d.Flag = 'Y' and d.TypeValue = 'EXT' and d.CycleNo >= 1
then case when rn.Flag = 'N'
then d.CycleNo + 1
else (d.CycleNo + 1) * 0.0001 -- This part keeps track of the cycle number in fractional values, which can be removed by converting the final result to INT.
end
else case when rn.Flag = 'N' and d.CycleNo < 1
then d.CycleNo * 10000
else d.CycleNo
end
end
as decimal(10,5)) as CycleNo
from rn
inner join d
on d.rn = rn.rn - 1
)
select ActivityId
,Flag
,TypeValue
,cast(CycleNo as int) as CycleNo
from d
order by ActivityId;
Output:
+------------+------+-----------+---------+
| ActivityId | Flag | TypeValue | CycleNo |
+------------+------+-----------+---------+
| 1 | N | NULL | 1 |
| 2 | N | NULL | 1 |
| 3 | Y | EXT | 1 |
| 4 | Y | NULL | 0 |
| 5 | Y | NULL | 0 |
| 6 | N | NULL | 2 |
| 7 | Y | INT | 2 |
| 8 | Y | NULL | 2 |
| 9 | N | NULL | 2 |
| 10 | N | NULL | 2 |
| 11 | N | NULL | 2 |
| 12 | Y | EXT | 2 |
| 13 | N | NULL | 3 |
| 14 | N | NULL | 3 |
| 15 | N | NULL | 3 |
| 16 | Y | EXT | 3 |
| 17 | Y | NULL | 0 |
| 18 | Y | INT | 0 |
| 19 | Y | NULL | 0 |
| 20 | Y | EXT | 0 |
| 21 | Y | NULL | 0 |
| 22 | N | NULL | 4 |
| 23 | N | NULL | 4 |
+------------+------+-----------+---------+
I have:
items which are described by a set of ids (GroupType, ID, Name)
VALUES table which gets populated with factor values on each date so that an item gets only a certain set of factors with values per date.
FACTORS table containing static descriptions of the factors.
Looking for:
I want to create a temporary table with a matrix showing factor values for each item per date so that one could see in user friendly way which Factors were populated on a given date (with corresponding values).
Values
Date GroupType ID Name FactorId Value
01/01/2013 1 1 A 1 10
01/01/2013 1 1 A 2 8
01/01/2013 1 1 A 3 12
01/01/2013 1 2 B 3 5
01/01/2013 1 2 B 4 6
02/01/2013 1 1 A 1 7
02/01/2013 1 1 A 2 6
02/01/2013 1 2 B 3 9
02/01/2013 1 2 B 4 9
Factors
FactorId FactorName
1 Factor1
2 Factor2
3 Factor3
4 Factor4
. .
. .
. .
temporary table Factor Values Matrix
Date Group ID Name Factor1 Factor2 Factor3 Factor4 Factor...
01/01/2013 1 1 A 10 8 12
01/01/2013 1 2 B 5 6
02/01/2013 1 1 A 7 6
02/01/2013 1 2 B 9 9
Any help is greatly appreciated!
This type of data transformation is known as a PIVOT which takes values from rows and converts it into columns.
In SQL Server 2005+, there is a function that will perform this rotation of data.
Static Pivot:
If your values will be set then you can hard-code the FactorNames into the columns by using a static pivot.
select date, grouptype, id, name, Factor1, Factor2, Factor3, Factor4
from
(
select v.date,
v.grouptype,
v.id,
v.name,
f.factorname,
v.value
from [values] v
left join factors f
on v.factorid = f.factorid
-- where v.date between date1 and date2
) src
pivot
(
max(value)
for factorname in (Factor1, Factor2, Factor3, Factor4)
) piv;
See SQL Fiddle with Demo.
Dynamic Pivot:
In your case, you stated that you are going to have an unknown number of values. If so, then you will need to use dynamic SQL to generate a SQL string that will be executed at run-time:
DECLARE #cols AS NVARCHAR(MAX),
#query AS NVARCHAR(MAX)
select #cols = STUFF((SELECT distinct ',' + QUOTENAME(FactorName)
from factors
FOR XML PATH(''), TYPE
).value('.', 'NVARCHAR(MAX)')
,1,1,'')
set #query = 'SELECT date, grouptype, id, name,' + #cols + ' from
(
select v.date,
v.grouptype,
v.id,
v.name,
f.factorname,
v.value
from [values] v
left join factors f
on v.factorid = f.factorid
-- where v.date between date1 and date2
) x
pivot
(
max(value)
for factorname in (' + #cols + ')
) p '
execute(#query)
See SQL Fiddle with Demo.
Both of these versions generate the same result:
| DATE | GROUPTYPE | ID | NAME | FACTOR1 | FACTOR2 | FACTOR3 | FACTOR4 |
------------------------------------------------------------------------------
| 2013-01-01 | 1 | 1 | A | 10 | 8 | 12 | (null) |
| 2013-01-01 | 1 | 2 | B | (null) | (null) | 5 | 6 |
| 2013-02-01 | 1 | 1 | A | 7 | 6 | 11 | (null) |
| 2013-02-01 | 1 | 1 | B | (null) | (null) | 9 | 9 |
If you want to filter the results based on a date range, then you will just need to add a WHERE clause to the above queries.
It looks like you are simply trying to pivot the rows into columns. I think this does what you want:
select Date, Group, ID, Name,
max(case when factorid = 1 then name end) as Factor1,
max(case when factorid = 2 then name end) as Factor2,
max(case when factorid = 3 then name end) as Factor3,
max(case when factorid = 4 then name end) as Factor4
from t
group by Date, Group, ID, Name