How do I write an SQL query to return count the similar values for each column in one row?
I have this:
emp_no
d1
d2
d3
d4
d5
d6
d7
d8
d9
d10
date
1002
2
2
2
26
26
4
4
53
53
53
2021-03-31
1003
4
4
4
26
26
2
26
26
26
26
2021-03-31
1002
2
2
2
26
26
4
4
26
26
26
2021-04-30
I want the result like this:
emp_no
2
4
26
51
53
date
1002
3
2
2
0
3
2021-03-31
1003
1
3
6
0
0
2021-03-31
1002
3
2
2
0
3
2021-04-30
I try UNPIVOT data, but how I can pivot this?
Do I create a view with unpivot data and after that re-pivot the aggregated data?
SELECT EMP_NO, TS_MTH_YR, TSS_D
FROM (
SELECT EMP_NO, TS_MTH_YR, [D1], [D2], [D3], [D4], [D5], [D6], [D7], [D8], [D9], [D10]
FROM TSS_MONTHLY_TS
) AS TSS
UNPIVOT (
TSS_D FOR TSS_DAYS IN ([D1], [D2], [D3], [D4], [D5], [D6], [D7], [D8], [D9], [D10])
) AS TS
As I mentioned in the comments, you'll need to both unpivot and then repivot your data here. One method would therefore be the below:
WITH YourTable AS(
SELECT emp_no ,d1 ,d2 ,d3 ,d4 ,d5 ,d6 ,d7 ,d8 ,d9 ,d10 , CONVERT(date,date) AS date --That's not confusing
FROM (VALUES(1002,2,2,2,26,26,4,4 ,53 ,53 ,53 ,'2021-03-31'),
(1003,4,4,4,26,26,2,26, 26, 26, 26,' 2021-03-31'),
(1002,2,2,2,26,26,4,4 ,26 ,26 ,26 ,'2021-04-30'))V(emp_no ,d1 ,d2 ,d3 ,d4 ,d5 ,d6 ,d7 ,d8 ,d9 ,d10 ,date))
SELECT YT.emp_no,
COUNT(CASE V.Val WHEN 2 THEN 1 END) AS [2],
COUNT(CASE V.Val WHEN 4 THEN 1 END) AS [4],
COUNT(CASE V.Val WHEN 26 THEN 1 END) AS [26],
COUNT(CASE V.Val WHEN 51 THEN 1 END) AS [51],
COUNT(CASE V.Val WHEN 53 THEN 1 END) AS [53],
YT.[date]
FROM YourTable YT
CROSS APPLY (VALUES('d1',YT.d1),
('d2',YT.d2),
('d3',YT.d3),
('d4',YT.d4),
('d5',YT.d5),
('d6',YT.d6),
('d7',YT.d7),
('d8',YT.d8),
('d9',YT.d9),
('d10',YT.d10))V(Col,Val)
GROUP BY YT.emp_no,
YT.[date];
Related
I have a view based on query:
SELECT CONVERT(VARCHAR(10), date, 103) AS date,
eventid, name, time, pts
FROM results
WHERE DATEPART(yy, date) = 2019;
This provides a data set such as this:
Date EventID Name Time Points
24/04/2019 10538 Fred Flintstone 22:27 10
24/04/2019 10538 Barney Rubble 22:50 9
24/04/2019 10538 Micky Mouse 23:17 8
24/04/2019 10538 Yogi Bear 23:54 7
24/04/2019 10538 Donald Duck 24:07 6
01/05/2019 10541 Barney Rubble 21:58 10
01/05/2019 10541 Fred Flintstone 22:00 9
01/05/2019 10541 Donald Duck 23:39 8
01/05/2019 10541 Yogi Bear 23:43 7
12/06/2019 10569 Fred Flintstone 22:06 10
12/06/2019 10569 Barney Rubble 22:22 9
12/06/2019 10569 Micky Mouse 23:05 8
12/06/2019 10569 Donald Duck 23:55 7
I need an output row for each name listing the pts per round and a total in the form:
Name 24/04/2019 01/05/2019 12/06/2019 total
Fred Flintstone 10 9 10 29
Barney Rubble 9 10 9 28
Yogi Bear 7 7 7 21
Micky Mouse 8 8 16
Donald Duck 6 8 14
There could be up to 16 non-consecutive event dates for the year.
Nothing wrong with PIVOT but, for me, the easiest and most performant way to do this would be to perform a Cross Tab. The syntax is less verbose, more portable, and easier to understand.
First for some DDL and easily consumable sample data. <<< Learn how to do this it will get you better answers more quickly.
SET NOCOUNT ON;
SET DATEFORMAT dmy; -- I need this because I'm American
-- DDL and easily consumable sample data
DECLARE #Results TABLE
(
[Date] DATE,
EventId INT,
[Name] VARCHAR(40), -- if indexed, go as narrow as possible
[Time] TIME,
Points INT,
INDEX uq_poc_results CLUSTERED([Name],[EventId]) -- a covering index is vital for a query like this
); -- note: ^^^ this bad clustered index candidate, I went this route for simplicity
INSERT #Results VALUES
('4/04/2019', 10538, 'Fred Flintstone', '22:27',10),
('24/04/2019',10538, 'Barney Rubble', '22:50',9),
('24/04/2019',10538, 'Micky Mouse ', '23:17',8),
('24/04/2019',10538, 'Yogi Bear', '23:54',7),
('24/04/2019',10538, 'Donald Duck', '2307',6),
('01/05/2019',10541, 'Barney Rubble', '21:58',10),
('01/05/2019',10541, 'Fred Flintstone', '22:00',9),
('01/05/2019',10541, 'Donald Duck', '23:39',8),
('01/05/2019',10541, 'Yogi Bear', '23:43',7),
('12/06/2019',10569, 'Fred Flintstone', '22:06',10),
('12/06/2019',10569, 'Barney Rubble', '22:22',9),
('12/06/2019',10569, 'Micky Mouse', '23:05',8),
('12/06/2019',10569, 'Donald Duck', '23:55',7);
Note that I created a clustered index on (Name,EventId) - I would use a non-clustered index that covered the columns you need in the real world. If you have a lot of rows then you will want that index.
Basic Cross-Tab
SELECT [Name] = r.[Name],
[24/04/2019] = MAX(CASE r.[Date] WHEN '24/04/2019' THEN r.Points ELSE 0 END),
[01/05/2019] = MAX(CASE r.[Date] WHEN '01/05/2019' THEN r.Points ELSE 0 END),
[12/06/2019] = MAX(CASE r.[Date] WHEN '12/06/2019' THEN r.Points ELSE 0 END)
FROM #Results AS r
GROUP BY r.[Name];
Results:
Name 24/04/2019 01/05/2019 12/06/2019
-------------------- ------------ ------------ ------------
Barney Rubble 9 10 9
Donald Duck 6 8 7
Fred Flintstone 0 9 10
Micky Mouse 8 0 8
Yogi Bear 7 7 0
To get the total we can wrap this in logic in a subquery and add the columns like this:
SELECT
[Name] = piv.N,
[24/04/2019] = piv.D1,
[01/05/2019] = piv.D2,
[12/06/2019] = piv.D3,
Total = piv.D1+piv.D2+piv.D3
FROM
(
SELECT r.[Name],
MAX(CASE r.[Date] WHEN '24/04/2019' THEN r.Points ELSE 0 END),
MAX(CASE r.[Date] WHEN '01/05/2019' THEN r.Points ELSE 0 END),
MAX(CASE r.[Date] WHEN '12/06/2019' THEN r.Points ELSE 0 END)
FROM #Results AS r
GROUP BY r.[Name]
) AS piv(N,D1,D2,D3);
Returns:
Name 24/04/2019 01/05/2019 12/06/2019 Total
------------------- ----------- ----------- ----------- -------
Barney Rubble 9 10 9 28
Donald Duck 6 8 7 21
Fred Flintstone 0 9 10 19
Micky Mouse 8 0 8 16
Yogi Bear 7 7 0 14
Not only does this get you what you need with very little SQL, you benefit from pre-aggregation inside the subquery. A huge benefit of this approach over PIVOT is how you can do multiple aggregations in one query. Below are two examples of how to use this approach for multiple aggregations; this first using a standard GROUP BY twice, the other using window aggregate functions (.. OVER (partition by, order by..):
--==== Traditional Approach
SELECT
[Name] = piv.N,
[24/04/2019] = MAX(piv.D1),
[01/05/2019] = MAX(piv.D2),
[12/06/2019] = MAX(piv.D3),
Total = MAX(f.Ttl),
Avg1 = AVG(piv.D1), -- 1st date (24/04/2019)
Avg2 = AVG(piv.D2), -- 2nd date...
Avg3 = AVG(piv.D3), -- 3rd date...
TotalAvg = AVG(f.Ttl) ,
Mn = MIN(f.Ttl) ,
Mx = MAX(f.Ttl)
FROM
(
SELECT r.[Name],
MAX(CASE r.[Date] WHEN '24/04/2019' THEN r.Points ELSE 0 END),
MAX(CASE r.[Date] WHEN '01/05/2019' THEN r.Points ELSE 0 END),
MAX(CASE r.[Date] WHEN '12/06/2019' THEN r.Points ELSE 0 END)
FROM #Results AS r
GROUP BY r.[Name]
) AS piv(N,D1,D2,D3)
CROSS APPLY (VALUES(piv.D1+piv.D2+piv.D3)) AS f(Ttl)
GROUP BY piv.N;
--==== Leveraging Window Aggregates
SELECT
[Name] = piv.N,
[24/04/2019] = piv.D1,
[01/05/2019] = piv.D2,
[12/06/2019] = piv.D3,
Total = f.Ttl,
Avg1 = AVG(piv.D1) OVER(PARTITION BY piv.N ORDER BY (SELECT NULL)), -- 1st date (24/04/2019)
Avg2 = AVG(piv.D2) OVER(PARTITION BY piv.N ORDER BY (SELECT NULL)), -- 2nd date...
Avg3 = AVG(piv.D3) OVER(PARTITION BY piv.N ORDER BY (SELECT NULL)), -- 3rd date...
TotalAvg = AVG(f.Ttl) OVER(PARTITION BY piv.N ORDER BY (SELECT NULL)),
Mn = MIN(f.Ttl) OVER(PARTITION BY piv.N ORDER BY (SELECT NULL)),
Mx = MAX(f.Ttl) OVER(PARTITION BY piv.N ORDER BY (SELECT NULL))
FROM
(
SELECT r.[Name],
MAX(CASE r.[Date] WHEN '24/04/2019' THEN r.Points ELSE 0 END),
MAX(CASE r.[Date] WHEN '01/05/2019' THEN r.Points ELSE 0 END),
MAX(CASE r.[Date] WHEN '12/06/2019' THEN r.Points ELSE 0 END)
FROM #Results AS r
GROUP BY r.[Name]
) AS piv(N,D1,D2,D3)
CROSS APPLY (VALUES(piv.D1+piv.D2+piv.D3)) AS f(Ttl);
Both Return:
Name 24/04/2019 01/05/2019 12/06/2019 Total Avg1 Avg2 Avg3 TotalAvg Mn Mx
----------------- ----------- ----------- ----------- ------ ------ ------ ------ ---------- ------ ------
Barney Rubble 9 10 9 28 9 10 9 28 28 28
Donald Duck 6 8 7 21 6 8 7 21 21 21
Fred Flintstone 0 9 10 19 0 9 10 19 19 19
Micky Mouse 8 0 8 16 8 0 8 16 16 16
Yogi Bear 7 7 0 14 7 7 0 14 14 14
To handle the columns dynamically you need to have a look at:
Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs by Jeff Moden.
I have a table with the columns Age, Period and Year. The column Age always starts with 0 and doesn't have a fixed maximum value (I used 'Age' 0 to 30 in this example but the range could also be 0 to 100 etc.), the values Period and Year only appear in certain rows at certain ages.
However at what Age the values for Period and Year appear, changes and the solution should therefore be dynamic. What is the best way to fill in the NULL values with correct Period and Year?
I am using SQL Server.
Age Period Year
-----------------
0 NULL NULL
1 NULL NULL
2 NULL NULL
3 NULL NULL
4 NULL NULL
5 NULL NULL
6 NULL NULL
7 NULL NULL
8 NULL NULL
9 NULL NULL
10 NULL NULL
11 NULL NULL
12 NULL NULL
13 NULL NULL
14 NULL NULL
15 NULL NULL
16 NULL NULL
17 NULL NULL
18 NULL NULL
19 NULL NULL
20 NULL NULL
21 46 2065
22 NULL NULL
23 NULL NULL
24 NULL NULL
25 NULL NULL
26 51 2070
27 NULL NULL
28 NULL NULL
29 NULL NULL
30 NULL NULL
The result should look like this, the numbers for Period and Year should be increased and/or decrease from the last known values for Period and Year.
Age Period Year
-----------------
0 25 2044
1 26 2045
2 27 2046
3 28 2047
4 29 2048
5 30 2049
6 31 2050
7 32 2051
8 33 2052
9 34 2053
10 35 2054
11 36 2055
12 37 2056
13 38 2057
14 39 2058
15 40 2059
16 41 2060
17 42 2061
18 43 2062
19 44 2063
20 45 2064
21 46 2065
22 47 2066
23 48 2067
24 49 2068
25 50 2069
26 51 2070
27 52 2071
28 53 2072
29 54 2073
30 55 2074
Here is an UPDATE to my question as I didn't specify my requirement detailed enough:
The solution should be able to handle different combinations of Age, Period and Year. My start point will always be a known Age, Period and Year combination. However, the combination Age = 21, Period = 46 and Year = 2065 (or 26|51|2070 as the second combination) in my example is not static. The value at Age = 21 could be anything e.g. Period = 2 and Year = 2021. Whatever the combination (Age, Period, Year) is, the solution should fill in the gaps and finish the sequence counting up and down from the known values for Period and Year. If a Period value sequence becomes negative the solutions should return NULL values, if possible.
Seem you have always the same increment for age and year
so
select age, isnull(period,age +25) Period, isnull(year,age+44) year
from yourtable
or the standard function coalesce (as suggested by Gordon Linoff)
select age, coalesce(period,age +25) Period, coalesce(year,age+44) year
from yourtable
Tabel creation code
create table yourtable ( AGE int , Period int, Year int )
insert into yourtable
Select 0 AS AGE , null As Period , null As Year UNION all
Select 1 AS AGE , null As Period , null As Year UNION all
Select 2 AS AGE , null As Period , null As Year UNION all
Select 3 AS AGE , null As Period , null As Year UNION all
Select 4 AS AGE , null As Period , null As Year UNION all
Select 5 AS AGE , null As Period , null As Year UNION all
Select 6 AS AGE , null As Period , null As Year UNION all
Select 7 AS AGE , null As Period , null As Year UNION all
Select 8 AS AGE , null As Period , null As Year UNION all
Select 9 AS AGE , null As Period , null As Year UNION all
Select 10 AS AGE , null As Period , null As Year UNION all
Select 11 AS AGE , null As Period , null As Year UNION all
Select 12 AS AGE , null As Period , null As Year UNION all
Select 13 AS AGE , null As Period , null As Year UNION all
Select 14 AS AGE , null As Period , null As Year UNION all
Select 15 AS AGE , null As Period , null As Year UNION all
Select 16 AS AGE , null As Period , null As Year UNION all
Select 17 AS AGE , null As Period , null As Year UNION all
Select 18 AS AGE , null As Period , null As Year UNION all
Select 19 AS AGE , null As Period , null As Year UNION all
Select 20 AS AGE , null As Period , null As Year UNION all
Select 21 AS AGE ,46 As Period ,2065 As Year UNION all
Select 22 AS AGE , null As Period , null As Year UNION all
Select 23 AS AGE , null As Period , null As Year UNION all
Select 24 AS AGE , null As Period , null As Year UNION all
Select 25 AS AGE , 51 As Period ,2070 As Year UNION all
Select 26 AS AGE , null As Period , null As Year UNION all
Select 27 AS AGE , null As Period , null As Year UNION all
Select 28 AS AGE , null As Period , null As Year UNION all
Select 29 AS AGE , null As Period , null As Year UNION all
Select 30 AS AGE , null As Period , null As Year
**Steps **
We need to get one row with non null value for Period and year.
Using age get first value for both the column .
Now just add respective age column value and fill full table .
Code to fix the serial
;with tmp as
(select top 1 * from yourtable where Period is not null and year is not null)
update yourtable
set Period = (tmp.Period - tmp.age) + yourtable.age
, year = (tmp.year - tmp.age) + yourtable.age
from yourtable , tmp
OR
Declare #age int ,#Year int ,#Period int
select #age = age , #Year = year - (age +1) ,#Period = Period- (AGE +1)
from yourtable where Period is not null and year is not null
update yourtable
set Period =#Period + age
,Year =#year + age
from yourtable
You finally want three sequences with different start values. Then you simply need to calculate an offset and add it to age:
with cte as
(
select age
,max(period - age) over () + age as period -- adjusted period
,max(yr - age) over () + age as yr -- adjusted yr
from #yourtable
)
select age
-- If a Period value sequence becomes negative the solutions should return NULL
,case when period >0 then period end as period
,yr
from cte
See fiddle
-- hope you can manage the syntax error. but some logic like given below should work in this case where we can make period an origin to calculate other missing values. good luck!
declare #knownperiod int;
declare #knownperiodage int;
declare #agetop int;
declare #agebottom int;
#knownperiod = select top 1 period from table1 where period is not null
#knownperiodage = select top 1 age from table1 where period is not null
while(#knownperiodage >= 0)
begin
#knownperiod = #knownperiod -1 ;
#knownperiodage = #knownperiodage -1;
update table1 set period = #knownperiod, year = YEAR(GetDate())+#knownperiod-1 where age = #knownperiodage
end
-- now for bottom age
#knownperiod = select top 1 period from table1 where period is null or year is null
#knownperiodage = select top 1 age from table1 where period is null or year is null
while(#knownperiodage <= (Select max(age) from table1))
begin
#knownperiod = #knownperiod +1 ;
#knownperiodage = #knownperiodage +1;
update table1 set period = #knownperiod, year = YEAR(GetDate())+#knownperiod-1 where age = #knownperiodage
end
Is the process to first calculate the increments (age -> period and age -> year) then simply add those increments to the age values?
This assumes the differences between age and period, and age and year, are consistent across rows (just not filled in sometimes).
As such, you could use the following to first calculate the increments (PeriodInc, YrInc) and then select the values with the increments added (noting that if period goes negative, it gets NULL).
; WITH PeriodInc AS (SELECT TOP 1 Period - Age AS PeriodInc FROM #yourtable WHERE Period IS NOT NULL),
YrInc AS (SELECT TOP 1 Yr - Age AS YrInc FROM #yourtable WHERE Yr IS NOT NULL)
SELECT Age,
CASE WHEN (Age + PeriodInc) >= 0 THEN (Age + PeriodInc) ELSE NULL END AS Period,
Age + YrInc AS Yr
FROM #yourtable
CROSS JOIN PeriodInc
CROSS JOIN YrInc
Here is a DB_Fiddle with the code
This solution takes 4 inputs:
#list_length -- (integer) the number of rows to generate (up to 12^5=248,832)
#start_age -- (integer) beginning age
#start_period -- (integer) beginning period
#start_year -- (integer) beginning year
For any combination of inputs this code generates the requested output. If either the Age or Year is calculated to be negative then it is converted to NULL. The current limit to the list length could be increased to whatever is necessary. The technique of creating a row_number using cross applied rows is known to be very fast when generating large sequences. Above about 500 rows it's always faster than a recursion based CTE. At small row numbers there's little to no performance difference between the two techniques.
Here are the code and output to match the example data.
Inputs
declare
#list_length int=31,
#start_age int=21,
#start_period int=46,
#start_year int=2065;
Code
with
n(n) as (select * from (values (1),(2),(3),(4),(5),(6),(7),(8),(9),(10),(11),(12)) v(n)),
tally_cte(n) as (
select row_number() over (order by (select null))
from n n1 cross join n n2 cross join n n3 cross join n n4 cross join n n5)
select p.Age,
case when p.[Period]<0 then null else p.[Period] end [Period],
case when p.[Year]<0 then null else p.[Year] end [Year]
from tally_cte t
cross apply
(select (t.n-1) [Age], (t.n-1)+(#start_period-#start_age) [Period],
(t.n-1)+(#start_year-#start_age) [Year]) p
where n<=#list_length;
Output
Age Period Year
0 25 2044
1 26 2045
2 27 2046
3 28 2047
4 29 2048
5 30 2049
6 31 2050
7 32 2051
8 33 2052
9 34 2053
10 35 2054
11 36 2055
12 37 2056
13 38 2057
14 39 2058
15 40 2059
16 41 2060
17 42 2061
18 43 2062
19 44 2063
20 45 2064
21 46 2065
22 47 2066
23 48 2067
24 49 2068
25 50 2069
26 51 2070
27 52 2071
28 53 2072
29 54 2073
30 55 2074
Suppose both the Period and the Year are less than the start Age. When the calculated values are negative the value is replaced with a NULL.
Inputs
declare
#list_length int=100,
#start_age int=10,
#start_period int=5,
#start_year int=8;
Output
Age Period Year
0 NULL NULL
1 NULL NULL
2 NULL 0
3 NULL 1
4 NULL 2
5 0 3
6 1 4
7 2 5
8 3 6
9 4 7
10 5 8
11 6 9
12 7 10
...
99 94 97
Imo this is a flexible and efficient way to meet all of the requirements. Please let me know if there are any issues.
This reads like a gaps-and-islands problem, where "empty" rows are the gaps and non-empty rows are the islands.
You want to fill the gaps. Your question is a bit tricky, because you do not clearly describe how to proceed when a gap row has both preceding and following islands - and what to do if they are not consistent.
Let me assume that you want to derive the value from the following island if there is one available, and fall back of the precedng island.
Here is an approach using lateral joins to retrieve the next and preceding non-empty row:
select t.age,
coalesce(t.period, n.period - n.diff, p.period - p.diff) period,
coalesce(t.year, n.year - n.diff, p.year - p.diff) year
from mytable t
outer apply (
select top (1) t1.*, t1.age - t.age diff
from mytable t1
where t1.age > t.age and t1.period is not null and t1.year is not null
order by t1.age
) n
outer apply (
select top (1) t1.*, t1.age - t.age diff
from mytable t1
where t1.age < t.age and t1.period is not null and t1.year is not null
order by t1.age desc
) p
order by t.age
Actually, this would probably be more efficiently performed with window functions. We can implement the very same logic by building groups of records with window counts, then doing the computation within the groups:
select
age,
coalesce(
period,
max(period) over(partition by grp2) - max(age) over(partition by grp2) + age,
max(period) over(partition by grp1) - min(age) over(partition by grp1) + age
) period,
coalesce(
year,
max(year) over(partition by grp2) - max(age) over(partition by grp2) + age,
max(year) over(partition by grp1) - min(age) over(partition by grp1) + age
) year
from (
select t.*,
count(period) over(order by age) grp1,
count(period) over(order by age desc) grp2
from mytable t
) t
order by age
Demo on DB Fiddle - both queries yield:
age | period | year
--: | -----: | ---:
0 | 25 | 2044
1 | 26 | 2045
2 | 27 | 2046
3 | 28 | 2047
4 | 29 | 2048
5 | 30 | 2049
6 | 31 | 2050
7 | 32 | 2051
8 | 33 | 2052
9 | 34 | 2053
10 | 35 | 2054
11 | 36 | 2055
12 | 37 | 2056
13 | 38 | 2057
14 | 39 | 2058
15 | 40 | 2059
16 | 41 | 2060
17 | 42 | 2061
18 | 43 | 2062
19 | 44 | 2063
20 | 45 | 2064
21 | 46 | 2065
22 | 47 | 2066
23 | 48 | 2067
24 | 49 | 2068
25 | 50 | 2069
26 | 51 | 2070
27 | 52 | 2071
28 | 53 | 2072
29 | 54 | 2073
30 | 55 | 2074
Also you can use recursive CTE (it can handle any variation of data in the table except only one that has no populated period and year at all):
WITH cte AS ( -- get any filled period and year
SELECT TOP 1 period - age delta,
[year]-period start_year
FROM tablename
WHERE period is not null and [year] is not null
), seq AS ( --get min and max age values
SELECT MIN(age) as min_age, MAX(age) as max_age
FROM tablename
), go_recursive AS (
SELECT min_age age,
min_age+delta period ,
start_year+min_age+delta year,
max_age
FROM seq
CROSS JOIN cte --That will generate the initial first row
UNION ALL
SELECT age + 1,
period +1,
year + 1,
max_age
FROM go_recursive
WHERE age < max_age --This part increments the data from first row
)
SELECT age,
period,
[year]
FROM go_recursive
OPTION (MAXRECURSION 0)
-- If you know there are some limit of rows in that kind of tables
--use this row count instead 0
Following the previous question
I have this query:
SELECT Acc.DocTLItem.TLRef ,
Acc.DocTLItem.Debit AS deb,
Acc.DocTLItem.Credit AS cred,
info.MiladiToShamsi(Acc.DocTLItem.StartDocDate) Date,
Acc.TL.TLCode ,
Acc.DocTLItem.DocTLHeaderRef ,
Acc.DocTLHeader.Num
FROM Acc.DocTLItem
INNER JOIN Acc.TL ON Acc.DocTLItem.TLRef = Acc.TL.Id
INNER JOIN Acc.DocTLHeader ON Acc.DocTLItem.DocTLHeaderRef = Acc.DocTLHeader.Id
ORDER BY ( CASE WHEN debit > 0 THEN 0 ELSE 1 END ) ,
Acc.TL.TLCode ,
debit
Result:
TLRef deb cred Date TLCode DocTLHeaderRef Num
--------------------------------------------------------------------------
44 1 0 1396/09/12 111 16 2
44 1 0 1396/09/21 111 18 4
28 13 0 1396/09/11 982 15 1
28 10 0 1396/09/19 982 17 3
44 0 10 1396/09/19 111 17 3
44 0 1 1396/09/21 111 18 4
44 0 9 1396/09/11 111 15 1
44 0 1 1396/09/12 111 16 2
How can I Group by Date then sort by Date?
I need to generate a result set like this that debt comes first and then ordered by TLCode column after all group by date.
Expected result:
TLRef deb cred Date TLCode DocTLHeaderRef Num
--------------------------------------------------------------------------------
44 1 0 1396/09/12 111 16 2
28 13 0 1396/09/11 982 15 1
28 10 0 1396/09/19 982 17 3
44 0 9 1396/09/11 111 15 1
44 0 1 1396/09/12 111 16 2
44 0 10 1396/09/19 111 17 3
Sum 24 20
44 1 0 1396/09/21 111 18 4
44 0 1 1396/09/21 111 18 4
Sum 1 1
May be following query block can help you:
This query will work in 4 steps:
--1. Create a temporary table that we can take as base table (#TMP)
Select *
INTO #TMP
From
(
Select 44 as TLRef, 1 as deb, 0 as cred, '1396/09/12' as Date, 111 as TLCode, 16 as DocTLHeaderRef, 2 as Num Union All
Select 44 as TLRef, 1 as deb, 0 as cred, '1396/09/21' as Date, 111 as TLCode, 18 as DocTLHeaderRef, 4 as Num Union All
Select 28 as TLRef, 13 as deb, 0 as cred, '1396/09/11' as Date, 982 as TLCode, 15 as DocTLHeaderRef, 1 as Num Union All
Select 28 as TLRef, 10 as deb, 0 as cred, '1396/09/19' as Date, 982 as TLCode, 17 as DocTLHeaderRef, 3 as Num Union All
Select 44 as TLRef, 0 as deb, 10 as cred, '1396/09/19' as Date, 111 as TLCode, 17 as DocTLHeaderRef, 3 as Num Union All
Select 44 as TLRef, 0 as deb, 1 as cred, '1396/09/21' as Date, 111 as TLCode, 18 as DocTLHeaderRef, 4 as Num Union All
Select 44 as TLRef, 0 as deb, 9 as cred, '1396/09/11' as Date, 111 as TLCode, 15 as DocTLHeaderRef, 1 as Num Union All
Select 44 as TLRef, 0 as deb, 1 as cred, '1396/09/12' as Date, 111 as TLCode, 16 as DocTLHeaderRef, 2 as Num
) X
--2. Group table by "Date" and select sum of "deb", "cred" columns and insert result in another temporary table (#TMP2)
Select null as TLRef, SUM(deb) as deb, SUM(cred) as cred, Date, null as TLCode, null as DocTLHeaderRef, null as Num
INTO #TMP2
From #TMP
GROUP BY Date
--3. Union both tables to resulting table gets both detail and grouped data.
Select *
From
(
Select *, 0 as IsDetail From #TMP
Union All
Select *, 1 as IsDetail From #TMP2
) X
Order By Date,IsDetail
--4. Drop both temporary table
DROP TABLE #TMP
DROP TABLE #TMP2
You can try this for sorting.
;WITH CTE AS (
SELECT Acc.DocTLItem.TLRef ,
Acc.DocTLItem.Debit AS deb,
Acc.DocTLItem.Credit AS cred,
info.MiladiToShamsi(Acc.DocTLItem.StartDocDate) Date,
Acc.TL.TLCode ,
Acc.DocTLItem.DocTLHeaderRef ,
Acc.DocTLHeader.Num,
ROW_NUMBER() OVER(PARTITION BY Acc.DocTLItem.Debi, Acc.DocTLItem.Credit, Acc.TL.TLCode ORDER BY Acc.DocTLItem.StartDocDate ) AS RN
FROM Acc.DocTLItem
INNER JOIN Acc.TL ON Acc.DocTLItem.TLRef = Acc.TL.Id
INNER JOIN Acc.DocTLHeader ON Acc.DocTLItem.DocTLHeaderRef = Acc.DocTLHeader.Id
)
SELECT * FROM CTE
ORDER BY
RN,
( CASE WHEN deb > 0 THEN 0 ELSE 1 END ) ,
TLCode ,
[Date],
deb
I m trying to achieve flag setting for the condition in my table below
p_id mon_year e_id flag
---- --------- ----- -----
1 2011/11 20 0
1 2011/11 21 1
1 2012/01 22 1
1 2012/02 23 0
1 2012/02 24 0
1 2012/02 25 1
2 2011/11 28 0
2 2011/11 29 1
2 2012/01 30 1
grouping by p_id,e_id and mon_year, the flag is set for the last value in the month.
I m confused how can i achieve this
I tried to achieved this by using row_number and partition to seperate out the value. By still looking for to achieved
Output by using row_number query , i have got is as below:
Grouping by
p_id mon_year e_id row
---- --------- ----- -----
1 2011/11 20 1
1 2011/11 21 2
1 2012/01 22 1
1 2012/02 23 1
1 2012/02 24 2
1 2012/02 25 3
2 2011/11 28 1
2 2011/11 29 2
2 2012/01 30 1
Max of this value would set the flag column. But i m really bugged how to achieve it. Any help would be useful.
Thanks !!
I think this is what you're going for. . . The output exactly matches your example:
declare #t table (p_id int, [year] int, [month] int, [day] int)
insert #t select 1, 2011, 11, 20
union select 1, 2011, 11, 21
union select 1, 2012, 01, 22
union select 1, 2012, 02, 23
union select 1, 2012, 02, 24
union select 1, 2012, 02, 25
union select 2, 2011, 11, 28
union select 2, 2011, 11, 29
union select 2, 2012, 01, 30
select p_id, [year], [month], [day]
, case when r=1 then 1 else 0 end flag
from
(
select p_id, [year], [month], [day]
, row_number() over (partition by p_id, [year], [month] order by [day] desc) r
from #t
) x
order by p_id, [year], [month], [day]
Output:
p_id year month day flag
1 2011 11 20 0
1 2011 11 21 1
1 2012 1 22 1
1 2012 2 23 0
1 2012 2 24 0
1 2012 2 25 1
2 2011 11 28 0
2 2011 11 29 1
2 2012 1 30 1
Try ordering by descending. In that way, you don't have to look for maximum ROW_NUMBER but when ROW_NUMBER is 1 ;)
Something like this (I didn't completely understand what you want to achieve, so this is probably not 100% accurate):
WITH r_MyTable
AS
(
SELECT
*,
ROW_NUMBER() OVER (PARTITION BY mon_year ORDER BY p_id, e_id DESC) AS GroupRank
FROM MyTable
)
UPDATE r_MyTable
SET flag = CASE WHEN GroupRank = 1 THEN 1 ELSE 0 END;
You can use max statement on e_id to get last value for the month, code is below:
IF OBJECT_ID('tempdb..#tmptest') IS NOT NULL
DROP TABLE #tmptest
SELECT
*
INTO
#tmptest
FROM
(
SELECT '1' p_id, '2011/11' mon_year, '20' e_id, '0' flag UNION ALL
SELECT '1', '2011/11', '21', '1' UNION ALL
SELECT '1', '2012/01', '22', '1' UNION ALL
SELECT '1', '2012/02', '23', '0' UNION ALL
SELECT '1', '2012/02', '24', '0' UNION ALL
SELECT '1', '2012/02', '25', '1' UNION ALL
SELECT '2', '2011/11', '28', '0' UNION ALL
SELECT '2', '2011/11', '29', '1' UNION ALL
SELECT '2', '2012/01', '30', '1'
) as tmp
SELECT
tmptest.*
FROM
(
SELECT
MAX(e_id) e_id
,p_id
,mon_year
FROM
#tmptest
GROUP BY
p_id,mon_year
) tblLastValueEID
INNER JOIN
#tmptest tmptest
ON
tmptest.p_id = tblLastValueEID.p_id
AND
tmptest.mon_year = tblLastValueEID.mon_year
AND
tmptest.e_id = tblLastValueEID.e_id
I'm stuck with a seemingly easy query, but couldn't manage to get it working the last hours.
I have a table files that holds file names and some values like records in this file, DATE of creation (create_date), DATE of processing (processing_date) and so on. There can be multiple files for a create date in different hours and it is likely that they will not get processed in the same day of creaton, in fact it can even take up to three days or longer for them to get processed.
So let's assume I have these rows, as an example:
create_date | processing_date
------------------------------
2012-09-10 11:10:55.0 | 2012-09-11 18:00:18.0
2012-09-10 15:20:18.0 | 2012-09-11 13:38:19.0
2012-09-10 19:30:48.0 | 2012-09-12 10:59:00.0
2012-09-11 08:19:11.0 | 2012-09-11 18:14:44.0
2012-09-11 22:31:42.0 | 2012-09-21 03:51:09.0
What I want in a single query is to get a grouped column truncated to the day create_date with 11 additional columns for the differences between the processing_date and the create_date, so that the result should roughly look like this:
create_date | diff0days | diff1days | diff2days | ... | diff10days
------------------------------------------------------------------------
2012-09-10 | 0 2 1 ... 0
2012-09-11 | 1 0 0 ... 1
and so on, I hope you get the point :)
I have tried this and so far it works getting a single aggregated column for a create_date with a difference of - for example - 3:
SELECT TRUNC(f.create_date, 'DD') as created, count(1) FROM files f WHERE TRUNC(f.process_date, 'DD') - trunc(f.create_date, 'DD') = 3 GROUP BY TRUNC(f.create_date, 'DD')
I tried combining the single queries and I tried sub-queries, but that didn't help or at least my knowledge about SQL is not sufficient.
What I need is a hint so that I can include the various differences as columns, like shown above. How could I possibly achieve this?
That's basically the pivoting problem:
SELECT TRUNC(f.create_date, 'DD') as created
, sum(case TRUNC(f.process_date, 'DD') - trunc(f.create_date, 'DD')
when 0 then 1 end) as diff0days
, sum(case TRUNC(f.process_date, 'DD') - trunc(f.create_date, 'DD')
when 1 then 1 end) as diff1days
, sum(case TRUNC(f.process_date, 'DD') - trunc(f.create_date, 'DD')
when 2 then 1 end) as diff2days
, ...
FROM files f
GROUP BY
TRUNC(f.create_date, 'DD')
SELECT CreateDate,
sum(CASE WHEN DateDiff(day, CreateDate, ProcessDate) = 1 THEN 1 ELSE 0 END) AS Diff1,
sum(CASE WHEN DateDiff(day, CreateDate, ProcessDate) = 2 THEN 1 ELSE 0 END) AS Diff2,
...
FROM table
GROUP BY CreateDate
ORDER BY CreateDate
As you are using Oracle 11g you can also get desired result by using pivot query.
Here is an example:
-- sample of data from your question
SQL> create table Your_table(create_date, processing_date) as
2 (
3 select '2012-09-10', '2012-09-11' from dual union all
4 select '2012-09-10', '2012-09-11' from dual union all
5 select '2012-09-10', '2012-09-12' from dual union all
6 select '2012-09-11', '2012-09-11' from dual union all
7 select '2012-09-11', '2012-09-21' from dual
8 )
9 ;
Table created
SQL> with t2 as(
2 select create_date
3 , processing_date
4 , to_date(processing_date, 'YYYY-MM-DD')
- To_Date(create_date, 'YYYY-MM-DD') dif
5 from your_table
6 )
7 select create_date
8 , max(diff0) diff0
9 , max(diff1) diff1
10 , max(diff2) diff2
11 , max(diff3) diff3
12 , max(diff4) diff4
13 , max(diff5) diff5
14 , max(diff6) diff6
15 , max(diff7) diff7
16 , max(diff8) diff8
17 , max(diff9) diff9
18 , max(diff10) diff10
19 from (select *
20 from t2
21 pivot(
22 count(dif)
23 for dif in ( 0 diff0
24 , 1 diff1
25 , 2 diff2
26 , 3 diff3
27 , 4 diff4
28 , 5 diff5
29 , 6 diff6
30 , 7 diff7
31 , 8 diff8
32 , 9 diff9
33 , 10 diff10
34 )
35 ) pd
36 ) res
37 group by create_date
38 ;
Result:
Create_Date Diff0 Diff1 Diff2 Diff3 Diff4 Diff5 Diff6 Diff7 Diff8 Diff9 Diff10
--------------------------------------------------------------------------------
2012-09-10 0 2 1 0 0 0 0 0 0 0 0
2012-09-11 1 0 0 0 0 0 0 0 0 0 1