Suppose I had the following table:
UserId AttributeId DateStart
1 3 1/1/2020
1 4 1/9/2020
1 3 2/2/2020
2 3 3/5/2020
2 3 4/1/2020
2 3 5/1/2020
For each unique UserId/AttributeId pair, it is assumed that the DateEnd is the day prior to the next DateStart for that pair, otherwise it is null (or some default like crazy far into the future - 12/31/3000).
Applying this operation to the above table would yield:
UserId AttributeId DateStart DateEnd
1 3 1/1/2020 2/1/2020
1 4 1/9/2020 <null>
1 3 2/2/2020 <null>
2 3 3/5/2020 3/31/2020
2 3 4/1/2020 4/30/2020
2 3 5/1/2020 <null>
What T-SQL, executing in SQL Server 2008 R2, would accomplish this?
I have changed query)
Try this please:
SELECT
UserId,AttributeId,DateStart,Min(DateEnd)DateEnd
FROM
(
SELECT X.UserId,X.AttributeId,X.DateStart, DATEADD(DD,-1,Y.DateStart) DateEnd
FROM TAB X LEFT JOIN TAB Y
ON (X.UserId=Y.UserId) AND (X.AttributeId=Y.AttributeId)
AND (X.DateStart<Y.DateStart)
)
T
GROUP BY UserId,AttributeId,DateStart
ORDER BY DateStart
You are describing lead():
select t.*,
dateadd(day, -1, lead(dateStart) over (partition by userId, attributeId order by dateStart)) as dateEnd
from t;
Now I have a table in redshift like this:
Table Project_team
Employee_ID Employee_Name Start_date Ranking Is_leader Is_Parttime_Staff
Emp001 John 2014-04-01 1 No No
Emp002 Mary 2015-02-01 2 No Yes
Emp003 Terry 2015-02-15 3 Yes No
Emp004 Peter 2016-02-05 4 No No
Emp004 Morris 2016-05-01 5 No No
Initially there is no ranking for staff.
What I do is to use the rank() function like this:
RANK() over (partition by Employee_ID,Employee_Name order by Start_date) as page_seq
However, now I want to manipulate the ranking based on their status. If the employee is leader then he or she should be ranked at the first. If he or she is parttime staff then should be ranked at the last. The table should be sth like this:
Employee_ID Employee_Name Start_date Ranking Is_leader Is_Parttime_Staff
Emp003 Terry 2015-02-15 1 Yes No
Emp001 John 2014-04-01 2 No No
Emp004 Peter 2016-02-05 3 No No
Emp004 Morris 2016-05-01 4 No No
Emp002 Mary 2015-02-01 5 No Yes
I tried to use the case function to manipulate it like
Case when Is_leader = true then Ranking = 1 else RANK() over (partition by Employee_ID,Employee_Name order by Start_date) End as page_seq.
However it does not work.
What is the process that I need to change the ranking based on other conditions in other columns?
Many thanks!
use dense_rank()
demo
select *,dense_Rank() over(order by case when leader='yes' then 1 else 0 end desc, case when parmanent='yes' then 1 else 0 end)
from cte1
output:
id name leader parmanent employeerank
1 A yes no 1
3 C no no 2
2 B no yes 3
Feels like it should be simple but my mind has gone blank so would appreciate any help!
Let's say I have this dataset
Date sale_id salesperson Missed_payment_this_month
01/01/2016 1001 John 1
01/01/2016 1002 Bob 0
01/01/2016 1003 Bob 0
01/01/2016 1004 John N/A
01/02/2016 1001 John 1
01/02/2016 1002 Bob 1
01/02/2016 1003 Bob 0
01/02/2016 1004 John 1
01/03/2016 1001 John 1
01/03/2016 1002 Bob 0
01/03/2016 1003 Bob 0
01/03/2016 1004 John 1
And want to add these two columns to the end. They look at the number of missed payments previously, by sales_id and salesperson.
Previous_missed_payment_by_sale_id Previous_missed_payment_by_sales person
0 0
0 0
0 0
0 0
1 1
0 0
0 0
0 1
2 3
1 1
0 1
1 3
sales_id is ok but getting it over sales persons is giving me an error (group by) or adding in extra columns. I need to keep the rows constant.
My best guess that returns extra columns:
select t1.Date, t1.sale_id, t1.salesperson
,sum(case when t2.Missed_payment_this_month = '1' then 1 else 0 end) previous_missed_sales_id
,sum(case when t2.Missed_payment_this_month = '1' then 1 else 0 end) OVER (PARTITION by t1.salesperson) previous_missed_salesperson
from [dbo].[simple_join_table2] t1
inner join [dbo].[simple_join_table2] t2 on
(t2.[Date] < t1.[Date] AND t1.[sale_id] = t2.[sale_id])
group by t1.Date, t1.sale_id, t1.salesperson
,case when t2.Missed_payment_this_month = '1' then 1 else 0 end
this is the output:
Date sale_id salesperson previous_missed_sales_id previous_missed_salesperson
01/02/2016 1002 Bob 0 1
01/02/2016 1003 Bob 0 1
01/03/2016 1002 Bob 0 1
01/03/2016 1002 Bob 1 1
01/03/2016 1003 Bob 0 1
01/02/2016 1001 John 1 3
01/02/2016 1004 John 0 3
01/03/2016 1001 John 2 3
01/03/2016 1004 John 0 3
01/03/2016 1004 John 1 3
Is this possible without another sub query? I guess another way to put it is i'm trying to mimic the sumx and earlier functions of Powerpivot.
If you are on 2012+ use windowing aggregates. Previous = sum all_previous_including_curret - sum current. Ms sql default window is exactly ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
with [simple_join_table2] as(
-- sample data
select cast(valuesDate as Date) valuesDate, sale_id, salesperson, Missed_payment_this_month
from (
values
('20160101',1001,'John', 1)
,('20160101',1002,'Bob ', 0)
,('20160101',1003,'Bob ', 0)
,('20160101',1004,'John',null)
,('20160201',1001,'John', 1)
,('20160201',1002,'Bob ', 1)
,('20160201',1003,'Bob ', 0)
,('20160201',1004,'John', 1)
,('20160301',1001,'John', 1)
,('20160301',1002,'Bob ', 0)
,('20160301',1003,'Bob ', 0)
,('20160301',1004,'John', 1)
) t(valuesDate, sale_id, salesperson, Missed_payment_this_month)
)
select valuesDate,sale_id, salesperson, Missed_payment_this_month,
byidprevmonth = sum(Missed_payment_this_month ) over(partition by sale_id order by valuesDate)
- sum(Missed_payment_this_month) over(partition by valuesDate, sale_id),
bypersonprevmonth = sum(Missed_payment_this_month) over(partition by salesperson order by valuesDate)
- sum(Missed_payment_this_month) over(partition by valuesDate, salesperson)
from [simple_join_table2]
order by salesperson, valuesDate
I have table like this:
ID Region CreatedDate Value
--------------------------------
1 USA 2016-01-01 5
2 USA 2016-02-02 10
3 Canada 2016-02-02 2
4 USA 2016-02-03 7
5 Canada 2016-03-03 3
6 Canada 2016-03-04 10
7 USA 2016-03-04 1
8 Cuba 2016-01-01 4
I need to sum column Value grouped by Region and CreatedDate by year and month. The result will be
Region Year Month SumOfValue
--------------------------------
USA 2016 1 5
USA 2016 2 17
USA 2016 3 1
Canada 2016 2 2
Canada 2016 3 13
Cuba 2016 1 4
BUT I want to replace all repeated values in column Region with empty string except first met row. The finish result must be:
Region Year Month SumOfValue
--------------------------------
USA 2016 1 5
2016 2 17
2016 3 1
Canada 2016 2 2
2016 3 13
Cuba 2016 1 4
Thank you for a solution. It will be advantage if solution will replace also in column Year
You need to use SUM and GROUP BY to get the SumOfValue. For the formatting, you can use ROW_NUMBER:
WITH Cte AS(
SELECT
Region,
[Year] = YEAR(CreatedDate),
[Month] = MONTH(CreatedDate),
SumOfValue = SUM(Value),
Rn = ROW_NUMBER() OVER(PARTITION BY Region ORDER BY YEAR(CreatedDate), MONTH(CreatedDate))
FROM #tbl
GROUP BY
Region, YEAR(CreatedDate), MONTH(CreatedDate)
)
SELECT
Region = CASE WHEN Rn = 1 THEN c.Region ELSE '' END,
[Year],
[Month],
SumOfValue
FROM Cte c
ORDER BY
c.Region, Rn
ONLINE DEMO
Although this can be done in TSQL, I suggest you do the formatting on the application side.
Query that follows the same order as the OP.
I have a table GAMES with this information:
Id_Game Id_Player1 Id_Player2 Week
--------------------------------------
1211 Peter John 2
1215 John Louis 13
1216 Louis Peter 17
I would like to get a list of the last week when each player has played, and the number of games, which should be this:
Id_Player Week numberGames
-----------------------------
Peter 17 2
John 13 2
Louis 17 2
But instead I get this one (notice on Peter week):
Id_Player Week numberGames
-----------------------------
Peter 2 2
John 13 2
Louis 17 2
What I do is this:
SELECT Id_Player,
MAX(Week) AS Week,
COUNT(*) as numberGames
FROM ((SELECT Id_Player1 as Id_Player, Week
FROM Games)
UNION ALL
(SELECT Id_Player2 as Id_Player, Week
FROM Games)) AS g2
GROUP BY Id_Player;
Could anyone help me to find the mistake?
What is the datatype of the Week column? If the datatype of Week is varchar you would get this behavior.