Select records from database in specific date period - sql

Ok I have this example table:
+-------+--------+-----------+
| users | groups | startDate |
+-------+--------+-----------+
| Foo | A | 1 Aug 18 |
| Foo | B | 1 Jan 18 |
| Boo | C | 1 Jan 18 |
| Doo | B | 1 Jan 18 |
| Loo | B | 1 Sep 18 |
+-------+--------+-----------+
and I want to select (Group B) users with "startDate" not higher than today and also without any other records for other groups in more recent "startDate" also not higher than today, so the correct result should be:
+-------+--------+-----------+
| users | groups | startDate |
+-------+--------+-----------+
| Doo | B | 1 Jan 18 |
+-------+--------+-----------+
I tried the following code but didn't get what I need:
DECLARE #StartDate date = '2018-08-01'
DECLARE #GroupID varchar(1) = 'B';
WITH CurrentUsers AS (SELECT users, groups, startDate, ROW_NUMBER() OVER(PARTITION BY users ORDER BY CASE WHEN startDate>#StartDate THEN 0 ELSE 1 END, ABS(DATEDIFF(DAY, #StartDate, startDate)) ASC) AS RowNum FROM usersTable) SELECT users FROM CurrentUsers WHERE groups=#GroupID AND RowNum = 1

If I understand correctly, you seem to want:
select user
from currentusers cu
group by user
having sum(case when groups = #GroupID then 1 else 0 end) > 0 and -- in Group B
max(startdate) < #StartDate;
EDIT:
The above is based on a misunderstanding. You want people whose active group is today. I think you want:
WITH CurrentUsers AS (
SELECT users, groups, startDate,
ROW_NUMBER() OVER (PARTITION BY users
ORDER BY startDate DESC
) as seqnum
FROM usersTable
WHERE startDate <= #StartDate
)
SELECT users
FROM CurrentUsers
WHERE groups=#GroupID AND seqnum = 1;

Related

SQL - get rid of the nested aggregate select

There is a table Payment, which for example tracks the amount of money user puts into account, simplified as
===================================
Id | UserId | Amount | PayDate |
===================================
1 | 42 | 11 | 01.02.99 |
2 | 42 | 31 | 05.06.99 |
3 | 42 | 21 | 04.11.99 |
4 | 24 | 12 | 05.11.99 |
What is need is to receive a table with balance before payment moment, eg:
===============================================
Id | UserId | Amount | PayDate | Balance |
===============================================
1 | 42 | 11 | 01.02.99 | 0 |
2 | 42 | 31 | 05.06.99 | 11 |
3 | 42 | 21 | 04.11.99 | 42 |
4 | 24 | 12 | 05.11.99 | 0 |
Currently the select statement looks something like
SELECT
Id,
UserId,
Amount,
PaidDate,
(SELECT sum(amount) FROM Payments nestedp
WHERE nestedp.UserId = outerp.UserId AND
nestedp.PayDate < outerp.PayDate) as Balance
FROM
Payments outerp
How can I rewrite this select to get rid of the nested aggregate selection? The database in question is SQL Server 2019.
You need to use cte with some custom logic to handle this type of problem.
WITH PaymentCte
AS (
SELECT ROW_NUMBER() OVER (
PARTITION BY UserId ORDER BY Id
) AS RowId
,Id
,UserId
,PayDate
,Amount
,SUM(Amount) OVER (
PARTITION BY UserId ORDER BY Id
) AS Balance
FROM Payment
)
SELECT X.Id
,X.UserId
,X.Amount
,X.PayDate
,Y.Balance
FROM PaymentCte x
INNER JOIN PaymentCte y ON x.userId = y.UserId
AND X.RowId = Y.RowId + 1
UNION
SELECT X.Id
,X.UserId
,X.Amount
,X.PayDate
,0 AS Balance
FROM PaymentCte x
WHERE X.RowId = 1
This provides the desired output
You can try the following using lag with a cumulative sum
with b as (
select * , isnull(lag(amount) over (partition by userid order by id),0) Amt
from t
)
select Id, UserId, Amount, PayDate,
Sum(Amt) over (partition by userid order by id) Balance
from b
order by Id
Thanks to other participants' leads I came up with a query that (seems) to work:
SELECT
Id,
UserId,
Amount,
PayDate,
COALESCE(sum(Amount) over (partition by UserId
order by PayDate
rows between unbounded preceding and 1 preceding), 0) as Balance
FROM
Payments
ORDER BY
UserId, PayDate
Lots of related examples can be found here

How to transform a range of records to the values of the record after that range in SQL?

I am trying to replace some bad input records within a specific date range with correct records. However, I'm not sure if there is an efficient way to do so. Therefore my question is how to transform a (static) range of records to the values of the record after that range in SQL? Below you will find an example to clarify what I try to achieve.
In this example you can see that customer number 1 belongs to group number 0 (None) in the period from 25-06-2020 to 29-06-2020. From 30-06-2020 to 05-07-2020 this group number changes from 0 to 11 for customer number 1. This static period contains the wrong records, and should be changed to the values that are valid on 06-07-2020 (group number == 10). Is there a way to do this?
If I understand correctly, you can use window functions to get the data on that particular date and case logic to assign it to the specific date range:
select t.*,
(case when date >= '2020-07-01' and date <= '2020-07-05'
then max(case when date = '2020-07-06' then group_number end) over (partition by customer_number)
else group_number
end) as imputed_group_number,
(case when date >= '2020-07-01' and date <= '2020-07-05'
then max(case when date = '2020-07-06' then role end) over (partition by customer_number)
else role
end) as imputed_role
from t;
If you want to update the values, you can use JOIN:
update t
set group_number = tt.group_number,
role = tt.role
from tt
where tt.customer_number = t.customer_number and tt.date = '2020-07-06'
I think that window function first_value() does what you want:
select
date,
customer_number,
first_value(group_number) over(partition by customer_number order by date) group_number,
first_value(role) over(partition by customer_number order by date) role
from mytable
You can do the following as an example. Here i have choosen the criteria that if role='Leader' its a bad record and therefore you would be applying the next available group_number --> in column group_number1, and role1.
I have used a smaller subset of the rows you have in your excel example.
select date1
,customer_number
,group_number
,case when role='Leader' then
(select t1.group_number
from t t1
where t1.date1>t.date1
and t1.role<>'Leader'
order by t1.date1 asc
limit 1
)
else group_number
end as group_number1
,role
,case when role='Leader' then
(select t1.role
from t t1
where t1.date1>t.date1
and t1.role<>'Leader'
order by t1.date1 asc
limit 1
)
else role
end as role1
from t
order by 1
+------------+-----------------+--------------+---------------+--------+--------+
| DATE1 | CUSTOMER_NUMBER | GROUP_NUMBER | GROUP_NUMBER1 | ROLE | ROLE1 |
+------------+-----------------+--------------+---------------+--------+--------+
| 2020-06-25 | 1 | 0 | 0 | None | None |
| 2020-06-26 | 1 | 0 | 0 | None | None |
| 2020-06-27 | 1 | 0 | 0 | None | None |
| 2020-06-28 | 1 | 0 | 0 | None | None |
| 2020-06-29 | 1 | 0 | 0 | None | None |
| 2020-06-30 | 1 | 11 | 10 | Leader | Member |
| 2020-07-01 | 1 | 11 | 10 | Leader | Member |
| 2020-07-06 | 1 | 10 | 10 | Member | Member |
+------------+-----------------+--------------+---------------+--------+--------+
db fiddle link
https://dbfiddle.uk/?rdbms=db2_11.1&fiddle=c95d12ced067c1df94947848b5a94c14

How can I query on the same table & get result in 2 columns

I have this table with the following data
+-------+-----------+-------+-------+
| Owner | closeDate | stage | value |
+-------+-----------+-------+-------+
| Abc | 1-1-2017 | won | 1000 |
| Abc | 31-1-2017 | won | 2000 |
| Abc | 3-1-2017 | lost | 1000 |
| Abc | 1-2-2017 | won | 5000 |
| Def | 1-2-2017 | won | 3000 |
| Def | 28-2-2017 | won | 4000 |
+-------+-----------+-------+-------+
I am aiming for a result like this where it groups the total value for each owner per month for only won stage
+-------+----------+----------+
| Owner | JanValue | FebValue |
+-------+----------+----------+
| Abc | 3000 | 5000 |
| Def | 0 | 7000 |
+-------+----------+----------+
I have tried this query but the results is getting in the record
SELECT Owner, sum(value) ,datename(month, closedate) as 'month'
FROM Table1
where closedate between '2017/01/1' and '2017/01/31' and stage='won'
GROUP BY Owner,datename(month, closedate)
UNION ALL
SELECT Owner, sum(value) ,datename(month, closedate) as 'month'
FROM Table1
where closedate between '2017/02/1' and '2017/02/28' and stage='won'
GROUP BY Owner,datename(month, closedate)
You are looking for a pivot query, this time involving the month of the close date:
SELECT
Owner,
SUM(CASE WHEN DATEPART(month, closeDate) = 1 THEN value END) AS JanValue,
SUM(CASE WHEN DATEPART(month, closeDate) = 2 THEN value END) AS FebValue,
...
FROM Table1
WHERE
stage = 'won' AND
DATEPART(year, closeDate) = 2017
GROUP BY
Owner;
Note that this approach gets stretched a bit thin when you want to consider having a monthly report across many years. In that case, you might want to use dynamic SQL to do the pivot. But, in such a case having so many months across columns would not be the most readable output IMO.
Try this for Dynamic result
SELECT
*
FROM
(
SELECT
*
FROM
(
SELECT
Owner,
CloseDate = DATENAME(month,CAST(CloseDate AS DATE)),
Val
FROM Table1
)T
PIVOT
(
SUM(VAL)
FOR CloseDate IN
(
[January],[February],[March],[April],[May],[June],[July],[August],[September],[October],[November],[December]
)
)Pvt
)Q
This will be your sample result
for the following input
The result is without filtering the Stage. You can give it in the following select
SELECT
Owner,
CloseDate = DATENAME(month,CAST(CloseDate AS DATE)),
Val
FROM Table1
where <Your Conditions>

Counting rows until where clause condition is satisfied

I have a table of data which contains attributes like body, offer_id and created_at. When in chronological order I need to find the count of rows until 'body' satisfies my 'where' clause for a particular offer_id, i.e.
created at | offer id | body
---------------------------------------------
Jan | 12 | does not satisfy
Feb | 12 | does not satisfy
Mar | 12 | satisfies
Jan | 13 | does not satisfy
Feb | 13 | satisfies
Jan | 14 | does not satisfy
Feb | 14 | satisfies
Mar | 14 | does not satisfy
Apr | 14 | does not satisfy
Expected output:
offer_id | count
---------|------
12 | 3
13 | 2
14 | 2
First - you need to generate a number for every record inside its offer window:
select t.*, rownumber() over (partition by t.offer_ID order by t.created_at) as rn
from t
it will result in something like:
created at | offer id | body | rn
---------------------------------------------
Jan | 12 | does not satisfy | 1
Feb | 12 | does not satisfy | 2
Mar | 12 | satisfies | 3
Jan | 13 | does not satisfy | 1
Feb | 13 | satisfies | 2
Jan | 14 | does not satisfy | 1
Feb | 14 | satisfies | 2
Mar | 14 | does not satisfy | 3
Apr | 14 | does not satisfy | 4
from this subquery you can get a minimal rn (first record that satisfies the condition):
with sub as (
select t.*, rownumber() over (partition by t.offer_ID order by t.created_at) as rn
from t)
select offer_ID, min(rn)
from sub
where (satisfies)
group by offer_ID
straight as an arrow
select t.offer_id, count(*)
from mytable t
where not exists
(
select 1 from mytable tt
where tt.offer_id = t.offer_id
and tt.created_at < t.created_at
and tt.body = 'satisfies'
)
group by t.offer_id
DO you have tried something like this?
select count(*)
from mytable
where "satisfies"
Or, if you want to count only the different offer_id:
select count(distinct offer_id)
from mytable
where "satisfies"
Or, finally:
select count(offer_id)
from mytable
where "satisfies"
group by offer_id
Does is this what you need? If not, give me more details! ;)
One way to count the number that don't satisfy the condition is to use a cumulative sum:
select offer_id, count(*)
from (select t.*,
sum(case when <condition> then 1 else 0 end) over
(partition by offer_id order by created_at) as num
from t
) t
where num = 0;
However, this is one less than the number you have. So, instead:
select offer_id,
(sum(case when num = 0 then 1 else 0 end) +
max(case when num = 1 then 1 else 0 end)
)
from (select t.*,
sum(case when <condition> then 1 else 0 end) over
(partition by offer_id order by created_at) as num
from t
) t
where num in (0, 1)
If you just want the count of offer_id , you can use the below
select offer_id, count(*) as count_1 from table_name
where <<your condition>>
group by offer_id
If my understanding is wrong, please share a detailed description on what exactly you require.
You can break the task in two parts:
For each offer ID find the record/date that first satisfies the condition.
Count all records per offer ID until that found record/date.
With a subquery in SELECT:
select
offer_id,
(
select count(*)
from mytable m
where m.offer_id = mfit.offer_id
and m.created_at <= min(mfit.created_at)
) as cnt
from mytable mfit
where <condition>
group by offer_id
or a subquery in FROM:
select
mfit.offer_id,
count(*) as cnt
from
(
select offer_id, min(created_at) as min_date
from mytable
where <condition>
group by offer_id
) mfit
join mytable m on m.offer_id = mfit.offer_id and m.created_at <= mfit.created_at
group by mfit.offer_id;
Here is another query using an analytic function. Analytic functions have the advantage that you read the table just once and get different aggregations on-the-fly. The idea is to have a running total per offer_id with a one for a record matching your condition plus a count per offer_id. This looks as follows:
created at | offer id | body | s | c
---------------------------------------------------
Jan | 13 | does not satisfy | 0 | 1
Feb | 13 | satisfies | 1 | 2
Jan | 14 | does not satisfy | 0 | 1
Feb | 14 | satisfies | 1 | 2
Mar | 14 | does not satisfy | 1 | 3
Apr | 14 | does not satisfy | 1 | 4
May | 14 | satisfies | 2 | 5
Jun | 14 | does not satisfy | 2 | 6
Apr | 14 | does not satisfy | 2 | 7
May | 14 | satisfies | 3 | 8
So we are simply looking for the min(c) for s = 1.
select offer_id, min(c) as cnt
from
(
select
offer_id,
sum(case when <condition> then 1 else 0 end)
over (partition by offer_id order by created_at) as s,
count(*) over (partition by offer_id order by created_at) as c
from mytable
) data
where s = 1
group by offer_id
order by offer_id;

SQL Query to get 1 latest record per member based on a latest date

I have a database Like this :
---------------------------------------------------
| MemberID | IntrCode | InstruReply | CreatedDate | ...other 2 more columns
---------------------------------------------------
| 6 | 1 | Activated | 26 FEB 2014 |
| 7 | 2 | Cancelled | 25 FEB 2014 |
| 6 | 2 | Cancelled | 15 FEB 2014 |
| 7 | 1 | Activated | 03 FEB 2014 |
---------------------------------------------------
Now based on the CreatedDate and the instCode, I need a query that returns the results as follows based on instCode as parameter.
When #IntrCode = 1, I need only active MemberID on the latest(CreatedDate).
PS: please note member 7 is cancelled when checking latest (CreatedDate).
Output
---------------------------------------------------
| MemberID | IntrCode | InstruReply | CreatedDate |
---------------------------------------------------
| 6 | 1 | Activated | 26 FEB 2014 |
---------------------------------------------------
I wrote the below Query and I cant show other columns.(I appreciate all your help)
SELECT MemberID, MAX(CreatedDate) AS LatestDate FROM MyTable GROUP BY MemberID
You can use a CTE and the ROW_NUMBER function:
With CTE As
(
SELECT t.*,
RN = ROW_NUMBER()OVER(PARTITION BY MemberID Order By CreatedDate DESC)
FROM MyTable t
WHERE IntrCode = #IntrCode
)
SELECT MemberID, IntrCode, InstruReply, CreatedDate
FROM CTE
WHERE RN = 1
DEMO
Try this
Method 1:
SELECT * FROM
(
SELECT *,ROW_NUMBER()OVER(PARTITION BY MemberID Order By CreatedDate DESC) RN
FROM MyTable WHERE InstruReply = 'Activated' AND IntrCode = #IntrCode
) AS T
WHERE RN = 1
Method 2 :
SELECT * FROM
(
Select MemberID,max(CreatedDate) as LatestDate from MyTable group by MemberID
) As s INNER Join MyTable T ON T.MemberID = S.MemberID AND T.CreatedDate = s.LatestDate
WHere T.InstruReply = 'Activated' T.IntrCode = #IntrCode
Fiddle Demo
Output
---------------------------------------------------
| MemberID | IntrCode | InstruReply | CreatedDate |
---------------------------------------------------
| 6 | 1 | Activated | 26 FEB 2014 |
---------------------------------------------------
This way you can select whole row for each member with latest date.
SELECT * FROM MyTable t1
WHERE NOT EXISTS (SELECT *
FROM MyTable t2
WHERE t2.CreatedDate > t1.CreatedDate
AND t1.MemberID = t2.MemberID)
AND IntrCode = #IntrCode
;with TempData as (Select MemberId, IntrCode ,InstruReply,CreatedDate , MemberCount =ROW_NUMBER()
over(PARTITION By MemberId Order By CreatedDate desc)
From MyTable
)
Select *
From TempData
Where MemberCount =1
The query you wrote is simply missing a WHERE clause which will help you to filter the data you need:
SELECT MemberID, MAX(CreatedDate) AS LatestDate
FROM MyTable
WHERE IntrCode = #IntrCode
AND InstruReply = 'Activated'
GROUP BY MemberID