SQL sum between dates - sql

I need to sum values in intersect of range dates.
sample of source data
person
item
start_date
end_date
value
a
apple
08.03.2018
29.03.2018
3
a
apple
01.01.2019
08.08.2021
2
a
apple
01.01.2019
09.10.2021
5
a
pen
10.10.2021
30.10.2021
2
a
cup
08.03.2018
20.03.2018
8
a
cup
15.03.2018
20.03.2019
2
b
pen
10.10.2021
30.10.2021
2
b
pen
10.10.2021
30.10.2021
6
b
orange
10.11.2021
10.11.2022
3
b
orange
20.11.2021
20.12.2021
2
expected result
person
item
start_date
end_date
value
a
apple
08.03.2018
29.03.2018
3
a
apple
01.01.2019
08.08.2021
7
a
apple
09.08.2021
09.10.2021
5
a
pen
10.10.2021
30.10.2021
2
a
cup
08.03.2018
14.03.2018
8
a
cup
15.03.2018
20.03.2018
10
a
cup
21.03.2018
20.03.2019
2
b
pen
10.10.2021
30.10.2021
8
b
orange
10.11.2021
19.11.2021
3
b
orange
20.11.2021
20.12.2021
5
b
orange
21.12.2021
10.11.2022
3
I use something code like this, but it is to simple, and results are not good
select
person
,item
,Min([start_date]) as [start_date]
,Max([end_date]) as [end_date]
,Sum([value]) as [value]
FROM table
Group by person, item
I tried to use LAG() function, but i'm lost

I have no access to Synapse , but assuming it's compatibile with SQL server...
db<>fiddle
Internal query build data ranges, creating additional dates for overlapping periods if needed. Main query just sum values.
select person, item, range_from, range_to,
(select sum(value) from test
where person = r.person
and item = r.item
and range_from between start_date and end_date) value
from (
select
be,
person,
item,
date range_from,
lead(date,1) over(partition by person, item order by date,be) range_to
from (
select 1 be, person, item, start_date date from test
union
select 2, person, item, end_date from test
union
select 2, person, item, dateadd(day,-1,start_date) from test a
where exists (select * from test where a.person = person and a.item = item and a.start_date > start_date and a.start_date < end_date)
union
select 1, person, item, dateadd(day,1,end_date) from test b
where exists (select * from test where b.person = person and b.item = item and b.end_date > start_date and b.end_date < end_date)
) k
) r where r.be = 1 order by r.person, r.item, r.range_from
column be contains:
1 - for period start
2 - for period end

Related

How to implement multiple joins on different fields based on different functions in SQL?

I have few tables as below. And, I need to fetch the records on the basis of each maximum level and latest level (ordered by date) for each ID and Type column. I'm using SQL Server to run the query. So far, I have tried the following SQL query:
select f.ID,x.MAX_LEVEL,f.TYPE, f.DATE
from (
select ID
,TYPE
, MAX(LEVEL) as MAX_LEVEL
from TABLEA
GROUP BY ID, TYPE
) as x
,
(
select ID
,TYPE
, MAX(DATE) as MAX_DATETIME
from TABLEA
GROUP BY ID, TYPE
) as y
inner join TABLEA as f
on f.ID = x.ID and f.LEVEL = x.MAX_LEVEL
inner join TABLEA as g
on f.ID = y.ID and g.DATE = y.MAX_DATETIME
and f.DATE > DATEADD(day, -1, GETDATE())
TABLEA
ID TYPE LEVEL DATE
1 ELECTRIC 2 01/06/2019
1 GAS 2 01/06/2019
2 ELECTRIC 2 01/06/2019
3 ELECTRIC 3 01/06/2019
3 ELECTRIC 3 01/06/2019
1 GAS 3 05/06/2019
1 GAS 5 13/06/2019
2 ELECTRIC 5 07/06/2019
3 GAS 5 08/06/2019
6 ELECTRIC 3 02/06/2019
2 ELECTRIC 3 04/06/2019
3 ELECTRIC 3 05/06/2019
2 GAS 10 06/06/2019
2 GAS 3 11/06/2019
3 ELECTRIC 3 11/06/2019
1 ELECTRIC 5 01/06/2019
1 GAS 3 02/06/2019
6 ELECTRIC 5 01/06/2019
1 ELECTRIC 5 10/06/2019
Expected Result:
ID TYPE MAX_LEVEL LATEST_LEVEL
1 ELECTRIC 5 5
1 GAS 5 3
2 ELECTRIC 5 5
2 GAS 10 3
3 ELECTRIC 3 3
3 GAS 5 5
6 ELECTRIC 5 3
Any thoughts, how could I achieve this?
if you are using sqlserver, you can try this.
SELECT ID, TYPE, MAX(T1.[LEVEL]) AS MAX_LEVEL, X.LEVEL AS LATEST_LEVEL
FROM TABLEA T1
OUTER APPLY (SELECT TOP 1 [LEVEL] FROM TABLEA T2 WHERE T2.ID = T1.ID AND T2.TYPE = T1.TYPE ORDER BY T2.[DATE] DESC) X
GROUP BY ID, TYPE, X.[LEVEL]
ORDER BY ID, TYPE
Unfortunately, SQL Server doesn't have a "first" or "last" aggregation function. But it does have first_value() and last_value() as window functions. So, one method is:
select distinct t.id, t.type
max(t.level) over (partition by id, type) as max_level,
first_value(t.level) over (partition by id, type order by date desc) as latest_level
from t;
Another alternative is using window functions in a subquery:
select id, type, max(level) as max_level,
max(case when seqnum = 1 then level end) as latest_level
from (select t.*,
row_number() over (partition by id, type order by date desc) as seqnum
from t
) t
group by id, type;

Is there a way to find active users in SQL?

I'm trying to find the total count of active users in a database. "Active" users here as defined as those who have registered an event on the selected day or later than the selected day. So if a user registered an event on days 1, 2 and 5, they are counted as "active" throughout days 1, 2, 3, 4 and 5.
My original dataset looks like this (note that this is a sample - the real dataset will run to up to 365 days, and has around 1000 users).
Day ID
0 1
0 2
0 3
0 4
0 5
1 1
1 2
2 1
3 1
4 1
4 2
As you can see, all 5 IDs are active on Day 0, and 2 IDs (1 and 2) are active until Day 4, so I'd like the finished table to look like this:
Day Count
0 5
1 2
2 2
3 2
4 2
I've tried using the following query:
select Day as days, sum(case when Day <= days then 1 else 0 end)
from df
But it gives incorrect output (only counts users who were active on each specific days).
I'm at a loss as to what I could try next. Does anyone have any ideas? Many thanks in advance!
I think I would just use generate_series():
select gs.d, count(*)
from (select id, min(day) as min_day, max(day) as max_day
from t
group by id
) t cross join lateral
generate_series(t.min_day, .max_day, 1) gs(d)
group by gs.d
order by gs.d;
If you want to count everyone as active from day 1 -- but not all have a value on day 1 -- then use 1 instead of min_day.
Here is a db<>fiddle.
A bit verbose, but this should do:
with dt as (
select 0 d, 1 id
union all
select 0 d, 2 id
union all
select 0 d, 3 id
union all
select 0 d, 4 id
union all
select 0 d, 5 id
union all
select 1 d, 1 id
union all
select 1 d, 2 id
union all
select 2 d, 1 id
union all
select 3 d, 1 id
union all
select 4 d, 1 id
union all
select 4 d, 2 id
)
, active_periods as (
select id
, min(d) min_d
, max(d) max_d
from dt
group by id
)
, days as (
select distinct d
from dt
)
select d.d
, count(ap.id)
from days d
join active_periods ap on d.d between ap.min_d and ap.max_d
group by 1
order by 1 asc
You need count by day.
select
id,
count(*)
from df
GROUP BY
id

How to query to get only rows where a change took place? (changes can go back and forth)

I'm working with a table that has dozens of rows per customer, each with a date and several columns representing various statuses. I'm only interested in pulling the rows where a change took place in one particular column (specifically 0 to 1 or 1 to 0, see status column below).
I can't simply use row_number() over (partition by customer_id, status order by date) because the status can go back and forth between 0 and 1.
Here's a sample of what I'm trying to do (note that there are two different Customer IDs in this example):
Original Table
Row CustomerID Status Date
1 ABC 0 3/12/2013
2 ABC 0 3/31/2013
3 ABC 1 4/13/2013
4 ABC 1 4/15/2013
5 ABC 1 5/17/2013
6 ABC 0 6/25/2013
7 ABC 0 6/28/2013
8 XYZ 0 8/2/2013
9 XYZ 1 5/10/2013
10 XYZ 0 5/18/2013
11 XYZ 1 8/23/2013
12 XYZ 1 9/7/2013
Desired Query Output
Customer ID Status Date
ABC 1 4/13/2013
ABC 0 6/25/2013
XYZ 1 5/10/2013
XYZ 0 5/18/2013
XYZ 1 8/23/2013
You were on the right track with ROW_NUMBER. It can be especially helpful in joining the table to itself in cases such as yours.
The following should get you what you're looking for:
WITH CTE AS (
SELECT Row,
CustomerID,
Status,
Date,
ROW_NUMBER() OVER(PARTITION BY CustomerID ORDER BY Row) AS N
FROM OriginalTable
)
SELECT A.CustomerID,
A.Status,
A.Date
FROM CTE A
JOIN CTE B
ON A.N = B.N+1
AND A.CustomerID = B.CustomerID
WHERE A.Status <> B.Status
ORDER BY
A.Row
select distinct b.CustomerID, b.status, min(b.date)
From customer a, customer b
where a.CustomerID = b.CustomerID and a.status <> b.status and a.date < b.date
group by b.CustomerID, b.status, a.date;

How to combine Date periods in SQL and create a composite timeline

In a oracle table for each family (unique id) can have several people (unique id) in different relationship for a date range. I would like to get a timeline created to obtain the FamilyType based on combinations of relationship for the time periods. An example is given below for a particular Family.
P1|-----Head---------------------------------------|
P2|--Partner--------------|
P3|---Child----------------------|
P4|---Child------------|
|=Single=|=Couple=|=Family=======|=SingleParent==|
Table has columns
FamilyId, PersonId, Relationship, StartDate, EndDate
Each | is a date (no time portion). The data guarantees that on a given date
* there will always be one person who is Head.
* There can be 0 or 1 Partner.
* There can be 0 or n child.
The rules are
* If there is only a Head the FamilyType is Single
* If there is a Head and a Partner the FamilyType is Couple
* If there is a Head , a Partner and 1 or more Children the FamilyType is Family
* If there is a Head and 1 or more Children the FamilyType is SingleParent
People can join or leave from a family on any date.
And people can change relationships. So following scenarios are possible
P1|----------Head--------------------|
P2|----partner------------|---Head--------|
P3|---Child----------------------|
P4|--Child-----------------|
|=Single=|=Couple=|=Family=======|=SingleParent==|
P1|----------Head--------------------|
P2|----partner------------|---Head--------|
P3|---Child----------------------|
P4|--Child-----------------|
p5|---Partner-----|
|=Single=|=Couple=|=Family=======================|
How can this be done using SQL in Oracle 11GR2 (working using SQL only and not using procedural code). I am trying to evaluate whether this is best done in SQL or C#. As a curiosity answer specific for SQL Server 2012 is also good to have.
The result should be rows with StartDate, EndDate and FamilyType.
you could do something like this:
with family_ranges(familyid, min_start, max_end, curr_date)
as (select familyid,
min(startdate),
max(enddate),
to_number(to_char(min(startdate), 'j'))
from family
group by familyid
union all
select familyid, min_start, max_end, curr_date+1
from family_ranges
where curr_date < to_number(to_char(max_end,'j')))
select familyid, min(curr_date) fromdate, max(curr_date) todate, state
from (select familyid, to_date(curr_date,'j') curr_date,
case when head = 'Y' and partner = 'Y' and child = 'Y' then 'Family'
when head = 'Y' and partner = 'Y' then 'Couple'
when head = 'Y' and child = 'Y' then 'SingleParent'
when head = 'Y' then 'Single'
end state
from (select f.familyid, d.curr_date, f.relationship
from family_ranges d
inner join family f
on f.familyid = d.familyid
and to_date(d.curr_date,'j') between f.startdate and f.enddate)
pivot (
max('Y')
for relationship in ('Head' as head, 'Partner' as partner, 'Child' as child)
))
group by familyid, state
order by familyid, fromdate;
forgive the nonsense with the date->julian. it's to work round a bug with 11.2.0.1-3 where date arithmetic fails with factored subqueries.
the fatored subquery part gets us a list of dates that the family spans. From that we join it back to family to work out who was in the family on that day.
select f.familyid, d.curr_date, f.relationship
from family_ranges d
inner join family f
on f.familyid = d.familyid
and to_date(d.curr_date,'j') between f.startdate and f.enddate;
now we pivot this to get a simple Y/N list
SQL> with family_ranges(familyid, min_start, max_end, curr_date)
2 as (select familyid,
3 min(startdate),
4 max(enddate),
5 to_number(to_char(min(startdate), 'j'))
6 from family
7 group by familyid
8 union all
9 select familyid, min_start, max_end, curr_date+1
10 from family_ranges
11 where curr_date < to_number(to_char(max_end,'j')))
12 select familyid, to_date(curr_date,'j') curr_date, head, partner, child
13 from (select f.familyid, d.curr_date, f.relationship
14 from family_ranges d
15 inner join family f
16 on f.familyid = d.familyid
17 and to_date(d.curr_date,'j') between f.startdate and f.enddate)
18 pivot (
19 max('Y')
20 for relationship in ('Head' as head, 'Partner' as partner, 'Child' as child)
21 );
FAMILYID CURR_DATE H P C
---------- --------- - - -
1 09-NOV-12 Y
1 11-NOV-12 Y
1 13-NOV-12 Y
1 23-NOV-12 Y
2 23-NOV-12 Y
2 28-NOV-12 Y Y
2 29-NOV-12 Y Y
1 30-NOV-12 Y Y
1 01-DEC-12 Y Y
1 03-DEC-12 Y Y
2 18-DEC-12 Y Y Y
2 20-DEC-12 Y Y Y
then its a simple case to get your required string from the rules and a group by to get the date ranges.
SQL> with family_ranges(familyid, min_start, max_end, curr_date)
2 as (select familyid,
3 min(startdate),
4 max(enddate),
5 to_number(to_char(min(startdate), 'j'))
6 from family
7 group by familyid
8 union all
9 select familyid, min_start, max_end, curr_date+1
10 from family_ranges
11 where curr_date < to_number(to_char(max_end,'j')))
12 select familyid, min(curr_date) fromdate, max(curr_date) todate, state
13 from (select familyid, to_date(curr_date,'j') curr_date,
14 case when head = 'Y' and partner = 'Y' and child = 'Y' then 'Family'
15 when head = 'Y' and partner = 'Y' then 'Couple'
16 when head = 'Y' and child = 'Y' then 'SingleParent'
17 when head = 'Y' then 'Single'
18 end state
19 from (select f.familyid, d.curr_date, f.relationship
20 from family_ranges d
21 inner join family f
22 on f.familyid = d.familyid
23 and to_date(d.curr_date,'j') between f.startdate and f.enddate)
24 pivot (
25 max('Y')
26 for relationship in ('Head' as head, 'Partner' as partner, 'Child' as child)
27 ))
28 group by familyid, state
29 order by familyid, fromdate;
FAMILYID FROMDATE TODATE STATE
---------- --------- --------- ------------
1 05-NOV-12 24-NOV-12 Single
1 25-NOV-12 14-DEC-12 Couple
1 15-DEC-12 24-JAN-13 Family
1 25-JAN-13 13-FEB-13 SingleParent
2 05-NOV-12 24-NOV-12 Single
2 25-NOV-12 14-DEC-12 Couple
2 15-DEC-12 13-FEB-13 Family
fiddle: http://sqlfiddle.com/#!4/484b5/1

How do I build a SQL query to show two columns of different date ranges?

I'm trying to build a query from an Oracle 11g database to use in a report. I need to use two tables CONTACT and CONTACT_EXT to get the data from, and compare the total amount of contacts over two date ranges.The tables are joined by ID's matching.
CONTACT:
ID | DATE
----------
1 12/12/2010
2 12/11/2010
3 14/09/2011
CONTACT_EXT
ID | TYPE
----------
1 MAIL
2 FAX
3 FAX
So for example if I set period A to be between 01/01/2010 and 12/12/2010 and period B to be between 01/01/2011 and 11/11/2011
TYPE | PERIOD A | PERIOD B | TOTAL
MAIL 1 0 1
FAX 1 1 2
SQL> create table contact (id,cdate)
2 as
3 select 1, date '2010-12-12' from dual union all
4 select 2, date '2010-11-12' from dual union all
5 select 3, date '2011-09-14' from dual
6 /
Table created.
SQL> create table contact_ext (id,type)
2 as
3 select 1, 'MAIL' from dual union all
4 select 2, 'FAX' from dual union all
5 select 3, 'FAX' from dual
6 /
Table created.
SQL> select ce.type
2 , count(case when c.cdate between date '2010-01-01' and date '2010-12-12' then 1 end) period_a
3 , count(case when c.cdate between date '2011-01-01' and date '2011-11-11' then 1 end) period_b
4 , count(*) total
5 from contact c
6 inner join contact_ext ce on (c.id = ce.id)
7 group by ce.type
8 /
TYPE PERIOD_A PERIOD_B TOTAL
---- ---------- ---------- ----------
FAX 1 1 2
MAIL 1 0 1
2 rows selected.
Regards,
Rob.
Just do a self join:
select type,period_a,period_b,period_a+period_b as total
from(
select type,count(1) as period_a
from contact_ext
left join contact
using(id)
where date>='20100101' and date<='20101212'
group by 1
)a
join(
select type,count(1) as period_b
from contact_ext
left join contact
using(id)
where date>='20110101' and date<='20111111'
group by 1
)b
using(type);
Ans 1 : In where clause set period A between 01/01/2010 and 12/12/2010 OR period B between 01/01/2011 and 11/11/2011
In where clause use OR condition
Ans 2: you can union two different select statements of Period A and Period B