How to find overlapping time slices of serveral key-value elements - sql

I would like to find out if I have overlapping time slices that have the same id and the same name.
In the following example, the entries with id=2 and name=c overlaps.
Entry with id=1 is just for demonstration of a good case.
Given table:
+---+------+-------+------------+--------------+
|id | name | value | validFrom | validTo |
+---+------+-------+------------+--------------+
|1 | a | 12 | 2019-01-01 | 9999-12-31 |
|1 | b | 34 | 2019-01-01 | 2019-10-31 |
|1 | b | 35 | 2019-11-01 | 9999-12-31 |
|1 | c | 13 | 2019-01-01 | 2025-12-31 |
|2 | a | 49 | 2019-01-01 | 9999-12-31 |
|2 | b | 99 | 2019-01-01 | 2034-12-31 |
|2 | c | 75 | 2019-01-01 | 2019-10-31 |
|2 | c | 84 | 2019-10-28 | 9999-12-31 |
|n | ... | ... | ... | ... |
+---+------+-------+------------+--------------+
expected output:
+---+------+
|id | name |
+---+------+
|2 | c |
+---+------+
Thanks for your help in advance!

You can get the overlapping rows using exists:
select t.*
from t
where exists (select 1
from t t2
where t2.id = t.id and
t2.name = t.name and
t2.value <> t.value and
t2.validTo > t.validFrom and
t2.validFrom < t.validTo
);
If you just want the id/name combinations:
select distinct t.id, t.name
from t
where exists (select 1
from t t2
where t2.id = t.id and
t2.name = t.name and
t2.value <> t.value and
t2.validTo > t.validFrom and
t2.validFrom < t.validTo
);
You can also do this with a cumulative max:
select t.*
from (select t.*,
max(validTo) over (partition by id, name
order by validFrom
rows between unbounded preceding and 1 preceding
) as prev_validTo
from t
) t
where prev_validTo >= validFrom;

Related

Calculate a column value backwards over a series of previous rows/RECURSIVE/CONNECTED BY

need your help. I guess/hope there is a function for that. I found "CONNECT DBY" and "WITH RECURSIVE AS ..." but it doesn't seem to solve my problem.
GIVEN TABLES:
Table A
+------+------------+----------+
| id | prev_id | date |
+------------------------------+
| 1 | | 20200101 |
| 23 | 1 | 20200104 |
| 34 | 23 | 20200112 |
| 41 | 34 | 20200130 |
+------------------------------+
Table B
+------+-----------+
| ref_id | key |
+------------------+
| 41 | abc |
+------------------+
(points always to the lates entry in table "A". Update, no history)
Join Statement:
SELECT
id, prev_id, key, date
FROM A
LEFT OUTER JOIN B ON B.ref_id = A.id
GIVEN psql result set:
+------+------------+----------+-----------+
| id | prev_id | key | date |
+------------------------------+-----------+
| 1 | | | 20200101 |
| 23 | 1 | | 20200104 |
| 34 | 23 | | 20200112 |
| 41 | 34 | abc | 20200130 |
+------------------------------+-----------+
DESIRED output:
+------+------------+----------+-----------+
| id | prev_id | key | date |
+------------------------------+-----------+
| 1 | | abc | 20200101 |
| 23 | 1 | abc | 20200104 |
| 34 | 23 | abc | 20200112 |
| 41 | 34 | abc | 20200130 |
+------------------------------+-----------+
The rows of the result set are connected by columns 'id' and 'prev_id'.
I want to calculate the "key" column in a reasonable time.
Keep in mind, this is a very simplified example. Normally there are a lot of more rows and different keys and id's
I understand that you want to bring the hierarchy of each row in tableb. Here is one approach using a recursive query:
with recursive cte as (
select a.id, a.prev_id, a.date, b.key
from tablea a
inner join tableb b on b.ref_id = a.id
union all
select a.id, a.prev_id, a.date, c.key
from cte c
inner join tablea a on a.id = c.prev_id
)
select * from cte

Finding the Third Most Recent Value of a Field for Each ID in Each Date in SQL

I have a similar table as below:
+----+----------+-------+
| ID | Date | Value |
+----+----------+-------+
| A | 20200620 | 150 |
+----+----------+-------+
| A | 20200621 | 130 |
+----+----------+-------+
| A | 20200622 | 140 |
+----+----------+-------+
| A | 20200623 | 200 |
+----+----------+-------+
| B | 20200622 | 300 |
+----+----------+-------+
| B | 20200623 | 350 |
+----+----------+-------+
| B | 20200624 | 400 |
+----+----------+-------+
| B | 20200625 | 150 |
+----+----------+-------+
I need to add a column that for each ID and in each date, it shows the value for two business days prior that date (for example for A in '20200623' it should show the value of the day '20200621'). The output should be something similar to the below:
+----+----------+-------+--------------------------+
| ID | Date | Value | Value_AsoF_TwoDaysBefore |
+----+----------+-------+--------------------------+
| A | 20200620 | 150 | NULL |
+----+----------+-------+--------------------------+
| A | 20200621 | 130 | NULL |
+----+----------+-------+--------------------------+
| A | 20200622 | 140 | 150 |
+----+----------+-------+--------------------------+
| A | 20200623 | 200 | 130 |
+----+----------+-------+--------------------------+
| B | 20200622 | 300 | NULL |
+----+----------+-------+--------------------------+
| B | 20200623 | 350 | NULL |
+----+----------+-------+--------------------------+
| B | 20200624 | 400 | 300 |
+----+----------+-------+--------------------------+
| B | 20200625 | 150 | 350 |
+----+----------+-------+--------------------------+
Could you please let me know a way to do that? I appreciate all the helps.
Try this below option with ROW_NUMBER and Self Joining-
Demo Here
WITH CTE
AS
(
SELECT *,
ROW_NUMBER() OVER (PARTITION BY ID ORDER BY Date) RN
FROM your_table
)
SELECT A.ID, A.Value, A.Date, B.Value
FROM CTE A
LEFT JOIN CTE B ON A.RN = B.RN + 2 AND A.ID = B.ID
Assuming you have a row for every date, use lag():
select t.*,
lag(value, 2) over (partition by id order by date) as Value_AsoF_TwoDaysBefore
from t;
If you don't have a value for every day, then use a left join:
select t.*, tprev.date as Value_AsoF_TwoDaysBefore
from t left join
t tprev
on tprev.id = t.id and tprev.date = dateadd(day, -2, t.date) ;
Note: Both of these return NULL for the missing values rather than -. NULLs make much more sense in SQL.

Running sum with max and min cap in SQL Server

I have a table that looks like this
|ID1| ID2| Date |count |
+---+----+------------+------+
|1 | 1 | 2019-07-24 | 3 |
|1 | 1 | 2019-07-25 | 3 |
|1 | 1 | 2019-07-26 | 3 |
|1 | 1 | 2019-07-27 | 1 |
|1 | 1 | 2019-07-28 | -3 |
|1 | 2 | 2019-07-24 | 1 |
|1 | 2 | 2019-07-25 | -3 |
|1 | 2 | 2019-07-26 | 3 |
|1 | 2 | 2019-07-27 | 3 |
|1 | 2 | 2019-07-28 | 3 |
I am interested in calculating the running sum with a min cap of 0 and a max cap of 8. Resulting table would look like this.
|ID1| ID2| Date |count |runningSum|
+---+----+------------+------+----------+
|1 | 1 | 2019-07-24 | 3 | 3 |
|1 | 1 | 2019-07-25 | 3 | 6 |
|1 | 1 | 2019-07-26 | 3 | 8 |
|1 | 1 | 2019-07-27 | 1 | 8 |
|1 | 1 | 2019-07-28 | -3 | 5 |
|1 | 2 | 2019-07-24 | 1 | 1 |
|1 | 2 | 2019-07-25 | -3 | 0 |
|1 | 2 | 2019-07-26 | 3 | 3 |
|1 | 2 | 2019-07-27 | 3 | 6 |
|1 | 2 | 2019-07-28 | 3 | 8 |
I know that Oracle has many different solution to address this problem, like
described here in number 7
https://blog.jooq.org/2016/04/25/10-sql-tricks-that-you-didnt-think-were-possible/. Does anything as simple as this exist for Microsoft SQL Server.
Note that I am not allowed to create tables, temporary tables or table variables.
EDIT I am using Azure Datawarehouse where recursive CTE and cursor statements are not available. Are there really not any other ways to solve this problem in SQL Server?
I don't think you can do this with window functions, alas. The problem is that the caps introduce a state change, so you have to process the rows incrementally to get the value for a given row.
A recursive CTE does iteration, so it can do what you want:
with t as (
select t.*,
row_number() over (partition by id1, id2 order by date) as seqnum
from <yourtable> t
),
cte as (
select id1, id2, date, count,
(case when count < 0 then 0
when count > 8 then 8
else count
end) as runningsum,
seqnum
from t
where seqnum = 1
union all
select cte.id1, cte.id2, t.date, t.count,
(case when t.count + cte.runningsum < 0 then 0
when t.count + cte.runningsum > 8 then 8
else t.count + cte.runningsum
end) as runningsum, t.seqnum
from cte join
t
on t.seqnum = cte.seqnum + 1 and
t.id1 = cte.id1 and t.id2 = cte.id2
)
select *
from cte
order by id1, id2, date;
Here is a db<>fiddle.
Note that very similar code will work in Oracle 12C, which supports recursive CTEs. In earlier versions of Oracle, you can use connect by.

Count of records between min date range and other date

I'm trying to get the count of records of users who appear between a certain date range, specifically the min(date) for each unique user and that min(date) + 14 days. I've checked this link
SQL HAVING BETWEEN a date range
but it's not what I'm looking for. Here's an example of what I'm working with and what I've tried to do
+----+------------+
| ID | ServiceDt |
+----+------------+
| 10 | 2017-03-02 |
| 10 | 2017-03-05 |
| 10 | 2017-03-06 |
| 10 | 2017-03-14 |
| 10 | 2017-03-27 |
| 11 | 2017-03-10 |
| 11 | 2017-03-19 |
| 11 | 2017-04-02 |
| 11 | 2017-04-14 |
| 11 | 2017-04-23 |
| .. | .. |
The query is:
SELECT ID, COUNT(ServiceDt) AS date_count
FROM (
SELECT ID, ServiceDt
FROM tbl
GROUP BY ID, ServiceDt
HAVING ServiceDt BETWEEN MIN(ServiceDt) AND DATEADD(day, +14, MIN(ServiceDt))
) AS R1
GROUP BY ID
When I do the above query I get the following result.
+----+------------+
| ID | date_count |
+----+------------+
| 10 | 5 |
| 11 | 5 |
| .. | .. |
I also tried using CONVERT(date, ...), but I get the same resulting table above. I want the result to be
+----+------------+
| ID | date_count |
+----+------------+
| 10 | 4 |
| 11 | 2 |
| .. | .. |
Can someone please guide me on what I can do to get my desired output, thanks
Use window functions:
select id, count(*)
from (select t.*, min(servicedt) over (partition by id) as min_sd
from tbl t
) t
where servicedt <= dateadd(day, 14, min_sd)
group by id;
Another option is to use cross apply() to get the first ServiceDt for each id and use that in your where clause.
select id, count(*) as date_count
from t
cross apply (
select top 1
i.ServiceDt
from t i
where i.Id = t.Id
order by i.ServiceDt
) x
where t.ServiceDt <= dateadd(day,14,x.ServiceDt)
group by id
rextester demo: http://rextester.com/WXA46698
returns:
+----+------------+
| id | date_count |
+----+------------+
| 10 | 4 |
| 11 | 2 |
+----+------------+

How to query two tables grouping by date and counting date registries for each table

I am using Firebird 2.5 and I have two tables:
|----------TABLE_1---------| |----------TABLE_2---------|
| DATE_1 | SOME_INFO_1 | | DATE_2 | SOME_INFO_2 |
| 2015-05-01 | Brazil | | 2015-04-10 | Bread |
| 2015-06-23 | Paraguai | | 2015-05-01 | Air |
| 2015-05-01 | Chile | | 2015-05-01 | Water |
| 2015-05-01 | Argentina |
I want to group this tables by date, in such way that I can count how much registries I have per date in each table. This is the expected result:
|----------------RESULT_TABLE----------------|
| DATE | COUNT_TABLE_1 | COUNT_TABLE 2 |
| 2014-04-10 | 0 | 1 |
| 2015-05-01 | 3 | 2 |
| 2015-06-23 | 1 | 0 |
I am trying this:
SELECT
a.date_1,
COUNT(a.date_1) AS count_table_1,
COUNT(b.date_2) AS count_table_2
FROM
table_1 a
LEFT OUTER JOIN
table_2 b ON b.date_2 = a.date_1
GROUP BY
a.date_1, b.date_2
ORDER BY
a.date_1 ASC, b.date_2 ASC
And I am getting this as result:
|----------------RESULT_TABLE----------------|
| DATE | COUNT_TABLE_1 | COUNT_TABLE 2 |
| 2014-04-10 | 0 | 1 |
| 2015-05-01 | 6 | 6 |
| 2015-06-23 | 1 | 0 |
I am feeling that my SQL is a mess, but I can't solve it.
First, aggregate the results in each table and then join using the date column.
select t1.date_1 as dt, isnull(t1_count,0), isnull(t2_count,0)
from
(SELECT date_1, count(*) as t1_count from table_1 group by date_1) t1
full outer join
(SELECT date_2, count(*) as t2_count from table_2 group by date_2) t2
on t1.date_1 = t2.date_2