postgres: How to select that have every value in a range? - sql

I have a table, test_table that looks like this:
analysis_date | test_num
------------------------+--------------------
2001-01-01 | 1
2001-01-01 | 2
2001-01-01 | 3
2001-01-02 | 1
2001-01-02 | 2
2001-01-02 | 3
2001-01-03 | 1
2001-01-03 | 2
2001-01-03 | 8
I only want to select rows in which the analysis date has a value for test_num 1, 2, AND 3. The query should not return rows for anlysis_date 2001-01-03as row for test_num = 3 is missing
analysis_date | test_num
------------------------+--------------------
2001-01-01 | 1
2001-01-01 | 2
2001-01-01 | 3
2001-01-02 | 1
2001-01-02 | 2
2001-01-02 | 3
I am aware of the BETWEEN query, but that doesn't guarantee that all the values within the range exist.

You may try
SELECT * FROM test_table
WHERE analysis_date
IN ( SELECT analysis_date
FROM test_table where test_num IN (1,2,3)
group by analysis_date having
count(DISTINCT test_num) = 3
)
DEMO

You can use exists also
select *
from test_table t
where exists ( select 1
from test_table
where test_num in (1,2,3)
and analysis_date = t.analysis_date
group by analysis_date
having count(distinct test_num) = 3
)
as an alternative to #Kaushik's case.

The fastest method if you want the original rows might be three exists:
select t.*
from test_table t
where exists (select 1
from test_table tt
where tt.analysis_date = t.analysis_date and
tt.test_num = 1
) and
exists (select 1
from test_table tt
where tt.analysis_date = t.analysis_date and
tt.test_num = 2
) and
exists (select 1
from test_table tt
where tt.analysis_date = t.analysis_date and
tt.test_num = 3
);
In particular, this can make use of an index on (analysis_date, test_num).
That said, I think I prefer window functions. Assuming you have no duplicate tests on the same date:
select tt.*
from (select tt.*,
count(*) filter (where test_num in (1, 2, 3)) over (partition by analysis_date) as cnt
from test_table tt
) t
where cnt = 3;

Related

Get only rows where data is the max value

I have a table like this:
treatment | patient_id
3 | 1
3 | 1
3 | 1
2 | 1
2 | 1
1 | 1
2 | 2
2 | 2
1 | 2
I need to get only rows on max(treatment) like this:
treatment | patient_id
3 | 1
3 | 1
3 | 1
2 | 2
2 | 2
The patient_id 1 the max(treatment) is 3
The patient_id 2 the max(treatment) is 2
You can for example join on the aggregated table using the maximal value:
select t.*
from tmp t
inner join (
select max(a) max_a, b
from tmp
group by b
) it on t.a = it.max_a and t.b = it.b;
Here's the db fiddle.
Try this :
WITH list AS
( SELECT patient_id, max(treatment) AS treatment_max
FROM your_table
GROUP BY patient_id
)
SELECT *
FROM your_table AS t
INNER JOIN list AS l
ON t.patient_id = l.patient_id
AND t.treatment = l.treatment_max
You can use rank:
with u as
(select *, rank() over(partition by patient_id order by treatment desc) r
from table_name)
select treatment, patient_id
from u
where r = 1;
Fiddle
use corelated subquery
select t1.* from table_name t1
where t1.treatment=( select max(treatment) from table_name t2 where t1.patient_id=t2.patient_id
)

SQL return second max date for each id, date and channel

I have the following table:
id channel_id date
1 | 1 | 2017-01-10
1 | 2 | 2018-02-05
1 | 1 | 2019-03-07
1 | 2 | 2020-03-15
2 | 1 | 2018-01-17
2 | 1 | 2019-07-20
2 | 1 | 2020-01-10
I want to return for previous maximum date for each date and id but two separate columns for both channel_id. So, one column for previous max date for channel_id is equal to 1 and another for previous max date for channel_id is equal to 2. What I want to get can be found below:
id channel_id date prev_date_channel_id1 prev_date_channel_id2
1 | 1 | 2017-01-10 | NULL | NULL |
1 | 2 | 2018-02-05 | 2017-01-10 | NULL |
1 | 1 | 2019-03-07 | 2017-01-10 | 2018-02-05 |
1 | 2 | 2020-03-15 | 2019-03-07 | 2018-02-05 |
2 | 1 | 2018-01-17 | NULL | NULL |
2 | 1 | 2019-07-20 | 2018-01-17 | NULL |
2 | 1 | 2020-01-10 | 2019-07-20 | NULL |
I made a query as below and returns what I want but takes too much time. I'd appreciate any optimization suggestions!
SELECT
a.id,
a.date,
MAX(c.date) AS prev_date_channel_id1,
MAX(d.date) AS prev_date_channel_id2
FROM
table a
LEFT JOIN
table c ON a.id=c.id AND a.date>c.date AND c.channel_id=1
LEFT JOIN
table d ON a.id=d.id AND a.date>d.date AND d.channel_id=2
GROUP BY a.id, a.date
Use lag() for the previous date and a cumulative conditional max for the channel 2 date:
select t.*, lag(date) over (partition by id order by date) as prev_date,
max(case when channel = 2 then date end) over
(partition by id
order by date
rows between unbounded preceding and 1 row preceding
) as prev_date_channel2
from t;
I think there's an error in your "expected output" for the value of prev_date_channel_id1 on the last row (it should be 2019-07-20).
In any case, with appropriate indexing an outer apply top 1 construct might serve you better:
create table t
(
id int,
channel_id int,
[date] date
constraint pk_t primary key clustered (id, channel_id, [date])
);
insert t values
(1, 1, '2017-01-10'),
(1, 2, '2018-02-05'),
(1, 1, '2019-03-07'),
(1, 2, '2020-03-15'),
(2, 1, '2018-01-17'),
(2, 1, '2019-07-20'),
(2, 1, '2020-01-10');
select t1.id,
t1.channel_id,
t1.[date],
prev_date_channel_id1 = c1.dt,
prev_date_channel_id2 = c2.dt
from t t1
outer apply (
select top 1 [date]
from t
where id = t1.id
and channel_id = 1
and [date] < t1.[date]
order by date desc
) c1(dt)
outer apply (
select top 1 [date]
from t
where id = t1.id
and channel_id = 2
and [date] < t1.[date]
order by date desc
) c2(dt)
order by t1.id, t1.[date];
Or possibly faster still, especially with the key changed to constraint pk_t primary key clustered (id, [date], [channel_id]))
select t1.id,
t1.channel_id,
t1.[date],
prev_date_channel_id1 = prev.c1,
prev_date_channel_id2 = prev.c2
from t t1
outer apply (
select c1 = max(iif(channel_id = 1, [date], null)),
c2 = max(iif(channel_id = 2, [date], null))
from t
where id = t1.id
and [date] < t1.[date]
) prev
Assuming you have an index on those three columns, you can use subqueries:
SELECT [T0].[id],
[T0].[channel_id],
[T0].[date],
[prev_date_channel_id1] = (
SELECT MAX([T1].[date])
FROM [t] [T1]
WHERE [T1].[id] = [T0].[id]
AND [T1].[date] < [T0].[date]
AND [T1].[channel_id] = 1
),
[prev_date_channel_id2] = (
SELECT MAX([T1].[date])
FROM [t] [T1]
WHERE [T1].[id] = [T0].[id]
AND [T1].[date] < [T0].[date]
AND [T1].[channel_id] = 2
)
FROM [t] [T0];

single column value in multiple columns

ID|Class | Number
--+------+---------
1 | 1 | 58.2
2 | 1 | 85.4
3 | 2 | 28.2
4 | 2 | 55.4
The desired result would be:
Column1 |Number | Column2 | Number
--------+-------+---------+---------
1 | 58.2 | 2 |28.2
1 | 85.4 | 2 |55.4
What would be the required SQL?
You can user row_number() and aggregate:
select 1, max(case when seqnum % 2 = 1 then number end),
2, max(case when seqnum % 2 = 0 then number end)
from (select t.*,
row_number() over (partition by class order by id) as seqnum
from t
) t
group by ceiling(seqnum / 2.0);
The aggregation uses arithmetic to put pairs of rows for each class into one row.
try this
SELECT 1 AS Column1,t2.Number,2 AS Column2,t1.Number
FROM
(
SELECT *
FROM test11
) t2
INNER JOIN
(
SELECT *
FROM test11
) t1
ON t1.Class = t2.Class
WHERE t1.ID < t2.ID
ORDER BY t1.ID DESC
Demo in db<>fiddle

Oracle SQL - update 2 columns in row with the oldest date

I am attempting to update 2 columns in a row. The row that should be updated is the row that has the oldest duedate
The table chorecompletion is described as:
Name Null? Type
----------------------------------------- -------- ----------------------------
CHOREID NOT NULL NUMBER(38)
GROUPID NOT NULL NUMBER(38)
DUEDATE NOT NULL DATE
COMPLETEDDATE DATE
COMPLETEDBY NUMBER(38)
This query returns the row that I want to update
select *
from
(
select choreid, duedate, row_number()
over (partition by choreid order by duedate) as rn
from chorecompletion where choreid = 12 and groupid = 6
)
where rn = 1;
Where I could use some help is how to use this query in my update statement, specifically my where clause
my current attempt:
update chorecompletion
set completeddate = sysdate, completedby=1
where --How to get the result of the previous query here?
Any help on my logic would be hugely appreciated.
Example desired result:
Before Update:
CHOREID GROUPID DUEDATE COMPLETEDDATE COMPLETEDBY
-------------------------------------------------------------------
12 6 2018-11-1
12 6 2018-10-1
After Update
CHOREID GROUPID DUEDATE COMPLETEDDATE COMPLETEDBY
-------------------------------------------------------------------
12 6 2018-11-1
12 6 2018-10-1 2018-09-30 1
Something like this?
SQL> create table test
2 (choreid number,
3 groupid number,
4 duedate date,
5 completeddate date,
6 completedby number
7 );
Table created.
SQL> insert into test
2 select 12, 6, date '2018-01-11', null, null from dual union all
3 select 12, 6, date '2018-01-10', null, null from dual;
2 rows created.
SQL> update test t set
2 t.completeddate = sysdate,
3 t.completedby = 1
4 where t.duedate = (select min(t1.duedate)
5 from test t1
6 where t1.choreid = t.choreid
7 and t1.groupid = t.groupid)
8 and t.choreid = 12
9 and t.groupid = 6;
1 row updated.
SQL> select * From test;
CHOREID GROUPID DUEDATE COMPLETEDD COMPLETEDBY
---------- ---------- ---------- ---------- -----------
12 6 2018-01-11
12 6 2018-01-10 2018-09-30 1
SQL>
You can use a MERGE statement and can join on the ROWID pseudo-column so that you can correlated directly to the matched row:
SQL Fiddle
Oracle 11g R2 Schema Setup:
CREATE TABLE chorecompletion ( choreid, groupid, duedate, completeddate, completedby ) AS
SELECT 12, 6, DATE '2018-09-29', CAST( null AS DATE ), CAST( null AS NUMBER ) FROM DUAL UNION ALL
SELECT 12, 6, DATE '2018-09-30', null, null FROM DUAL;
Query 1:
MERGE INTO chorecompletion dst
USING (
SELECT ROWID AS rid
FROM (
SELECT *
FROM chorecompletion
WHERE choreid = 12
AND groupid = 6
ORDER BY duedate ASC
)
WHERE ROWNUM = 1
) src
ON ( src.RID = dst.ROWID )
WHEN MATCHED THEN
UPDATE
SET completeddate = sysdate,
completedby = 1
Results:
1 Row Updated.
Query 2:
SELECT * FROM chorecompletion
Results:
| CHOREID | GROUPID | DUEDATE | COMPLETEDDATE | COMPLETEDBY |
|---------|---------|----------------------|----------------------|-------------|
| 12 | 6 | 2018-09-29T00:00:00Z | 2018-09-30T18:42:45Z | 1 |
| 12 | 6 | 2018-09-30T00:00:00Z | (null) | (null) |
Query 3: You could also use an UPDATE statement with the ROWID pseudo-column:
UPDATE chorecompletion dst
SET completeddate = sysdate,
completedby = 2
WHERE ROWID = (
SELECT ROWID
FROM (
SELECT ROW_NUMBER() OVER ( PARTITION BY choreid ORDER BY duedate ) rn
FROM chorecompletion
WHERE choreid = 12
AND groupid = 6
ORDER BY duedate ASC
)
WHERE rn = 1
)
Results:
1 Row Updated.
Query 4:
SELECT * FROM chorecompletion
Results:
| CHOREID | GROUPID | DUEDATE | COMPLETEDDATE | COMPLETEDBY |
|---------|---------|----------------------|----------------------|-------------|
| 12 | 6 | 2018-09-29T00:00:00Z | 2018-09-30T18:42:45Z | 2 |
| 12 | 6 | 2018-09-30T00:00:00Z | (null) | (null) |
You can use a correlated subquery. If I understand correctly:
update chorecompletion
set completeddate = (select min(duedate)
from chorecompletion cc
where cc.choreid = chorecompletion.coreid
)
where choreid = 12 and groupid = 6

display 3 or more consecutive rows(Sql)

I have a table with below data
+------+------------+-----------+
| id | date1 | people |
+------+------------+-----------+
| 1 | 2017-01-01 | 10 |
| 2 | 2017-01-02 | 109 |
| 3 | 2017-01-03 | 150 |
| 4 | 2017-01-04 | 99 |
| 5 | 2017-01-05 | 145 |
| 6 | 2017-01-06 | 1455 |
| 7 | 2017-01-07 | 199 |
| 8 | 2017-01-08 | 188 |
+------+------------+-----------+
now what i am trying to do is to display 3 consecutive rows where people were >=100 like this
+------+------------+-----------+
| id | date1 | people |
+------+------------+-----------+
| 5 | 2017-01-05 | 145 |
| 6 | 2017-01-06 | 1455 |
| 7 | 2017-01-07 | 199 |
| 8 | 2017-01-08 | 188 |
+------+------------+-----------+
can anyone help me how to do this query using oracle database. I am able to display rows which are above 100 but not in a consecutive way
Table creation(reducing typing time for people who will be helping)
CREATE TABLE stadium
( id int
, date1 date, people int
);
Insert into stadium values (
1,TO_DATE('2017-01-01','YYYY-MM-DD'),10);
Insert into stadium values
(2,TO_DATE('2017-01-02','YYYY-MM-DD'),109);
Insert into stadium values(
3,TO_DATE('2017-01-03','YYYY-MM-DD'),150);
Insert into stadium values(
4,TO_DATE('2017-01-04','YYYY-MM-DD'),99);
Insert into stadium values(
5,TO_DATE('2017-01-05','YYYY-MM-DD'),145);
Insert into stadium values(
6,TO_DATE('2017-01-06','YYYY-MM-DD'),1455);
Insert into stadium values
(7,TO_DATE('2017-01-07','YYYY-MM-DD'),199);
Insert into stadium values(
8,TO_DATE('2017-01-08','YYYY-MM-DD'),188);
Thanks in advance for the help
Assuming you mean >= 100, there are a couple of ways. One method just uses lead() and lag(). But a simple method defines each group >= 100 by the number of values < 100 before it. Then it uses count(*) to find the size of the consecutive values:
select s.*
from (select s.*, count(*) over (partition by grp) as num100pl
from (select s.*,
sum(case when people < 100 then 1 else 0 end) over (order by date) as grp
from stadium s
) s
) s
where num100pl >= 3;
Here is a SQL Fiddle showing that the syntax works.
You can use the following sql script to get the desired output.
WITH partitioned AS (
SELECT *, id - ROW_NUMBER() OVER (ORDER BY id) AS grp
FROM stadium
WHERE people >= 100
),
counted AS (
SELECT *, COUNT(*) OVER (PARTITION BY grp) AS cnt
FROM partitioned
)
select id , visit_date,people
from counted
where cnt>=3
I'm assuming that both the id and date columns are sequential and correspond to each other (there will need to be additional ROW_NUMBER() if the ids are not sequential with the dates, and more complex logic included if the dates are not necessarily sequential).
SELECT
*
FROM
(
SELECT
*
,COUNT(date) OVER (PARTITION BY sequential_group_num) AS num_days_in_sequence
FROM
(
SELECT
*
,(id - ROW_NUMBER() OVER (ORDER BY date)) AS sequential_group_num
FROM
stadium
WHERE
people >= 100
) AS subquery1
) AS subquery2
WHERE
num_days_in_sequence >= 3
That produces the following output:
id date people sequential_group_num num_days_in_sequence
----------- ---------- ----------- -------------------- --------------------
5 2017-01-05 145 2 4
6 2017-01-06 1455 2 4
7 2017-01-07 199 2 4
8 2017-01-08 188 2 4
By using joins we can display the consecutive rows like this
SELECT id, date1, people FROM stadium a WHERE people >= 100
AND (SELECT people FROM stadium b WHERE b.id = a.id + 1) >= 100
AND (SELECT people FROM stadium c WHERE c.id = a.id + 2) >= 100
OR people >= 100
AND (SELECT people FROM stadium e WHERE e.id = a.id - 1) >= 100
AND (SELECT people FROM stadium f WHERE f.id = a.id + 1) >= 100
OR people >= 100
AND (SELECT people FROM stadium g WHERE g.id = a.id - 1) >= 100
AND (SELECT people FROM stadium h WHERE h.id = a.id - 2) >= 100
order by id;
select distinct
t1.*
from
stadium t1
join
stadium t2
join
stadium t3
where
t1.people >= 100
and t2.people >= 100
and t3.people >= 100
and
(
(t1.id + 1 = t2.id
and t2.id + 1 = t3.id)
or
(
t2.id + 1 = t1.id
and t1.id + 1 = t3.id
)
or
(
t2.id + 1 = t3.id
and t3.id + 1 = t1.id
)
)
order by
id;
SQL script:
SELECT DISTINCT SS.*
FROM STADIUM SS
INNER JOIN
(SELECT S1.ID
FROM STADIUM S1
WHERE 3 = (
SELECT COUNT(1)
FROM STADIUM S2
WHERE (S2.ID=S1.ID OR S2.ID=S1.ID+1 OR S2.ID=S1.ID+2)
AND S2.PEOPLE >= 100
)) AS SS2
ON SS.ID>=SS2.ID AND SS.ID<SS2.ID+3
select *
from(
select * , count(*) over (partition by grp) as total
from
(select * , Sum(case when people < 100 then 1 else 0 end) over (order by date) as grp
from stadium) T -- inner Query 1
where people >=100 )S--inner query 2
where total >=3 --outer query
I wrote the following solution for this similar leetcode problem:
with groupVisitsOver100 as (
select *,
sum(
case
when people < 100 then 1
else 0
end
) over (order by date1) as visitGroups
from stadium
),
filterUnder100 as (
select
*
from groupVisitsOver100
where people >= 100
),
countGroupsSize as (
select
*,
count(*) over (partition by visitGroups) as groupsSize
from filterUnder100
)
select id, date1, people from countGroupsSize where groupsSize >= 3 order by date1