display 3 or more consecutive rows(Sql) - sql

I have a table with below data
+------+------------+-----------+
| id | date1 | people |
+------+------------+-----------+
| 1 | 2017-01-01 | 10 |
| 2 | 2017-01-02 | 109 |
| 3 | 2017-01-03 | 150 |
| 4 | 2017-01-04 | 99 |
| 5 | 2017-01-05 | 145 |
| 6 | 2017-01-06 | 1455 |
| 7 | 2017-01-07 | 199 |
| 8 | 2017-01-08 | 188 |
+------+------------+-----------+
now what i am trying to do is to display 3 consecutive rows where people were >=100 like this
+------+------------+-----------+
| id | date1 | people |
+------+------------+-----------+
| 5 | 2017-01-05 | 145 |
| 6 | 2017-01-06 | 1455 |
| 7 | 2017-01-07 | 199 |
| 8 | 2017-01-08 | 188 |
+------+------------+-----------+
can anyone help me how to do this query using oracle database. I am able to display rows which are above 100 but not in a consecutive way
Table creation(reducing typing time for people who will be helping)
CREATE TABLE stadium
( id int
, date1 date, people int
);
Insert into stadium values (
1,TO_DATE('2017-01-01','YYYY-MM-DD'),10);
Insert into stadium values
(2,TO_DATE('2017-01-02','YYYY-MM-DD'),109);
Insert into stadium values(
3,TO_DATE('2017-01-03','YYYY-MM-DD'),150);
Insert into stadium values(
4,TO_DATE('2017-01-04','YYYY-MM-DD'),99);
Insert into stadium values(
5,TO_DATE('2017-01-05','YYYY-MM-DD'),145);
Insert into stadium values(
6,TO_DATE('2017-01-06','YYYY-MM-DD'),1455);
Insert into stadium values
(7,TO_DATE('2017-01-07','YYYY-MM-DD'),199);
Insert into stadium values(
8,TO_DATE('2017-01-08','YYYY-MM-DD'),188);
Thanks in advance for the help

Assuming you mean >= 100, there are a couple of ways. One method just uses lead() and lag(). But a simple method defines each group >= 100 by the number of values < 100 before it. Then it uses count(*) to find the size of the consecutive values:
select s.*
from (select s.*, count(*) over (partition by grp) as num100pl
from (select s.*,
sum(case when people < 100 then 1 else 0 end) over (order by date) as grp
from stadium s
) s
) s
where num100pl >= 3;
Here is a SQL Fiddle showing that the syntax works.

You can use the following sql script to get the desired output.
WITH partitioned AS (
SELECT *, id - ROW_NUMBER() OVER (ORDER BY id) AS grp
FROM stadium
WHERE people >= 100
),
counted AS (
SELECT *, COUNT(*) OVER (PARTITION BY grp) AS cnt
FROM partitioned
)
select id , visit_date,people
from counted
where cnt>=3

I'm assuming that both the id and date columns are sequential and correspond to each other (there will need to be additional ROW_NUMBER() if the ids are not sequential with the dates, and more complex logic included if the dates are not necessarily sequential).
SELECT
*
FROM
(
SELECT
*
,COUNT(date) OVER (PARTITION BY sequential_group_num) AS num_days_in_sequence
FROM
(
SELECT
*
,(id - ROW_NUMBER() OVER (ORDER BY date)) AS sequential_group_num
FROM
stadium
WHERE
people >= 100
) AS subquery1
) AS subquery2
WHERE
num_days_in_sequence >= 3
That produces the following output:
id date people sequential_group_num num_days_in_sequence
----------- ---------- ----------- -------------------- --------------------
5 2017-01-05 145 2 4
6 2017-01-06 1455 2 4
7 2017-01-07 199 2 4
8 2017-01-08 188 2 4

By using joins we can display the consecutive rows like this
SELECT id, date1, people FROM stadium a WHERE people >= 100
AND (SELECT people FROM stadium b WHERE b.id = a.id + 1) >= 100
AND (SELECT people FROM stadium c WHERE c.id = a.id + 2) >= 100
OR people >= 100
AND (SELECT people FROM stadium e WHERE e.id = a.id - 1) >= 100
AND (SELECT people FROM stadium f WHERE f.id = a.id + 1) >= 100
OR people >= 100
AND (SELECT people FROM stadium g WHERE g.id = a.id - 1) >= 100
AND (SELECT people FROM stadium h WHERE h.id = a.id - 2) >= 100
order by id;

select distinct
t1.*
from
stadium t1
join
stadium t2
join
stadium t3
where
t1.people >= 100
and t2.people >= 100
and t3.people >= 100
and
(
(t1.id + 1 = t2.id
and t2.id + 1 = t3.id)
or
(
t2.id + 1 = t1.id
and t1.id + 1 = t3.id
)
or
(
t2.id + 1 = t3.id
and t3.id + 1 = t1.id
)
)
order by
id;

SQL script:
SELECT DISTINCT SS.*
FROM STADIUM SS
INNER JOIN
(SELECT S1.ID
FROM STADIUM S1
WHERE 3 = (
SELECT COUNT(1)
FROM STADIUM S2
WHERE (S2.ID=S1.ID OR S2.ID=S1.ID+1 OR S2.ID=S1.ID+2)
AND S2.PEOPLE >= 100
)) AS SS2
ON SS.ID>=SS2.ID AND SS.ID<SS2.ID+3

select *
from(
select * , count(*) over (partition by grp) as total
from
(select * , Sum(case when people < 100 then 1 else 0 end) over (order by date) as grp
from stadium) T -- inner Query 1
where people >=100 )S--inner query 2
where total >=3 --outer query

I wrote the following solution for this similar leetcode problem:
with groupVisitsOver100 as (
select *,
sum(
case
when people < 100 then 1
else 0
end
) over (order by date1) as visitGroups
from stadium
),
filterUnder100 as (
select
*
from groupVisitsOver100
where people >= 100
),
countGroupsSize as (
select
*,
count(*) over (partition by visitGroups) as groupsSize
from filterUnder100
)
select id, date1, people from countGroupsSize where groupsSize >= 3 order by date1

Related

SQL return second max date for each id, date and channel

I have the following table:
id channel_id date
1 | 1 | 2017-01-10
1 | 2 | 2018-02-05
1 | 1 | 2019-03-07
1 | 2 | 2020-03-15
2 | 1 | 2018-01-17
2 | 1 | 2019-07-20
2 | 1 | 2020-01-10
I want to return for previous maximum date for each date and id but two separate columns for both channel_id. So, one column for previous max date for channel_id is equal to 1 and another for previous max date for channel_id is equal to 2. What I want to get can be found below:
id channel_id date prev_date_channel_id1 prev_date_channel_id2
1 | 1 | 2017-01-10 | NULL | NULL |
1 | 2 | 2018-02-05 | 2017-01-10 | NULL |
1 | 1 | 2019-03-07 | 2017-01-10 | 2018-02-05 |
1 | 2 | 2020-03-15 | 2019-03-07 | 2018-02-05 |
2 | 1 | 2018-01-17 | NULL | NULL |
2 | 1 | 2019-07-20 | 2018-01-17 | NULL |
2 | 1 | 2020-01-10 | 2019-07-20 | NULL |
I made a query as below and returns what I want but takes too much time. I'd appreciate any optimization suggestions!
SELECT
a.id,
a.date,
MAX(c.date) AS prev_date_channel_id1,
MAX(d.date) AS prev_date_channel_id2
FROM
table a
LEFT JOIN
table c ON a.id=c.id AND a.date>c.date AND c.channel_id=1
LEFT JOIN
table d ON a.id=d.id AND a.date>d.date AND d.channel_id=2
GROUP BY a.id, a.date
Use lag() for the previous date and a cumulative conditional max for the channel 2 date:
select t.*, lag(date) over (partition by id order by date) as prev_date,
max(case when channel = 2 then date end) over
(partition by id
order by date
rows between unbounded preceding and 1 row preceding
) as prev_date_channel2
from t;
I think there's an error in your "expected output" for the value of prev_date_channel_id1 on the last row (it should be 2019-07-20).
In any case, with appropriate indexing an outer apply top 1 construct might serve you better:
create table t
(
id int,
channel_id int,
[date] date
constraint pk_t primary key clustered (id, channel_id, [date])
);
insert t values
(1, 1, '2017-01-10'),
(1, 2, '2018-02-05'),
(1, 1, '2019-03-07'),
(1, 2, '2020-03-15'),
(2, 1, '2018-01-17'),
(2, 1, '2019-07-20'),
(2, 1, '2020-01-10');
select t1.id,
t1.channel_id,
t1.[date],
prev_date_channel_id1 = c1.dt,
prev_date_channel_id2 = c2.dt
from t t1
outer apply (
select top 1 [date]
from t
where id = t1.id
and channel_id = 1
and [date] < t1.[date]
order by date desc
) c1(dt)
outer apply (
select top 1 [date]
from t
where id = t1.id
and channel_id = 2
and [date] < t1.[date]
order by date desc
) c2(dt)
order by t1.id, t1.[date];
Or possibly faster still, especially with the key changed to constraint pk_t primary key clustered (id, [date], [channel_id]))
select t1.id,
t1.channel_id,
t1.[date],
prev_date_channel_id1 = prev.c1,
prev_date_channel_id2 = prev.c2
from t t1
outer apply (
select c1 = max(iif(channel_id = 1, [date], null)),
c2 = max(iif(channel_id = 2, [date], null))
from t
where id = t1.id
and [date] < t1.[date]
) prev
Assuming you have an index on those three columns, you can use subqueries:
SELECT [T0].[id],
[T0].[channel_id],
[T0].[date],
[prev_date_channel_id1] = (
SELECT MAX([T1].[date])
FROM [t] [T1]
WHERE [T1].[id] = [T0].[id]
AND [T1].[date] < [T0].[date]
AND [T1].[channel_id] = 1
),
[prev_date_channel_id2] = (
SELECT MAX([T1].[date])
FROM [t] [T1]
WHERE [T1].[id] = [T0].[id]
AND [T1].[date] < [T0].[date]
AND [T1].[channel_id] = 2
)
FROM [t] [T0];

single column value in multiple columns

ID|Class | Number
--+------+---------
1 | 1 | 58.2
2 | 1 | 85.4
3 | 2 | 28.2
4 | 2 | 55.4
The desired result would be:
Column1 |Number | Column2 | Number
--------+-------+---------+---------
1 | 58.2 | 2 |28.2
1 | 85.4 | 2 |55.4
What would be the required SQL?
You can user row_number() and aggregate:
select 1, max(case when seqnum % 2 = 1 then number end),
2, max(case when seqnum % 2 = 0 then number end)
from (select t.*,
row_number() over (partition by class order by id) as seqnum
from t
) t
group by ceiling(seqnum / 2.0);
The aggregation uses arithmetic to put pairs of rows for each class into one row.
try this
SELECT 1 AS Column1,t2.Number,2 AS Column2,t1.Number
FROM
(
SELECT *
FROM test11
) t2
INNER JOIN
(
SELECT *
FROM test11
) t1
ON t1.Class = t2.Class
WHERE t1.ID < t2.ID
ORDER BY t1.ID DESC
Demo in db<>fiddle

Select except where different in SQL

I need a bit of help with a SQL query.
Imagine I've got the following table
id | date | price
1 | 1999-01-01 | 10
2 | 1999-01-01 | 10
3 | 2000-02-02 | 15
4 | 2011-03-03 | 15
5 | 2011-04-04 | 16
6 | 2011-04-04 | 20
7 | 2017-08-15 | 20
What I need is all dates where only one price is present.
In this example I need to get rid of row 5 and 6 (because there is two difference prices for the same date) and either 1 or 2(because they're duplicate).
How do I do that?
select date,
count(distinct price) as prices -- included to test
from MyTable
group by date
having count(distinct price) = 1 -- distinct for the duplicate pricing
The following should work with any DBMS
SELECT id, date, price
FROM TheTable o
WHERE NOT EXISTS (
SELECT *
FROM TheTable i
WHERE i.date = o.date
AND (
i.price <> o.price
OR (i.price = o.price AND i.id < o.id)
)
)
;
JohnHC answer is more readable and delivers the information the OP asked for ("[...] I need all the dates [...]").
My answer, though less readable at first, is more general (allows for more complexes tie-breaking criteria) and also is capable of returning the full row (with id and price, not just date).
;WITH CTE_1(ID ,DATE,PRICE)
AS
(
SELECT 1 , '1999-01-01',10 UNION ALL
SELECT 2 , '1999-01-01',10 UNION ALL
SELECT 3 , '2000-02-02',15 UNION ALL
SELECT 4 , '2011-03-03',15 UNION ALL
SELECT 5 , '2011-04-04',16 UNION ALL
SELECT 6 , '2011-04-04',20 UNION ALL
SELECT 7 , '2017-08-15',20
)
,CTE2
AS
(
SELECT A.*
FROM CTE_1 A
INNER JOIN
CTE_1 B
ON A.DATE=B.DATE AND A.PRICE!=B.PRICE
)
SELECT * FROM CTE_1 WHERE ID NOT IN (SELECT ID FROM CTE2)

T-sql rank for max and min value

I need help with a t-sql query.
I have a table with this structure:
id | OverallRank | FirstRank | SecondRank | Nrank..
1 | 10 | 20 | 30 | 5
2 | 15 | 24 | 12 | 80
3 | 10 | 40 | 37 | 12
I need a query that produces this kind of result:
When id: 1
id | OverallRank | BestRankLabel | BestRankValue | WorstRankLabel | WorkRankValue
1 | 10 | SecondRank | 30 | Nrank | 5
When id: 2
id | OverallRank | BestRankLabel | BestRankValue | WorstRankLabel | WorkRankValue
1 | 15 | FirstRank | 24 | SecondRank | 12
How can I do it?
Thanks in advance
with cte(id, RankValue,RankName) as (
SELECT id, RankValue,RankName
FROM
(SELECT id, OverallRank, FirstRank, SecondRank, Nrank
FROM ##input) p
UNPIVOT
(RankValue FOR RankName IN
(OverallRank, FirstRank, SecondRank, Nrank)
)AS unpvt)
select t1.id, max(case when RankName = 'OverallRank' then RankValue else null end) as OverallRank,
max(case when t1.RankValue = t2.MaxRankValue then RankName else null end) as BestRankName,
MAX(t2.MaxRankValue) as BestRankValue,
max(case when t1.RankValue = t3.MinRankValue then RankName else null end) as WorstRankName,
MAX(t3.MinRankValue) as WorstRankValue
from cte as t1
left join (select id, MAX(RankValue) as MaxRankValue from cte group by id) as t2 on t1.id = t2.id
left join (select id, min(RankValue) as MinRankValue from cte group by id) as t3 on t1.id = t3.id
group by t1.id
Working good with your test data. You should only edit RankName IN (OverallRank, FirstRank, SecondRank, Nrank) by adding right columns' names.
CASE
WHEN OverallRank > FirstRank and OverallRank > FirstSecondRand and OverallRank > nRank THEN 'OverallRank'
WHEN FirstRank > OverallRank ... THEN 'FirstRank'
END
This kind of query is why you should normalise your data.
declare #id int, #numranks int
select #id = 1, #numranks = 3 -- number of Rank columns
;with cte as
(
select *
from
(
select *,
ROW_NUMBER() over (partition by id order by rank desc) rn
from
(
select * from YourBadlyDesignedTable
unpivot (Rank for RankNo in (FirstRank, SecondRank, ThirdRank))u -- etc
) v2
) v1
where id=#id and rn in (1, #numranks)
)
select
tMin.id,
tMin.OverallRank,
tMin.RankNo as BestRankLabel,
tMin.Rank as BestRankValue,
tMax.RankNo as WorstRankLabel,
tMax.Rank as WorstRankValue
from (select * from cte where rn=1) tMin
inner join (select * from cte where rn>1) tMax
on tMin.id = tmax.id
You can take out the id = #id if you want all rows.

Aggregating Several Columns in SQL

Suppose I have a table that looks like the following
id | location | dateHired | dateRehired | dateTerminated
1 | 1 | 10/1/2011 | NULL | 12/1/2011
2 | 1 | 10/3/2011 | 11/1/2011 | 12/31/2011
3 | 5 | 10/5/2011 | NULL | NULL
4 | 5 | 10/5/2011 | NULL | NULL
5 | 7 | 11/5/2011 | NULL | 12/1/2011
6 | 10 | 11/2/2011 | NULL | NULL
and I wanted to condense that into a summary table such that:
location | date | hires | rehires | terms
1 | 10/1/2011 | 1 | 0 | 0
1 | 10/3/2011 | 1 | 0 | 0
1 | 11/1/2011 | 0 | 1 | 0
1 | 12/1/2011 | 0 | 0 | 1
1 | 12/31/2011 | 1 | 0 | 0
5 | 10/5/2011 | 2 | 0 | 0
etc.
-- what would that SQL look like? I was thinking it would be something to the effect of:
SELECT
e.location
, -- ?
,SUM(CASE WHEN e.dateHired IS NOT NULL THEN 1 ELSE 0 END) AS Hires
,SUM(CASE WHEN e.dateRehired IS NOT NULL THEN 1 ELSE 0 END) As Rehires
,SUM(CASE WHEN e.dateTerminated IS NOT NULL THEN 1 ELSE 0 END) As Terms
FROM
Employment e
GROUP BY
e.Location
,--?
But I'm not real keen if that's entirely correct or not?
EDIT - This is for SQL 2008 R2.
Also,
INNER JOIN on the date columns assumes that there are values for all three categories, which is false; which is the original problem I was trying to solve. I was thinking something like COALESCE, but that doesn't really make sense either.
I am sure there is probably an easier, more elegant way to solve this. However, this is the simplest, quickest that I can think of this late that works.
CREATE TABLE #Temp
(
Location INT,
Date DATETIME,
HireCount INT,
RehireCount INT,
DateTerminatedCount INT
)
--This will keep us from having to do an insert if does not already exist
INSERT INTO #Temp (Location, Date)
SELECT DISTINCT Location, DateHired FROM Employment
UNION
SELECT DISTINCT Location, DateRehired FROM Employment
UNION
SELECT DISTINCT Location, DateTerminated FROM Employment
UPDATE #Temp
SET HireCount = Hired.HireCount
FROM #Temp
JOIN
(
SELECT Location, DateHired AS Date, SUM(*) AS HireCount
FROM Employment
GROUP BY Location, DateHired
) AS Hired
UPDATE #Temp
SET RehireCount= Rehire.RehireCount
FROM #Temp
JOIN
(
SELECT Location, DateRehired AS Date, SUM(*) AS RehireCount
FROM Employment
GROUP BY Location, DateRehired
) AS Rehire
ON Rehire.Location = #Temp.Location AND Rehire.Date = #Temp.Date
UPDATE #Temp
SET DateTerminatedCount = Terminated.DateTerminatedCount
FROM #Temp
JOIN
(
SELECT Location, DateTerminated AS Date, SUM(*) AS DateTerminatedCount
FROM Employment
GROUP BY Location, DateTerminated
) AS Terminated
ON Terminated.Location = #Temp.Location AND Terminated.Date = #Temp.Date
SELECT * FROM #Temp
How about something like:
with dates as (
select distinct location, d from (
select location, dateHired as [d]
from tbl
where dateHired is not null
union all
select location, dateRehired
from tbl
where dateRehired is not null
union all
select location, dateTerminated
from tbl
where dateTerminated is not null
)
)
select location, [d],
(
select count(*)
from tbl
where location = dates.location
and dateHired = dates.[d]
) as hires,
(
select count(*)
from tbl
where location = dates.location
and dateRehired = dates.[d]
) as rehires,
(
select count(*)
from tbl
where location = dates.location
and dateTerminated = dates.[d]
) as terms
from dates
I don't have a SQL server handy, or I'd test it out.
SELECT * FROM
(SELECT location, dateHired as date, COUNT(1) as hires FROM mytable GROUP BY location, date) H
INNER JOIN
(SELECT location, dateReHired as date, COUNT(1) as rehires FROM mytable GROUP BY location, date) R ON H.location = R.location AND H.dateHired = R.dateRehired
INNER JOIN
(SELECT location, dateTerminated as date, COUNT(1) as terminated FROM mytable GROUP BY location, date) T
ON H.location = T.location AND H.dateHired = T.dateTerminated