Find the Start and End Number - sql

I am looking to use correct window function for my SQL problem.
I have the following table and I need find the start and end numbers of continuous ranges.
Logs table:
+------------+
| log_id |
+------------+
| 1 |
| 2 |
| 3 |
| 7 |
| 8 |
| 10 |
+------------+
Expected Result:
+------------+--------------+
| start_id | end_id |
+------------+--------------+
| 1 | 3 |
| 7 | 8 |
| 10 | 10 |
+------------+--------------+

The idea is just to subtract an increasing value and then aggregate:
select min(log_id), max(log_id)
from (
select t.*, row_number() over (order by log_id) as seqnum
from t
) t
group by (log_id - seqnum)
order by min(log_id);

You can do by using row_number(), try the following and here is the demo.
select
min(log_id) as start_id,
max(log_id) as end_id
from
(
select
log_id,
log_id - row_number() over (order by log_id) as rnk
from logs
) t
group by
rnk

You can also create a CTE-
With CTE AS(
select log_id,
log_id-row_number() over(order by log_id) as diff from logs)
Select MIN(log_id) as start_id,MAX(log_id) as end_id from CTE group by diff
ORDER by start_id

Related

SQL Server Add row number each group

I working on a query for SQL Server 2016. I have order by serial_no and group by pay_type and I would like to add row number same example below
row_no | pay_type | serial_no
1 | A | 4000118445
2 | A | 4000118458
3 | A | 4000118461
4 | A | 4000118473
5 | A | 4000118486
1 | B | 4000118499
2 | B | 4000118506
3 | B | 4000118519
4 | B | 4000118521
1 | A | 4000118534
2 | A | 4000118547
3 | A | 4000118550
1 | B | 4000118562
2 | B | 4000118565
3 | B | 4000118570
4 | B | 4000118572
Help me please..
SELECT
ROW_NUMBER() OVER(PARTITION BY paytype ORDER BY serial_no) as row_no,
paytype, serial_no
FROM table
ORDER BY serial_no
You can assign groups to adjacent pay types that are the same and then use row_number(). For this purpose, the difference of row numbers is a good way to determine the groups:
select row_number() over (partition by pay_type, seqnum - seqnum_2 order by serial_no) as row_no,
t.*
from (select t.*,
row_number() over (order by serial_no) as seqnum,
row_number() over (partition by pay_type order by serial_no) as seqnum_2
from t
) t;
This type of problem is one example of a gaps-and-islands problem. Why does the difference of row numbers work? I find that the simplest way to understand is to look at the results of the subquery.
Here is a db<>fiddle.
add this to your select list
ROW_NUMBER() OVER ( ORDER BY (SELECT 1) )
since you already sorting by your stuff, so you don't need to sorting in your windowing function so consuming less CPU,

PostgreSQL: Filter select query by comparing against other rows

Suppose I have a table of Events that lists a userId and the time the Event occurred:
+----+--------+----------------------------+
| id | userId | time |
+----+--------+----------------------------+
| 1 | 46 | 2020-07-22 11:22:55.307+00 |
| 2 | 190 | 2020-07-13 20:57:07.138+00 |
| 3 | 17 | 2020-07-11 11:33:21.919+00 |
| 4 | 46 | 2020-07-22 10:17:11.104+00 |
| 5 | 97 | 2020-07-13 20:57:07.138+00 |
| 6 | 17 | 2020-07-04 11:33:21.919+00 |
| 6 | 17 | 2020-07-11 09:23:21.919+00 |
+----+--------+----------------------------+
I want to get the list of events that had a previous event on the same day, by the same user. The result for the above table would be:
+----+--------+----------------------------+
| id | userId | time |
+----+--------+----------------------------+
| 1 | 46 | 2020-07-22 11:22:55.307+00 |
| 3 | 17 | 2020-07-11 11:33:21.919+00 |
+----+--------+----------------------------+
How can I perform a select query that filters results by evaluating them against other rows in the table?
This can be done using an EXISTS condition:
select t1.*
from the_table t1
where exists (select *
from the_table t2
where t2.userid = t1.userid -- for the same user
and t2.time::date = t1.time::date -- on the same
and t2.time < t1.time); -- but previously on that day
You can use lag():
select t.*
from (select t.*,
lag(time) over (partition by userid, time::date order by time) as prev_time
from t
) t
where prev_time is not null;
Here is a db<>fiddle.
Or row_number():
select t.*
from (select t.*,
row_number() over (partition by userid, time::date order by time) as seqnum
from t
) t
where seqnum >= 2;
You can use LAG() to find the previous row for a user. Then a simple comparison will tell if it occured in the same day or not.
For example:
select *
from (
select
*,
lag(time) over(partition by userId order by time) as prev_time
from t
) x
where date::date = prev_time::date
You can use ROW_NUMBER() analytic function :
SELECT id , userId , time
FROM
(
SELECT ROW_NUMBER() OVER (PARTITION BY UserId, date_trunc('day',time) ORDER BY time DESC) AS rn,
t.*
FROM Events
) q
WHERE rn > 1
in order to bring the latest event for UserId who takes place in more than one event.

Select the highest value of column 2 per column 1

Given the following table P_PROV
+----+-----------+-----------+
| id | date | person_id |
+----+-----------+-----------+
| 1 |19/06/2019 | 1 |
| 2 |18/07/2010 | 2 |
| 3 |19/06/2020 | 1 |
| 4 |17/06/2020 | 2 |
| 5 |28/06/2020 | 3 |
+----+-----------+-----------+
I want this output
+----+-----------+-----------+
| id | date | person_id |
+----+-----------+-----------+
| 3 |19/06/2020 | 1 |
| 4 |17/06/2020 | 2 |
| 5 |28/06/2020 | 3 |
+----+-----------+-----------+
Putting this in words, I want to return per person the maximum date. I tried something like this
SELECT DISTINCT pp.date, pp.id FROM P_PROV pp
WHERE (SELECT MAX(aa.date)
FROM P_PROV aa) = pp.date;
This one is only returning one row (of course, because the MAX will return the maximum date only), but I really don't know how to approach this issue, any kind of help would be appreciated
ROW_NUMBER provides one way to handle this:
SELECT id, date, person_id
FROM
(
SELECT t.*, ROW_NUMBER() OVER (PARTITION BY person_id ORDER BY date DESC) rn
FROM yourTable t
) t
WHERE rn = 1;
Oracle has a fun way to do this using aggregation:
select max(id) keep (dense_rank first order by date desc) as id,
max(date) as date, person_id
from P_PROV
group by person_id;
Given that your ids are increasing, this probably also does what you want:
select max(id) as id, max(date) as date, person_id
from P_PROV
group by person_id;

SELECT based on multiple fields in MS-SQL

I have a table with 4 columns:
AcctNumb | PeriodEndingDate | WaterConsumption | ReadingType
There are multiple records for each AcctNumb, with the date that each record was recorded.
What I want to do is grab the most recent date, consumption reading, and reading type for each account.
I have tried using MAX(PeriodEndingDate) and GROUP BY AcctNumb, but I would need to aggregate all the other values, and none of the aggregate functions help me for the WaterConsumption, etc.
Can anyone point me in the right direction?
Thanks
EDIT
Here is a sample table
+----------+------------------+------------------+-------------+
| AcctNumb | PeriodEndingDate | WaterConsumption | ReadingType |
+----------+------------------+------------------+-------------+
| 1000 | 2018-03-31 | 122230 | A |
| 1001 | 2018-03-31 | 24850 | A |
| 1002 | 2018-03-31 | 88540 | A |
| 1000 | 2017-12-31 | 123800 | A |
| 1001 | 2017-12-31 | 3000 | E |
+----------+------------------+------------------+-------------+
The ReadingType is whether it's an actual (A) reading, or an estimate (E).
Try this
SELECT
AcctNumb,
PeriodEndingDate,
WaterConsumption,
ReadingType
FROM (SELECT
AcctNumb,
PeriodEndingDate,
WaterConsumption,
ReadingType,
ROW_NUMBER() OVER (PARTITION BY AcctNumb ORDER BY PeriodEndingDate DESC) AS MostrecentRecord
FROM <TableName>) dt
WHERE MostrecentRecord= 1
This can be done using ROW_NUMBER. It has been asked an answered thousands of times but the query is easier to write than find a duplicate.
select *
from
(
select *
, RowNum = ROW_NUMBER() over(partition by AcctNumb order by PeriodEndingDate)
from YourTable
) x
where x.RowNum = 1
SELECT DQ.* FROM
(SELECT *,
Row_Number() OVER (PARTITION BY AcctNumb ORDER BY PeriodEndingDate DESC) AS RN
FROM YourTable
) AS DQ
WHERE DQ.RN = 1

Select top 1 Student Fee From List In SQL Server

In my SQL Server table, I have this data:
+------+-----+------------+
| Name | Fee | Date_Time |
+------+-----+------------+
| AA | 50 | 2018-03-27 |
| AA | 30 | 2018-04-10 |
| BB | 40 | 2018-01-10 |
| BB | 10 | 2018-04-10 |
| CC | 10 | 2018-04-10 |
| DD | 10 | 2018-04-10 |
+------+-----+------------+
How can I get data using SQL query like TOP 1 for (AA, BB, CC, DD) ORDER BY Date_Time DESC into a list?
+------+-----+------------+
| Name | Fee | Date_Time |
+------+-----+------------+
| AA | 30 | 2018-04-10 |
| BB | 10 | 2018-04-10 |
| CC | 10 | 2018-04-10 |
| DD | 10 | 2018-04-10 |
+------+-----+------------+
Use row_number() function to get the top most Fee
select top(1) with ties Name, Fee, Date_Time
from table t
order by row_number() over (partition by Name order by Date_Time desc)
Another approach can be
SELECT Name,Fee,Date_Time FROM
(
SELECT *, ROW_NUMBER() OVER(PARTITION BY NAME ORDER BY DATE_TIME DESC) RN
FROM [TABLE_NAME]
) T
WHERE RN=1
In case if you have multiple entries on same day for a particular fee, and you want both should appear you can use DENSE_RANK() instead of ROW_NUMBER() like following.
SELECT Name,Fee,Date_Time FROM
(
SELECT *, DENSE_RANK() OVER(PARTITION BY NAME ORDER BY DATE_TIME DESC) RN
FROM [TABLE_NAME]
) T
WHERE RN=1
DEMO
Give a row_number based on the partition by Name and order by descending order of Date_Time and then select rows having row_number is 1.
Query
;with cte as (
select [rn] = row_number() over(
partition by [Name]
order by [Date_Time] desc
), *
from [your_table_name]
)
select [Name], [Fee], [Date_Time]
from cte
where [rn] = 1;