PostgreSQL use value from previous row if missing - sql

I have a following query:
WITH t as (
SELECT date_trunc('hour', time_series) as trunc
FROM generate_series('2013-02-27 22:00'::timestamp, '2013-02-28 2:00',
'1 hour') as time_series
GROUP BY trunc
ORDER BY trunc
)
SELECT DISTINCT ON(trunc) trunc, id
FROM t
LEFT JOIN (
SELECT id, created, date_trunc('hour', created) as trunc_u
FROM event
ORDER BY created DESC
) u
ON trunc = trunc_u
which yields the following result:
"2013-02-27 22:00:00";
"2013-02-27 23:00:00";2
"2013-02-28 00:00:00";5
"2013-02-28 01:00:00";
"2013-02-28 02:00:00";
Table event has id, created and some other columns, but only those are relevant here. The query above gives me id of last event generated per given trunc time period (thanks to DISTINCT ON I get a nice aggregation per period).
Now, this query yields NULL if no events happened in given time period. I would like it to return the previous available id, even if it is from different time period. I.e.:
"2013-02-27 22:00:00";0
"2013-02-27 23:00:00";2
"2013-02-28 00:00:00";5
"2013-02-28 01:00:00";5
"2013-02-28 02:00:00";5
I am sure I am missing some easy way to accomplish this. Any advice?

You ca mix a self join and windows functions
Simplifying I take this table with this sample values:
create table t ( a int, b int);
insert into t values
( 1, 1),
( 2, Null),
( 3, Null),
( 4, 2 ),
( 5, Null),
( 6, Null);
In your query a is trunc_u and b is your id.
The query is:
with cte as (
select
t1.a,
coalesce( t1.b, t2.b, 0) as b,
rank() OVER
(PARTITION BY t1.a ORDER BY t2.a DESC) as pos
from t t1
left outer join t t2
on t2.b is not null and
t2.a < t1.a
)
select a, b
from cte
where pos = 1;
And results:
| A | B |
---------
| 1 | 1 |
| 2 | 1 |
| 3 | 1 |
| 4 | 2 |
| 5 | 2 |
| 6 | 2 |

Try:
WITH t as (
SELECT time_series as trunc
FROM generate_series('2013-02-27 22:00'::timestamp, '2013-02-28 2:00',
'1 hour') as time_series
)
SELECT DISTINCT ON(t.trunc) t.trunc, e.id
FROM t
JOIN event e
ON e.created < t.trunc
ORDER BY t.trunc, e.created DESC
If it is too slow - tell me. I will give you a faster query.

Related

How can i select values from the same table, comparing values on each line?

I'm trying to get the ID values where the AL and DATE are the same, but the ID are diferent, from the same table.
I've tried like this:
SELECT a.AL,a.ID,a.date
FROM
tabel a,
tabel b
where a.id <> b.id
AND a.al = b.al
AND a.date LIKE '%20201016%'
GROUP BY a.id,a.al ,a.date
Table
AL ID date
10 400 20201016
20 400 20201016
30 100 20201016
20 100 20201016
10 100 20201016
10 300 20201016
But its returning repetitive values.
I need that the result be like:
AL ID date
10 400 20201016
10 100 20201016
10 300 20201016
Rather than performing a self-join, you can use analytic functions (and only query the table once):
SELECT AL,
ID,
"DATE"
FROM (
SELECT AL,
ID,
"DATE",
COUNT(*) OVER ( PARTITION BY al )
- COUNT(*) OVER ( PARTITION BY al, id ) AS cnt,
ROW_NUMBER() OVER ( PARTITION BY id ORDER BY al ) AS rn
FROM table_name
WHERE "DATE" >= DATE '2020-10-16'
AND "DATE" < DATE '2020-10-16' + INTERVAL '1' DAY
)
WHERE cnt > 0
AND rn = 1;
(Note: rather than comparing dates as strings you can use TRUNC to remove the time part or compare on a range of dates from midnight of the day up until midnight of the next day; a date range is better as it would then use an index on the column whereas using TRUNC would require a specific function-based index.)
So, for the sample data:
CREATE TABLE table_name ( AL, ID, "DATE" ) AS
SELECT 10, 400, DATE '2020-10-16' FROM DUAL UNION ALL
SELECT 20, 400, DATE '2020-10-16' FROM DUAL UNION ALL
SELECT 30, 100, DATE '2020-10-16' FROM DUAL UNION ALL
SELECT 20, 100, DATE '2020-10-16' FROM DUAL UNION ALL
SELECT 10, 100, DATE '2020-10-16' FROM DUAL UNION ALL
SELECT 10, 300, DATE '2020-10-16' FROM DUAL;
Outputs:
AL | ID | DATE
-: | --: | :--------
10 | 100 | 16-OCT-20
10 | 300 | 16-OCT-20
10 | 400 | 16-OCT-20
db<>fiddle here
I think you want exists and a correlated subquery:
select a.*
from mytable t1
where exists (
select 1
from mytable t1
where t1.al = t.al and t1.date = t.date and t1.id <> t.id
)
Instead of using <> when comparing the ids, use <:
SELECT a.AL,a.ID,a.date
FROM
tabel a,
tabel b
where a.id < b.id
AND a.al = b.al
AND a.date LIKE '%20201016%'
GROUP BY a.id, a.al ,a.date
You are getting duplicates because you check every row against every other row in both directions. Checking only "less than" means these duplicates will not appear.

SQL: Order by date and distinct Number without losing fields

My knowledge about SQL is not the best but I need a quick solution for this. I have a table
number | date | text
1 | 2018-01-13 | A
2 | 2018-01-15 | B
1 | 2018-02-15 | C
Now I need to remove the duplicate value "number" in the output(select) based on the date. It should look like this:
number | date | text
2 | 2018-01-15 | B
1 | 2018-02-15 | C
I tried
SELECT DISTINCT number, date ORDER BY date DESC FROM table
The problem is that I now miss the field "text" in the output. I also tried
SELECT * DISTINCT(SELECT * number ORDER BY(date) DESC) FROM table
Any ideas?
One option uses ROW_NUMBER:
SELECT number, date, text
FROM
(
SELECT *, ROW_NUMBER() OVER (PARTITION BY number ORDER BY date DESC) rn
FROM yourTable
) t
WHERE rn = 1;
You can achieve it using below query:
SELECT t1.number, t1.date, t1.text
FROM yourTable t1
INNER JOIN
(
SELECT number, MAX(date) AS max_date
FROM yourTable
GROUP BY number
) t2
ON t1.number = t2.number AND
t1.date = t2.max_date;
use below query this will work for ur requirement
SELECT number, date, text
FROM
(
SELECT *, ROW_NUMBER() OVER (PARTITION BY number ORDER BY date DESC) rn
FROM test
) t
WHERE rn = 1;
It can be done just using window functions:
create table #TEST ([NUMBER] int, [DATE] date, [TEXT] char(1))
insert into #TEST values (1, '2018-01-13', 'A'), (2, '2018-01-15', 'B'), (1, '2018-02-15', 'C')
SELECT DISTINCT [NUMBER]
, FIRST_VALUE([DATE]) OVER (PARTITION BY [NUMBER] ORDER BY [DATE] DESC) AS [DATE]
, FIRST_VALUE([TEXT]) OVER (PARTITION BY [NUMBER] ORDER BY [DATE] DESC) AS [TEXT]
FROM #TEST

select distinct list of ids from table with earliest value in same table

I have the following table,
SDate Id Balance
2016-01-01 ABC 3
2016-01-01 DEF 7
2016-01-01 GHI 2
2016-02-01 ABC 6
2016-02-01 DEF 4
2016-02-01 GHI 8
2016-02-01 XYZ 12
I need to write a query that gives me a distinct list of Id's over a date range (so in this example SDate >= '2016-01-01' and SDate <= '2016-02-01') but also give me the earliest balance so the result from the table above I would like to see is,
Id Balance
ABC 3
DEF 7
GHI 2
XYZ 12
Is this possible?
UPDATE
Sorry I should have specified that for each date the Id is unique.
You can do this with a derived table that first works out the minimum SDate value for each Id value. Using this you then join back to your original table to find the Balance for the row that matches those values:
declare #t table(SDate date,Id nvarchar(3),Balance int);
insert into #t values ('2016-01-01','ABC',3),('2016-01-01','DEF',7),('2016-01-01','GHI',2),('2016-02-01','ABC',6),('2016-02-01','DEF',4),('2016-02-01','GHI',8),('2016-02-01','XYZ',12);
declare #StartDate date = '20160101';
declare #EndDate date = '20160201';
with d as
(
select Id
,min(SDate) as MinSDate
from #t
where SDate between #StartDate and #EndDate
group by id
)
select d.Id
,t.Balance
from d
inner join #t t
on(d.Id = t.Id
and d.MinSDate = t.SDate
);
Output:
Id | Balance
----+--------
ABC | 3
DEF | 7
GHI | 2
XYZ | 12
This should be possible with a window function - all you have to do is
partition by id
assign a row number, and
select the top row for each id
Example:
select id,
balance
from (
select id,
balance,
row_number() over( partition by id order by SDate ) as row_num
from table1
where SDate between '2016-01-01' and '2016-02-01'
) as a
where row_num = 1
Note: the advantage of this method is it is a lot more flexible. Say you wanted the 2 oldest records, you could just change to where row_num <= 2.
Analytic row_number() should be the fastest
select *
from (
select
t.*,
row_number() over (partition by Id order by SDate) rn
from your_table t
) t where rn = 1;
You can achieve this with a self join, which may not be the fastest or most elegant solution:
CREATE TABLE #SOPostSample
(
SDate DATE ,
Id NVARCHAR(5) ,
Balance INT
);
INSERT INTO #SOPostSample
( SDate, Id, Balance )
VALUES ( '2016-01-01', 'ABC', 3 ),
( '2016-01-01', 'DEF', 7 ),
( '2016-01-01', 'GHI', 2 ),
( '2016-02-01', 'ABC', 6 ),
( '2016-02-01', 'DEF', 4 ),
( '2016-02-01', 'GHI', 8 ),
( '2016-02-01', 'XYZ', 12 );
SELECT t1.Id ,
MIN(t2.Balance) Balance
FROM #SOPostSample t1
INNER JOIN #SOPostSample t2 ON t1.Id = t2.Id
GROUP BY t1.Id ,
t2.SDate
HAVING t2.SDate = MIN(t1.SDate);
DROP TABLE #SOPostSample;
Produces:
id Balance
============
ABC 3
DEF 7
GHI 2
XYZ 12
This works for the sample data, but please test with more data as I just wrote it quickly.
This should work, Top 1 just inserted for safety, should not be needed if SDate and Id are unique in combination
SELECT o.Id ,
( SELECT TOP 1
Balance
FROM tbl
WHERE Id = o.Id
AND SDate = MIN(o.SDate)
) Balance
FROM tbl o
GROUP BY Id
HAVING sDate BETWEEN '20160101' AND '20160201';
You can use sub-query
SELECT Id ,
( SELECT TOP 1
Balance
FROM [TableName] AS T1
WHERE T1.Id = [TableName].Id
ORDER BY SDate
) AS Balance
FROM [TableName]
GROUP BY Id;

Max rows by group

Current SQL:
select t1.*
from table t1
where t1.id in ('2', '3', '4')
Current results:
id | seq
---+----
3 | 5
2 | 7
2 | 5
3 | 7
4 | 3
Attempt to select maxes:
select t1.*
from table t1
where t1.id in ('2', '3', '4')
and t1.seq = (select max(t2.seq)
from table2 t2
where t2.id = t1.id)
This obviously does not work since I'm using an in list. How can I adjust my SQL to get these expected results:
id | seq
---+----
2 | 7
3 | 7
4 | 3
Group By is your friend:
SELECT
id,
MAX(seq) seq
FROM TABLE
GROUP BY id
EDIT: Response to comment. To get the rest of the data from the table matching the max seq and id just join back to the table:
SELECT t1.*
FROM TABLE t1
INNER JOIN (
SELECT
id
MAX(seq) as seq
FROM TABLE
GROUP BY id
) as t2
on t1.id = t2.id
and t1.seq = t2.seq
EDIT: Gordon and Jean-Francois are correct you can also use the ROW_NUMBER() analytic function to get the same result. You need to check the performance difference for your application (I did not check). Here is an example of that:
SELECT *
FROM (
SELECT ROW_NUMBER() OVER (
PARTITION BY id
ORDER BY seq DESC) as row_num
,*
FROM TABLE
) as TMP
WHERE row_num = 1
This SQL Query will give you max seq from individaul ID.
SELECT t1.*
FROM t1
WHERE t1.id in ('2', '3', '4')
AND NOT EXISTS (
SELECT *
FROM t1 t2
WHERE t2.id = t1.id
AND t2.seq > t1.seq
select *
from table
where (id,seq) in
(
select id,max(seq)
from table
group by id
having id in ('2','3','4')
);
That is if id and/or seq are completely part of the PK of that table.
Here's another example, using the first/last method I mentioned earlier in the comments:
with sd as (select 3 id, 5 seq, 1 dummy from dual union all
select 2 id, 7 seq, 2 dummy from dual union all
select 2 id, 5 seq, 3 dummy from dual union all
select 3 id, 7 seq, 4 dummy from dual union all
select 3 id, 7 seq, 5 dummy from dual union all
select 4 id, 3 seq, 6 dummy from dual)
select id,
max(seq) max_seq,
max(dummy) keep (dense_rank first order by seq desc) max_rows_dummy
from sd
group by id;
ID MAX_SEQ MAX_ROWS_DUMMY
---------- ---------- --------------
2 7 2
3 7 5
4 3 6
The keep (dense_rank first order by ...) bit is requesting to keep the values associated with the rank of 1 in the order list of rows. The max(...) bit is there in case more then one row has a rank of 1; it's just a way of breaking ties.

getting latest event for each eventtype from table [duplicate]

This question already has answers here:
How to select records with maximum values in two columns?
(2 answers)
Closed 9 years ago.
I have a table with events, their time stamps and some errorcode and would like to get the latest occurrence for each of them. I am using oracle sql
TS eventType code
t1 A 1
t2 A 5
t3 BA 2
t4 A 1
t5 B 3
t6 B 1
t7 ZA -
t8 A 1
Assuming that t is strictly increasing, I am looking for a query that returns for each eventType A,B,C,... the latest event.
TS eventType code
t3 BA 2
t6 B 1
t7 ZA -
t8 A 1
My natural approach would be to loop and union, but there seem to be no straightforward ways to loop so I was hoping there would be another way that is more in the spirit of oracle sql to solve this
One of the method is,
SELECT ts, eventtype, code
FROM table_name t1
WHERE ts = (SELECT MAX(ts)
FROM table_name t2
WHERE t1.eventtype = t2.eventtype);
Or you can use MAX() analytic function to do this,
SELECT ts, eventtype, code
FROM(
SELECT ts,
eventtype,
code,
MAX(ts) OVER (PARTITION BY eventtype) dt
FROM t
)
WHERE ts = dt
ORDER BY ts;
Analytical functions are the most efficient method (lowest cost execution plan) I've found to solve this and also have very simple syntax that does not require nested SELECTs and allows you to ORDER BY multiple columns easily.
SQL Fiddle
Oracle 11g R2 Schema Setup:
CREATE TABLE tbl ( TS, eventType, code ) AS
SELECT 't1', 'A', 1 FROM DUAL
UNION ALL SELECT 't2', 'A', 5 FROM DUAL
UNION ALL SELECT 't3', 'BA', 2 FROM DUAL
UNION ALL SELECT 't4', 'A', 1 FROM DUAL
UNION ALL SELECT 't5', 'B', 3 FROM DUAL
UNION ALL SELECT 't6', 'B', 1 FROM DUAL
UNION ALL SELECT 't7', 'ZA', NULL FROM DUAL
UNION ALL SELECT 't8', 'A', 1 FROM DUAL;
Query 1:
SELECT MAX( ts ) KEEP ( DENSE_RANK LAST ORDER BY ts ) AS ts,
eventType,
MAX( code ) KEEP ( DENSE_RANK LAST ORDER BY ts ) AS code
FROM tbl
GROUP BY eventType
ORDER BY ts
Results:
| TS | EVENTTYPE | CODE |
|----|-----------|--------|
| t3 | BA | 2 |
| t6 | B | 1 |
| t7 | ZA | (null) |
| t8 | A | 1 |