sql window function not giving me the right output - sql

A B C D
1pm a 1 1
2pm a 2 2
3pm b 1 1
4pm b 2 2
5pm a 3 1
6pm a 4 2
When I do row_number() over (partition by B order by A) as C ., I get the column C. How do I get column D?

You need to assign a group to the "adjacent" values. One simple method is the difference of row numbers:
select a, b,
row_number() over (partition by b, (seqnum_a - seqnum_ab) order by a) as d
from (select t.*,
row_number() over (order by a) as seqnum_a,
row_number() over (partition by b order by a) as seqnum_ab
from t
) t;
The difference of row numbers is one solution to some types of gaps-and-islands problems (basically what you are asking for). Why it works is a little tricky to explain. I find that if someone sees the results of the subquery, they will usually get why the difference identifies the adjacent rows.

Related

SQL Row Count Over Partition By

As we know, over partition by increases until the group changes. When the group is changed, it starts over. How can the opposite be done? that is, if the group is not changed, the number should repeat as follows.
NAME | ROW_COUNT
A 1
A 1
A 1
B 2
C 3
C 3
D 4
E 5
Your scenario is of using dense_rank() as rank() doesn't maintain the sequence but just ranks the column also row_number() maintains the sequence but again in case of similar rank it assigns it a unique number
select name
, dense_rank() over (partition by name order by name)
from table;

Select most recent non null row

I have a table in Postgres with timestamp and 6 columns (A,B,C,D,E,F) with values. Every 10 minutes new record is appended to this table, however for columns B, D, F actual value is fetched only every 30 minutes meaning that only every 3rd row is not null.
I would like to write a query that outputs most recent record per every column. The only thing that comes to my mind is to write 2 queries:
SELECT A,C,E
FROM data_prices
ORDER BY date_of_record DESC LIMIT 1
SELECT B,D,F
FROM data_prices
WHERE B is not null, D is not null, F is not null
ORDER BY date_of_record DESC LIMIT 1
And then join the results into 1 table with 6 columns and 1 row. I don't know, however how to do that because in the documentation I found operations like UNION, INTERSECT, EXCEPT which append data rather than creating one wider table. Any ideas how to join these 2 selects into 1 table with 6 columns? Or maybe smarter way to get latest non NULL result per column in table?
Unfortunately, Postgres does not support lag(ignore nulls).
One method uses first_value():
select date_of_record, a,
first_value(b) over (order by (b is not null) desc, date_of_record desc) as last_b,
c,
first_value(d) over (order by (d is not null) desc, date_of_record desc) as last_d,
e,
first_value(f) over (order by (f is not null) desc, date_of_record desc) as last_f
from t
order by date_of_record desc
limit 1;

How to count by referring date in sql

I have tables like following.
table
product customer surrender_date
A a 2020/5/1
B a 2020/6/1
C b 2019/7/1
D b 2020/8/1
E b 2020/9/1
First I'd like to group by customer
product customer surrender_date
A a 2020/5/1
B a 2020/6/1
Second I'd like to rank by refferring to surrender_date from the newestone
My desired result is like following
product customer surrender_date rank
A a 2020/5/1 2
B a 2020/6/1 1
Therefore My whole desired result is following.
product customer surrender_date rank
A a 2020/5/1 2
B a 2020/6/1 1
C b 2019/7/1 3
D b 2020/8/1 2
E b 2020/9/1 1
Are there any way to achieve this?
As I've never referred to date, If someone has opinion,please let me know.
You can use window functions:
select
t.*,
row_number() over(partition by customer order by surrender_date desc) rnk
from mytable
Notes:
I don't see what the question has to do with aggregation
depending on how you want to handle ties, you might be looking for rank() or dense_rank() instead of row_number()

query for roww returning the first element of a group in db2

Suppose I have a table filled with the data below, what SQL function or query I should use in db2 to retrieve all rows having the FIRST field FLD_A with value A, the FIRST field FLD_A with value B..and so on?
ID FLD_A FLD_B
1 A 10
2 A 20
3 A 30
4 B 10
5 A 20
6 C 30
I am expecting a table like below; I am aware of grouping done by function GROUP BY but how can I limit the query to return the very first of each group?
Essentially I would like to have the information about the very first row where a new value for FLD_A is appearing for the first time?
ID FLD_A FLD_B
1 A 10
4 B 10
6 C 30
Try this it works in sql
SELECT * FROM Table1
WHERE ID IN (SELECT MIN(ID) FROM Table1 GROUP BY FLD_A)
A good way to approach this problem is with window functions and row_number() in particular:
select t.*
from (select t.*,
row_number() over (partition by fld_a order by id) as seqnum
from table1
) t
where seqnum = 1;
(This is assuming that "first" means "minimum id".)
If you use t.*, this will add one extra column to the output. You can just list the columns you want to avoid this.

Top 3 Max entries for a Combination which a condition

I am new to sql side. so if this questoin sound very easy then please spare me. I have a 4 coloumns in a sql table.Let say A,B,C,D . For any BC combination I may get any number of rows. I need to get at max 3 rows (which inturn give me 3 unique value of A for that BC ombination) for these selected rows i should have Top 3 Max value of D. As compare to other entries for that BC combination.
So there can be any number of BC combination so the above logic should imply to all of them.
Most databases support ranking functions. With these, you can do what you want as follows:
select A, B, C, D
from (select t.*,
row_number() over (partition by B, C order by D desc) as seqnum
from t
) t
where seqnum <= 3
order by B, C, D desc
The row_number() function creates a sequencial number. This number starts at "1" in very B,C group, and is ordered by the value of D descending.