Difference in the output of query, using rank() and CTE - sql

My first query looks like:
select trans.* from
( select
acc_num,
acc_type,
trans_amount,
load_date,
rank() over(partition by acc_num order by load_date) as rk
from monetary
where rat_code = 123
) trans
where trans.rk =1;
second query looks like
with a as (
select *,
row_number() over(partition by acc_num order by load_date) as rn
from monetary
where rat_code = 123 )
select
acc_num,
acc_type,
trans_amount,
load_date
from a
where rn =1;
Can any one please help me I am getting different number of records for both the cases.
though the query is same.

Its because there is difference between rank and row_number.
Below example will show
Accno, dt, rank_col, rownum_col
100, 2-jun-2022, 1, 1
100, 3-jun-2022, 1, 2
100, 1-jul-2022, 1, 3
54, 2-jun-2022, 4, 1
54, 1-jul-2022, 4, 2
In above example, you can see row number will calculate unique row id. Whereas rank gives unique id but in a continuous manner. You can see from above example, rank=1 gives you 3 rows but rownum=1 gives only two.

Related

Given an id, can you get the "index" of that items id in a sorted query

EDIT:
I am using android versions that don't have a sqlite version > 2.35. I cannot use ROW_NUMBER.
Given the following table:
id, date(long)
1, 100
2, 25
3, 5
4, 50
If I query for items sorted:
select * from items order by date:
id, date
3, 5
2, 25
4, 50
1, 100
If I have id 4, can I query to get the index in the sorted list, in this case index "3"
With ROW_NUMBER() window function:
SELECT rn
FROM (
SELECT *, ROW_NUMBER() OVER (ORDER BY date DESC) rn
FROM items
)
WHERE id = 4;
An alternative, for versions of SQLite prior to 3.25.0+ which don't support window functions:
SELECT COUNT(*) + 1 rn
FROM items
WHERE date > (SELECT date FROM items WHERE id = 4);
See the demo.
You can use ROW_NUMBER() to get the location of the row according to a custom ordering. The query can look like:
select rn
from (
select t.*, row_number() over(order by date) as rn from t
) x
where id = 4

BigQuery SQL: Sum of first N related items

I would like to know the sum of a value in the first n items in a related table. For example, I want to get the sum of a companies first 6 invoices (the invoices can be sorted by ID asc)
Current SQL:
SELECT invoices.company_id, SUM(invoices.amount)
FROM invoices
JOIN companies on invoices.company_id = companies.id
GROUP BY invoices.company_id
This seems simple but I can't wrap my head around it.
Consider also below approach
select company_id, (
select sum(amount)
from t.amounts amount
) as top_six_invoices_amount
from (
select invoices.company_id,
array_agg(invoices.amount order by invoices.invoice_id limit 6) amounts
from your_table invoices
group by invoices.company_id
) t
You can create order row numbers to the lines in a partition based on invoice id and filter to it, something like this:
with array_table as (
select 'a' field, * from unnest([3, 2, 1 ,4, 6, 3]) id
union all
select 'b' field, * from unnest([1, 2, 1, 7]) id
)
select field, sum(id) from (
select field, id, row_number() over (partition by a.field order by id desc) rownum
from array_table a
)
where rownum < 3
group by field
More examples for analytical examples here:
https://medium.com/#aliz_ai/analytic-functions-in-google-bigquery-part-1-basics-745d97958fe2
https://cloud.google.com/bigquery/docs/reference/standard-sql/analytic-function-concepts

Getting a max value for every id given in query (Postgres)

I have query like this
select distinct on (foreign_id) foreign_id, id, date
from table
where foreign_id IN (1, 2, 3)
I am getting result as
foreign_id
id
date
1
101
2019-03-20
2
102
2020-02-06
3
103
2020-06-09
Which is good because I want to get only single row every foreign_id but I would like to get row with max date and max id value in result.
Right now for id number 1 I am getting date 2019-03-20 which is not the greatest date that is in table
I have tried to use max() function but It returns only one row from one given foreign_id
Any ideas?
You are missing an ORDER BY:
select distinct on (foreign_id) foreign_id, id, date
from table
where foreign_id IN (1, 2, 3)
order by foreign_id, date DESC;
You can use analytical function as follows:
select select foreign_id, id, date from
(select foreign_id, id, date,
row_number() over (partition by foreign_id order by date desc) as rn
from table
where foreign_id IN (1, 2, 3) ) t
where rn = 1
Just add the ORDER BY clause, because DISTINCT ON takes the first record of an ordered group.
select distinct on (foreign_id) foreign_id, id, date
from table
where foreign_id IN (1, 2, 3)
order by foreign_id, date desc <<--- add this
You can achieve this through cte and row_number() as below:
with cte as (
select foreign_id,id,date, row_number()over (partition by foreign_id order by date desc) rownum
from t
)
select foreign_id,id,date from cte where rownum=1

Select SQL logic

Folks at a loss here!!!
First, this is what I am trying to achieve:
Select all the records from table CUSTOMER_ORDER_DETAILS table shown below and if multiple entries for the same CUSTOMER_NO exist then:
- select the entry with PAID = 1
- if there are multiple PAID = 1 entries, then select the record with TYPE = Y
Expected Result:
877, CU115, lit, 0, 1, X
878, CU111, Toi, 1, 1, Y
879, CU117, Fla, 1, 1, X
My approach was to get the count(CUSTOMER_NO) > 1 using GROUP BY on CUSTOMER_NO, but as soon as I am adding the remaining columns of the table to the Select statement, the count column is showing a value of 1.
Any pointers to tackle this or implement if-else kind of logic?
This is a prioritization query. Here is one method to do what you want:
select t.*
from (select t.*,
row_number() over (partition by customer_no
order by paid desc, type desc
) as seqnum
from t
) t
where seqnum = 1;
This assumes that paid takes on the values 0 and 1, and that type has the values X and Y.
You can prioritize these conditions with an order by condition in row_number function.
select * from (
select t.*,
row_number() over(partition by customer_no
order by case when paid=1 and type='Y' then 1
when paid=1 then 2
else 3 end) as rnum
from customer_orders t
) t
where rnum=1
This assumes there can only be one row with type='Y' per customer_no if there exist multiple rows with paid=1 for that same customer_no.
If there exist multiple rows with paid =1 and all of them have a type <> 'Y' then a row is arbitrarily picked amongst them.

"Group" some rows together before sorting (Oracle)

I'm using Oracle Database 11g.
I have a query that selects, among other things, an ID and a date from a table. Basically, what I want to do is keep the rows that have the same ID together, and then sort those "groups" of rows by the most recent date in the "group".
So if my original result was this:
ID Date
3 11/26/11
1 1/5/12
2 6/3/13
2 10/15/13
1 7/5/13
The output I'm hoping for is:
ID Date
3 11/26/11 <-- (Using this date for "group" ID = 3)
1 1/5/12
1 7/5/13 <-- (Using this date for "group" ID = 1)
2 6/3/13
2 10/15/13 <-- (Using this date for "group" ID = 2)
Is there any way to do this?
One way to get this is by using analytic functions; I don't have an example of that handy.
This is another way to get the specified result, without using an analytic function (this is ordering first by the most_recent_date for each ID, then by ID, then by Date):
SELECT t.ID
, t.Date
FROM mytable t
JOIN ( SELECT s.ID
, MAX(s.Date) AS most_recent_date
FROM mytable s
WHERE s.Date IS NOT NULL
GROUP BY s.ID
) r
ON r.ID = t.ID
ORDER
BY r.most_recent_date
, t.ID
, t.Date
The "trick" here is to return "most_recent_date" for each ID, and then join that to each row. The result can be ordered by that first, then by whatever else.
(I also think there's a way to get this same ordering using Analytic functions, but I don't have an example of that handy.)
You can use the MAX ... KEEP function with your aggregate to create your sort key:
with
sample_data as
(select 3 id, to_date('11/26/11','MM/DD/RR') date_col from dual union all
select 1, to_date('1/5/12','MM/DD/RR') date_col from dual union all
select 2, to_date('6/3/13','MM/DD/RR') date_col from dual union all
select 2, to_date('10/15/13','MM/DD/RR') date_col from dual union all
select 1, to_date('7/5/13','MM/DD/RR') date_col from dual)
select
id,
date_col,
-- For illustration purposes, does not need to be selected:
max(date_col) keep (dense_rank last order by date_col) over (partition by id) sort_key
from sample_data
order by max(date_col) keep (dense_rank last order by date_col) over (partition by id);
Here is the query using analytic functions:
select
id
, date_
, max(date_) over (partition by id) as max_date
from table_name
order by max_date, id
;