creating a pseudo linked list in sql - sql

I have a table that has the following columns
table: route
columns: id, location, order_id
and it has values such as
id, location, order_id
1, London, 12
2, Amsterdam, 102
3, Berlin, 90
5, Paris, 19
Is it possible to do a sql select statement in postgres that will return each row along with the id with the next highest order_id? So I want something like...
id, location, order_id, next_id
1, London, 12, 5
2, Amsterdam, 102, NULL
3, Berlin, 90, 2
5, Paris, 19, 3
Thanks

select
id,
location,
order_id,
lag(id) over (order by order_id desc) as next_id
from your_table

Creating testbed first:
CREATE TABLE route (id int4, location varchar(20), order_id int4);
INSERT INTO route VALUES
(1,'London',12),(2,'Amsterdam',102),
(3,'Berlin',90),(5,'Paris',19);
The query:
WITH ranked AS (
SELECT id,location,order_id,rank() OVER (ORDER BY order_id)
FROM route)
SELECT b.id, b.location, b.order_id, n.id
FROM ranked b
LEFT JOIN ranked n ON b.rank+1=n.rank
ORDER BY b.id;
You can read more on the window functions in the documentation.

yes:
select * ,
(select top 1 id from routes_table where order_id > main.order_id order by 1 desc)
from routes_table main

Related

Difference in the output of query, using rank() and CTE

My first query looks like:
select trans.* from
( select
acc_num,
acc_type,
trans_amount,
load_date,
rank() over(partition by acc_num order by load_date) as rk
from monetary
where rat_code = 123
) trans
where trans.rk =1;
second query looks like
with a as (
select *,
row_number() over(partition by acc_num order by load_date) as rn
from monetary
where rat_code = 123 )
select
acc_num,
acc_type,
trans_amount,
load_date
from a
where rn =1;
Can any one please help me I am getting different number of records for both the cases.
though the query is same.
Its because there is difference between rank and row_number.
Below example will show
Accno, dt, rank_col, rownum_col
100, 2-jun-2022, 1, 1
100, 3-jun-2022, 1, 2
100, 1-jul-2022, 1, 3
54, 2-jun-2022, 4, 1
54, 1-jul-2022, 4, 2
In above example, you can see row number will calculate unique row id. Whereas rank gives unique id but in a continuous manner. You can see from above example, rank=1 gives you 3 rows but rownum=1 gives only two.

BigQuery SQL: Sum of first N related items

I would like to know the sum of a value in the first n items in a related table. For example, I want to get the sum of a companies first 6 invoices (the invoices can be sorted by ID asc)
Current SQL:
SELECT invoices.company_id, SUM(invoices.amount)
FROM invoices
JOIN companies on invoices.company_id = companies.id
GROUP BY invoices.company_id
This seems simple but I can't wrap my head around it.
Consider also below approach
select company_id, (
select sum(amount)
from t.amounts amount
) as top_six_invoices_amount
from (
select invoices.company_id,
array_agg(invoices.amount order by invoices.invoice_id limit 6) amounts
from your_table invoices
group by invoices.company_id
) t
You can create order row numbers to the lines in a partition based on invoice id and filter to it, something like this:
with array_table as (
select 'a' field, * from unnest([3, 2, 1 ,4, 6, 3]) id
union all
select 'b' field, * from unnest([1, 2, 1, 7]) id
)
select field, sum(id) from (
select field, id, row_number() over (partition by a.field order by id desc) rownum
from array_table a
)
where rownum < 3
group by field
More examples for analytical examples here:
https://medium.com/#aliz_ai/analytic-functions-in-google-bigquery-part-1-basics-745d97958fe2
https://cloud.google.com/bigquery/docs/reference/standard-sql/analytic-function-concepts

Getting a max value for every id given in query (Postgres)

I have query like this
select distinct on (foreign_id) foreign_id, id, date
from table
where foreign_id IN (1, 2, 3)
I am getting result as
foreign_id
id
date
1
101
2019-03-20
2
102
2020-02-06
3
103
2020-06-09
Which is good because I want to get only single row every foreign_id but I would like to get row with max date and max id value in result.
Right now for id number 1 I am getting date 2019-03-20 which is not the greatest date that is in table
I have tried to use max() function but It returns only one row from one given foreign_id
Any ideas?
You are missing an ORDER BY:
select distinct on (foreign_id) foreign_id, id, date
from table
where foreign_id IN (1, 2, 3)
order by foreign_id, date DESC;
You can use analytical function as follows:
select select foreign_id, id, date from
(select foreign_id, id, date,
row_number() over (partition by foreign_id order by date desc) as rn
from table
where foreign_id IN (1, 2, 3) ) t
where rn = 1
Just add the ORDER BY clause, because DISTINCT ON takes the first record of an ordered group.
select distinct on (foreign_id) foreign_id, id, date
from table
where foreign_id IN (1, 2, 3)
order by foreign_id, date desc <<--- add this
You can achieve this through cte and row_number() as below:
with cte as (
select foreign_id,id,date, row_number()over (partition by foreign_id order by date desc) rownum
from t
)
select foreign_id,id,date from cte where rownum=1

SQL Select highest value where duplicate ID

I have a SQL table with the columns:
ID, DayNumber, Mfm, value
432080971, 1, 15, 57
432080971, 1, 15, 59
432080978, 3, 15, 54
432080978, 4, 45, 54
Unfortunately there are some duplicated entries. What I'd like is a select statement that returns the table without duplicated ID, Daynumber and Mfm, and where if there is a double entry to select the row with the higher value.
So, as an example the above entries would be returned as:
ID, DayNumber, Mfm, value
432080971, 1, 15, 59
432080978, 3, 15, 54
432080978, 4, 45, 54
I'm using sql server management studio running sql server 2012
select top (1)
with ties ID, DayNumber, Mfm, value
from
table
order by row_number() over (partiton by
ID, DayNumber, Mfm
order by value desc)
You have to use Group By clause and use the aggregate function MAX to get the highest value of the group. Something like this:
Select ID, DayNumber, Mfm, Max(value) From
From your_table
Group By ID, DayNumber, Mfm
select
ID, DayNumber, Mfm, max(value)
from table
group by ID, DayNumber, Mfm

Unable to retrieve a row with highest value of a column using row number and group by

I am working on a query which returns one row which has highest price in it for each product.
For Example I have
Table T1
Product Price Tax Location
Pen 10 2.25 A
Pen 5 1.25 B
Pen 15 1.5 A
Board 25 5.26 A
Board 2 NULL B
Water 5 10 A
The result should be like
Product Price Tax Location
Pen 15 1.5 A
Board 25 5.26 A
Water 5 10 A
I am using row number() and group by to achieve this using the following
ALTER VIEW [dbo].[InferredBestBids]
AS
SELECT ROW_NUMBER() OVER ( ORDER BY ( SELECT NULL
) ) AS id ,
product ,
MAX(price) AS Price ,
MIN(tax) AS Tax ,
location
FROM [dbo].InferredBids_A
WHERE NOT ( proce IS NULL
AND tax IS NULL
)
GROUP BY market ,
term
GO
When I ran the above query, it threw me the error
Column 'dbo.InferredBids_A.Location' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause
When I tried to group the query results by location, it gave me incorrect results by returning multiple rows for a product depending on the location
No GROUP BY is needed if you get your row_number clauses properly engaged and then just select based on the rownumber. Feel free to add an extra row_number call to the front of the outer query if you require it for some other reason. See the example here.
SELECT Product, Price, Tax, Location
FROM (
SELECT Product, Price, Tax, Location, ROW_NUMBER()OVER(PARTITION BY Product ORDER BY Price DESC) as RowID
FROM InferredBids_A
) T
WHERE RowID = 1
If you select something that's aggregated you must GROUP BY anything else in the select list that is not also aggregated:
SELECT Product, Price ,Tax, Location
FROM (SELECT Product, Price ,Tax, Location,
RANK() OVER (PARTITION BY Product ORDER BY Price DESC) N
FROM InferredBids_A
WHERE Price IS NOT NULL AND Tax IS NOT NULL
) T WHERE N = 1
(RANK will give rows for ties, use ROW_NUMBER if you don't care about these)
Making some test data:
DECLARE #BestBids TABLE
(
Product VARCHAR(20),
Price INT,
Tax DECIMAL(10,2),
Location VARCHAR(10)
)
INSERT INTO #BestBids
VALUES
('Pen', 10, 2.25, 'A'),
('Pen', 5, 1.25, 'B'),
('Pen', 15, 1.5, 'A'),
('Board', 25, 5.26, 'A'),
('Board', 2, NULL, 'B'),
('Water', 5, 10, 'A');
We get our row number to be set to the highest price for each product.
SELECT * FROM
(
SELECT *,
ROW_NUMBER() OVER (PARTITION BY Product ORDER BY Price DESC) RN
FROM #BestBids
) a
WHERE RN=1
We wrap the sql and just pick the first row number. Here is the output:
Product Price Tax Location RN
Board 25 5.26 A 1
Pen 15 1.50 A 1
Water 5 10.00 A 1
You could use a common table expression:
WITH cte
AS ( SELECT product ,
MAX(price) AS price
FROM dbo.InferredBids_A
)
SELECT product ,
price ,
tax ,
location
FROM dbo.InferredBids_A tbl
INNER JOIN cte ON cte.product = tbl.product
AND cte.price = tbl.price