sql server select query a bit complex - sql

I have a item_prices table with prices. Those prices vary at any time.
I want to display all items where date is highest
ITEM_prices
id | Items_name | item_price | item_date
------------------------------------------
1 A 10 2012-01-01
2 B 15 2012-01-01
3 B 16 2013-01-01
4 C 50 2013-01-01
5 A 20 2013-01-01
I want to display ABC items once each with highest date like as below
id | Items_name | item_price | item_date
-------------------------------------------
3 B 16 2013-01-01
4 C 50 2013-01-01
5 A 20 2013-01-01

when you can use native functions then why to go for any window function or CTE.
SELECT t1.*
FROM ITEM_prices t1
JOIN
(
SELECT Items_name,MAX(item_date) AS MaxItemDate
FROM ITEM_prices
GROUP BY Items_name
)t2
ON t1.Items_name=t2.Items_name AND t1.item_date=t2.MaxItemDate

One approach would be to use a CTE (Common Table Expression) if you're on SQL Server 2005 and newer (you aren't specific enough in that regard).
With this CTE, you can partition your data by some criteria - i.e. your Items_name - and have SQL Server number all your rows starting at 1 for each of those "partitions", ordered by some criteria.
So try something like this:
;WITH NewestItem AS
(
SELECT
id, Items_name, item_price, item_date,
RowNum = ROW_NUMBER() OVER(PARTITION BY Items_name ORDER BY item_date DESC)
FROM
dbo.ITEM_Prices
)
SELECT
id, Items_name, item_price, item_date
FROM
NewestItem
WHERE
RowNum = 1
Here, I am selecting only the "first" entry for each "partition" (i.e. for each Items_Name) - ordered by the item_date in descending order (newest date gets RowNum = 1).
Does that approach what you're looking for??

One way is to use window functions to find the maximum date for each item:
select id, Items_name, item_price, item_date
from (select ip.*,
max(item_date) over (partition by items_name) as max_item_date
from item_prices
) ip
where item_date = max_item_date;

This will select all rows with the max date.
SELECT *
FROM item_prices
WHERE item_date = (SELECT max(item_date) FROM item_prices)
ORDER BY ID
This will select all rows for each item with the max date for that item.
select id, Items_name, item_price, item_date
from (select items_name, max(item_date) max_item_date
from item_prices
group by items_name
) ip
where item_date = max_item_date and items_name = ip.items_name

Related

How to find row with equal value?

I've got a table Accounts
AMOUNT| ID_CLIENT | ID_BRANCH
250 1 1
250 1 3
100 1 4
300 2 1
300 2 3
450 3 2
100 3 2
225 4 1
225 4 2
225 4 4
225 4 5
I need to find clients who have the same amount in every branch (like ID_CLIENT = 2 and ID_CLIENT = 4). I have no idea how can I implement this ( Could anyone help me, please?
Use two levels of aggregation:
select client
from (select client, branch, sum(amount) as amount
from t
group by client, branch
) cb
group by client
having min(amount) = max(amount);
I can't tell if you can have multiple rows per client/branch. If not, you just need:
select client
from t
group by client
having min(amount) = max(amount);
You can use analytical functions to achieve the same:
Demo
with CTE1 as
(
SELECT A.*, DENSE_RANK() OVER (PARTITION BY ID_CLIENT ORDER BY AMOUNT) DN,
COUNT(*) OVER (PARTITION BY ID_CLIENT) TOTAL_COUNT
FROM TABLE1 A ORDER BY ID_CLIENT
)
SELECT ID_CLIENT FROM
(
SELECT ID_CLIENT, SUM(DN), TOTAL_COUNT
FROM CTE1
GROUP BY ID_CLIENT, TOTAL_COUNT
HAVING SUM(DN) = TOTAL_COUNT
);
By using First_value and Last_value:
Demo
SELECT DISTINCT ID_CLIENT FROM
(
SELECT A.*,
FIRST_VALUE(AMOUNT) OVER(PARTITION BY ID_CLIENT ORDER BY AMOUNT ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) FST_VAL,
LAST_VALUE(AMOUNT) OVER(PARTITION BY ID_CLIENT ORDER BY AMOUNT ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) LST_VAL
FROM TABLE1 A
) X WHERE FST_VAL = LST_VAL ;

SQL - How to SELECT the best two months which are next to each other

How to select in PostgreSQL the best two months from Table.
Table:
ID Month Value
1 2019-06 100
2 2019-07 120
3 2019-08 70
4 2019-09 200
5 2019-10 100
6 2019-11 50
I would like to select ID where sum(Value) of two months which are next to each other is the highest.
In the following case the result will be:
4 2019-09
5 2019-10
where the sum of values is equal to 300.
You can put the data on one row using join:
select t1.*, t2.*
from t t1 join
t t2
on t2.month = t1.month + interval '1 month'
order by t1.value + t.value desc
limit 1;
Getting separate rows is trickier. You can easily get the first row using lead():
select t.*
from (select t.*, lead(value, 1, 0) over (order by month) as next_value
from t
) t
order by (value + next_value) desc
limit 1;
Getting the second month is much trickier. I am thinking that the simplest method is to unpivot the first results:
select t.*
from (select t1, t2
from t t1 join
t t2
on t2.month = t1.month + interval '1 month'
order by t1.value + t.value desc
limit 1
) cross join lateral
unnest(array[t1, t2]) t
order by t.month;
Here is a solution that uses solely window functions and that does not assume that month is of a date-like datatype.
This works as follows:
first rank records per increasing month with row_number() (aliased as rn) and compute the sum of the current and previous value (aliased as vals)
rank records by vals (aliased rnk)
exhibit the row number the record that has the highest vals (aliased rn_max)
finally pull out the this record and the preceeding one (ie the one that has the previous row number)
Query:
select id, month, value
from (
select t.*, first_value(rn) over(order by rnk) rn_max
from (
select t.*, rank() over(order by vals desc) rnk
from (
select
t.*,
value + lag(value, 1, 0) over (order by month) vals,
row_number() over(order by month) rn
from mytable t
) t
) t
) t
where rn in (rn_max, rn_max - 1)
order by month
Step-by-step demo on DB Fiddle:
id | month | value
-: | :------ | ----:
4 | 2019-09 | 200
5 | 2019-10 | 100

redshift: how to find row_number after grouping and aggregating?

Suppose I have a table of customer purchases ("my_table") like this:
--------------------------------------
customerid | date_of_purchase | price
-----------|------------------|-------
1 | 2019-09-20 | 20.23
2 | 2019-09-21 | 1.99
1 | 2019-09-21 | 123.34
...
I'd like to be able to find the nth highest spending customer in this table (say n = 5). So I tried this:
with cte as (
select customerid, sum(price) as total_pay,
row_number() over (partition by customerid order by total_pay desc) as rn
from my_table group by customerid order by total_pay desc)
select * from cte where rn = 5;
But this gives me nonsense results. For some reason rn doesn't seem to be unique (for example there are a bunch of customers with rn = 1). I don't understand why. Isn't rn supposed to be just a row number?
Remove the partition by in the definition of row_number():
with cte as (
select customerid, sum(price) as total_pay,
row_number() over (order by total_pay desc) as rn
from my_table
group by customerid
)
select *
from cte
where rn = 5;
You are already aggregating by customerid, so each customer has only one row. So the value of rn will always be 1.

selecting set of second lowest values

I have two columns of interest ID and Deadline:
ID Deadline (DD/MM/YYYY)
1 01/01/2017
1 05/01/2017
1 04/01/2017
2 02/01/2017
2 03/01/2017
2 06/02/2017
2 08/03/2017
Each ID can have multiple (n) deadlines. I need to select all rows where the Deadline is second lowest for each individual ID.
Desired output:
ID Deadline (DD/MM/YYYY)
1 04/01/2017
2 03/01/2017
Selecting minimum can be done by:
select min(deadline) from XXX group by ID
but I am lost with "middle" values. I am using Rpostgresql, but any idea helps as well.
Thanks for your help
One way is to use ROW_NUMBER() window function
SELECT id, deadline
FROM (
SELECT *, ROW_NUMBER() OVER (PARTITION BY id ORDER BY deadline) rn
FROM xxx
) q
WHERE rn = 2 -- get only second lowest ones
or with LATERAL
SELECT t.*
FROM (
SELECT DISTINCT id FROM xxx
) i JOIN LATERAL (
SELECT *
FROM xxx
WHERE id = i.id
ORDER BY deadline
OFFSET 1 LIMIT 1
) t ON (TRUE)
Output:
id | deadline
----+------------
1 | 2017-04-01
2 | 2017-03-01
Here is a dbfiddle demo
Using ROW_NUMBER() after taking distinct records will eliminate the chance of getting the lowest date instead of second lowest if there are duplicate records.
select ID,Deadline
from (
select ID,
Deadline,
ROW_NUMBER() over(partition by ID order by Deadline) RowNum
from (select distinct ID, Deadline from SourceTable) T
) Tbl
where RowNum = 2

Min Date from one column multiple rows

My apologies, I should have added every column and complete problem not just portion.
I have a table A which stores all invoices issued(id 1) payments received (id 4) from clients. Sometimes client pay in 2-3 installments. I want to find dateifference between invoice issued and last payment collected for the invoice. My data looks like this
**a.cltid**|**A.Invnum**|A.Cash|A.Date | a.type| a.status
70 |112 |-200 |2012-03-01|4 |P
70 |112 |-500 |2012-03-12|4 |P
90 |124 |-550 |2012-01-20|4 |P
70 |112 |700 |2012-02-20|1 |p
55 |101 |50 |2012-01-15|1 |d
90 |124 |550 |2012-01-15|1 |P
I am running
Select *, Datediff(dd,T.date,P.date)
from (select a.cltid, a.invnumber,a.cash, min(a.date)date
from table.A as A
where a.status<>'d' and a.type=1
group by a.cltid, a.invnumber,a.cash)T
join
Select *
from (select a.cltid, a.invnumber,a.cash, min(a.date)date
from table.A as A
where a.status<>'d' and a.type=4
group by a.cltid, a.invnumber,a.cash)P
on
T.invnumb=P.invnumber and T.cltid=P.cltid
How can I make it work? So it shows me
70|112|-500|2012-03-12|4|P 70|112|700|2012-02-20|1|p|22
90|124|-550|2012-01-20|4|P 90|124|550|2012-01-15|1|P|5
Edited***
You can use row_number to assign sequence number within each cltid in the order of decreasing date and then filter to get the first row for each cltid which will be the row with latest date for that cltid:
select *
from (
select A.*,
row_number() over (
partition by a.cltid order by a.date desc
) rn
from table.A as A
) t
where rn = 1;
It will return one row (with latest date) for each client. If you want to return all the rows which have latest date, use rank() instead.
Use a ranking function to get all the columns:
select a.*
from (select a.*,
row_number() over (partition by cltid order by date desc) as seqnum
from a
) a
where seqnum = 1;
Use aggregation if you only want the date. The issue with your query is that the group by clause has too many columns:
select a.cltid, max(a.date) as date
from table.A as A
group by a.cltid;
And the fact that min() returns the first date not the last date.
There are many ways to do this. Here are some of them:
test setup: http://rextester.com/VGUY60367
with common_table_expression as () using row_number()
with cte as (
select *
, rn = row_number() over (
partition by cltid, Invnum
order by [date] desc
)
from a
)
select cltid, Invnum, Cash, [date]
from cte
where rn = 1
cross apply version:
select distinct
a.cltid
, a.Invnum
, x.Cash
, x.[date]
from a
cross apply (
select top 1
cltid, Invnum
, [date]
, Cash
from a as i
where i.cltid =a.cltid
and i.Invnum=a.Invnum
order by i.[date] desc
) as x;
top with ties version:
select top 1 with ties
*
from a
order by
row_number() over (
partition by cltid, Invnum
order by [date] desc
)
all return:
+-------+--------+---------------------+------+
| cltid | Invnum | date | Cash |
+-------+--------+---------------------+------+
| 70 | 112 | 12.03.2012 00:00:00 | -500 |
| 90 | 124 | 20.01.2012 00:00:00 | -550 |
+-------+--------+---------------------+------+
You can achieve the desired o/p by this:
Select
a.cltid, a.invnumber,a.cash, max(a.date) [date]
from
YourTable a
group by
a.cltid, a.invnumber, a.cash, a.date