Find Column With Max on Other Column Grouping By Another Column - sql

I have a table like this:
Id - ItemId - Price - SalesId - Date
1 12 99.99924 21899234 2025-01-01 00:00:00.000000
2 123 12.34567 348923 2021-01-01 00:00:00.000000
3 1234 1234.5 3321234 2022-01-01 00:00:00.000000
4 12345 3.3246 2154234 2023-01-01 00:00:00.000000
5 1234 451.234 3423 2020-02-01 00:00:00.000000
6 12345 0.989 71112357 2020-09-15 20:20.10.000000
7 123 3435.3 71112357 2020-09-14 20:10:12.000000
I am trying to find the Price of an Item with latest Date. For example, if we tried to find ItemId = 1234, the one with the latest date is this 2022-01-01 00:00:00.000000 that has Id = 3, it has the price of 1234.5. That's what I'm trying to find by this query, the price of this item.
I am a beginner to SQL and tried the following query, but it gives me this error:
select "ItemId",
max("Date"),
"Price"
from "Products"
group by "ItemId"
[42803] ERROR: column "Products.Price" must appear in the GROUP BY clause or be used in an aggregate function
I appreciate any help here. Thank you!

In Postgres, you can use distinct on:
select distinct on ("ItemId") p.*
from "Products" p
order by "ItemId", "Date" desc;
Note: If you are learning SQL, don't use double quotes for string and column names.

You can try using row_number()
select * from
(
select ItemId,Date,Price,row_number() over(partition by itemid order by date desc) as rn
from Products
)A where rn=1

Related

How do I group aggregated data a certain way

I have the following sample transactional item receipt data, consisting of Item, Vendor and Receipt Date:
Item
Vendor
Receipt_Date
A
1
2021-01-01 00:00:00.000
A
2
2021-01-31 00:00:00.000
B
1
2021-02-01 00:00:00.000
B
2
2021-02-10 00:00:00.000
B
3
2021-02-20 00:00:00.000
C
7
2021-03-01 00:00:00.000
I want to select the Vendor for each Item, based on the last (max) Receipt Date, so the expected result for the above sample would be:
Item
Last_Vendor_For_Receipt
A
2
B
3
C
7
I can group the data per Item and Vendor, but I cannot figure out how to achieve the above expected result with an outer query. I'm using SQL Server 2012. Here's the initial query:
select
ir.Item
,ir.Vendor
,max(ir.Receipt_Date) Last_Receipt_Date
from
ItemReceipt ir
I checked online and in the forum, but it was hard to search for my specific question.
Thanks
Here is one approach using TOP with ROW_NUMBER:
SELECT TOP 1 WITH TIES *
FROM yourTable
ORDER BY ROW_NUMBER() OVER (PARTITION BY Item ORDER BY Receipt_Date DESC);
First you select the desired max date per item:
select max(Receipt_Date) as max_rcpt_date
, Item
from your_unknown_table
group by Item
And then you can use this as a subquery to get the vendor:
select Item
, Vendor
from your_unknown_table
where ( Receipt_Date, Item ) in
( select max(Receipt_Date) as max_rcpt_date
, Item
from your_unknown_table
group by Item
)
This will work in Oracle. I'm not sure if this subquery-structure in SQL-Server wil work.

How to calculate average monthly number of some action in some perdion in Teradata SQL?

I have table in Teradata SQL like below:
ID trans_date
------------------------
123 | 2021-01-01
887 | 2021-01-15
123 | 2021-02-10
45 | 2021-03-11
789 | 2021-10-01
45 | 2021-09-02
And I need to calculate average monthly number of transactions made by customers in a period between 2021-01-01 and 2021-09-01, so client with "ID" = 789 will not be calculated because he made transaction later.
In the first month (01) were 2 transactions
In the second month was 1 transaction
In the third month was 1 transaction
In the nineth month was 1 transactions
So the result should be (2+1+1+1) / 4 = 1.25, isn't is ?
How can I calculate it in Teradata SQL? Of course I showed you sample of my data.
SELECT ID, AVG(txns) FROM
(SELECT ID, TRUNC(trans_date,'MON') as mth, COUNT(*) as txns
FROM mytable
-- WHERE condition matches the question but likely want to
-- use end date 2021-09-30 or use mth instead of trans_date
WHERE trans_date BETWEEN date'2021-01-01' and date'2021-09-01'
GROUP BY id, mth) mth_txn
GROUP BY id;
Your logic translated to SQL:
--(2+1+1+1) / 4
SELECT id, COUNT(*) / COUNT(DISTINCT TRUNC(trans_date,'MON')) AS avg_tx
FROM mytable
WHERE trans_date BETWEEN date'2021-01-01' and date'2021-09-01'
GROUP BY id;
You should compare to Fred's answer to see which is more efficent on your data.

Select earliest date and count rows in table with duplicate IDs

I have a table called table1:
id created_date
1001 2020-06-01
1001 2020-01-01
1001 2020-07-01
1002 2020-02-01
1002 2020-04-01
1003 2020-09-01
I'm trying to write a query that provides me a list of distinct IDs with the earliest created_date they have, along with the count of rows each id has:
id created_date count
1001 2020-01-01 3
1002 2020-02-01 2
1003 2020-09-01 1
I managed to write a window function to grab the earliest date, but I'm having trouble figuring out where to fit the count statement in one:
SELECT
id,
created_date
FROM ( SELECT
id,
created_date,
row_number() OVER(PARTITION BY id ORDER BY created_date) as row_num
FROM table1)
) AS a
WHERE row_num = 1
You would use aggregation:
select id, min(create_date), count(*)
from table1
group by id;
I find it amusing that you want to use window functions -- which are considered more advanced -- when lowly aggregation suffices.

Query to find value in column dependent on a different column in table being the minimum date

I have a dataset that looks like this. I would like to pull a distinct id, the minimum date and value on the minimum date.
id date value
1 01/01/2020 0.5
1 02/01/2020 1
1 03/01/2020 2
2 01/01/2020 3
2 02/01/2020 4
2 03/01/2020 5
This code will pull the id and the minimum date
select Distinct(id), min(nav_date)
from table
group by id
How can I get the value on the minimum date so the output of my query looks like this?
id date value
1 01/01/2020 0.5
2 01/01/2020 3
Use distinct on:
select distinct on (id) t.*
from t
order by id, date;
This can take advantage of an index on (id, date) and is typically the fastest way to do this operation in Postgres.

How to find most recent date given a set a values that fulfill condition *

I've been trying to build an sql query that finds from (table) the most recent date for selected id's that fulfill the condition where 'type' is in hierarchy 'vegetables'. My goal is to be able to get the whole row once max(date) and hierarchy conditions are met for each id.
Example values
ID DATE PREFERENCE AGE
123 1/3/2013 carrot 14
123 1/3/2013 apple 12
123 1/2/2013 carrot 14
124 1/5/2013 carrot 13
124 1/3/2013 apple 13
124 1/2/2013 carrot 14
125 1/4/2013 carrot 13
125 1/3/2013 apple 14
125 1/2/2013 carrot 13
I tried the following
SELECT *
FROM table
WHERE date in
(SELECT max(date) FROM (table) WHERE id in (123,124,125))
and preference in
(SELECT preference FROM (hierarchy_table)
WHERE hierarchy = vegetables))
and id in (123,24,125)
but it doesn't give me the most recent date for each id that meets the hierarchy conditions. (ex. in this scenario I would only get id 124)
Thank you in advance!
SELECT max(date) FROM (table) WHERE id in (123,124,125)
is giving you the max date from all dates, you need to group them.
Try replacing with:
SELECT max(date) FROM (table) GROUP BY id
This way you will get the max date for each id
I figured this out. Please see the query below as an example:
SELECT * FROM (table) t
WHERE t.date in
(SELECT max(date) FROM table sub_t where t.ID = sub_t.ID and (date !> (currentdate))
and preference in
(SELECT preference FROM (hierarchy_table) WHERE hierarchy ='vegetables')
and ID in ('124')
Change:
max(date)
To:
-- if your date data is in mm/dd/yyyy
max( str_to_date( date, '%m/%d/%Y' ) )
OR
-- if your date data is in dd/mm/yyyy
max( str_to_date( date, '%d/%m/%Y' ) )