Select MIN MAX together alters behaviour of query - sql

I have a SQLite table with the following columns:
timestamp
id
value
I want to write a query that - for each id - lists the latest timestamp with the value at that timestamp, as well as the max value for each id.
For that I wrote the following query that works as expected.
SELECT timestamp, MAX(value) as max, id, value from (
SELECT * from temperatures order by timestamp DESC
) GROUP BY ID;
When I now also want to calculate the min value, I alter the query:
SELECT timestamp, MAX(value) as max, MIN(value) as min, id, value from (
SELECT * from temperatures order by timestamp DESC
) GROUP BY ID;
The problem now is that the timestamp is not the latest timestamp anymore. Why is that?

Your query is malformed. The SELECT list is not compatible with the GROUP BY. So, what SQLite does is bespoke processing . . . and yes, what the query does change the meaning by adding another column.
I would recommend that you write the query using window functions:
SELECT id, MIN(timestamp), MAX(timestamp),
MIN(CASE WHEN timestamp_asc = 1 THEN value END) as temp_at_min,
MIN(CASE WHEN timestamp_desc = 1 THEN value END) as temp_at_max
FROM (SELECT t.*,
ROW_NUMBER() OVER (PARTITION BY id ORDER BY timestamp ASC) as seqnum_asc,
ROW_NUMBER() OVER (PARTITION BY id ORDER BY timestamp ASC) as seqnum_desc
FROM temperatures t
) t
GROUP BY ID;
This is standard SQL and should consistently do what you want.

You can do it with MIN(), MAX() and FIRST_VALUE() window functions:
SELECT DISTINCT id,
MIN(timestamp) OVER (PARTITION BY id) min_timestamp,
MAX(timestamp) OVER (PARTITION BY id) max_timestamp,
FIRST_VALUE(value) OVER (PARTITION BY id ORDER BY timestamp) min,
FIRST_VALUE(value) OVER (PARTITION BY id ORDER BY timestamp DESC) max
FROM temperatures
There is no need for aggregation, window functions are enough.

Related

sql select everything with maximum date (that that is smaller than a specific date) without subqueries

I would like to write a sql query where I choose all rows grouped by id where the column date is the latest date for this id but still smaller than for example 16-JUL-2021. I would like to do this without using subqueries (in oracle), is that possible?
I tried the below but it doesn't work.
SELECT *, max(date)
WHERE date < '16-JUL-2021'
OVER(PARTITION BY id ORDER BY date DESC) as sth
FROM table
You can find the maximum date without sub-queries.
SELECT t.*,
max("DATE") OVER(PARTITION BY id ORDER BY "DATE" DESC) as max_date
FROM "TABLE" t
WHERE "DATE" < DATE '2021-07-16'
You need a sub-query to filter to only show the row(s) with the maximum date:
SELECT *
FROM (
SELECT t.*,
max("DATE") OVER(PARTITION BY id ORDER BY "DATE" DESC) as max_date
FROM "TABLE" t
WHERE "DATE" < DATE '2021-07-16'
)
WHERE "DATE" = max_date;
However, you are still only querying the table once using this technique even though it uses a sub-query.
Note DATE and TABLE are reserved words and cannot be used as unquoted identifiers; it would be better practice to use different names for those identifiers.
You could, equivalently use the RANK or DENSE_RANK analytic functions instead of MAX; ROW_NUMBER, however, does not give the same output as it will only return a single row and will not return all tied rows.
SELECT *
FROM (
SELECT t.*,
RANK() OVER(PARTITION BY id ORDER BY "DATE" DESC) as rnk
FROM "TABLE" t
WHERE "DATE" < DATE '2021-07-16'
)
WHERE rnk = 1;
But you still need a sub-query to filter the rows.
If you want to not use a sub-query then you can use:
SELECT id,
MAX("DATE") AS "DATE",
MAX(col1) KEEP (DENSE_RANK LAST ORDER BY "DATE", ROWNUM) AS col1,
MAX(col2) KEEP (DENSE_RANK LAST ORDER BY "DATE", ROWNUM) AS col2,
MAX(col3) KEEP (DENSE_RANK LAST ORDER BY "DATE", ROWNUM) AS col3
FROM "TABLE"
GROUP BY id
However, that is not quite the same as it will only get a single row per id and will not return multiple rows tied for the greatest date per id.

SQL get values with MAX date by grouping

I have database with columns:
station, source, type, date and price
I would like to get newest price for each type.
I tried
SELECT max(date) and GROUP BY station, source, type
but error appears "price must appear in GROUP BY clause"
May be someone know how to do it?
If you want the most recent row for a combination of columns, you can use row_number():
select t.*
from (select t.*,
row_number() over (partition by station, source, type order by date desc) as seqnum
from t
) t
where seqnum = 1;
You can try the below one-
select * from
(
SELECT station, source, type ,price, row_number() over(partition by type order by date desc) as rn from tablename
)A where rn=1

Select all items for each category but only for latest date per each category

Table schema is very simple item, category and date. I would like to query all items for each category but only for MAX date per each category.
I think this does what you want:
select t.*
from (select t.*,
max(date) over (partition by category) as max_date
from t
) t
where date = max_date;
Below is for BigQuery Standard SQL (and in BigQuery style)
#standardSQL
SELECT AS VALUE ARRAY_AGG(t ORDER BY date DESC LIMIT 1)[OFFSET(0)]
FROM `project.dataset.table` t
GROUP BY category
I have cases where multiple entries have exactly same max date and all need to be returned
consider below version:
#standardSQL
SELECT t.* FROM (
SELECT ARRAY_CONCAT_AGG(arr ORDER BY `date` DESC LIMIT 1) arr FROM (
SELECT category, `date`, ARRAY_AGG(t) arr
FROM `project.dataset.table` t
GROUP BY category, `date`
) GROUP BY category
), UNNEST(arr) t

SQL Finding five largest numbers instead of one Max in a table

I have a table and I need to run a query that contains some aggregation Functions like Maximum , Average , Standard Deviation , ...
but instead of one Maximum I should return 5 largest number.
the simplified query is something like this:
SELECT OSI_KEY , MAX(VALUE) , AVG(VALUE) , STDDEV(VALUE), variance(VALUE)
FROM DATA_VALUES_5MIN_6_2013
GROUP BY OSI_KEY
ORDER BY OSI_KEY
and I need some Magical ;) Query like this:
SELECT OSI_KEY , MAX1(VALUE) ,MAX2(VALUE) ,MAX3(VALUE) ,MAX4(VALUE) , MAX5(VALUE) ,
AVG(VALUE) , STDDEV(VALUE), variance(VALUE)
FROM DATA_VALUES_5MIN_6_2013
GROUP BY OSI_KEY
ORDER BY OSI_KEY
I appreciate your considerations.
Oracle has an NTH_VALUE() function. Unfortunately, it is only an analytic function and not a window function. This leads to the strange construct of SELECT DISTINCT with a bunch of analytic functions:
SELECT DISTINCT OSI_KEY,
MAX(VALUE) OVER (PARTITION BY OSI_KEY),
NTH_VALUE(VALUE, 2) OVER (PARTITION BY OSI_KEY ORDER BY VALUE DESC) as MAX_2,
NTH_VALUE(VALUE, 3) OVER (PARTITION BY OSI_KEY ORDER BY VALUE DESC) as MAX_3,
NTH_VALUE(VALUE, 4) OVER (PARTITION BY OSI_KEY ORDER BY VALUE DESC) as MAX_4,
NTH_VALUE(VALUE, 5) OVER (PARTITION BY OSI_KEY ORDER BY VALUE DESC) as MAX_5,
AVG(VALUE) OVER (PARTITION BY OSI_KEY),
STDDEV(VALUE) OVER (PARTITION BY OSI_KEY),
variance(VALUE) OVER (PARTITION BY OSI_KEY)
FROM DATA_VALUES_5MIN_6_2013
ORDER BY OSI_KEY;
You can also do this using conditional aggregation, with a row_number() or dense_rank() in a subquery.
SELECT OSI_KEY, MaxValue FROM (
SELECT OSI_KEY, MAX(value) AS MaxValue FROM table GROUP BY OSI_KEY
)
ORDER BY MaxValue DESC
FETCH FIRST 5 ROWS ONLY;

How to get single closest value for each column type in DB2

I have this query:
SELECT * FROM TABLE1 WHERE KEY_COLUMN='NJCRF' AND TYPE_COLUMN IN ('SCORE1', 'SCORE2', 'SCORE3') AND DATE_EFFECTIVE_COLUMN<='2016-09-17'
I get about 12 record(rows) as result.
How to get result closest to DATE_EFFECTIVE_COLUMN for each TYPE_COLUMN? In this case, how to get three records, for each type, that are closest to effective date?
UPDATE: I could use TOP if I had to go over only single type, but I have three at this moment and for each of them I need to get closest time result.
Hope I made it clear, let me know if you need more info.
If I understand correctly, you can use ROW_NUMBER():
SELECT t.*
FROM (SELECT t.*,
ROW_NUMBER() OVER (PARTITION BY TYPE_COLUMN ORDER BY DATE_EFFECTIVE_COLUMN DESC) as seqnum
FROM TABLE1 t
WHERE KEY_COLUMN = 'NJCRF' AND
TYPE_COLUMN IN ('SCORE1', 'SCORE2', 'SCORE3') AND
DATE_EFFECTIVE_COLUMN <= '2016-09-17'
) t
WHERE seqnum = 1;
If you want three records per type, just use seqnum <= 3.
I like ROW_NUMBER() for this. You want to partition by TYPE, which will start the row count over for each type, then order by DATE_EFFECTIVE desc, and take only the highest date (the first row):
SELECT *
FROM (
SELECT *,
ROW_NUMBER() over (PARTITION BY TYPE_COLUMN ORDER BY DATE_EFFECTIVE_COLUMN desc) RN
FROM TABLE1
WHERE KEY_COLUMN = 'NJCRF'
AND TYPE_COLUMN IN ('SCORE1', 'SCORE2', 'SCORE3')
AND DATE_EFFECTIVE_COLUMN <= '2016-09-17'
) A
WHERE RN = 1