Limit result set in sql window function - sql

Assume I would like to rewrite the following aggregate query
select id, max(hittime)
from status
group by id
using an aggregate windowing function like
select id, max(hittime) over(partition by id order by hittime desc) from status
How can I specify, that I am only interested in the first result within the partition?
EDIT: I was thinking that there might be a solution with [ RANGE | ROWS ] BETWEEN frame_start AND frame_end. What to get not only max(hittime) but also the second, third ...

I think what you need is a ranking function, either ROW_NUMBER or DENSE_RANK depending on how you want to handle ties.
select id, hittime
from (
select id, hittime,
dense_rank() over(partition by id order by hittime desc) as ranking
from status
) as x
where ranking = 1; --to get max hittime
--where ranking <=2; --max and second largest

Use distinct statement.
select DISTINCT id, max(hittime) over(partition by id order by hittime desc) from status

Related

Is there a way to group rankings in SQL Teradata?

I am trying to get the ranking or grouping to count like in the custom_ranking column:
I want it to count the rank like in the row custom_ranking, but everything I keep trying is counting it in the current_ranking row.
I am currently using this:
,row_number() OVER (partition by custID, propID ORDER BY trans_type desc, record_date desc) AS RANKING
Based on your sample data, this would be:
dense_rank() over (partition by custid order by propid)

Generate custom group ranking in sql

As posted, I am trying to generate group ranking based on Is_True_Mod column. Here Until next 1 comes, I want 1 group to be there. Please find expected output in SQL. Here in expected output, rows grouped based on Is_True_Mode column. Regular ranking showing for reference ( order by ranking should be their )
You can identify the groups using a cumulative sum. Then you can you row_number() to enumerate the rows:
select t.*,
row_number() over (partition by grp order by regularranking) as expected_output
from (select t.*,
sum(is_true_mode) over (order by regularranking) as grp
from t
) t;

SQL Finding five largest numbers instead of one Max in a table

I have a table and I need to run a query that contains some aggregation Functions like Maximum , Average , Standard Deviation , ...
but instead of one Maximum I should return 5 largest number.
the simplified query is something like this:
SELECT OSI_KEY , MAX(VALUE) , AVG(VALUE) , STDDEV(VALUE), variance(VALUE)
FROM DATA_VALUES_5MIN_6_2013
GROUP BY OSI_KEY
ORDER BY OSI_KEY
and I need some Magical ;) Query like this:
SELECT OSI_KEY , MAX1(VALUE) ,MAX2(VALUE) ,MAX3(VALUE) ,MAX4(VALUE) , MAX5(VALUE) ,
AVG(VALUE) , STDDEV(VALUE), variance(VALUE)
FROM DATA_VALUES_5MIN_6_2013
GROUP BY OSI_KEY
ORDER BY OSI_KEY
I appreciate your considerations.
Oracle has an NTH_VALUE() function. Unfortunately, it is only an analytic function and not a window function. This leads to the strange construct of SELECT DISTINCT with a bunch of analytic functions:
SELECT DISTINCT OSI_KEY,
MAX(VALUE) OVER (PARTITION BY OSI_KEY),
NTH_VALUE(VALUE, 2) OVER (PARTITION BY OSI_KEY ORDER BY VALUE DESC) as MAX_2,
NTH_VALUE(VALUE, 3) OVER (PARTITION BY OSI_KEY ORDER BY VALUE DESC) as MAX_3,
NTH_VALUE(VALUE, 4) OVER (PARTITION BY OSI_KEY ORDER BY VALUE DESC) as MAX_4,
NTH_VALUE(VALUE, 5) OVER (PARTITION BY OSI_KEY ORDER BY VALUE DESC) as MAX_5,
AVG(VALUE) OVER (PARTITION BY OSI_KEY),
STDDEV(VALUE) OVER (PARTITION BY OSI_KEY),
variance(VALUE) OVER (PARTITION BY OSI_KEY)
FROM DATA_VALUES_5MIN_6_2013
ORDER BY OSI_KEY;
You can also do this using conditional aggregation, with a row_number() or dense_rank() in a subquery.
SELECT OSI_KEY, MaxValue FROM (
SELECT OSI_KEY, MAX(value) AS MaxValue FROM table GROUP BY OSI_KEY
)
ORDER BY MaxValue DESC
FETCH FIRST 5 ROWS ONLY;

Aggregate function like MAX for most common cell in column?

Group by the highest Number in a column worked great with MAX(), but what if I would like to get the cell that is at most common.
As example:
ID
100
250
250
300
200
250
So I would like to group by ID and instead of get the lowest (MIN) or highest (MAX) number, I would like to get the most common one (that would be 250, because there 3x).
Is there an easy way in SQL Server 2012 or am I forced to add a second SELECT where I COUNT(DISTINCT ID) and add that somehow to my first SELECT statement?
You can use dense_rank to return all the id's with the highest counts. This would handle cases when there are ties for the highest counts as well.
select id from
(select id, dense_rank() over(order by count(*) desc) as rnk from tablename group by id) t
where rnk = 1
A simple way to do what you want uses top and order by:
SELECT top 1 id
FROM t
GROUP BY id
ORDER BY COUNT(*) DESC;
This is a statistic called the mode. Getting the mode and max is a bit challenging in SQL Server. I would approach it as:
WITH cte AS (
SELECT t.id, COUNT(*) AS cnt,
row_number() OVER (ORDER BY COUNT(*) DESC) AS seqnum
FROM t
GROUP BY id
)
SELECT MAX(id) AS themax, MAX(CASE WHEN seqnum = 1 THEN id END) AS MODE
FROM cte;

Select Record with Maximum Creation Date

Let us say that I have a database table with the following two records:
CACHE_ID BUSINESS_DATE CREATED_DATE
1183 13-09-06 13-09-19 16:38:59.336000000
1169 13-09-06 13-09-24 17:19:05.762000000
1152 13-09-06 13-09-17 14:18:59.336000000
1173 13-09-05 13-09-19 15:48:59.136000000
1139 13-09-05 13-09-24 12:59:05.263000000
1152 13-09-05 13-09-27 13:28:59.332000000
I need to write a query that will return the CACHE_ID for the record which has the most recent CREATED_DATE.
I am having trouble crafting such a query. I can do a GROUP BY based on BUSINESS_DATE and get the MAX(CREATED_DATE)...of course, I won't have the CACHE_ID of the record.
Could someone help with this?
Not positive on oracle syntax, but use the ROW_NUMBER() function:
SELECT BUSINESS_DATE, CACHE_ID
FROM (SELECT t.*,
ROW_NUMBER() OVER(PARTITION BY BUSINESS_DATE ORDER BY CREATED_DATE DESC) RN
FROM YourTable t
)sub
WHERE RN = 1
The ROW_NUMBER() function assigns a number to each row. PARTITION BY is optional, but used to start the numbering over for each value in that group,  ie: if you PARTITION BY BUSINESS_DATE  then for each unique BUSINESS_DATE value the numbering would start over at 1.  ORDER BY of course is used to define how the counting should go, and is required in the ROW_NUMBER() function.
You want to group on business date, and get the CACHE_ID with the most current created date? Use something like this:
select yt.CACHE_ID, yt.BUSINESS_DATE, yt.CREATED_DATE
from YourTable yt
where yt.CREATED_DATE = (select max(yt1.CREATED_DATE)
from YourTable yt1
where yt1.BUSINESS_DATE = yt.BUSINESS_DATE)
Not sure of the exact syntax, but conceptually, can't you just sort by CREATED_DATE descending and take the first one?
Across all records -
select top 1 CACHE_ID from YourTable order by CREATED_DATE desc
For each BUSINESS_DATE -
select distinct
a.BUSINESS_DATE,
(
select top 1 b.CACHE_ID
from YourTable b where a.BUSINESS_DATE = b.BUSINESS_DATE
order by b.CREATED_DATE desc
) as Last_CREATED_DATE
from YourTable a