SQL. Limited query

SQL. Limited query - sql

I have a view and I request data from it with a simple query like:
SELECT * FROM my_view WHERE id IN (...)
In general on "normal data" it should return 10-100 entries per id, but for some ids, it may return more than 1,000,000 entries!
I would like to limit my query so that it would not return more than 100 entries per id, but I really have no idea other than running a query for each id separately.

Use row_number():
select v.*
from (select v.*, row_number() over (partition by id order by id) as seqnum
from my_view
where id in (...)
) v
where seqnum <= 100;

Related

DB2 Using max aggregate function

Can I rewrite this select without using aggregate function to retrieve the highest value
Select *
From A
Where Id= 123
and value = (
select max(value)
from A inner
where inner.id = 123 )

If you are certain that only one record would have the max value, or, if there are ties you don't care which gets returned, then you may use this limit query:
SELECT *
FROM A
WHERE Id = 123
ORDER BY value DESC
LIMIT 1;
If this doesn't meet your expectations, then stick with your current approach. Note that you could also use RANK() here:
WITH cte AS (
SELECT *, RANK() OVER (ORDER BY value DESC) rnk
FROM A
WHERE Id = 123
)
SELECT *
FROM cte
WHERE rnk = 1;
But like your version, the above rank query also requires a subquery.

How to Pass Query Answer into Limit Function Impala

I am attempting to sample 20% of a table in impala. I have heard somewhere that the built in impala sampling function has issues.
Is there a way to pass in a subquery to the impala limit function to sample n percent of the entire table.
I have something like this:
select
* from
table_a
order by rand()
limit
(
select
round( (count(distinct ids)) *.2,0)
from table_a)
)
The sub query gives me 20% of all records

I'm not sure if Impala has specific sampling logic (some databases do). But you can use window functions:
select a.*
from (select a.*,
row_number() over (order by rand()) as seqnum,
count(*) over () as cnt
from table_a
) a
where seqnum <= cnt * 0.2;

Aggregate function like MAX for most common cell in column?

Group by the highest Number in a column worked great with MAX(), but what if I would like to get the cell that is at most common.
As example:
ID
100
250
250
300
200
250
So I would like to group by ID and instead of get the lowest (MIN) or highest (MAX) number, I would like to get the most common one (that would be 250, because there 3x).
Is there an easy way in SQL Server 2012 or am I forced to add a second SELECT where I COUNT(DISTINCT ID) and add that somehow to my first SELECT statement?

You can use dense_rank to return all the id's with the highest counts. This would handle cases when there are ties for the highest counts as well.
select id from
(select id, dense_rank() over(order by count(*) desc) as rnk from tablename group by id) t
where rnk = 1

A simple way to do what you want uses top and order by:
SELECT top 1 id
FROM t
GROUP BY id
ORDER BY COUNT(*) DESC;
This is a statistic called the mode. Getting the mode and max is a bit challenging in SQL Server. I would approach it as:
WITH cte AS (
SELECT t.id, COUNT(*) AS cnt,
row_number() OVER (ORDER BY COUNT(*) DESC) AS seqnum
FROM t
GROUP BY id
)
SELECT MAX(id) AS themax, MAX(CASE WHEN seqnum = 1 THEN id END) AS MODE
FROM cte;

Opposite of TOP in SQL Server

I need to retrieve the last few entries from a table. I can retrieve them using:
SELECT TOP n *
FROM table
ORDER BY id DESC
That I looked everywhere and that's the only answer I could find, But that way I get them in reverse order. I need them in the same order as they are in the table because it's for a messaging interface.

Use a derived table:
select id, ...
from
(
select top n id, ...
from t
order by id desc
) dt
order by id

I suggest you to use a ROW_NUMBER() like this:
SELECT *
FROM (
SELECT
*, ROW_NUMBER() OVER (ORDER BY id DESC) AS RowNo
FROM
yourTable
) AS t
WHERE
(RowNO < #n)
ORDER BY
id

Get top N records grouped by another field

I have an Oracle table with ID, SUBJECT, and PAYLOAD (CLOB). I'd like to get a listing of the TOP 10 records who have the biggest PAYLOAD (LENGTH(PAYLOAD)) grouped by subject. So if I have 10 DISTINCT SUBJECT's in the table, the query should return 100 rows (top 10 per subject).

Use row_number():
select t.*
from (select t.*, row_number() over (partition by subject order by length(payload) desc) as seqnum
from table t
) t
where seqnum <= 10;

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

SQL. Limited query - sql

Use row_number(): select v.* from (select v.*, row_number() over (partition by id order by id) as seqnum from my_view where id in (...) ) v where seqnum <= 100;

Related

DB2 Using max aggregate function

How to Pass Query Answer into Limit Function Impala

Aggregate function like MAX for most common cell in column?

Opposite of TOP in SQL Server

Get top N records grouped by another field

Categories

Resources