Expand on limit SQL query results use lag() window function

Expand on limit SQL query results use lag() window function - sql

I have an table with columns id, score, parent_id ordered by the score. I have ask one scenario in here.
The different from previous question is that the parent_id might show up on multiple rows not necessary sequence rows. The updated table:
id
score
parent_id
5859
10
5859
2157043
9
5859
21064154
8
21064154
51992
7
51992
34384599
6
51992
1675761
5
5859
3465729
4
3465729
401202
3
401203
1817458
2
1817458
I want to query all columns from this table with the same order but limit results at least 5 rows to meet the unique parent_id number equal to 5. As result, the parent_id only contains 5 ids: 5859, 21064154, 51992, 3465729, 401203
Expected Results like:
id
score
parent_id
5859
10
5859
2157043
9
5859
21064154
8
21064154
51992
7
51992
34384599
6
51992
1675761
5
5859
3465729
4
3465729
401202
3
401203
The solution using lag only works for the following row has same value of parent_id. If use Java, we could use a SET to store the parent_id and keep count the unique parent_id, but how do we write in SQL?
select id, score, parent_id
from (
select *, Sum(diff) over(order by score desc)seq
from (
select *,
case when Lag(parent_id) over(order by score desc) = parent_id then 0 else 1 end diff
from t
)t
)d
where seq <= 5
order by score desc;

Perhaps this will work for you, a refactor of your existing query but an alternative to lag using exists which caters for non-sequential rows, give this a try:
select id, score, parent_id
from (
select *, Sum(keep) over (order by score desc) seq
from (
select *,
case when exists (
select * from t t2
where t2.parent_id = t.parent_id and t2.score > t.score
) then 0 else 1 end keep
from t
)t
)s
where seq <= 5
order by score desc;

One method to consider might be:
SELECT t1.*
FROM t t1
WHERE
(SELECT count(distinct parent_id)
FROM t t2
WHERE t2.score >= t1.score) <= 5
You can see a Fiddle here. The WHERE clause is simply counting the number of distinct parent_ids with a score >= the current row.

Related

How do i select all columns, plus the result of the sum

I have this select:
"Select * from table" that return:
Id
Value
1
1
1
1
2
10
2
10
My goal is create a sum from each Value group by id like this:
Id
Value
Sum
1
1
2
1
1
2
2
10
20
2
10
20
I Have tried ways like:
SELECT Id,Value, (SELECT SUM(Value) FROM Table V2 WHERE V2.Id= V.Id GROUP BY IDRNC ) FROM Table v;
But the is not grouping by id.
Id
Value
Sum
1
1
1
1
1
1
2
10
10
2
10
10

Aggregation aggregates rows, reducing the number of records in the output. In this case you want to apply the result of a computation to each of your records, task carried out by the corresponding window function.
SELECT table.*, SUM(Value) OVER(PARTITION BY Id) AS sum_
FROM table
Check the demo here.

Your attempt looks correct.
Can you try the below query :
It works for me :
SELECT Id, Value,
(SELECT SUM(Value) FROM Table V2 WHERE V2.Id= V.Id GROUP BY ID) as sum
FROM Table v;

You can do it using inner join to join with selection grouped by id :
select t.*, sum
from _table t
inner join (
select id, sum(Value) as sum
from _table
group by id
) as s on s.id = t.id
You can check it here

Your select is ok if you adjust it just a little:
SELECT Id,Value, (SELECT SUM(Value) FROM Table V2 WHERE V2.Id= V.Id GROUP BY IDRNC ) FROM Table v;
GROUP BY IDRNC is a mistake and should be GROUP BY ID
you should give an alias to a sum column ...
subquery selecting the sum does not have to have self table alias to be compared with outer query that has one (this is not a mistake - works either way)
Test:
WITH
a_table (ID, VALUE) AS
(
Select 1, 1 From Dual Union All
Select 1, 1 From Dual Union All
Select 2, 10 From Dual Union All
Select 2, 10 From Dual
)
SELECT ID, VALUE, (SELECT SUM(VALUE) FROM a_table WHERE ID = v.ID GROUP BY ID) "ID_SUM" FROM a_table v;
ID VALUE ID_SUM
---------- ---------- ----------
1 1 2
1 1 2
2 10 20
2 10 20

Return distinct results that appear more than once

I have the following data:
ID Site
2 NULL
2 32
3 6
4 7
8 12
8 13
9 14
9 14
Result should be:
ID Site
2 NULL
2 32
8 12
8 13
Note that the result find unique combinations of ID and Site that repeat more than once for a given ID.
I did the following query but does not return the result:
select distinct id, site
from Table1
group by id, site
having count(*) > 1
order by id

SELECT
ID,
site
FROM table1
WHERE ID IN (
SELECT ID
FROM (
SELECT ID ,site
FROM table1
GROUP BY ID ,site
) x
GROUP BY ID
HAVING count(*)>1
)
See: DBFIDDLE
The SELECT ID, site FROM table1 GROUP BY ID, site will select the distinct values.
Then, using HAVING count(*) > 1, only the IDs that appear more than once are filtered.
P.S. You should try to avoid using DISTINCT and GROUP BY in one query. It makes life so much more complicated when you do that ... 😉

One way to do it is to do the select distinct in a CTE, then use the count window function to get the desired result:
with u as (
select distinct *
from Table1
), v as (
select *
, count(*) over(partition by ID) as cnt
from u
)
select ID, Site
from v
where cnt > 1;
Fiddle

SQL query to find counts of numbers in running total

Suppose the table has 1 column ID and the values are as below:
ID
5
5
5
6
5
5
6
6
the output should be
ID count
5 3
6 1
5 2
6 2
How can we do that in a single SQL query.

If you want to find the Total count of the Records you have you can write like
select count(*) from database_name order by column_name;

In relational databases data in the table has no any order, see this: https://en.wikipedia.org/wiki/Table_(database)
the database system does not guarantee any ordering of the rows unless
an ORDER BY clause is specified in the SELECT statement that queries
the table.
therefore, in order to get desired results, you must have an additional colum in the table that defines an order of rows (and can by used in ORDER BY clause).
In the below examle cn column defines such an order:
select * from tab123 ORDER BY rn;
RN ID
---------- -------
1 5
2 5
3 5
4 6
5 5
6 5
7 6
8 6
Starting from Oracle version 12c new MATCH_REGOGNIZE clause can be used:
select * from tab123
match_recognize(
order by rn
measures
strt.id as id,
count(*) as cnt
one row per match
after match skip past last row
pattern( strt ss* )
define ss as ss.id = prev( ss.id )
);
On earlier versions that support windows function (Oracle 10 and above) you can use two windows functions: LAG ... over and SUM ... over, in this way
select max( id ) as id, count(*) as cnt
FROM (
select id, sum( xxx ) over (order by rn ) as yyy
from (
select t.*,
case lag( id ) over (order by rn )
when id then 0 else 1 end as xxx
from tab123 t
)
)
GROUP BY yyy
ORDER BY yyy;

Oracle Nested Grouping

The question is: For each day, list the User ID who has read the most number of messages.
user_id msgID read_date
1 1 10
1 2 10
2 2 10
2 2 23
3 2 23
I believe the date is an outer group and user_id is an inner group, but how to do group nesting in sql? Or somehow avoid this?

This is a task for a Window Function:
select *
from
(
select user_id, read_date, count(*) as cnt,
rank()
over (partition by read_date -- each day
order by count(*) desc) as rnk -- maximum number
from tab
group by user_id, read_date
) dt
where rnk = 1
This might return multiple users for one with the same maximum count, if you want just one (randomly) switch to ROW_NUMBER

select user_id
from
(
select user_id,count(msgID)
from table
group by read_date
)
where rownum <= 1;

Second maximum and minimum values

Given a table with multiple rows of an int field and the same identifier, is it possible to return the 2nd maximum and 2nd minimum value from the table.
A table consists of
ID | number
------------------------
1 | 10
1 | 11
1 | 13
1 | 14
1 | 15
1 | 16
Final Result would be
ID | nMin | nMax
--------------------------------
1 | 11 | 15

You can use row_number to assign a ranking per ID. Then you can group by id and pick the rows with the ranking you're after. The following example picks the second lowest and third highest :
select id
, max(case when rnAsc = 2 then number end) as SecondLowest
, max(case when rnDesc = 3 then number end) as ThirdHighest
from (
select ID
, row_number() over (partition by ID order by number) as rnAsc
, row_number() over (partition by ID order by number desc) as rnDesc
) as SubQueryAlias
group by
id
The max is just to pick out the one non-null value; you can replace it with min or even avg and it would not affect the outcome.

This will work, but see caveats:
SELECT Id, number
INTO #T
FROM (
SELECT 1 ID, 10 number
UNION
SELECT 1 ID, 10 number
UNION
SELECT 1 ID, 11 number
UNION
SELECT 1 ID, 13 number
UNION
SELECT 1 ID, 14 number
UNION
SELECT 1 ID, 15 number
UNION
SELECT 1 ID, 16 number
) U;
WITH EX AS (
SELECT Id, MIN(number) MinNumber, MAX(number) MaxNumber
FROM #T
GROUP BY Id
)
SELECT #T.Id, MIN(number) nMin, MAX(number) nMax
FROM #T INNER JOIN
EX ON #T.Id = EX.Id
WHERE #T.number <> MinNumber AND #T.number <> MaxNumber
GROUP BY #T.Id
DROP TABLE #T;
If you have two MAX values that are the same value, this will not pick them up. So depending on how your data is presented you could be losing the proper result.

You could select the next minimum value by using the following method:
SELECT MAX(Number)
FROM
(
SELECT top 2 (Number)
FROM table1 t1
WHERE ID = {MyNumber}
order by Number
)a
It only works if you can restrict the inner query with a where clause

This would be a better way. I quickly put this together, but if you can combine the two queries, you will get exactly what you were looking for.
select *
from
(
select
myID,
myNumber,
row_number() over (order by myID) as myRowNumber
from MyTable
) x
where x.myRowNumber = 2
select *
from
(
select
myID,
myNumber,
row_number() over (order by myID desc) as myRowNumber
from MyTable
) y
where x.myRowNumber = 2

let the table name be tblName.
select max(number) from tblName where number not in (select max(number) from tblName);
same for min, just replace max with min.

As I myself learned just today the solution is to use LIMIT. You order the results so that the highest values are on top and limit the result to 2. Then you select that subselect and order it the other way round and only take the first one.
SELECT somefield FROM (
SELECT somefield from table
ORDER BY somefield DESC LIMIT 2)
ORDER BY somefield ASC LIMIT 1

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Expand on limit SQL query results use lag() window function - sql

One method to consider might be: SELECT t1.* FROM t t1 WHERE (SELECT count(distinct parent_id) FROM t t2 WHERE t2.score >= t1.score) <= 5 You can see a Fiddle here. The WHERE clause is simply counting the number of distinct parent_ids with a score >= the current row.

Related

How do i select all columns, plus the result of the sum

Return distinct results that appear more than once

SQL query to find counts of numbers in running total

Oracle Nested Grouping

Second maximum and minimum values

Categories

Resources