SQL Select top frequent records

SQL Select top frequent records - sql

I have the following table:
Table
+----+------+-------+
| ID | Name | Group |
+----+------+-------+
| 0 | a | 1 |
| 1 | a | 1 |
| 2 | a | 2 |
| 3 | a | 1 |
| 4 | b | 1 |
| 5 | b | 2 |
| 6 | b | 1 |
| 7 | c | 2 |
| 8 | c | 2 |
| 9 | c | 1 |
+----+------+-------+
I would like to select top 20 distinct names from a specific group ordered by most frequent name in that group. The result for this example for group 1 would return a b c (
a - 3 occurrences, b - 2 occurrences and c - 1 occurrence).
Thank you.

SELECT TOP(20) [Name], Count(*) FROM Table
WHERE [Group] = 1
GROUP BY [Name]
ORDER BY Count(*) DESC

SELECT Top(20)
name, group, count(*) as occurences
FROM yourtable
GROUP BY name, group
ORDER BY count(*) desc

SELECT
TOP 20
Name,
Group,
COUNT(1) Count,
FROM
MyTable
GROUP BY
Name,
Group
ORDER BY
Count DESC

Related

Selecting rows that doesn't have duplicates

Let's say I have the following table:
| sku | id | value | count |
|-----|----|-------|-------|
| A | 1 | 1 | 2 |
| A | 1 | 2 | 2 |
| A | 3 | 3 | 3 |
I want to select rows that don't have the same count for the same id. So my desired outcome is:
| sku | id | value | count |
|-----|----|-------|-------|
| A | 3 | 3 | 3 |
I need something that works with Postgres 10

A simple method is window functions:
select t.*
from (select t.*, count(*) over (partition by sku, id) as cnt
from t
) t
where cnt = 1;
This assumes you really mean the sku/id combination.

How to sum rows before a condition is met in SQL

I have a table which has multiple records for the same id. Looks like this, and the rows are sorted by sequence number.
+----+--------+----------+----------+
| id | result | duration | sequence |
+----+--------+----------+----------+
| 1 | 12 | 7254 | 1 |
+----+--------+----------+----------+
| 1 | 12 | 2333 | 2 |
+----+--------+----------+----------+
| 1 | 11 | 1000 | 3 |
+----+--------+----------+----------+
| 1 | 6 | 5 | 4 |
+----+--------+----------+----------+
| 1 | 3 | 20 | 5 |
+----+--------+----------+----------+
| 2 | 1 | 230 | 1 |
+----+--------+----------+----------+
| 2 | 9 | 10 | 2 |
+----+--------+----------+----------+
| 2 | 6 | 0 | 3 |
+----+--------+----------+----------+
| 2 | 1 | 5 | 4 |
+----+--------+----------+----------+
| 2 | 12 | 3 | 5 |
+----+--------+----------+----------+
E.g. for id=1, i would like to sum the duration for all the rows before and include result=6, which is 7254+2333+1000+5. Same for id =2, it would be 230+10+0. Anything after the row where result=6 will be left out.
My expected output:
+----+----------+
| id | duration |
+----+----------+
| 1 | 10592 |
+----+----------+
| 2 | 240 |
+----+----------+
The sequence has to be in ascending order.
I'm not sure how I can do this in sql.
Thank you in advance!

I think you want:
select t2.id, sum(t2.duration)
from t
where t.sequence <= (select t2.sequence
from t t2
where t2.id = t.id and t2.result = 6
);
In PrestoDB, I would recommend window functions:
select id, sum(duration)
from (select t.*,
min(case when result = 6 then sequence end) over (partition by id) as sequence_6
from t
) t
where sequence <= sequence_6;

You can use a simple aggregate query with a condition that uses a subquery to recover the sequence corresponding to the record whose sequence is 6 :
SELECT t.id, SUM(t.duration) total_duration
FROM mytable t
WHERE t.sequence <= (
SELECT sequence
FROM mytable
WHERE id = t.id AND result = 6
)
GROUP BY t.id
This demo on DB Fiddle with your test data returns :
| id | total_duration |
| --- | -------------- |
| 1 | 10592 |
| 2 | 240 |

Basic group by query should solve your issue
select
id,
sum(duration) duration
from t
group by id
for the certain rows:
select
id,
sum(duration) duration
from t
where id = 1
group by id
if you want to include it in your result set
select id, duration, sequence from t
union all
select
id,
sum(duration) duration
null sequence
from t
group by id

Sum all sub group last value by group

Consider the following table:
ID | ITEM | GROUP_ID | VAL | COST
---+------+----------+-----------+-------
1 | A | 1 | 1 | 12
2 | B | 1 | 2 | 12
3 | C | 1 | 3 | 12
4 | D | 1 | 4 | 13
5 | D | 1 | 5 | 12
6 | E | 2 | 1 | 17
7 | E | 2 | 2 | 10
8 | E | 2 | 3 | 11
9 | E | 2 | 4 | 12
10 | F | 2 | 5 | 15
11 | F | 2 | 6 | 13
12 | F | 2 | 7 | 11
13 | F | 2 | 8 | 12
how to get the result as follow:
GROUP_ID | VAL | COST
----------+-----------+-------
1 | 15 | 48
2 | 36 | 24
The val is the sum by group id.
The cost is the sum of last value by item.

Use analytic function ROW_NUMBER() on postgres, oracle or sql server
SqlFiddleDemo
WITH last_item as (
SELECT group_id, sum(cost) as sum_cost
FROM (
SELECT t.*,
ROW_NUMBER() over (partition by item order by id desc) as rn
FROM Table1 t
) as t
WHERE rn = 1
GROUP BY t.group_id
),
val_sum as (
SELECT t.group_id, SUM(val) as sum_val
FROM Table1 t
GROUP BY t.group_id
)
SELECT v.group_id, v.sum_val, l.sum_cost
FROM val_sum v
INNER JOIN last_item l
ON v.group_id = l.group_id
OUTPUT
| group_id | sum_val | sum_cost |
|----------|---------|----------|
| 1 | 15 | 48 |
| 2 | 36 | 24 |

Try this
WITH LastRow (id)
AS (
SELECT MAX(id)
FROM TheTable
GROUP BY item, group_id
)
SELECT group_Id, SUM(val), SUM(CASE WHEN B.id IS NULL THEN 0 ELSE cost END)
FROM TheTable A
LEFT OUTER JOIN LastRow B ON A.id = B.id
GROUP BY group_id
EDIT:
SQL Fiddle Demo
Thanks #Juan Carlos Oropeza for creating the SQL Fiddle test data

group by top two results based on order

I have been trying to get this to work with some row_number, group by, top, sort of things, but I am missing some fundamental concept. I have a table like so:
+-------+-------+-------+
| name | ord | f_id |
+-------+-------+-------+
| a | 1 | 2 |
| b | 5 | 2 |
| c | 6 | 2 |
| d | 2 | 1 |
| e | 4 | 1 |
| a | 2 | 3 |
| c | 50 | 4 |
+-------+-------+-------+
And my desired output would be:
+-------+---------+--------+-------+
| f_id | ord_n | ord | name |
+-------+---------+--------+-------+
| 2 | 1 | 1 | a |
| 2 | 2 | 5 | b |
| 1 | 1 | 2 | d |
| 1 | 2 | 4 | e |
| 3 | 1 | 2 | a |
| 4 | 1 | 50 | c |
+-------+---------+--------+-------+
Where data is ordered by the ord value, and only up to two results per f_id. Should I be working on a Stored Procedure for this or can I just do it with SQL? I have experimented with some select TOP subqueries, but nothing has even come close..
Here are some statements to create the test table:
create table help(name varchar(255),ord tinyint,f_id tinyint);
insert into help values
('a',1,2),
('b',5,2),
('c',6,2),
('d',2,1),
('e',4,1),
('a',2,3),
('c',50,4);

You may use Rank or DENSE_RANK functions.
select A.name, A.ord_n, A.ord , A.f_id from
(
select
RANK() OVER (partition by f_id ORDER BY ord asc) AS "Rank",
ROW_NUMBER() OVER (partition by f_id ORDER BY ord asc) AS "ord_n",
help.*
from help
) A where A.rank <= 2
Sqlfiddle demo

Sql: Aggregation First() After Order by and Group by

id | name | value | time |
--------------------------
1 | A | 1 | 1 |
2 | B | 2 | 2 |
3 | C | 2 | 3 |
4 | A | 3 | 3 |
5 | A | 4 | 2 |
and I expected the result as below:
name | value |
--------------
A | 3 |
B | 2 |
C | 2 |
The results are to show name and value which are lastest time and not duplicate with name.
And I try to query:
SELECT name,First(value)
FROM
(SELECT name,value,time
FROM test
ORDER BY time DESC
)
GROUP BY name;
But I got this result:
name | value |
--------------
A | 1 |
B | 2 |
C | 2 |
I don't understand why A value isn't 3 because from subselect I got A values are 3,4,1 respectively.

Query:
SQLFIDDLEExample
SELECT t.name,
(SELECT t1.value
FROM test t1
WHERE t1.name = t.name
ORDER BY t1.time DESC
LIMIT 1) AS value
FROM test t
GROUP BY t.name
Result:
| NAME | VALUE |
----------------
| A | 3 |
| B | 2 |
| C | 2 |

also you can use partitionby
;with cte as (
select id, row_number() over (order by time desc) rn
from test
)
select * from test
join cte on test.id = cte.id and rn = 1
just choose the one which is faster

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

SQL Select top frequent records - sql

SELECT TOP(20) [Name], Count() FROM Table WHERE [Group] = 1 GROUP BY [Name] ORDER BY Count() DESC

SELECT Top(20) name, group, count() as occurences FROM yourtable GROUP BY name, group ORDER BY count() desc

SELECT TOP 20 Name, Group, COUNT(1) Count, FROM MyTable GROUP BY Name, Group ORDER BY Count DESC

Related

Selecting rows that doesn't have duplicates

How to sum rows before a condition is met in SQL

Sum all sub group last value by group

group by top two results based on order

Sql: Aggregation First() After Order by and Group by

Categories

Resources

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

SQL Select top frequent records - sql

SELECT TOP(20) [Name], Count(*) FROM Table WHERE [Group] = 1 GROUP BY [Name] ORDER BY Count(*) DESC

SELECT Top(20) name, group, count(*) as occurences FROM yourtable GROUP BY name, group ORDER BY count(*) desc

SELECT TOP 20 Name, Group, COUNT(1) Count, FROM MyTable GROUP BY Name, Group ORDER BY Count DESC

Related

Selecting rows that doesn't have duplicates

How to sum rows before a condition is met in SQL

Sum all sub group last value by group

group by top two results based on order

Sql: Aggregation First() After Order by and Group by

Categories

Resources

SELECT TOP(20) [Name], Count() FROM Table WHERE [Group] = 1 GROUP BY [Name] ORDER BY Count() DESC

SELECT Top(20) name, group, count() as occurences FROM yourtable GROUP BY name, group ORDER BY count() desc