I have the following table:
Table
+----+------+-------+
| ID | Name | Group |
+----+------+-------+
| 0 | a | 1 |
| 1 | a | 1 |
| 2 | a | 2 |
| 3 | a | 1 |
| 4 | b | 1 |
| 5 | b | 2 |
| 6 | b | 1 |
| 7 | c | 2 |
| 8 | c | 2 |
| 9 | c | 1 |
+----+------+-------+
I would like to select top 20 distinct names from a specific group ordered by most frequent name in that group. The result for this example for group 1 would return a b c (
a - 3 occurrences, b - 2 occurrences and c - 1 occurrence).
Thank you.
SELECT TOP(20) [Name], Count(*) FROM Table
WHERE [Group] = 1
GROUP BY [Name]
ORDER BY Count(*) DESC
SELECT Top(20)
name, group, count(*) as occurences
FROM yourtable
GROUP BY name, group
ORDER BY count(*) desc
SELECT
TOP 20
Name,
Group,
COUNT(1) Count,
FROM
MyTable
GROUP BY
Name,
Group
ORDER BY
Count DESC
Related
Let's say I have the following table:
| sku | id | value | count |
|-----|----|-------|-------|
| A | 1 | 1 | 2 |
| A | 1 | 2 | 2 |
| A | 3 | 3 | 3 |
I want to select rows that don't have the same count for the same id. So my desired outcome is:
| sku | id | value | count |
|-----|----|-------|-------|
| A | 3 | 3 | 3 |
I need something that works with Postgres 10
A simple method is window functions:
select t.*
from (select t.*, count(*) over (partition by sku, id) as cnt
from t
) t
where cnt = 1;
This assumes you really mean the sku/id combination.
I have a table which has multiple records for the same id. Looks like this, and the rows are sorted by sequence number.
+----+--------+----------+----------+
| id | result | duration | sequence |
+----+--------+----------+----------+
| 1 | 12 | 7254 | 1 |
+----+--------+----------+----------+
| 1 | 12 | 2333 | 2 |
+----+--------+----------+----------+
| 1 | 11 | 1000 | 3 |
+----+--------+----------+----------+
| 1 | 6 | 5 | 4 |
+----+--------+----------+----------+
| 1 | 3 | 20 | 5 |
+----+--------+----------+----------+
| 2 | 1 | 230 | 1 |
+----+--------+----------+----------+
| 2 | 9 | 10 | 2 |
+----+--------+----------+----------+
| 2 | 6 | 0 | 3 |
+----+--------+----------+----------+
| 2 | 1 | 5 | 4 |
+----+--------+----------+----------+
| 2 | 12 | 3 | 5 |
+----+--------+----------+----------+
E.g. for id=1, i would like to sum the duration for all the rows before and include result=6, which is 7254+2333+1000+5. Same for id =2, it would be 230+10+0. Anything after the row where result=6 will be left out.
My expected output:
+----+----------+
| id | duration |
+----+----------+
| 1 | 10592 |
+----+----------+
| 2 | 240 |
+----+----------+
The sequence has to be in ascending order.
I'm not sure how I can do this in sql.
Thank you in advance!
I think you want:
select t2.id, sum(t2.duration)
from t
where t.sequence <= (select t2.sequence
from t t2
where t2.id = t.id and t2.result = 6
);
In PrestoDB, I would recommend window functions:
select id, sum(duration)
from (select t.*,
min(case when result = 6 then sequence end) over (partition by id) as sequence_6
from t
) t
where sequence <= sequence_6;
You can use a simple aggregate query with a condition that uses a subquery to recover the sequence corresponding to the record whose sequence is 6 :
SELECT t.id, SUM(t.duration) total_duration
FROM mytable t
WHERE t.sequence <= (
SELECT sequence
FROM mytable
WHERE id = t.id AND result = 6
)
GROUP BY t.id
This demo on DB Fiddle with your test data returns :
| id | total_duration |
| --- | -------------- |
| 1 | 10592 |
| 2 | 240 |
Basic group by query should solve your issue
select
id,
sum(duration) duration
from t
group by id
for the certain rows:
select
id,
sum(duration) duration
from t
where id = 1
group by id
if you want to include it in your result set
select id, duration, sequence from t
union all
select
id,
sum(duration) duration
null sequence
from t
group by id
Consider the following table:
ID | ITEM | GROUP_ID | VAL | COST
---+------+----------+-----------+-------
1 | A | 1 | 1 | 12
2 | B | 1 | 2 | 12
3 | C | 1 | 3 | 12
4 | D | 1 | 4 | 13
5 | D | 1 | 5 | 12
6 | E | 2 | 1 | 17
7 | E | 2 | 2 | 10
8 | E | 2 | 3 | 11
9 | E | 2 | 4 | 12
10 | F | 2 | 5 | 15
11 | F | 2 | 6 | 13
12 | F | 2 | 7 | 11
13 | F | 2 | 8 | 12
how to get the result as follow:
GROUP_ID | VAL | COST
----------+-----------+-------
1 | 15 | 48
2 | 36 | 24
The val is the sum by group id.
The cost is the sum of last value by item.
Use analytic function ROW_NUMBER() on postgres, oracle or sql server
SqlFiddleDemo
WITH last_item as (
SELECT group_id, sum(cost) as sum_cost
FROM (
SELECT t.*,
ROW_NUMBER() over (partition by item order by id desc) as rn
FROM Table1 t
) as t
WHERE rn = 1
GROUP BY t.group_id
),
val_sum as (
SELECT t.group_id, SUM(val) as sum_val
FROM Table1 t
GROUP BY t.group_id
)
SELECT v.group_id, v.sum_val, l.sum_cost
FROM val_sum v
INNER JOIN last_item l
ON v.group_id = l.group_id
OUTPUT
| group_id | sum_val | sum_cost |
|----------|---------|----------|
| 1 | 15 | 48 |
| 2 | 36 | 24 |
Try this
WITH LastRow (id)
AS (
SELECT MAX(id)
FROM TheTable
GROUP BY item, group_id
)
SELECT group_Id, SUM(val), SUM(CASE WHEN B.id IS NULL THEN 0 ELSE cost END)
FROM TheTable A
LEFT OUTER JOIN LastRow B ON A.id = B.id
GROUP BY group_id
EDIT:
SQL Fiddle Demo
Thanks #Juan Carlos Oropeza for creating the SQL Fiddle test data
I have been trying to get this to work with some row_number, group by, top, sort of things, but I am missing some fundamental concept. I have a table like so:
+-------+-------+-------+
| name | ord | f_id |
+-------+-------+-------+
| a | 1 | 2 |
| b | 5 | 2 |
| c | 6 | 2 |
| d | 2 | 1 |
| e | 4 | 1 |
| a | 2 | 3 |
| c | 50 | 4 |
+-------+-------+-------+
And my desired output would be:
+-------+---------+--------+-------+
| f_id | ord_n | ord | name |
+-------+---------+--------+-------+
| 2 | 1 | 1 | a |
| 2 | 2 | 5 | b |
| 1 | 1 | 2 | d |
| 1 | 2 | 4 | e |
| 3 | 1 | 2 | a |
| 4 | 1 | 50 | c |
+-------+---------+--------+-------+
Where data is ordered by the ord value, and only up to two results per f_id. Should I be working on a Stored Procedure for this or can I just do it with SQL? I have experimented with some select TOP subqueries, but nothing has even come close..
Here are some statements to create the test table:
create table help(name varchar(255),ord tinyint,f_id tinyint);
insert into help values
('a',1,2),
('b',5,2),
('c',6,2),
('d',2,1),
('e',4,1),
('a',2,3),
('c',50,4);
You may use Rank or DENSE_RANK functions.
select A.name, A.ord_n, A.ord , A.f_id from
(
select
RANK() OVER (partition by f_id ORDER BY ord asc) AS "Rank",
ROW_NUMBER() OVER (partition by f_id ORDER BY ord asc) AS "ord_n",
help.*
from help
) A where A.rank <= 2
Sqlfiddle demo
id | name | value | time |
--------------------------
1 | A | 1 | 1 |
2 | B | 2 | 2 |
3 | C | 2 | 3 |
4 | A | 3 | 3 |
5 | A | 4 | 2 |
and I expected the result as below:
name | value |
--------------
A | 3 |
B | 2 |
C | 2 |
The results are to show name and value which are lastest time and not duplicate with name.
And I try to query:
SELECT name,First(value)
FROM
(SELECT name,value,time
FROM test
ORDER BY time DESC
)
GROUP BY name;
But I got this result:
name | value |
--------------
A | 1 |
B | 2 |
C | 2 |
I don't understand why A value isn't 3 because from subselect I got A values are 3,4,1 respectively.
Query:
SQLFIDDLEExample
SELECT t.name,
(SELECT t1.value
FROM test t1
WHERE t1.name = t.name
ORDER BY t1.time DESC
LIMIT 1) AS value
FROM test t
GROUP BY t.name
Result:
| NAME | VALUE |
----------------
| A | 3 |
| B | 2 |
| C | 2 |
also you can use partitionby
;with cte as (
select id, row_number() over (order by time desc) rn
from test
)
select * from test
join cte on test.id = cte.id and rn = 1
just choose the one which is faster