SQL need to compare rows count values - sql

I have a query that returns the ID, Name and count of the number of times an ID has been entered to the table.
SELECT
ID,
NAME,
COUNT(*) count
FROM
TABLE
GROUP BY
NAME, ID, CASE_DETAIL_ID
HAVING
COUNT(*) > 1;
This returns the following data:
ID
NAME
COUNT
123
HAT
10
123
UMBRELLA
10
123
TOWEL
10
123
WATER
8
555
HAT
3
555
UMBRELLA
10
555
TOWEL
10
555
WATER
10
322
UMBRELLA
5
322
TOWEL
20
322
WATER
20
I want to be able to query the row with a count of less than what the other rows with the same ID have. How can I do this? So that the end result is:
ID
NAME
COUNT
FULL COUNT
123
WATER
8
10
555
HAT
3
10
322
UMBRELLA
5
20
There are multiple IDs that we store and I only want the rows/names that have a count less than the rows with the same IDs have.
I have also tried -
WITH x AS
(SELECT ID, NAME, COUNT(*) count
FROM FRT.CASE_DETAIL_HISTORY
GROUP BY
NAME,
ID,
CASE_DETAIL_ID)
SELECT x.ID, t.NAME, X.COUNT, MIN(x.count)
FROM x
JOIN FRT.CASE_DETAIL_HISTORY t
on t.ID= x.ID
GROUP BY x.ID, t.ID, X.COUNT
However, this doesnt give me what I am looking for. I only want rows returned if the name's count doesnt match the 'mode' count of the ID.
I also have tried the below but keep facing errors:
WITH COUNT_OF_ROWS AS
(SELECT ID, NAME, COUNT(*) count
FROM TABLE
GROUP BY NAME, ID, CASE_DETAIL_ID
HAVING COUNT(*) >= 1),
MINIMUM AS
(SELECT COUNT_OF_ROWS.ID, COUNT_OF_ROWS.NAME,
MIN(COUNT_OF_ROWS.COUNT) MINI
FROM COUNT_OF_ROWS
JOIN TABLE CD on CD.ID = COUNT_OF_ROWS.ID
GROUP BY COUNT_OF_ROWS.ID, COUNT_OF_ROWS.NAME
)
select distinct COUNT_OF_ROWS.*, MINIMUM.MINI
from minimum, count_of_rows
where minimum.mini != count_of_rows.count;

Some sample data would help but you can use a CTE, and select the lowest using min() something like this:
WITH x AS(
SELECT t.id, t.nametext, COUNT(*) as count
FROM table t
GROUP BY id, t.nametext, CASE_DETAIL_ID
), y as(
SELECT x.id, MIN(x.[COUNT]) as mincount
FROM x
GROUP BY x.id
)
select y.id, x.nametext, y.mincount
from y
join x
on x.[COUNT] = y.mincount
and x.id = y.id
Or it can be done using top 1 and order by like this:
SELECT TOP 1 id, name, COUNT(*) as count
FROM TABLE
WHERE ID = 123
GROUP BY NAME, ID, CASE_DETAIL_ID
ORDER BY count DESC
But bare in mind that as this would select only the first row, this would only work with the where clause because it would always only return 1 row.
While if you use the CTE option it would work also if you want per id, without where id = 123.

Related

Use window functions to select the value from a column based on the sum of another column, in an aggregate query

Consider this data (View on DB Fiddle):
id
dept
value
1
A
5
1
A
5
1
B
7
1
C
5
2
A
5
2
A
5
2
B
15
2
A
2
The base query I am running is pretty simple. Just get the total value by id and the most frequent dept.
SELECT
id,
MODE() WITHIN GROUP(ORDER BY dept) AS dept_freq,
SUM(value) AS value
FROM test
GROUP BY id
;
id
dept_freq
value
1
A
22
2
A
27
But I also need to get, for each id, the dept that concentrates the greatest value (so the greatest sum of value by id and dept, not the highest individual value in the original table).
Is there any way to use window functions to achieve that and do it directly in the base query above?
The expected output for this particular example would be:
id
dept_freq
dept_value
value
1
A
A
22
2
A
B
27
I could achieve that with the query below and then joining that with the results of the base query above
SELECT * FROM(
SELECT
*,
ROW_NUMBER() OVER(PARTITION BY id ORDER BY value DESC) as row
FROM (
SELECT id, dept, SUM(value) AS value
FROM test
GROUP BY id, dept
) AS alias1
) AS alias2
WHERE alias2.row = 1
;
id
dept
value
row
1
A
10
1
2
B
15
1
But it is not easy to read/maintain and seems also pretty inefficient. So I thought it should be possible to achieve this using window functions directly in the base query, and that also may also help Postgres to come up with a better query plan that does less passes over the data. But none of my attempts using over partition and filter worked.
step-by-step demo:db<>fiddle
You can fetch the dept for the highest values using the first_value() partition function. Adding this before your mode() grouping should do it:
SELECT
id,
highest_value_dept,
MODE() WITHIN GROUP(ORDER BY dept) AS dept_freq,
SUM(value) as value
FROM (
SELECT
id,
dept,
value,
FIRST_VALUE(dept) OVER (PARTITION BY id ORDER BY value DESC) as highest_value_dept
FROM test
) s
GROUP BY 1,2

Find duplicate values only if separate column id differs

I have the following table:
id item
1 A
2 A
3 B
4 C
3 H
1 E
I'm looking to obtain duplicate values from the id column only when the item column differs in value. The end result should be:
1 A
1 E
3 B
3 H
I've attempted:
select id, items, count(*)
from table
group by id, items
HAVING count(*) > 1
But this is giving only duplicate values from the id column and not taking into account the items column.
Any suggestions will be greatly appreciated.
You can use a window function for this, this is generally far more efficient than using a self-join
SELECT
t.id,
t.items,
t.count
from (
SELECT *,
COUNT(*) OVER (PARTITION BY t.id) AS count
FROM YourTable t
) t
WHERE t.count > 1;
db<>fiddle

How to find frequency in SQL

I am having some issues with SQL code, specifically in finding the frequency of an ID.
My table looks like
Num ID
136 23
1427 45
1415 67
1416 23
7426 45
4727 12
4278 67
...
I would need to see the frequency of ID, when this has more or equal 2 same values.
For example: 23, 45 and 67 in the table above.
I have tried as follows:
Select distinct *, count(*)
From table_1
Group by 1,2
Having count(*) >2
But it is wrong.
I need distinct, as I do not want any duplicates in Num.
I think I should you a counter to reset when the value of the next rows is different from the previous one and report the frequency (1, 2, 3, and so on), then select values greater or equal to 2, but Indo not know how to do it in Sql.
Could you help me please?
Thanks
Use ID only in GROUP BY :
SELECT ID, COUNT(*) AS No_frequency
FROM table t
GROUP BY id
HAVING COUNT(*) >= 2;
Note : If you have duplicate num then use distinct :
HAVING COUNT(DISTINCT num) >= 2;
If I understand your question, you can try this:
SELECT ID, COUNT(1)
FROM table_1
GROUP BY ID
HAVING COUNT(1) >= 2
In this way you have the ID's with 2 or more occurences and the number of occurences
EDIT
I suppose you are using MySql but add your DBMS in your question, so, try this:
SELECT ID, COUNT(1) as FREQUENCY, GROUP_CONCAT(NUM)
FROM table_1
GROUP BY ID
HAVING COUNT(1) >= 2
This works for me
SELECT ID, COUNT(ID) AS Frq FROM MyTable
GROUP BY ID
HAVING COUNT(ID) > 2
ORDER BY COUNT(ID) DESC

Using distinct on other field of selected ids postgresql

I have this data
Table
id weight
1 1000
1 1000
2 2000
2 2000
3 3000
4 3000
I am trying to find average of weight of distinct ids except 4 and I need data in this format
id avg(weight)
1,2,3 2000
I have tried distinct but it gives me average including all duplicate values.
SELECT
String_agg(distinct id :: text, ', ') AS ids,
Round(Coalesce(Avg(weight), 0)) AS avg
FROM "table"
where id != 4
I have also tried group by id but it gives me data in different format and also it is not giving me correct average.
SELECT
String_agg(id :: text, ', ') AS ids,
Round(Coalesce(Avg(weight), 0)) AS avg
FROM "table"
where id != 4
group by id
so how can I find average for this?
Thanks.
You may try using your current logic on a subquery which finds the distinct records:
SELECT
STRING_AGG(id::text, ',' ORDER BY id) AS ids,
ROUND(COALESCE(AVG(weight), 0)) AS avg
FROM
(
SELECT DISTINCT id, weight
FROM "table"
WHERE id <> 4
) t;
Demo
Note: I added an ORDER BY clause to your STRING_AGG call, to ensure that the ids appear in the order you want.

Grouping by number of occurrences of a repeatable value in Oracle SQL

Lets assume we have a table like this.
id name value
1 x 12
2 x 23
3 y 47
4 x 18
5 y 29
6 z 45
7 y 67
Doing a normal group by name would yield us
select name,count(*) from table group by name;
name count(*)
x 3
y 3
z 1
I want to get the reverse.. ie. grouping the number of names that occur a set number of times. I want my output to be
count number of elements occuring count times
1 1
3 2
Is it possible to do this using just a single query? Another way is to use a temp table but I dont want to do that.
Thanks
You need one more group by:
select cnt, count(*), min(name), max(name)
from (select name, count(*) as cnt
from table
group by name
) n
group by cnt
order by 1;
I do these types of histogram queries all the time. The min() and max() provide sample data. This is useful to understand outliers and unexpected values.
You can GROUP BY twice, e.g.
with
Names as (
select name as name,
count(1) as cnt
from MyTable
group by name)
select count(1),
cnt
from Names
group by cnt