Identify the counts of each distinct value in one column in sqlite3

Identify the counts of each distinct value in one column in sqlite3 - sql

I am trying to identify the count of each distinct value in one column (name) in a table called brgy.
---------------------
| ID | name |
---------------------
| 1 | Alfonso |
| 2 | Arakan |
| 3 | Poblacion |
| 4 | Ilaya |
| 5 | Poblacion |
----------------------
I tried using this code but it keeps giving the COUNT as 1 despite Poblacion appearing twice in the name column:
SELECT name,COUNT(name) AS distinct_name
FROM (
SELECT DISTINCT name
FROM brgy
GROUP BY name
)
GROUP BY name;
The intended output should eliminate duplicate names but sum up the number of times the distinct name appears in the name column:
Expected Output is as below,
-----------------------------
| name | distinct_name |
-----------------------------
| Alfonso | 1 |
| Arakan | 1 |
| Ilaya | 1 |
| Poblacion | 2 |
-----------------------------

A simple GROUP BY name:
SELECT name, COUNT(name) AS distinct_name
FROM brgy
GROUP BY name;
you don't need the subquery:
SELECT DISTINCT name FROM brgy GROUP BY name
because GROUP BY name takes care of it.

Just removed the subquery because it will give you unique entry :
select name, count(*) as distinct_name
from brgy
group by name
order by distinct_name, name;

Related

SQL group by a field and only return one joined row for each grouping

Table data
+-----+----------------+--------+----------------+
| ID | Required_by | Name | Another_Field |
+-----+----------------+--------+----------------+
| 1 | 7 August | cat | X |
| 2 | 7 August | cat | Y |
| 3 | 10 August | cat | Z |
| 4 | 11 August | dog | A |
+-----+----------------+--------+----------------+
What I want to do is group by the name, then for each group choose one of the rows with the earliest required by date.
For this data set, I would like to end up with either rows 1 and 4, or rows 2 and 4.
Expected result:
+-----+----------------+--------+----------------+
| ID | Required_by | Name | Another_Field |
+-----+----------------+--------+----------------+
| 1 | 7 August | cat | X |
| 4 | 11 August | dog | A |
+-----+----------------+--------+----------------+
OR
+-----+----------------+--------+----------------+
| ID | Required_by | Name | Another_Field |
+-----+----------------+--------+----------------+
| 2 | 7 August | cat | Y |
| 4 | 11 August | dog | A |
+-----+----------------+--------+----------------+
I have something that returns 1,2 and 4 but I'm not sure how to only pick one from the first group to get the desired result. I'm joining the grouping with the data table so that I can get the ID and another_field back after the grouping.
SELECT d.id, d.name, d.required_by, d.another_field
FROM
(
SELECT min(required_by) as min_date, name
FROM data
GROUP BY name
) agg
INNER JOIN
data d
on d.required_by = agg.min_date AND d.name = agg.name

This is typically solved using window functions:
select d.id, d.name, d.required_by, d.another_field
from (
select id, name, required_by, another_field,
row_number() over (partition by name order by required_by) as rn
from data
) d
where d.rn = 1;
In Postgres using distinct on() is typically faster:
select distinct on (name) *
from data
order by name, required_by
Online example

SELECT [id]
,[date]
,[name]
FROM [test].[dbo].[data]
WHERE date IN (SELECT min(date) FROM data GROUP BY name)
enter image description here

Oracle SQL: Counting how often an attribute occurs for a given entry and choosing the attribute with the maximum number of occurs

I have a table that has a number column and an attribute column like this:
1.
+-----+-----+
| num | att |
-------------
| 1 | a |
| 1 | b |
| 1 | a |
| 2 | a |
| 2 | b |
| 2 | b |
+------------
I want to make the number unique, and the attribute to be whichever attribute occured most often for that number, like this (This is the end-product im interrested in) :
2.
+-----+-----+
| num | att |
-------------
| 1 | a |
| 2 | b |
+------------
I have been working on this for a while and managed to write myself a query that looks up how many times an attribute occurs for a given number like this:
3.
+-----+-----+-----+
| num | att |count|
------------------+
| 1 | a | 1 |
| 1 | b | 2 |
| 2 | a | 1 |
| 2 | b | 2 |
+-----------------+
But I can't think of a way to only select those rows from the above table where the count is the highest (for each number of course).
So basically what I am asking is given table 3, how do I select only the rows with the highest count for each number (Of course an answer describing providing a way to get from table 1 to table 2 directly also works as an answer :) )

You can use aggregation and window functions:
select num, att
from (
select num, att, row_number() over(partition by num order by count(*) desc, att) rn
from mytable
group by num, att
) t
where rn = 1
For each num, this brings the most frequent att; if there are ties, the smaller att is retained.

Oracle has an aggregation function that does this, stats_mode().:
select num, stats_mode(att)
from t
group by num;
In statistics, the most common value is called the mode -- hence the name of the function.
Here is a db<>fiddle.

You can use group by and count as below
select id, col, count(col) as count
from
df_b_sql
group by id, col

Join Table From Minimum Value and Specific Name

I have:
Table id
+--------+
| number |
+--------+
| 1 |
| 2 |
| 3 |
+--------+
Table data
+-------+--------------+
| name | phone_number |
+-------+--------------+
| Bob | 111 |
| John | 333 |
| Alice | 555 |
+-------+--------------+
How to join table with results: (number from minimum value & name='John') ?
+--------+-------+--------------+
| number | name | phone_number |
+--------+-------+--------------+
| 1 | John | 333 |
+--------+-------+--------------+

You can try below -
select
(select min(number) FROM ID) as number, name, phone_number
from date
where name = 'John'

You can use cross join:
select min(number) as number, name, phone_number
from Table_Id
cross join Table_Data
group by name, phone_number

Depending on the RDBMS you're using, this query should get you close.
SELECT
MIN_NUMBER, NAME, PHONE_NUMBER
FROM
DATA LEFT JOIN (SELECT MIN(NUMBER) AS MIN_NUMBER FROM ID) ON 1=1
WHERE NAME = 'JOHN'

how to sum multiple rows with same id in SQL Server

Lets say I have following table:
id | name | no
--------------
1 | A | 10
1 | A | 20
1 | A | 40
2 | B | 20
2 | B | 20
And I want to perform a select query in SQL server which sums the value of "no" field which have same id.
Result should look like this,
id | name | no
--------------
1 | A | 70
2 | B | 40

Simple GROUP BY and SUM should work.
SELECT ID, NAME, SUM([NO])
FROM Your_TableName
GROUP BY ID, NAME;

Use SUM and GROUP BY
SELECT ID,NAME, SUM(NO) AS TOTAL_NO FROM TBL_NAME GROUP BY ID, NAME

SELECT *, SUM(no) AS no From TABLE_NAME GROUP BY name
This will return the same table by summing up the no column of the same name column.

SQL : Getting duplicate rows along with other variables

I am working on Terradata SQL. I would like to get the duplicate fields with their count and other variables as well. I can only find ways to get the count, but not exactly the variables as well.
Available input
+---------+----------+----------------------+
| id | name | Date |
+---------+----------+----------------------+
| 1 | abc | 21.03.2015 |
| 1 | def | 22.04.2015 |
| 2 | ajk | 22.03.2015 |
| 3 | ghi | 23.03.2015 |
| 3 | ghi | 23.03.2015 |
Expected output :
+---------+----------+----------------------+
| id | name | count | // Other fields
+---------+----------+----------------------+
| 1 | abc | 2 |
| 1 | def | 2 |
| 2 | ajk | 1 |
| 3 | ghi | 2 |
| 3 | ghi | 2 |
What am I looking for :
I am looking for all duplicate rows, where duplication is decided by ID and to retrieve the duplicate rows as well.
All I have till now is :
SELECT
id, name, other-variables, COUNT(*)
FROM
Table_NAME
GROUP BY
id, name
HAVING
COUNT(*) > 1
This is not showing correct data. Thank you.

You could use a window aggregate function, like this:
SELECT *
FROM (
SELECT id, name, other-variables,
COUNT(*) OVER (PARTITION BY id) AS duplicates
FROM users
) AS sub
WHERE duplicates > 1
Using a teradata extension to ISO SQL syntax, you can simplify the above to:
SELECT id, name, other-variables,
COUNT(*) OVER (PARTITION BY id) AS duplicates
FROM users
QUALIFY duplicates > 1

As an alternative to the accepted and perfectly correct answer, you can use:
SELECT {all your required 'variables' (they are not variables, but attributes)}
, cnt.Count_Dups
FROM Table_NAME TN
INNER JOIN (
SELECT id
, COUNT(1) Count_Dups
GROUP BY id
HAVING COUNT(1) > 1 -- If you want only duplicates
) cnt
ON cnt.id = TN.id
edit: According to your edit, duplicates are on id only. Edited my query accordingly.

try this,
SELECT
id, COUNT(id)
FROM
Table_NAME
GROUP BY
id
HAVING
COUNT(id) > 1

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Identify the counts of each distinct value in one column in sqlite3 - sql

A simple GROUP BY name: SELECT name, COUNT(name) AS distinct_name FROM brgy GROUP BY name; you don't need the subquery: SELECT DISTINCT name FROM brgy GROUP BY name because GROUP BY name takes care of it.

Just removed the subquery because it will give you unique entry : select name, count(*) as distinct_name from brgy group by name order by distinct_name, name;

Related

SQL group by a field and only return one joined row for each grouping

Oracle SQL: Counting how often an attribute occurs for a given entry and choosing the attribute with the maximum number of occurs

Join Table From Minimum Value and Specific Name

how to sum multiple rows with same id in SQL Server

SQL : Getting duplicate rows along with other variables

Categories

Resources