SQL Server : Getting distinct count on every column in a large view - sql

I have a large SQL Server 2012 database with a couple of views I need to analyse.
What I want to know for each view is the number of unique values of each column in the view. I could not find any script yet that would give me this.
So the input should be the view name and the output would be two rows like:
Column Uniques
accountid 200
accountname 178
numberofemp 23
telephone 154
notusedyet 0

You need to use COUNT() (an aggregate function) with Distinct to count only the unique values.
SELECT [column], COUNT(DISTINCT value) [Uniques]
FROM tableName
GROUP BY [column]

Get a distinct count for each column via count(distinct [ColA]) for each column you want to count (no group by). You can then unpivot to get the tabular format you desire. Here's an example:
;with DistinctColumnCount( Id, Description )
as
(
select
count(distinct Id) Id
, count(distinct Description) Description
from
EntityB
)
SELECT CountColumn, [Count].[DistinctCount]
FROM
DistinctColumnCount
unpivot
( DistinctCount for CountColumn in ( Id, [Description] ) ) as [Count]

Related

Using Derby SQL to calculate value for histogram

I have a table with various SKU in totes.
The table is totecontents with below columns:
ToteID
SKU
Each Tote can contain a maximum of 6 SKUs. (programmatically constrained)
select toteid, count(*) as qtypertote
from totecontents
group by toteid;
gives me a list of totes with the number of skus in each.
I now want to get to a table with following result
SkuCount Occurences where each row would have the ordinal value (1 through 6 ) and then the number of occurences of that value.
My efforts included the following approach
select count(*)
from
( select toteid, count(*) as qtypertote
from totecontents
group by toteid)
group by qtypertote;
Stung by the comments I performed more research. This works:
SELECT CountOfskus, COUNT(1) groupedCount
FROM
( SELECT COUNT(*) as countofskus, toteid
FROM totecontents
Group By toteid
) MyTable
GROUP BY countofskus;

How to delete the duplicate data in table (Postgres)

I want to delete the duplicated data in a table , I know there is a way use
SELECT
fruit,
COUNT( fruit )
FROM
basket
GROUP BY
fruit
HAVING
COUNT( fruit )> 1
ORDER BY
fruit;
to find them , buy I need to determine every column's value is equal , which means tableA.* = tableA.* (except id , id is the auto-increment primary key )
and I tried this:
SELECT
*,
COUNT( * )
FROM
myTable
GROUP BY
*
HAVING
COUNT( * )> 1
ORDER BY
id;
but it says I can't use GROUP BY * , so how can I find & delete the duplicated data(need every column's value is equal except id)?
using
SELECT * DISTINCT
DISTINCT remove duplicated result
You need to try something similar to be below query. You apply PARTITION BY for the columns other than Id (as it is incrementing unique value). PARTITION BY should be applied for columns, for which you want to check duplicates.
Also refer to Row_Number in Postgres & Common Table expression in Postgres
WITH DuplicateTableRows AS
(
SELECT Id, Row_Number() OVER (PARTITION BY col1, col2... ORDER BY Id)
FROM
Table1
)
DELETE FROM Table1
WHERE Id IN (SELECT Id FROM Table1 WHERE row_number > 1)
You can do this using JSON:
select (to_jsonb(b) - 'id')
from basket b
group by 1
having count(*) > 1;
The result is as JSON. Unfortunately, to extract the values back into a record, you need to list the columns individually.

Avg Sql Query Always Returns int

I have one column for Farmer Names and one column for Town Names in my table TRY.
I want to find Average_Number_Of_Farmers_In_Each_Town.
Select TownName ,AVG(num)
FROM(Select TownName,Count(*) as num From try Group by TownName) a
group by TownName;
But this query always returns int values. How can i get values in float too?
;WITH [TRY]([Farmer Name], [Town Name])
AS
(
SELECT N'Johny', N'Bucharest' UNION ALL
SELECT N'Miky', N'Bucharest' UNION ALL
SELECT N'Kinky', N'Ploiesti'
)
SELECT AVG(src.Cnt) AS Average
FROM
(
SELECT COUNT(*)*1.00 AS Cnt
FROM [TRY]
GROUP BY [TRY].[Town Name]
) src
Results:
Average
--------
1.500000
Without ... *1.00 the result will be (!) 1 (AVG(INT 2 , INT 1) -truncated-> INT 1, see section Return types).
Your query is always returning int logically because the average is not doing anything. Both the inner and the outer queries are grouping by town name -- so there is one value for each average, and that average is the count.
If you are looking for the overall average, then something like:
Select AVG(cast(cnt as float))
FROM (Select TownName, Count(*) as cnt
From try
Group by TownName
) t
You can also do this without the subquery as:
select cast(count(*) as float) /count(distinct TownName)
from try;
EDIT:
The assumption was that each farmer in the town has one row in try. Are you just trying to count the number of distinct farmers in each town? Assuming you have a field like FarmerName that identifies a given farmer, that would be:
select TownName, count(distinct FarmerName)
from try
group by TownName;

sql query - filtering duplicate values to create report

I am trying to list all the duplicate records in a table. This table does not have a Primary Key and has been specifically created only for creating a report to list out duplicates. It comprises of both unique and duplicate values.
The query I have so far is:
SELECT [OfficeCD]
,[NewID]
,[Year]
,[Type]
FROM [Test].[dbo].[Duplicates]
GROUP BY [OfficeCD]
,[NewID]
,[Year]
,[Type]
HAVING COUNT(*) > 1
This works right and gives me all the duplicates - that is the number of times it occurs.
But I want to display all the values in my report of all the columns. How can I do that without querying for each record separately?
For example:
Each table has 10 fields and [NewID] is the field which is occuring multiple times.I need to create a report with all the data in all the fields where newID has been duplicated.
Please help.
Thank you.
You need a subquery:
SELECT * FROM yourtable
WHERE NewID IN (
SELECT NewID FROM yourtable
GROUP BY OfficeCD,NewID,Year,Type
HAVING Count(*)>1
)
Additionally you might want to check your tags: You tagged mysql, but the Syntax lets me think you mean sql-server
Try this:
SELECT * FROM [Duplicates] WHERE NewID IN
(
SELECT [NewID] FROM [Duplicates] GROUP BY [NewID] HAVING COUNT(*) > 1
)
select d.*
from Duplicates d
inner join (
select NewID
from Duplicates
group by NewID
having COUNT(*) > 1
) dd on d.NewID = dd.NewID

Selecting COUNT(*) with DISTINCT

In SQL Server 2005 I have a table cm_production that lists all the code that's been put into production. The table has a ticket_number, program_type, program_name and push_number along with some other columns.
GOAL: Count all the DISTINCT program names by program type and push number.
What I have so far is:
DECLARE #push_number INT;
SET #push_number = [HERE_ADD_NUMBER];
SELECT DISTINCT COUNT(*) AS Count, program_type AS [Type]
FROM cm_production
WHERE push_number=#push_number
GROUP BY program_type
This gets me partway there, but it's counting all the program names, not the distinct ones (which I don't expect it to do in that query). I guess I just can't wrap my head around how to tell it to count only the distinct program names without selecting them. Or something.
Count all the DISTINCT program names by program type and push number
SELECT COUNT(DISTINCT program_name) AS Count,
program_type AS [Type]
FROM cm_production
WHERE push_number=#push_number
GROUP BY program_type
DISTINCT COUNT(*) will return a row for each unique count. What you want is COUNT(DISTINCT <expression>): evaluates expression for each row in a group and returns the number of unique, non-null values.
I needed to get the number of occurrences of each distinct value. The column contained Region info.
The simple SQL query I ended up with was:
SELECT Region, count(*)
FROM item
WHERE Region is not null
GROUP BY Region
Which would give me a list like, say:
Region, count
Denmark, 4
Sweden, 1
USA, 10
You have to create a derived table for the distinct columns and then query the count from that table:
SELECT COUNT(*)
FROM (SELECT DISTINCT column1,column2
FROM tablename
WHERE condition ) as dt
Here dt is a derived table.
SELECT COUNT(DISTINCT program_name) AS Count, program_type AS [Type]
FROM cm_production
WHERE push_number=#push_number
GROUP BY program_type
try this:
SELECT
COUNT(program_name) AS [Count],program_type AS [Type]
FROM (SELECT DISTINCT program_name,program_type
FROM cm_production
WHERE push_number=#push_number
) dt
GROUP BY program_type
You can try the following query.
SELECT column1,COUNT(*) AS Count
FROM tablename where createddate >= '2022-07-01'::date group by column1
This is a good example where you want to get count of Pincode which stored in the last of address field
SELECT DISTINCT
RIGHT (address, 6),
count(*) AS count
FROM
datafile
WHERE
address IS NOT NULL
GROUP BY
RIGHT (address, 6)