SQL query to determine that values in a column are unique - sql

How to write a query to just determine that the values in a column are unique?

Try this:
SELECT CASE WHEN count(distinct col1)= count(col1)
THEN 'column values are unique' ELSE 'column values are NOT unique' END
FROM tbl_name;
Note: This only works if 'col1' does not have the data type 'ntext' or 'text'. If you have one of these data types, use 'distinct CAST(col1 AS nvarchar(4000))' (or similar) instead of 'distinct col1'.

select count(distinct column_name), count(column_name)
from table_name;
If the # of unique values is equal to the total # of values, then all values are unique.

IF NOT EXISTS (
SELECT
column_name
FROM
your_table
GROUP BY
column_name
HAVING
COUNT(*)>1
)
PRINT 'All are unique'
ELSE
PRINT 'Some are not unique'
If you want to list those that aren't unique, just take the inner query and run it. HTH.

With this following query, you have the advantage of not only seeing if your columns are unique, but you can also see which combination is most non-unique. Furthermore, because you still see frequency 1 is your key is unique, you know your results are good, and not for example simply missing; something is less clear when using a HAVING clause.
SELECT Col1, Col2, COUNT(*) AS Freq
FROM Table
GROUP BY Col1, Col2
ORDER BY Freq DESC

Are you trying to return only distinct values of a column? If so, you can use the DISTINCT keyword. The syntax is:
SELECT DISTINCT column_name,column_name
FROM table_name;

If you want to check if all the values are unique and you care about NULL values, then do something like this:
select (case when count(distinct column_name) = count(column_name) and
(count(column_name) = count(*) or count(column_name) = count(*) - 1)
then 'All Unique'
else 'Duplicates'
end)
from table t;

select (case when count(distinct column1 ) = count(column1)
then 'Unique'
else 'Duplicates'
end)
from table_name

By my understanding you want to know which values are unique in a column. Therefore, using select distinct to do so doesn't solve the problem, because only lists the value as if they are unique, but they may not.
A simple solution as follows:
SELECT COUNT(column_name), column_name
FROM table_name
GROUP BY column_name
HAVING COUNT(column_name) = 1;

this code return distinct value
SELECT code FROM #test
group by code
having count(distinct code)= count(code)
return 14 that is just unique value

Use the DISTINCT keyword inside a COUNT aggregate function as shown below:
SELECT COUNT(DISTINCT column_name) AS some_alias FROM table_name
The above query will give you the count of distinct values in that column.

Related

Find all pairs of identical columns in different tables in a database

I would like to find identical column headers in different tables throughout a database (or across databases). I am trying to learn what are unique or foreign keys in each table fit with other keys in other tables in a multi-database SQL environment (using Teradata), and I think such a query would expedite this process.
I know how to query the database name, table name, and column name, but I don't know how to specify a condition to return only column headers in one table that exist in a different table
Here is some sample code that I think is the starter to this type of query:
select DatabaseName,TABLENAME as Tab1,Columnname as Col1, TABLENAME as Tab2, Columnname as Col2
from DBC.ColumnsV
order by DatabaseName,TABLENAME;
DatabaseName Tab1 Col1 Tab2 Col2
Dat1 Table0 Col0 Table9 Col0
Andrews query simplified:
SELECT DatabaseName, TableName, ColumnName,
Count(*) Over (PARTITION BY ColumnName) AS Cnt
FROM dbc.ColumnsV
QUALIFY Cnt > 1 -- only repeated columns
I think this is enough data to work with, but if you really want pairs of tables you need a self join:
WITH cte AS
(
SELECT DatabaseName, TableName, ColumnName,
Count(*) Over (PARTITION BY ColumnName) AS Cnt
FROM dbc.ColumnsV
WHERE databasename = 'open_data'
QUALIFY Cnt > 1 -- only repeated columns
)
SELECT *
FROM cte AS t1
JOIN cte AS t2
ON t1.ColumnName = t2.ColumnName -- same column
WHERE t1.DatabaseName || '.' || t1.TableName < t2.DatabaseName || '.' || t2.TableName
Of course, this will greatly increase the number of rows, it returns each table name once, thus n*(n-1)/2 rows for n tables with the same column name.
If you change the condition to <> instead of < you get all combinations and twice the number of rows, i,e, both table1,table2 and table2,table1.
First we'll get a list of column names that are duplicated. Then we join that back to ColumnsV and get whatever information you want on those columns.
with cols as (
select
columnname ,
count (*) as cnt
from
dbc.columnsv
group by columnname
having count (*) > 1)
select
columnsv.*
from
dbc.columnsv
inner join cols
on columnsv.columnname = cols.columnname

Using a value from one query in second query sql

SELECT AS, COUNT(*)
FROM Table1
HAVING COUNT(AS)>1
group BY AS;
This produces the result
AS COUNT
5 2
I then want to use the AS value in another query and only output the end result. Is this possible.i was thinking something like.
SELECT *
FROM
TABLE 2
Where AS =(
SELECT AS, COUNT(*)
FROM Table1
HAVING COUNT(AS)>1
group BY AS;
);
This is called a subquery. To be safe, you would use in instead of = (and as is a bad name for a column, because it is a SQL key word):
SELECT *
FROM TABLE2
WHERE col IN (SELECT col
FROM Table1
GROUP BY col
HAVING COUNT(col) > 1
);
Your first query is also incorrect, because the having clause goes after the group by.
You could use a subquery with the in operator:
SELECT *
FROM table2
WHERE AS IN (SELECT AS
FROM table1
GROUP BY AS
HAVING COUNT(*) > 1)

T-SQL Writing a Count Statement to find two values

I am trying to get two columns to appear. I have made a union of number of tables together. These tables then appear in one table now.
After this table I know need to do a summary count of one column.
This column contains two values. So i require to get count on text value 1 and text value 2 in the column.
select count (column_name) as column_name
FROM table name
where column_name = 'value1'
But i am not sure how to add value 2 into this statement? Any help be great. Much appreciated.
You can use pivot, but I think conditional aggregation is easier in this case:
select sum(case when column_name = 'value1' then 1 else 0 end) as value1,
sum(case when column_name = 'value2' then 1 else 0 end) as value2
from table name;
If you can live with the values on two rows instead of in two columns, use group by:
select column_name, count(*)
from table name
group by column_name;
I not sure what you want but whatever I understand, I think this will help you -
select
Sum ( case when column_name = 'value1' then 1 else 0 end) as CountValue1,
Sum ( case when column_name = 'value2' then 1 else 0 end) as CountValue2
FROM table name
select column_name, count (*)
FROM
(
select column_name from table1
union all
select column_name from table2
) src
group by column_name
where column_name in ( 'value1' ,'value2')

Gathering Database table information?

I am trying to find below information from the table in tabular format . I can get Rows Count, Column Name and Attribute (DataType)
No.Of columns
No.Of Rows Count
Column name
Attribute (DataType)
Min Value
Max Value
Non null count
Distinct count of the column
Any idea?
Many of these items can be found in the INFORMATION_SCHEMA.COLUMNS view, and the rest can be found by querying the table itself. You say you want this data in a tabular format, but many of the items do not 'fit' together. Can you provide a sample of what the result set should look like?
-- No.Of columns
SELECT COUNT(*)
FROM INFORMATION_SCHEMA.COLUMNS
WHERE TABLE_NAME = 'your_table'
-- No.Of Rows Count
SELECT COUNT(*)
FROM your_table
--Column name
SELECT COLUMN_NAME
FROM INFORMATION_SCHEMA.COLUMNS
WHERE TABLE_NAME = 'your_table'
--Attribute (DataType)
SELECT DATA_TYPE, CHARACTER_MAXIMUM_LENGTH
FROM INFORMATION_SCHEMA.COLUMNS
WHERE TABLE_NAME = 'your_table'
--Min Value
SELECT MIN(column_1)
FROM your_table
--Max Value
SELECT MAX(column_1)
FROM your_table
--Non null count
SELECT SUM(CASE WHEN column_1 IS NOT NULL THEN 1 ELSE 0 END) AS not_null_count
FROM your_table
--Distinct count of the column
SELECT COUNT(*)
FROM your_table
GROUP BY column_1

SQL to find the number of distinct values in a column

I can select all the distinct values in a column in the following ways:
SELECT DISTINCT column_name FROM table_name;
SELECT column_name FROM table_name GROUP BY column_name;
But how do I get the row count from that query? Is a subquery required?
You can use the DISTINCT keyword within the COUNT aggregate function:
SELECT COUNT(DISTINCT column_name) AS some_alias FROM table_name
This will count only the distinct values for that column.
This will give you BOTH the distinct column values and the count of each value. I usually find that I want to know both pieces of information.
SELECT [columnName], count([columnName]) AS CountOf
FROM [tableName]
GROUP BY [columnName]
An sql sum of column_name's unique values and sorted by the frequency:
SELECT column_name, COUNT(*) FROM table_name GROUP BY column_name ORDER BY 2 DESC;
Be aware that Count() ignores null values, so if you need to allow for null as its own distinct value you can do something tricky like:
select count(distinct my_col)
+ count(distinct Case when my_col is null then 1 else null end)
from my_table
/
SELECT COUNT(DISTINCT column_name) FROM table as column_name_count;
you've got to count that distinct col, then give it an alias.
select count(*) from
(
SELECT distinct column1,column2,column3,column4 FROM abcd
) T
This will give count of distinct group of columns.
select Count(distinct columnName) as columnNameCount from tableName
Using following SQL we can get the distinct column value count in Oracle 11g.
select count(distinct(Column_Name)) from TableName
After MS SQL Server 2012, you can use window function too.
SELECT column_name, COUNT(column_name) OVER (PARTITION BY column_name)
FROM table_name
GROUP BY column_name
To do this in Presto using OVER:
SELECT DISTINCT my_col,
count(*) OVER (PARTITION BY my_col
ORDER BY my_col) AS num_rows
FROM my_tbl
Using this OVER based approach is of course optional. In the above SQL, I found specifying DISTINCT and ORDER BY to be necessary.
Caution: As per the docs, using GROUP BY may be more efficient.
select count(distinct(column_name)) AS columndatacount from table_name where somecondition=true
You can use this query, to count different/distinct data.
Without using DISTINCT this is how we could do it-
SELECT COUNT(C)
FROM (SELECT COUNT(column_name) as C
FROM table_name
GROUP BY column_name)
Count(distinct({fieldname})) is redundant
Simply Count({fieldname}) gives you all the distinct values in that table. It will not (as many presume) just give you the Count of the table [i.e. NOT the same as Count(*) from table]