joining two columns sql query - sql

world!
I'm currently stuck on this problem where i want to join two columns and run the select statement of the two, but i'm getting errors; these are the columns i want to join:
SELECT DISTINCT column_name FROM owner_name.table_name ORDER BY column_name;
and
SELECT DISTINCT * FROM (SELECT count(column_name) OVER (partition by column_name) Amount from owner_name.table_name order by column_name);
where in the second, for every row, i count how many equal rows i have for each value.
the two columns values:
first column
second column
i dont know how to have both of them next to each other as a normal select statement:
SELECT column_1, column_2 FROM table;

You do not want to use an analytic function for this as you will find the COUNT for all the rows and then use DISTINCT to discard rows which involves lots of unnecessary calculation.
Instead, it is much more efficient GROUP BY the column_name and then aggregate so that you only generate a single row for each group to start with:
SELECT column_name,
COUNT(column_name) AS amount
FROM owner_name.table_name
GROUP BY column_name
ORDER BY column_name;

SELECT DISTINCT
column_name,
COUNT(column_name) OVER (PARTITION BY column_name) Amount
FROM owner_name.table_name
ORDER BY column_name;

Related

What is the faster way to calculate number of duplicate rows present in Redshift Table

There are millions of record in table. And need to calculate number of duplicate rows present in my table in Redshift. I could achieve it by using below query,
select
sum(cnt) from (select <primary_key>
, count(*)-1 as cnt
from
table_name
group by
<primary_key> having count(*)>1
Is there a faster way to achieve the same ?
Is there a way do achieve this in a single query without using subquery ?
Thanks.
You can try the following query:
SELECT Column_name, COUNT(*) Count_Duplicate
FROM Table_name
GROUP BY Column_name
HAVING COUNT(*) > 1
ORDER BY COUNT(*) DESC
If the criteria of duplication is only repeating primary key then
SELECT count(1)-count(distinct <primary_key>) FROM your_table
would work, except if you have specified your column as primary key in Redshift (it doesn't enforce constraint but if you mark a column as primary key count(distinct <primary_key>) will return the same as count(1) even if there are duplicate values in this column

count all the distinct records in a table

I need to count all the distinct records in a table name with a single query and also without using any sub-query.
My code is
select count ( distinct *) from table_name
It gives an error:
Incorrect syntax near '*'.
I am using Microsoft SQL Server
Try this -
SELECT COUNT(*)
FROM
(SELECT DISTINCT * FROM [table_name]) A
I'm afraid that if you don't want to use a subquery, the only way to achieve that is replacing * with a concatenation of the columns in your table
select count(distinct concat(column1, column2, ..., columnN))
from table_name
To avoid undesired behaviours (like the concatenation of 1 and 31 becoming equal to the concatenation of 13 and 1) you could add a reasonable separator
select count(distinct concat(column1, '$%&£', column2, '$%&£', ..., '$%&£', columnN)
from table_name
You can use CTE.
;WITH CTE AS
(
SELECT DISTINCT * FROM TableName
)
SELECT COUNT(*)
FROM CTE
Hope this query gives you what you required.
As others mentioned, you cannot use DISTINCT with *. Also it is good practice to use a column name instead of the *, like a unique key / primary key of the table.
SELECT COUNT( DISTINCT id )
FROM table
select distinct Name , count(Name) from TableName
group by Name
having count(Name)=1
select ##rowcount
I had the same issue involving a query that had multiple joins to tables and I could not simply do count(distinct ) or count(distinct alias.).
My solution was to create a string made up of the key columns I cared about and count them.
SELECT Count(DISTINCT person.first || '~' || person.last)
from person;
If you want to use DISTINCT keyword, you need to specify column name on which bases you want to get distinct records.
Example:
SELECT count(DISTINCT Column-Name) FROM table_name

List all distinct values of column and their count

I have a column with different text values. How can I get a list of all the unique values and the count of the appearance of them in the column?
Simplest way is to use GROUP BY
select text_column, count(*) from text_table group by text_column
more info - http://www.w3schools.com/sql/sql_groupby.asp
SELECT column_name
, COUNT(*)
FROM table_name
GROUP BY column_name
;

COUNT of DISTINCT items in a column

Here's the SQL that works (strangely) but still just returns the COUNT of all items, not the COUNT of DISTINCT items in the column.
SELECT DISTINCT(COUNT(columnName)) FROM tableName;
SELECT COUNT(*) FROM tableName
counts all rows in the table,
SELECT COUNT(columnName) FROM tableName
counts all the rows in the table where columnName is not null, and
SELECT (DISTINCT COUNT(columnName)) FROM tableName
counts all the rows in the table where columnName is both not null and distinct (i.e. no two the same)
SELECT DISTINCT(COUNT(columnName)) FROM tableName
Is the second query (returning, say, 42), and the distinct gets applied after the rows are counted.
You need
SELECT COUNT(DISTINCT columnName) AS Cnt
FROM tableName;
The query in your question gets the COUNT (i.e. a result set with one row) then applies Distinct to that single row result which obviously has no effect.
SELECT COUNT(*) FROM (SELECT DISTINCT columnName FROM tableName);

SELECT *, COUNT(*) in SQLite

If i perform a standard query in SQLite:
SELECT * FROM my_table
I get all records in my table as expected. If i perform following query:
SELECT *, 1 FROM my_table
I get all records as expected with rightmost column holding '1' in all records. But if i perform the query:
SELECT *, COUNT(*) FROM my_table
I get only ONE row (with rightmost column is a correct count).
Why is such results? I'm not very good in SQL, maybe such behavior is expected? It seems very strange and unlogical to me :(.
SELECT *, COUNT(*) FROM my_table is not what you want, and it's not really valid SQL, you have to group by all the columns that's not an aggregate.
You'd want something like
SELECT somecolumn,someothercolumn, COUNT(*)
FROM my_table
GROUP BY somecolumn,someothercolumn
If you want to count the number of records in your table, simply run:
SELECT COUNT(*) FROM your_table;
count(*) is an aggregate function. Aggregate functions need to be grouped for a meaningful results. You can read: count columns group by
If what you want is the total number of records in the table appended to each row you can do something like
SELECT *
FROM my_table
CROSS JOIN (SELECT COUNT(*) AS COUNT_OF_RECS_IN_MY_TABLE
FROM MY_TABLE)