Select Distinct for 2 columns in SQL query - sql

If I have a table such as
1 bob
1 ray
1 bob
1 ray
2 joe
2 joe
And I want to select distinct based on the two columns so that I would get
1 bob
1 ray
2 joe
How can I word my query? Is the only way to concatenate the columns and wrap them around a distinct function operator?

select distinct id, name from [table]
or
select id, name from [table] group by id, name

You can just do:
select distinct col1, col2 from your_table;
That's exactly what the distinct operator is for: removing duplicate result rows.
Keep in mind that distinct is usually a pretty expensive operation, since, after processing the query, the DB server might perform a sort operation in order to remove the duplicates.

Related

Convert row values as columns in SQL Server

Table:
CompanyID Lead LeadManager
------------------------------
1 2 3
Required output:
CompanyID Role RoleID
--------------------------------
1 Lead 2
1 Leadmanager 3
You can use union all to unpivot your dataset. This is a standard solution that works across most (if not all) RDBMS:
select companyID, 'Lead' role, Lead from mytable
union all select companyID, 'LeadManager', LeadManager from mytable
You can use apply to unpivot the data:
select v.*
from t cross apply
(values (t.CompanyId, 'Lead', t.Lead),
(t.CompanyId, 'LeadManager', t.LeadManager)
) v(CompanyId, Role, RoleId);
The advantage to this approach is that it scans the original table only once. This can be particular helpful when the "table" is a complex query.

What is the difference between count (*) and count(attribute_name)?

Is there any difference between COUNT(*) and COUNT(attribute_name)?
I used count(attribute_name) as I thought that it would be specific hence the searching process would be easier. Is that true?
It would be great to see any example with sql code with my issue to help me understand better
Imagine this table:
select Count(TelephoneNumber) from Calls -- returns 3
select Count(*) from Calls -- returns 4
count(column_name) also counts duplicate values. Consider:
select Count(TelephoneNumber) from Calls -- returns 4
COUNT(*) counts all the records in the group.
COUNT(column_name) only counts non-null values.
There is also another typical expression, COUNT(DISTINCT column_name), that counts non-null distinct values.
Since you asked for it, here is a demo on DB Fiddlde:
with t as (
select 1 x from dual
union all select 1 from dual
union all select null from dual
)
select count(*), count(x), count(distinct x) from t
COUNT(*) | COUNT(X) | COUNT(DISTINCTX)
-------: | -------: | ---------------:
3 | 2 | 1
COUNT(*) will count all the rows.
COUNT(column) will count non-NULLs only.
Your can use of COUNT(*) or COUNT(column) which should be based on the desired output only.
Consider below Example of employee table
ID Name Description
1 Raji Smart
2 Rahi Positive
3
4 Falle Smart
select count(*) from employee;
Count(*)
4
select count(name) from employee;
Count(Name)
3
count() only counts non-null values. * references the complete row and as such never excludes any rows. count(attribute_name) only counts rows where that column is no null.
So this:
select count(attribute_name)
from the_table
is equivalent to:
select count(*)
from the_table
where attribute_name is not null
The difference is simple: COUNT(*) counts the number of rows produced by the query, whereas COUNT(1) counts the number of 1 values. Note that when you include a literal such as a number or a string in a query, this literal is "appended" or attached to every row that is produced by the FROM clause.
For more detail this link would help you understand.

How to get grouping of rows in SQL

I have a table like this:
id name
1 washing
1 cooking
1 cleaning
2 washing
2 cooking
3 cleaning
and I would like to have a following grouping
id name count
1 washing,cooking,cleaning 3
2 washing,cooking 2
3 cleaning 1
I have tried to group by ID but can only show count after grouping by
SELECT id,
COUNT(name)
FROM WORK
GROUP BY id
But this will only give the count and not the actual combination of names.
I am new to SQL. I know it has to be relational but there must be some way.
Thanks in advance!
in postgresql you can use array_agg
SELECT id, array_agg(name), COUNT(*)
FROM WORK
GROUP BY id
in mysql you can use group_concat
SELECT id, group_concate(name), COUNT(*)
FROM WORK
GROUP BY id
or for redshift
SELECT id, listagg(name), COUNT(*)
FROM WORK
GROUP BY id

What exactly does SELECT DISTINCT(COUNT(*)) do?

I used the following query and it returned what I wanted it to return, but I'm having a tough time wrapping my head around what the query is doing.
Query is nothing fancier than what's in the title: select distinct(count(*)) from table1
Distinct is not required in your SQL ,as you are going to get only result, count(*) without group by clause returns, count of all rows within that table.
Hence try this :
select count(*) from table1
Distinct is used for finding distinct values from a group of values:
say you have table1 , with column1 as :
Column1
----------
a
a
b
b
a
c
following sqls are run you will get output as :
1) select count(*) from table1
output :6
2) select distinct(count(*)) from table1
output :6
3) select count( distinct column1) from table1
output :3
Usually distinct is used inside count preferably with a particular column .
select count( distinct column_name_n ) from table1
The distinct is redundant... Select Count(*) with only one table can only generate one value, so distinct (which would eliminate duplicates) is irelelvant.
If you had multiple outputs, (if for example you were grouping on something) then it would cause the query to only display one output row for every distinct value of count(*) that would other wise be generated...
if, for example, you had
name
Bob
Bob
Bob
Bob
Mary
Mary
Mary
Mary
Dave
Dave
Al
George
then
select count(*)
From table
group By name
would result in
4
4
2
1
1
but
select distinct count(*)
From table
group By name
would result in
4
2
1

SQL Separating Distinct Values using single column

Does anyone happen to know a way of basically taking the 'Distinct' command but only using it on a single column. For lack of example, something similar to this:
Select (Distinct ID), Name, Term from Table
So it would get rid of row with duplicate ID's but still use the other column information. I would use distinct on the full query but the rows are all different due to certain columns data set. And I would need to output only the top most term between the two duplicates:
ID Name Term
1 Suzy A
1 Suzy B
2 John A
2 John B
3 Pete A
4 Carl A
5 Sally B
Any suggestions would be helpful.
select t.Id, t.Name, t.Term
from (select distinct ID from Table order by id, term) t
You can use row number for this
Select ID, Name, Term from(
Select ID, Name, Term, ROW_NUMBER ( )
OVER ( PARTITION BY ID order by Name) as rn from Table
Where rn = 1)
as tbl
Order by determines the order from which the first row will be picked.