count distinct entries by summing values on a new column - sql

how would you do this:
Tim ...
Tim ...
Henry ...
Henry ...
Henry ...
I have a table thatc ontains these names on the first column and I am interested in adding a new column X, that should have:
0.5
0.5
0.333
0.333
0.333
to count the number of distinct entries in the first column if you were to do the sum of the new column. Thank you!

You could use the following:
select name,
1.0 / count(name) over(partition by name) as X
from yourtable
See SQL Fiddle with Demo

Please try:
select
[Name], 1.00/count(*)
from YourTable
group by [Name]
Sample:
declare #testTable table (Name varchar(10))
insert into #testTable
select 'A' UNION ALL
select 'A' UNION ALL
select 'A' UNION ALL
select 'B' UNION ALL
select 'B'
SELECT
[Name], 1.00/count(*)
FROM #testTable
GROUP BY [Name]

Related

Counting the count of distinct values from two columns in sql

I have a table in data base in which there are corresponding values for the primary key.
I want to count the distinct values from two columns.
I already know one method of using union all and then applying groupby on that resultant table.
Select Id,Brand1
into #Temp
from data
union all
Select Id,Brand2
from data
Select ID,Count(Distinct Brand1)
from #Temp
group by ID
Same thing we can do in big query also using temp table only.
Sample Table
ID Brand1 Brand2
1 A B
1 B C
2 D A
2 A D
Resultant Table
ID Distinct_Count_Brand
1 3
2 2
As you can see in this column Distinct_count_Brand It is counting the unique count of Brand from two columns Brand1 and Brand2.
I already know one way (Basically unpivoting) but want to know if there is some other way around to count unique values from two columns.
I don't know BigQuery's quirks, but perhaps you can just inline the union query:
SELECT ID, COUNT(DISTINCT Brand)
FROM
(
SELECT ID, Brand1 AS Brand FROM data
UNION ALL
SELECT ID, Brand2 FROM data
) t
GROUP BY ID;
In SQL Server, I woud use:
Select b.id, count(distinct b.brand)
from data d cross apply
(values (id, brand1), (id, brand2)) b(id, brand)
group by b.id;
Here is a db<>fiddle.
In BigQuery, the equivalent would be expressed as:
select t.id, count(distinct brand)
from t cross join
unnest(array[brand1, brand2]) brand
group by t.id;
Here is a BQ query that demonstrates that this works:
with t as (
select 1 as id, 'A' as brand1, 'B' as brand2 union all
select 1, 'B', 'C' union all
select 2, 'D', 'A' union all
select 2, 'A', 'D'
)
select t.id, count(distinct brand)
from t cross join
unnest(array[brand1, brand2]) brand
group by t.id;

SQL server 2008 R2, select one value of a column for each distinct value of another column

On SQL server 2008 R2, I would like to select one value of a column for each distinct value of another column.
e.g.
name id_num
Tom 53
Tom 60
Tom 27
Jane 16
Jane 16
Bill 97
Bill 83
I need to get one id_num for each distinct name, such as
name id_num
Tom 27
Jane 16
Bill 97
For each name, the id_num can be randomly picked up (not required to be max or min) as long as it is associated with the name.
For example, for Bill, I can pick up 97 or 83. Either one is ok.
I do know how to write the SQL query.
Thanks
SELECT
name,MIN(id_num)
FROM YourTable
GROUP BY name
UPDATE:
If you want pick id_num randomly, you may try this
WITH cte AS (
SELECT
name, id_num,rn = ROW_NUMBER() OVER (PARTITION BY name ORDER BY newid())
FROM YourTable
)
SELECT *
FROM cte
WHERE rn = 1
SQL Fiddle Demo
You could grab the max id like this:
SELECT name, MAX(id_num)
FROM tablename
GROUP BY name
That would get you one id for each distinct name.
select name, max(id_num)
from [mytable]
group by name
The (SELECT 1) in the cte does not really order the data in each of the partitions. which should give you the random selection.
CREATE TABLE #tmp
(
name VARCHAR(10)
, id_num INT
)
INSERT INTO #tmp
SELECT 'Tom', 53 UNION ALL
SELECT 'Tom', 60 UNION ALL
SELECT 'Tom', 27 UNION ALL
SELECT 'Jane', 16 UNION ALL
SELECT 'Jane', 16 UNION ALL
SELECT 'Bill', 97 UNION ALL
SELECT 'Bill', 83
;WITH CTE AS (
SELECT
ROW_NUMBER() OVER (PARTITION BY name ORDER BY (SELECT 1)) AS ID
, name
, id_num
FROM #tmp
)
SELECT *
FROM CTE
WHERE ID = 1

SQL remove duplicate rows when counting

I have a select statement using count and since I am counting the rows instead of returning them how do I make sure that I do not get a duplicate value on a column?
For Example
_table_
fName someField
Eric data
Kyle mdata
Eric emdata
Andrew todata
I want the count to be 3 because Eric is duplicated, is there a way to do that? My select is:
Select Count(*) From _table_ INTO :var
Thanks,
SELECT Count(DISTINCT fName) From _table_ INTO :var
It will count number of distinct elements from fName column.
This will do the Job Select Count(distinct fnmae) From _table_ INTO :var
Try SELECT Count(DISTINCT fName) From _table_ INTO :var
You could select the count of DISTINCT first names:
declare #table table (fname varchar(20), someField varchar(20))
insert into #table (fname, someField)
select 'Eric', 'data'
union select 'Kyle', 'mdata'
union select 'Eric', 'emdata'
union select 'Andrew', 'todata'
-- returns 4, because there are 4 rows in the table
select count(*) from #table
-- returns 3, because there are 3 rows with distinct first names
select count(*) from (select distinct fname from #table) firstNames

Union select statements horizontally

let's say result of my select statements as follows (I have 5 of those):
Id Animal AnimalId
1 Dog Dog1
1 Cat Cat57
Id Transport TransportId
2 Car Car100
2 Plane Plane500
I'd like to get a result as follows:
Id Animal AnimalId Transport TransportId
1 Dog Dog1
1 Cat Cat57
2 Car Car100
2 Plane Plane500
What I can do is I can crate a tablevariable and specify all possible columns and insert records from each select statement into it. But maybe better solution like PIVOT?
Edit
queries: 1st: Select CategoryId as Id, Animal, AnimalId from Animal
2nd: Select CategoryId as Id, Transport, TransportId from Transport
How about this, if you need them in the same rows, this gets the row_number() for each row and joins on those:
select a.id,
a.aname,
a.aid,
t.tname,
t.tid
from
(
select id, aname, aid, row_number() over(order by aid) rn
from animal
) a
left join
(
select id, tname, tid, row_number() over(order by tid) rn
from transport
) t
on a.rn = t.rn
see SQL Fiddle with Demo
If you don't need them in the same row, then use UNION ALL:
select id, aname, aid, 'Animal' tbl
from animal
union all
select id, tname, tid, 'Transport'
from transport
see SQL Fiddle with Demo
Edit #1, here is a version with an UNPIVOT and PIVOT:
select an_id, [aname], [aid], [tname], [tid]
from
(
select *, row_number() over(partition by col order by col) rn
from animal
unpivot
(
value
for col in (aname, aid)
) u
union all
select *, row_number() over(partition by col order by col) rn
from transport
unpivot
(
value
for col in (tname, tid)
) u
) x1
pivot
(
min(value)
for col in([aname], [aid], [tname], [tid])
) p
order by an_id
see SQL Fiddle with Demo
This would do it for you:
SELECT
ID, field1, field2, '' as field3, '' as field4
FROM sometable
UNION ALL
SELECT
ID, '', '', field3, field4
FROM someothertable
create table Animal (
Animal varchar(50)
,AnimalID varchar(50)
)
create table Transport (
Transport varchar(50)
,TransportID varchar(50)
)
insert into Animal values ('Dog', 'Dog1')
insert into Animal values ('Cat', 'Cat57')
insert into Transport values ('Car', 'Car100')
insert into Transport values ('Plane', 'Plane500')
select ID = 1
,A.Animal
,A.AnimalID
,Transport = ''
,TransportID = ''
from Animal A
union
select ID = 2
,Animal = ''
,AnimalID = ''
,T.Transport
,T.TransportID
from Transport T
To get it in the format you want, select the values you want, and then null (or an empty string) for the other columns.
SELECT
CategoryId as Id,
Animal as 'Animal',
AnimalId as 'AnimalId',
null as 'Transport',
null as 'TransportId'
FROM Animal
UNION
SELECT
CategoryId as Id,
null as 'Animal',
null as 'AnimalId',
Transport as 'Transport',
TransportId as 'TransportId'
FROM Transport
I'm still not sure of the purpose of this, but this should give the output you want.
You shouldn't need to pivot, your results are already fine.
If you want, you can just UNION all 5 statements together in the same format as the first select: ID/Category/CategoryID. Then you'll get one long result set with all 5 sets appended 3 columns wide.
Is that what you want? Or do you need to distinguish between 'categories'?
given your example, try:
Select CategoryId as Id, Animal, AnimalId from Animal
union all
Select CategoryId as Id, Transport, TransportId from Transport
if you want, you can alias the columns like:
Select CategoryId as Id, Animal as category, AnimalId as categoryID from Animal
union all
Select CategoryId as Id, Transport, TransportId from Transport
you really don't need to pivot, just space out your columns like you were thinking initially. You don't pivot to move columns, you pivot to perform an aggregate function over grouped data.

Counting the rows of a column where the value of a different column is 1

I am using a select count distinct to count the number of records in a column. However, I only want to count the records where the value of a different column is 1.
So my table looks a bit like this:
Name------Type
abc---------1
def----------2
ghi----------2
jkl-----------1
mno--------1
and I want the query only to count abc, jkl and mno and thus return '3'.
I wasn't able to do this with the CASE function, because this only seems to work with conditions in the same column.
EDIT: Sorry, I should have added, I want to make a query that counts both types.
So the result should look more like:
1---3
2---2
SELECT COUNT(*)
FROM dbo.[table name]
WHERE [type] = 1;
If you want to return the counts by type:
SELECT [type], COUNT(*)
FROM dbo.[table name]
GROUP BY [type]
ORDER BY [type];
You should avoid using keywords like type as column names - you can avoid a lot of square brackets if you use a more specific, non-reserved word.
I think you'll want (assuming that you wouldn't want to count ('abc',1) twice if it is in your table twice):
select count(distinct name)
from mytable
where type = 1
EDIT: for getting all types
select type, count(distinct name)
from mytable
group by type
order by type
select count(1) from tbl where type = 1
;WITH MyTable (Name, [Type]) AS
(
SELECT 'abc', 1
UNION
SELECT 'def', 2
UNION
SELECT 'ghi', 2
UNION
SELECT 'jkl', 1
UNION
SELECT 'mno', 1
)
SELECT COUNT( DISTINCT Name)
FROM MyTable
WHERE [Type] = 1