Counting the count of distinct values from two columns in sql

Counting the count of distinct values from two columns in sql - sql

I have a table in data base in which there are corresponding values for the primary key.
I want to count the distinct values from two columns.
I already know one method of using union all and then applying groupby on that resultant table.
Select Id,Brand1
into #Temp
from data
union all
Select Id,Brand2
from data
Select ID,Count(Distinct Brand1)
from #Temp
group by ID
Same thing we can do in big query also using temp table only.
Sample Table
ID Brand1 Brand2
1 A B
1 B C
2 D A
2 A D
Resultant Table
ID Distinct_Count_Brand
1 3
2 2
As you can see in this column Distinct_count_Brand It is counting the unique count of Brand from two columns Brand1 and Brand2.
I already know one way (Basically unpivoting) but want to know if there is some other way around to count unique values from two columns.

I don't know BigQuery's quirks, but perhaps you can just inline the union query:
SELECT ID, COUNT(DISTINCT Brand)
FROM
(
SELECT ID, Brand1 AS Brand FROM data
UNION ALL
SELECT ID, Brand2 FROM data
) t
GROUP BY ID;

In SQL Server, I woud use:
Select b.id, count(distinct b.brand)
from data d cross apply
(values (id, brand1), (id, brand2)) b(id, brand)
group by b.id;
Here is a db<>fiddle.
In BigQuery, the equivalent would be expressed as:
select t.id, count(distinct brand)
from t cross join
unnest(array[brand1, brand2]) brand
group by t.id;
Here is a BQ query that demonstrates that this works:
with t as (
select 1 as id, 'A' as brand1, 'B' as brand2 union all
select 1, 'B', 'C' union all
select 2, 'D', 'A' union all
select 2, 'A', 'D'
)
select t.id, count(distinct brand)
from t cross join
unnest(array[brand1, brand2]) brand
group by t.id;

Related

Insert/join table on multiple conditions

I’ve a table that looks like this:
Table A
Version,id
5060586,22285
5074515,22701
5074515,22285
7242751,22701
7242751,22285
I want to generate a new key called groupId that is inserted as my example below:
Table A
Version,id,groupId
5060586,22285,1
5074515,22701,2
5074515,22285,2
7242751,22701,2
7242751,22285,2
I want the groupId to be the same as long as the id's are the same in the different versions. So for example version 5074515 and 7242751 has the same id's so therefor the groupId will be the same. If all the id's aren't the same a new groupId should be added as it has in version 5060586.
How can i solve this specific problem in SQL oracle?

One approach is to create a unique value representing the set of ids in each version, then assign a groupid to the unique values of that, then join back to the original data.
INSERT ALL
INTO t (version,id) VALUES (5060586,22285)
INTO t (version,id) VALUES (5074515,22701)
INTO t (version,id) VALUES (5074515,22285)
INTO t (version,id) VALUES (7242751,22701)
INTO t (version,id) VALUES (7242751,22285)
SELECT 1 FROM dual;
WITH groups
AS
(
SELECT version
, LISTAGG(id,',') WITHIN GROUP (ORDER BY id) AS group_text
FROM t
GROUP BY version
),
groupids
AS
(
SELECT group_text, ROW_NUMBER() OVER (ORDER BY group_text) AS groupid
FROM groups
GROUP BY group_text
)
SELECT t.*, groupids.groupid
FROM t
INNER JOIN groups ON t.version = groups.version
INNER JOIN groupids ON groups.group_text = groupids.group_text;
dbfiddle.uk

You can use:
UPDATE tableA t
SET group_id = ( SELECT COUNT(DISTINCT id)
FROM TableA x
WHERE x.Version <= t.version );
Which, for the sample data:
CREATE TABLE TableA (
Version NUMBER,
id NUMBER,
group_id NUMBER
);
INSERT INTO TableA (Version, id)
SELECT 5060586,22285 FROM DUAL UNION ALL
SELECT 5074515,22701 FROM DUAL UNION ALL
SELECT 5074515,22285 FROM DUAL UNION ALL
SELECT 7242751,22701 FROM DUAL UNION ALL
SELECT 7242751,22285 FROM DUAL;
Then, after the update:
SELECT * FROM tablea;
Outputs:
VERSION
ID
GROUP_ID
5060586
22285
1
5074515
22701
2
5074515
22285
2
7242751
22701
2
7242751
22285
2
db<>fiddle here

SQLite - Return Rows Even If They Are Duplicates

I have a simple SQLite table which has just one ID column.
I have some variable IDs that may be duplicates of each other like: 1,2,3,4,3,1 (These IDs are just examples, there could be hundreds of them).
And I have a simple query as follows:
SELECT ID FROM TABLE WHERE ID in (1,2,3,4,3,1)
In the usual case the answer contains only 4 rows with ids 1,2,3,4. Is there any way to force SQLite to return rows in the order of the request (1,2,3,4,3,1) even if they are duplicates?
I have n IDs in my query and I want n rows in return even if they are duplicates.
Edit: The Table Definition is:
CREATE TABLE TEST(ID TEXT PRIMARY KEY)

You can use left join:
select t.*
from (select 1 as id, 1 as ord union all
select 2 as id, 2 as ord union all
select 3 as id, 3 as ord union all
select 4 as id, 4 as ord union all
select 3 as id, 5 as ord union all
select 1 as id, 6 as ord
) ids left join
t
on t.id = ids.id
order by ids.ord;

UNION operation with the same table

I have a scenario where I need to query data in a single row as multiple columns,
Table format is as follows,
SAMPLE_TABLE [ID, REF_TAB_A,REF_TAB_B,REF_TAB_C]
I need REF_TAB_A,REF_TAB_B,REF_TAB_C values in a single column. What I did is use UNION ALL as follows,
SELECT REF_TAB_A FROM SAMPLE_TABLE
UNION ALL
SELECT REF_TAB_B FROM SAMPLE_TABLE
UNION ALL
SELECT REF_TAB_C FROM SAMPLE_TABLE
Is there any other way to do this?? What is the most efficient way to handle such a scenario??
(I'm using oracle 11g)
Thanks in advance.. :D

Using union all generally results in three scans of the table. An alternative approach is a little messier but should have just one scan:
SELECT (case when which = 'A' then REF_TAB_A
when which = 'B' then REF_TAB_B
when which = 'C' then REF_TAB_C
end)
FROM SAMPLE_TABLE cross join
(select 'A' as which from dual union all select 'B' from dual union all select 'C' from dual
) iter

You could alias the columns:
SELECT REF_TAB_A tab FROM SAMPLE_TABLE
UNION ALL
SELECT REF_TAB_B tab FROM SAMPLE_TABLE
UNION ALL
SELECT REF_TAB_C tab FROM SAMPLE_TABLE
What you really need to do, however, is to normalize your database. Whenever you have columns with repeating names like name1, name2, name3, namea, nameb, and namec, it is a sign that you want another table and a 1-many relationship between them.
CREATE TABLE tabs (
tab_id NUMBER PRIMARY KEY,
sample_table_id NUMBER,
tab VARCHAR2(255),
CONSTRAINT FK_sample_table
FOREIGN KEY(sample_table_id)
REFERENCES SAMPLE_TABLE(sample_table_id)
)
Now your query involves a simple JOIN.
SELECT
tab
FROM tabs t
JOIN SAMPLE_TABLE st ON t.sample_table_id = st.sample_table_id
WHERE
...

As an alternative to Gordon's CROSS JOIN trick Oracle has the UNPIVOT clause in the SELECT statement specifically for this situation.
Assuming this table:
create table tmp_test ( a number, b number, c number );
insert all
into tmp_test values (1,2,3)
into tmp_test values (4,5,6)
select * from dual;
The following query would do what you require:
select col
from tmp_test
unpivot ( col for i in (a,b,c) );
COL
----------
1
2
3
4
5
6
6 rows selected.
For this small example an explain plan indicates that using the inbuilt functionality would be more efficient but but test both options and see what's better.

Counting the rows of a column where the value of a different column is 1

I am using a select count distinct to count the number of records in a column. However, I only want to count the records where the value of a different column is 1.
So my table looks a bit like this:
Name------Type
abc---------1
def----------2
ghi----------2
jkl-----------1
mno--------1
and I want the query only to count abc, jkl and mno and thus return '3'.
I wasn't able to do this with the CASE function, because this only seems to work with conditions in the same column.
EDIT: Sorry, I should have added, I want to make a query that counts both types.
So the result should look more like:
1---3
2---2

SELECT COUNT(*)
FROM dbo.[table name]
WHERE [type] = 1;
If you want to return the counts by type:
SELECT [type], COUNT(*)
FROM dbo.[table name]
GROUP BY [type]
ORDER BY [type];
You should avoid using keywords like type as column names - you can avoid a lot of square brackets if you use a more specific, non-reserved word.

I think you'll want (assuming that you wouldn't want to count ('abc',1) twice if it is in your table twice):
select count(distinct name)
from mytable
where type = 1
EDIT: for getting all types
select type, count(distinct name)
from mytable
group by type
order by type

select count(1) from tbl where type = 1

;WITH MyTable (Name, [Type]) AS
(
SELECT 'abc', 1
UNION
SELECT 'def', 2
UNION
SELECT 'ghi', 2
UNION
SELECT 'jkl', 1
UNION
SELECT 'mno', 1
)
SELECT COUNT( DISTINCT Name)
FROM MyTable
WHERE [Type] = 1

How to add 2 temporary tables together

If I am creating temporary tables, that have 2 columns. id and score. I want to to add them together.
The way I want to add them is if they each contain the same id then I do not want to duplicate the id but instead add the scores together.
if I have 2 temp tables called t1 and t2
and t1 had:
id 3 score 4
id 6 score 7
and t2 had:
id 3 score 5
id 5 score 2
I would end up with a new temp table containing:
id 3 score 9
id 5 score 2
id 6 score 7
The reason I want to do this is, I am trying to build a product search. I have a few algorithms I want to use, 1 using fulltext another not. And I want to use both algorithms so I want to create a temporary table based on algorithm1 and a temp table based on algorithm2. Then combine them.

How about:
SELECT id, SUM(score) AS score FROM (
SELECT id, score FROM t1
UNION ALL
SELECT id, score FROM t2
) t3
GROUP BY id

This is untested but you should be able to perform a union on the two tables and then perform a select on the results, grouping the fields and adding the scores
SELECT id,SUM(score) FROM
(
SELECT id,score FROM t1
UNION ALL
SELECT id,score FROM t2
) joined
GROUP BY id

Perform a full outer join on the ID. Select on the ID and the sum of the two "score" columns after coalescing the values to 0.

SELECT id, SUM(score) FROM
(
SELECT id, score FROM #t1
UNION ALL
SELECT id, score FROM #t2
) AS Temp
GROUP BY id

select id, sum(score)
from (
select * from table 1
union all
select * from table2
) tables
group by id

You need to create an union of those two tables then You can easily group the results.
SELECT id, sum(score) FROM
(
SELECT id, score FROM t1
UNION
SELECT id, score FROM t2
) as tmp
GROUP BY id;

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Counting the count of distinct values from two columns in sql - sql

I don't know BigQuery's quirks, but perhaps you can just inline the union query: SELECT ID, COUNT(DISTINCT Brand) FROM ( SELECT ID, Brand1 AS Brand FROM data UNION ALL SELECT ID, Brand2 FROM data ) t GROUP BY ID;

Related

Insert/join table on multiple conditions

SQLite - Return Rows Even If They Are Duplicates

UNION operation with the same table

Counting the rows of a column where the value of a different column is 1

How to add 2 temporary tables together

Categories

Resources