Getting distinct list of ints from 2 distinct int lists - sql

I have 2 separate queries that are just basic selects, both returning a single distinct column of ints. I need to then combined these 2 lists of ints together to produce a final single distinct list of ints.
Is there any faster way to do this than the following?
SELECT DISTINCT ID
FROM dbo.Test
UNION
SELECT DISTINCT ID
FROM dbo.Test2

If you don't have duplicates within each table, then the following is probably faster:
select id
from dbo.test
union all
select id
from dbo.test1 t1
where not exists (select 1 from dbo.test t where t.id = t1.id);
For this, you want an index on test(id).
Even with duplicates, the following is likely to be faster:
select distinct id
from dbo.test
union all
select distinct id
from dbo.test1 t1
where not exists (select 1 from dbo.test t where t.id = t1.id);
This requires indexes for both test(id) and test1(id). The idea is that the indexes are scanned to return the id.

I think that the fastest approach in your case is to remove the two DISTINCT since UNION will remove all duplicates overall anyway:
SELECT ID
FROM dbo.Test
UNION
SELECT ID
FROM dbo.Test2
Note that the two DISTINCTs don't ensure uniqueness across both sequences anyway, that's what the UNION does. If you don't need/want unique elements use UNION ALL.

Related

Order two parts of a union independently of one another

Is it possible to do something like this:
select name from table1 order by name
union
select name from table2 order by name
I know I can do this:
select name from table1
union
select name from table2 order by name
However, I want the names from table1 to appear first. I have spent the last hour Googling this and I have go nowhere. For example, I have looked here: How to order by with union in SQL?
The query needs to be a bit more complicated:
select name
from ((select distinct name, 1 as is_1 from table1)
union
(select distinct name, 0 from table2)
) n
group by name
order by max(is_1), name;
This uses select distinct in the subqueries because that can take advantage of an index on name.
Add a "sort" field and put the union inside a subquery so you can sort after the union.
untested
select a.name
from (
select name, 1 sort
from table1
union all
select name, 2 sort
from table2
) a
order by a.sort, a.name
I changed it to union all to make it clear this approach won't do a union. You could also select the sort column if you want to see it. If you don't want duplicate names, then this approach won't work.
You need another column to sort on. UNION does not allow the individual queries to have an ORDER BY clause.
Adding in a column to sort on before name allows for it to sort the individual result sets. See my example below:
CREATE TABLE #Table1 (Name VARCHAR(50))
CREATE TABLE #Table2 (Name VARCHAR(50))
INSERT INTO #Table1 VALUES ('Bart'), ('Lisa'), ('Maggie')
INSERT INTO #Table2 VALUES ('Chris'), ('Meg'), ('Stewie')
SELECT Name, 0 AS Sort FROM #Table1
UNION
SELECT Name, 1 AS Sort FROM #Table2
ORDER BY Sort, Name

Union of multiple queries using the count function

I'm working on learning more about how the UNION function works in SQL Server.
I've got a query that is directed at a single table:
SELECT Category, COUNT(*) AS Number
FROM Table1
GROUP BY Category;
This returns the number of entries for each distinct line in the Category column.
I have multiple tables that are organized by this Category column and I'd like to be able to have the results for every table returned by one query.
It seems like UNION will accomplish what I want it to do but the way I've tried implementing the query doesn't work with COUNT(*).
SELECT *
FROM (SELECT Table1.Category
Table1.COUNT(*) AS Number
FROM dbo.Table1
UNION
SELECT Table2.Category
Table2.COUNT(*) AS Number
FROM dbo.Table2) AS a
GROUP BY a.Category
I'm sure there's an obvious reason why this doesn't work but can anyone point out what that is and how I could accomplish what I'm trying to do?
You cannot write a common Group by clause for two different select's. You need to use Group by clause for each select
SELECT TABLE1.Category, --missing comma here
COUNT(*) as Number -- Remove TABLE1. alias name
FROM dbo.TABLE1
GROUP BY Category
UNION ALL --UNION
SELECT TABLE2.Category, --missing comma here
COUNT(*) as Number -- Remove TABLE1. alias name
FROM dbo.TABLE2
GROUP BY Category
If you really want to remove duplicates in result then change UNION ALL to UNION
COUNT as any associated aggregation function has to have GROUP BY specified. You have to use group by for each sub query separately:
SELECT * FROM (
SELECT TABLE1.Category,
COUNT(*) as Number
FROM dbo.TABLE1
GROUP BY TABLE1.Category
UNION ALL
SELECT TABLE2.Category,
COUNT(*) as Number
FROM dbo.TABLE2
GROUP BY TABLE2.Category
) as a
It is better to use UNION ALL vs UNION - UNION eliminates duplicates from result sets, since - let say - you want to merge both results as they are it is safer to use UNION ALL

How to return unique records between two tables without using distinct and union?

I need to return the unique records between two tables. Ideally, an UNION would solve my problem but both tables contain an object field which gives me an error(cannot ORDER objects without MAP or ORDER method) when I do UNION/distinct.
So, I was wondering if I can do a UNION ALL(to avoid the error) to get all the records first then do something to return only the unique records from there. I tried analytic function combined with the UNION ALL query but no luck so far.
Select * from Table1
union all
Select * from table2
Any help? Note:I need to return all fields.
I actually solved the problem using analytic function+row_num. The query will choose the first record for each set of duplicates hence returning only the unique records.
select * from
(
select ua.*,row_number() over (partition by p_id order by p_id ) row_num from
(
select * from table1
union all
select * from table2
)ua
) inner
where inner.row_num=1
How about this :
SELECT DISTINCT A.* FROM
(
Select * from Table1
union all
Select * from table2
) A;
(or)
SELECT col1,col2,col3...coln FROM
(
Select col1,col2,col3...coln from Table1
union all
Select col1,col2,col3...coln from table2
) A
GROUP BY A.col1,col2,col3...coln;
UNION ALL will give duplicate values as well .. instead use UNION and see if you are facing the error

How can I identify common elements in at least 2 out of 5 tables?

If I have 5 tables, what join functions should I be using if I want to find elements in a single column that occur in AT LEAST 2 out of the 5 tables?, ie: discarding only those elements that occur in a single table.
Would the code be similar if I wanted to find common elements in AT LEAST 3/5 tables?
(I'm using MS Access)
Thanks!
I'm not 100% positive I understand your question, but I think you can use UNION ALL for this:
select yourcol
from (
select distinct yourcol from t1
union all
select distinct yourcol from t2
union all
select distinct yourcol from t3
union all
select distinct yourcol from t4
union all
select distinct yourcol from t5
) t
group by id
having count(*) >= 2
SQL Fiddle Demo
Then you can change >= 2 to whatever number you want.
BTW -- if the column in question doesn't contain duplicates, you can remove distinct from the subquery.

Combining several query results into one table, how is the results order determined?

I am retuning table results for different queries but each table will be in the same format and will all be in one final table. If I want the results for query 1 to be listed first and query2 second etc, what is the easiest way to do it?
Does UNION append the table or are is the combination random?
The SQL standard does not guarantee an order unless explicitly called for in an order by clause. In practice, this usually comes back chronologically, but I would not rely on it if the order is important.
Across a union you can control the order like this...
select
this,
that
from
(
select
this,
that
from
table1
union
select
this,
that
from
table2
)
order by
that,
this;
UNION appends the second query to the first query, so you have all the first rows first.
You can use:
SELECT Col1, Col2,...
FROM (
SELECT Col1, Col2,..., 1 AS intUnionOrder
FROM ...
) AS T1
UNION ALL (
SELECT Col1, Col2,..., 2 AS intUnionOrder
FROM ...
) AS T2
ORDER BY intUnionOrder, ...