Getting a distinct value across 2 union sql server tables - sql

I'm trying to get all distinct values across 2 tables using a union.
The idea is to get a count of all unique values in the columnA column without repeats so that I can get a summation of all columns that contain a unique columnA.
This is what I tried (sql server express 2008)
select
count(Distinct ColumnA)
from
(
select Distinct ColumnA as ColumnA from tableX where x = y
union
select Distinct ColumnA as ColumnA from tableY where y=z
)

SELECT COUNT(distinct tmp.ColumnA) FROM ( (SELECT ColumnA FROM TableX WHERE x=y)
UNION (SELECT ColumnA FROM TableY WHERE y=z) ) as tmp
The extra distincts on TableX and TableY aren't necessary; they'll get stripped in the tmp.ColumnA clause. Declaring a temporary table should eliminate the ambiguity that might've prevented your query from executing.

SELECT COUNT(*)
FROM
(
SELECT DISTINCT ColumnA From TableX WHERE x = y
UNION
SELECT DISTINCT ColumnA From TableY WHERE y = z
) t
Using a "UNION" will not return duplicates. If you used "UNION ALL" then duplicate ColumnA values from each table WOULD be return.

To get distinct values in Union query you can try this
Select distinct AUnion.Name,AUnion.Company from (SELECT Name,Company from table1 UNION SELECT Name,Company from table2)AUnion

SELECT DISTINCT Id, Name
FROM TableA
UNION ALL
SELECT DISTINCT Id, Name
FROM TableB
WHERE TableB.Id NOT IN (SELECT Id FROM TableA)

Related

How to use CASE or IF-statement in Postgres to select from different table?

I want to make a selection from one of many tables. This selection depends on some condition. How can I make it?
I suppose it should be some like this (but it doesn't work):
CASE x
WHEN x=1 THEN
select Id,Name from table1
WHEN x=2 THEN
select Id,Name from table2
WHEN x=3 THEN
select Id,Name from table3
END CASE;
Inefficient solution the queries all 3 tables, but immitates a switch statement in code (assuming the retrieved columns are equivalent)
declare #inputValue int = 1
SELECT * FROM (
SELECT 1 [key], Id from table1
UNION ALL
SELECT 2, Id from table2
UNION ALL
SELECT 3, Id from table3
) x
where x.[key] = #inputValue
One way to do it:
SELECT id, name
FROM table1
WHERE x = 1
UNION ALL
SELECT id, name
FROM table2
WHERE x = 2
UNION ALL
SELECT id, name
FROM table3
WHERE x = 3
Only one table's data will be returned (if x is any of those values).

select different values for column based on what table the row exists in

I have this query
define LAST_DATE_BEFORE = to_date('03112016','ddmmyyyy')
with
table1 as (some result),
table2 as (some result),
select
MS.PAID_TRANS_IND,
MS.CURR_PRICE_PLAN_KEY,
case
when MS.SEGMENT_KEY in t1.SEGMENT_KEY
then MS.PLAN_SEGMENT_KEY
when MS.SEGMENT_KEY in t2.SEGMENT_KEY
and MS.START_ALLOC_DATE = &LAST_DATE_BEFORE + 1
then MS.SEGMENT_KEY
else null
end as SEGMENT_KEY
from MO_SU MS, table1 t1, table2 t2
table 1 and 2 have different values from table MO_SU. Now I just check column values, but I want to check if the whole row can be found in t1/t2.
I thought this could work
when MS.* in t1.*
it doesn't.
What can I do?
Trying to reduce the question to checking if a given row of a table exists in another table, with equal values in all the columns, not using a JOIN but only the IN, you may need something like the following ( assuming not null values):
with
tab1(colA, colB, colC) as ( select 'a', 'b', 'c' from dual union all
select 'A', 'B', 'C' from dual
),
tab2(columnA, columnB, columnC) as ( select 'a', 'b', 'c' from dual)
select *
from tab1
where (colA, colB, colC) in ( select columnA, columnB, columnC from tab2)
If the second table had exactly the columns you need to check, in the right order and with no other column, in theory you could even edit it into :
...
where (colA, colB, colC) in ( select * from tab2)
but I absolutely recommend NOT to use such an approach: it's always better to avoid things like select *.
OK so if table1 and table2 contains records from MO_SU you can do such thing:
with
table1 as (select rowid r, s.* from MO_SU),
table2 as (select rowid r, s.* from MO_SU)
select what_you_need
from
MO_SU MS inner join (select * from table1 t1 union all select * from table2 t2) t on (MO_SU.rowid = t.r);
However I don't see deeper sense here. What code do because if you're using only data from MO_SU you could probably select it in with. rowid is unique identifier of row so if you attach rowid in CTEs you can the join on rowid to filter only data that is present in CTEs.

How to delete all records returned by a subquery?

I want to delete all records that are returned by a certain query, but I can't figure out a proper way to do this. I tried to DELETE FROM mytable WHERE EXISTS (subquery), however, that deleted all records from the table and not just the ones selected by the subquery.
My subquery looks like this:
SELECT
MAX(columnA) as columnA,
-- 50 other columns
FROM myTable
GROUP BY
-- the 50 other columns above
having count(*) > 1;
This should be easy enough, but my mind is just stuck right now. I'm thankful for any suggestions.
Edit: columnA is not unique (also no other column in that table is globally unique)
Presumably, you want to use in:
DELETE FROM myTable
WHERE columnA IN (SELECT MAX(columnA) as columnA
FROM myTable
GROUP BY -- the 50 other columns above
HAVING count(*) > 1
);
This assumes that columnA is globally unique in the table. Otherwise, you will have to work a bit harder.
DELETE FROM myTable t
WHERE EXISTS (SELECT 1
FROM (SELECT MAX(columnA) as columnA,
col1, col2, . . .
FROM myTable
GROUP BY -- the 50 other columns above
HAVING count(*) > 1
) t2
WHERE t.columnA = t2.columnA AND
t.col1 = t2.col1 AND
t.col2 = t2.col2 AND . . .
);
And, even this isn't guaranteed to work if any of the columns have NULL values (although the conditions can be easily modified to handle this).
Another solution if the uniqueness is only guaranteed by a set of columns:
delete table1 where (col1, col2, ...) in (
select min(col1), col2, ...
from table1
where...
group by col2, ...
)
Null values will be ignored and not deleted.
To achieve this, try something like
with data (id, val1, val2) as
(
select 1, '10', 10 from dual union all
select 2, '20', 21 from dual union all
select 2, null, 21 from dual union all
select 2, '20', null from dual
)
-- map null values in column to a nonexistent value in this column
select * from data d where (d.id, nvl(d.val1, '#<null>')) in
(select dd.id, nvl(dd.val1, '#<null>') from data dd)
If you need to delete all the rows of a table such that the value of a given field is in the result of a query, you can use something like
delete table
my column in ( select column from ...)

How to use order by with union all in sql?

I tried the sql query given below:
SELECT * FROM (SELECT *
FROM TABLE_A ORDER BY COLUMN_1)DUMMY_TABLE
UNION ALL
SELECT * FROM TABLE_B
It results in the following error:
The ORDER BY clause is invalid in views, inline functions, derived
tables, subqueries, and common table expressions, unless TOP or FOR
XML is also specified.
I need to use order by in union all. How do I accomplish this?
SELECT *
FROM
(
SELECT * FROM TABLE_A
UNION ALL
SELECT * FROM TABLE_B
) dum
-- ORDER BY .....
but if you want to have all records from Table_A on the top of the result list, the you can add user define value which you can use for ordering,
SELECT *
FROM
(
SELECT *, 1 sortby FROM TABLE_A
UNION ALL
SELECT *, 2 sortby FROM TABLE_B
) dum
ORDER BY sortby
You don't really need to have parenthesis. You can sort directly:
SELECT *, 1 AS RN FROM TABLE_A
UNION ALL
SELECT *, 2 AS RN FROM TABLE_B
ORDER BY RN, COLUMN_1
Not an OP direct response, but I thought I would jimmy in here responding to the the OP's ERROR messsage, which may point you in another direction entirely!
All these answers are referring to an overall ORDER BY once the record set has been retrieved and you sort the lot.
What if you want to ORDER BY each portion of the UNION independantly, and still have them "joined" in the same SELECT?
SELECT pass1.* FROM
(SELECT TOP 1000 tblA.ID, tblA.CustomerName
FROM TABLE_A AS tblA ORDER BY 2) AS pass1
UNION ALL
SELECT pass2.* FROM
(SELECT TOP 1000 tblB.ID, tblB.CustomerName
FROM TABLE_B AS tblB ORDER BY 2) AS pass2
Note the TOP 1000 is an arbitary number. Use a big enough number to capture all of the data you require.
There will be times when you need to do something like this :
Pull top 5 from table 1 based on a sort
and bottom 5 from table 2 based on another sort
and union these together.
solution
select * from (
-- top 5 records
select top 5 col1, col2, col3
from table1
group by col1, col2
order by col3 desc ) z
union all
select * from (
-- bottom 5 records
select top 5 col1, col2, col3
from table2
group by col1, col2
order by col3 ) z
this was the only way i was able to get around the error and worked fine for me.
SELECT * FROM (SELECT *
FROM TABLE_A ORDER BY COLUMN_1)DUMMY_TABLE
UNION ALL
SELECT * FROM TABLE_B
ORDER BY 2;
2 is column number here .. In Oracle SQL you can use the column number by which you want to sort the data
This solved my SELECT statement:
SELECT * FROM
(SELECT id,name FROM TABLE_A
UNION ALL
SELECT id,name FROM TABLE_B ) dum
order by dum.id , dum.name
where id and name columns available in tables and you can use your columns .
Simply use that , no need parenthesis or anything else
SELECT *, id as TABLE_A_ID FROM TABLE_A
UNION ALL
SELECT *, id as TABLE_B_ID FROM TABLE_B
ORDER BY TABLE_A_ID, TABLE_B_ID
ORDER BY after the last UNION should apply to both datasets joined by union.
The solution shown below:
SELECT *,id AS sameColumn1 FROM Locations
UNION ALL
SELECT *,id AS sameColumn2 FROM Cities
ORDER BY sameColumn1,sameColumn2
select CONCAT(Name, '(',substr(occupation, 1, 1), ')') AS f1
from OCCUPATIONS
union
select temp.str AS f1 from
(select count(occupation) AS counts, occupation, concat('There are a total of ' ,count(occupation) ,' ', lower(occupation),'s.') As str from OCCUPATIONS group by occupation order by counts ASC, occupation ASC
) As temp
order by f1

Count rows in more than one table with tSQL

I need to count rows in more than one table in SQL Server 2008. I do this:
select count(*) from (select * from tbl1 union all select * from tbl2)
But it gives me an error of incorrect syntax near ). Why?
PS. The actual number of tables can be more than 2.
In case you have different number of columns in your tables try this way
SELECT count(*)
FROM (
SELECT NULL as columnName
FROM tbl1
UNION ALL
SELECT NULL
FROM tbl2
) T
try this:
You have to give a name to your derived table
select count(*) from
(select * from tbl1 union all select * from tbl2)a
I think you have to alias the SELECT in the FROM clause:
select count(*)
from
(
select * from tbl1
union all
select * from tbl2
) AS SUB
You also need to ensure that the * in both tables tbl1 and tbl2 return exactly the same number of columns and they have to be matched in their type.
I don't like doing the union before doing the count. It gives the SQL optimizer an opportunithy to choose to do more work.
AlexK's (deleted) solution is fine. You could also do:
select (select count(*) from tbl1) + (select count(*) from tbl2) as cnt