How can I treat a UNION query as a sub query - sql

I have a set of tables that are logically one table split into pieces for performance reasons. I need to write a query that effectively joins all the tables together so I use a single where clause of the result. I have successfully used a UNION on the result of using the WHERE clause on each subtable explicitly as in the following
SELECT * FROM FRED_1 WHERE CHARLIE = 42
UNION
SELECT * FROM FRED_2 WHERE CHARLIE = 42
UNION
SELECT * FROM FRED_3 WHERE CHARLIE = 42
but as there are ten separate subtables updating the WHERE clause each time is a pain. What I want is something like this
SELECT *
FROM (
SELECT * FROM FRED_1
UNION
SELECT * FROM FRED_2
UNION
SELECT * FROM FRED_3)
WHERE CHARLIE = 42
If it makes a difference the query needs to run against a DB2 database.
Here is a more comprehensive (sanitised) version of what I need to do.
select *
from ( select * from FRD_1 union select * from FRD_2 union select * from FRD_3 ) as FRD,
( select * from REQ_1 union select * from REQ_2 union select * from REQ_3 ) as REQ,
( select * from RES_1 union select * from RES_2 union select * from RES_3 ) as RES
where FRD.KEY1 = 123456
and FRD.KEY1 = REQ.KEY1
and FRD.KEY1 = RES.KEY1
and REQ.KEY2 = RES.KEY2
NEW INFORMATION:
It looks like the problem has more to do with the number of fields in the union than anything else. If I greatly restrict the fields I can get most of the syntax variations below working. Unfortunately, restricting the fields so much means the resulting query, while potentially useful, is not giving me the result I wanted. I've managed to get an additional 3 fields from one of the tables in addition to the 2 keys. Any more than that and the query fails.

I believe you have to give a name to your subquery result. I don't know db2 so I'm taking a shot in the dark, but I know this works on several other platforms.
SELECT *
FROM (
SELECT * FROM FRED_1
UNION
SELECT * FROM FRED_2
UNION
SELECT * FROM FRED_3) AS T1
WHERE CHARLIE = 42

If the logical implementation is a single table but the physical implementation is multiple tables then how about creating a view that defines the logical model.
CREATE VIEW VW_FRED AS
SELECT * FROM FRED_1
UNION
SELECT * FROM FRED_2
UNION
SELECT * FROM FRED_3
then it's a simple matter of
SELECT * FROM VW_FRED WHERE CHARLIE = 42
Again, I'm not familiar with db2 syntax but this gives you the general idea.

with
FRD as ( select * from FRD_1 union select * from FRD_2 union select * from FRD_3 ),
REQ as ( select * from REQ_1 union select * from REQ_2 union select * from REQ_3 ),
RES as ( select * from RES_1 union select * from RES_2 union select * from RES_3 )
SELECT * from FRD, REQ, RES
WHERE FRD.KEY1 = 123456
and FRD.KEY1 = REQ.KEY1
and FRD.KEY1 = RES.KEY1
and REQ.KEY2 = RES.KEY2

I'm not familiar with DB2 syntax but why aren't you doing this as an INNER JOIN or LEFT JOIN?
SELECT *
FROM FRED_1
INNER JOIN FRED_2
ON FRED_1.Charlie = FRED_2.Charlie
INNER JOIN FRED_3
ON FRED_1.Charlie = FRED_3.Charlie
WHERE FRED_1.Charlie = 42
If the values don't exist in FRED_2 or FRED_3 then use a LEFT/OUTER JOIN. I'm assuming that FRED_1 is a master table, and if a record exists then it will be in this table.

maybe:
SELECT * FROM
(select * from FRD_1
union
select * from FRD_2
union
select * from FRD_3) FRD
INNER JOIN (select * from REQ_1 union select * from REQ_2 union select * from REQ_3) REQ
on FRD.KEY1 = REQ.KEY1
INNER JOIN (select * from RES_1 union select * from RES_2 union select * from RES_3) RES
on FRD.KEY1 = RES.KEY1
WHERE FRD.KEY1 = 123456 and REQ.KEY2 = RES.KEY2

Related

BigQuery Union Distinct Where Value not in Preceding DataSet

I am trying to reconcile some student database with GSuite emails, where usernames have been created inconsistently for years.
The gist of the query I am trying to make on BigQuery is:
Match Emails to Students from email Pattern 1 and union with
Match Emails to Students from email Pattern 2 and union with
Emails not in 1 an 2.
Or in SQL:
with mymatches as (
with emaildataset as (
select 'testA' as col
union all
select 'testB'
union all
select 'testC'
union all
select 'testD'
)
select * from emaildataset where col like '%A'
union distinct
select * from emaildataset where col like '%B'
),
emaildataset2 as (
select 'testA' as col
union all
select 'testB'
union all
select 'testC'
union all
select 'testD'
)
select * from mymatches
union distinct
select * from emaildataset2 where emaildataset2.col not in (select col from mymatches)
This runs happily, but when I run the real code, then I'm getting duplicates.
The real code is now:
with matchedEmails as (
with g as (
select * from gsuite.StudentUsers
union all
select * from gsuite.AlumniUsers
)
select
std.STDCODE,
g.*
from g
inner join quick.all_students_alumni as std
on split(lower(g.Email), '#')[offset(0)] = split(quick.studentEmail(std.FNAME, std.MNAME, std.LNAME, std.STATUSTYPE), '#')[offset(0)]
where g.OU like '/Student%' or OU like '/Alumni%'
union distinct select
std.STDCODE,
g.*
from g
inner join quick.all_students_alumni as std
on split(lower(g.Email), '#')[offset(0)] = split(quick.studentEmail(std.FNAME, '', std.LNAME, std.STATUSTYPE), '#')[offset(0)]
where g.OU like '/Student%' or OU like '/Alumni%'
)
select * from matchedEmails
union distinct select
'NOT MATCHED' as STDCODE,
g.*
from (
select * from gsuite.StudentUsers
union all
select * from gsuite.AlumniUsers
) as g
where g.Email not in (select Email from matchedEmails)
and g.OU like '/Student%' or OU like '/Alumni%'
As a result though, I am getting duplicates in the Email column, which--based on my knowledge and test above--should not be, due to the where g.Email not in (select Email from matchedEmails) clause.
Am I doing something wrong?
I think, very last WHERE clause should be fixed to look like below
where g.Email not in (select Email from matchedEmails)
and (g.OU like '/Student%' or OU like '/Alumni%')
As you can see - the brackets around g.OU like '/Student%' or OU like '/Alumni%' were missing
it might be something else too that still need to be fixed - but this answers you below questions
As a result though, I am getting duplicates in the Email column, which--based on my knowledge and test above--should not be, due to the where g.Email not in (select Email from matchedEmails) clause.

Multiple Linked Servers in one select statement with one where clause, possible?

Got a tricky one today (Might even just be me):
I have 8 Linked SQL 2012 servers configured to my main SQL server and I need to create table views so that I can filter all these combined table results only using one where clause, currently I use UNION because they all have the same table structures.
Currently my solution looks as follows:
SELECT * FROM [LinkedServer_1].[dbo].[Table] where value = 'xxx'
UNION
SELECT * FROM [LinkedServer_2].[dbo].[Table] where value = 'xxx'
UNION
SELECT * FROM [LinkedServer_3].[dbo].[Table] where value = 'xxx'
UNION
SELECT * FROM [LinkedServer_4].[dbo].[Table] where value = 'xxx'
UNION
SELECT * FROM [LinkedServer_5].[dbo].[Table] where value = 'xxx'
UNION
SELECT * FROM [LinkedServer_6].[dbo].[Table] where value = 'xxx'
UNION
SELECT * FROM [LinkedServer_7].[dbo].[Table] where value = 'xxx'
UNION
SELECT * FROM [LinkedServer_8].[dbo].[Table] where value = 'xxx'
As you can see this is becoming quite ugly because I have a select statement and where clause for each linked server and would like to know if there was a simpler way of doing this!
Appreciate the feedback.
Brakkie101
Instead of using views, you can use inline table-valued functions (a view with parameters). It will not save initial efforts for creating the queries, but could save some work in the future:
CREATE FUNCTION [dbo].[fn_LinkedSever] (#value NVARCHAR(128))
AS
RETURNS TABLE
AS
RETURN
(
SELECT * FROM [LinkedServer_1].[dbo].[Table] where value = #value
UNION
SELECT * FROM [LinkedServer_2].[dbo].[Table] where value = #value
UNION
SELECT * FROM [LinkedServer_3].[dbo].[Table] where value = #value
UNION
SELECT * FROM [LinkedServer_4].[dbo].[Table] where value = #value
UNION
SELECT * FROM [LinkedServer_5].[dbo].[Table] where value = #value
UNION
SELECT * FROM [LinkedServer_6].[dbo].[Table] where value = #value
UNION
SELECT * FROM [LinkedServer_7].[dbo].[Table] where value = #value
UNION
SELECT * FROM [LinkedServer_8].[dbo].[Table] where value = #value
);
Also, if possible, use UNION ALL instead of UNION.

SQL join for Loop?

I have beginner knowledge on SQL and I am wondering whether this is possible in SQL.
SQL query 1 >>
select distinct(id) as active_pod from schema_naming
Query 1 output >>
active_pod
DB_1
DB_2
...
DB_20
SQL query 2 >>
select * from DB_1.mapping UNION
select * from DB_2.mapping UNION
....
select * from DB_20.mapping UNION
Due to my limited knowledge on SQL, I'm currently running #1 query first and change DB_1, DB2,.. DB_20 in query 2 everytime and run #2.
However, I was wondering whether there's way to this in one query so I don't have to manually change DB number in the #2 query and don't have to union every line.
something like this..(but not sure what to do with union)
select * from {
select distinct id from schema_naming}.user_map
It will be great if someone can shed light on this. (I'm trying to do this on Oracle SQL)
thank you in advance.
Are you trying to get something like this?
SELECT 'SELECT * FROM ' || active_pod || '.' || 'Mapping UNION'
FROM
(
select distinct(id) as active_pod from schema_naming
) as DT;
Alternatively, use PL/SQL block:
BEGIN
For i in (SELECT 'SELECT * FROM ' || ACTIVE_POD || '.MAPPING UNION' AS QUERY
FROM SCHEMA_NAMING) loop
dbms_output.put_line(i.query);
end loop;
END
Your queries will appear in the output window on your IDE.
This is a definitely a hack but it might make your life easier until a better solution is proposed. Basically use a query to generate your 2nd query, only manual edit needed would to remove the unecessary UNION on the final line.
SELECT 'SELECT * FROM ' || ACTIVE_POD || '.MAPPING UNION' AS QUERY
FROM SCHEMA_NAMING
Results:
SELECT * FROM DB_1.MAPPING UNION
SELECT * FROM DB_2.MAPPING UNION
SELECT * FROM DB_3.MAPPING UNION
SELECT * FROM DB_4.MAPPING UNION
SELECT * FROM DB_5.MAPPING UNION
SELECT * FROM DB_6.MAPPING UNION
SELECT * FROM DB_7.MAPPING UNION
SELECT * FROM DB_8.MAPPING UNION
SELECT * FROM DB_9.MAPPING UNION
SELECT * FROM DB_10.MAPPING UNION
SELECT * FROM DB_11.MAPPING UNION
SELECT * FROM DB_12.MAPPING UNION
SELECT * FROM DB_13.MAPPING UNION
SELECT * FROM DB_14.MAPPING UNION
SELECT * FROM DB_15.MAPPING UNION
SELECT * FROM DB_16.MAPPING UNION
SELECT * FROM DB_17.MAPPING UNION
SELECT * FROM DB_18.MAPPING UNION
SELECT * FROM DB_19.MAPPING UNION
SELECT * FROM DB_20.MAPPING UNION

Using Random function in the WHERE Clause

In my WHERE Clause I'm using the random function to get a random number between 1 and 5. However the result is always empty without any error.
Here it is:
Select Question._id, question_text, question_type, topic, favorite,
picture_text, picture_src, video_text, video_src, info_title, info_text,
info_picture_src, topic_text
FROM Question
LEFT JOIN Question_Lv ON Question._id = Question_Lv.question_id
LEFT JOIN Info ON Question._id = Info.question_id
LEFT JOIN Info_Lv ON Question._id = Info_Lv.question_id
LEFT JOIN Picture ON Question._id = Picture.question_id
LEFT JOIN Picture_Lv ON Question._id = Picture_Lv.question_id
LEFT JOIN Video ON Question._id = Video.question_id
LEFT JOIN Video_Lv ON Question._id = Video_Lv.question_id
LEFT JOIN Topic ON Question.topic = Topic._id
LEFT JOIN Topic_Lv ON Topic._id = Topic_Lv.topic_id
LEFT JOIN Exam ON Question._id = Exam.question_id
WHERE Exam.exam = (random() * 5+ 1)
What is the random function doing in this case and how to use it correctly?
From Docs
random()
The random() function returns a pseudo-random integer between
-9223372036854775808 and +9223372036854775807.
Hence your random value is not between 0 and 1 as you assumed and hence no rows.
You can get it between 0 and 1 by dividing it with 2×9223372036854775808 and adding 0.5 to it.
random() / 18446744073709551616 + 0.5
So, your where clause becomes:
WHERE Exam.exam = ((random() / 18446744073709551616 + 0.5) * 5 + 1)
which is same as:
WHERE Exam.exam = 5 * random() / 18446744073709551616 + 3.5
Also, you'll probably need to round the output of right side calculation, so:
WHERE Exam.exam = round(5 * random() / 18446744073709551616 + 3.5)
I'll answer this question using Vertica as a database.
Vertica has the function RANDOM(), which returns a random double precision number between 0 and 1, and the function RANDOMINT(<*integer*>), which returns an integer number between 0 and *<integer>*-1. I'll use RANDOMINT(5) for this example.
As a general suggestion - Isolate your specific problem in your question. The joins in your query are not part of the problem. And use a sample table, like I do in the code below.
As some of the previous answers suggested, RANDOMINT(5) will return a new random integer between 0 and 4 for each of the rows that are read from the exam table.
See here:
WITH exam(id,exam,exam_res) AS (
SELECT 1,1,'exam_res_1'
UNION ALL SELECT 2,2,'exam_res_2'
UNION ALL SELECT 3,3,'exam_res_3'
UNION ALL SELECT 4,4,'exam_res_4'
UNION ALL SELECT 5,5,'exam_res_5'
UNION ALL SELECT 6,1,'exam_res_1'
UNION ALL SELECT 7,2,'exam_res_2'
UNION ALL SELECT 8,3,'exam_res_3'
UNION ALL SELECT 9,4,'exam_res_4'
UNION ALL SELECT 10,5,'exam_res_5'
UNION ALL SELECT 11,1,'exam_res_1'
UNION ALL SELECT 12,2,'exam_res_2'
UNION ALL SELECT 13,3,'exam_res_3'
UNION ALL SELECT 14,4,'exam_res_4'
UNION ALL SELECT 15,5,'exam_res_5'
)
SELECT * FROM exam WHERE exam=RANDOMINT(5)+1
;
id|exam|exam_res
3| 3|exam_res_3
4| 4|exam_res_4
5| 5|exam_res_5
6| 1|exam_res_1
7| 2|exam_res_2
9| 4|exam_res_4
12| 2|exam_res_2
What you need to do is make sure that you call your random number generator only once.
If your database abides to the ANSI 99 standard and supports the WITH clause (the common table expression, as I also use it to generate the sample data), do that also in a common table expression - which I call search_exam:
WITH search_exam(exam) AS (
SELECT RANDOMINT(5)+1
)
, exam(id,exam,exam_res) AS (
SELECT 1,1,'exam_res_1'
UNION ALL SELECT 2,2,'exam_res_2'
UNION ALL SELECT 3,3,'exam_res_3'
UNION ALL SELECT 4,4,'exam_res_4'
UNION ALL SELECT 5,5,'exam_res_5'
UNION ALL SELECT 6,1,'exam_res_1'
UNION ALL SELECT 7,2,'exam_res_2'
UNION ALL SELECT 8,3,'exam_res_3'
UNION ALL SELECT 9,4,'exam_res_4'
UNION ALL SELECT 10,5,'exam_res_5'
UNION ALL SELECT 11,1,'exam_res_1'
UNION ALL SELECT 12,2,'exam_res_2'
UNION ALL SELECT 13,3,'exam_res_3'
UNION ALL SELECT 14,4,'exam_res_4'
UNION ALL SELECT 15,5,'exam_res_5'
)
SELECT id,exam,exam_res FROM exam
WHERE exam=(SELECT exam FROM search_exam)
;
id|exam|exam_res
1| 1|exam_res_1
6| 1|exam_res_1
11| 1|exam_res_1
Alternatively, you can go SELECT id,exam,exam_res FROM exam INNER JOIN search_exam USING(exam) .
Happy playing -
Marco the Sane

Using UNION with Sequel

I want to define a SQL-command like this:
SELECT * FROM WOMAN
UNION
SELECT * FROM MEN
I tried to define this with the following code sequence in Ruby + Sequel:
require 'sequel'
DB = Sequel::Database.new()
sel = DB[:women].union(DB[:men])
puts sel.sql
The result is (I made some pretty print on the result):
SELECT * FROM (
SELECT * FROM `women`
UNION
SELECT * FROM `men`
) AS 't1'
There is an additional (superfluous?) SELECT.
If I define multiple UNION like in this code sample
sel = DB[:women].union(DB[:men]).union(DB[:girls]).union(DB[:boys])
puts sel.sql
I get more superfluous SELECTs.
SELECT * FROM (
SELECT * FROM (
SELECT * FROM (
SELECT * FROM `women`
UNION
SELECT * FROM `men`
) AS 't1'
UNION
SELECT * FROM `girls`
) AS 't1'
UNION
SELECT * FROM `boys`
) AS 't1'
I detected no problem with it up to now, the results seem to be the same.
My questions:
Is there a reason for the additional selects (beside sequel internal procedures)
Can I avoid the selects?
Can I get problems with this additional selects? (Any Performance issue?)
The reason for the extra SELECTs is so code like DB[:girls].union(DB[:boys]).where(:some_column=>1) operates properly. You can use DB[:girls].union(DB[:boys], :from_self=>false) to not wrap it in the extra SELECTs, as mentioned in the documentation.