How to combine multiple SELECT statements into a single query & get a single result output - sql

I have multiple SELECT queries which is ran against different tables.
The output of all the queries have the same number of rows (every query when ran individually will have the same number of rows). Is there a way I can combine the output of all these queries into a single result? (Keep out from first query and add the output of next query as a column to the output of the next query). I dont want to save these tables into database as I am just doing some validation testing.
Example:
SELECT AAA,BBB,CCC FROM Table1
SELECT Table2.DDD, Table1.AAA
FROM Table2
INNER JOIN Table1
ON Table1.AAA = Table2.AAA
I tried writing combining the query as
SELECT Table1.AAA,Table1.BBB,Table1.CCC,T1.DDD
FROM Table1,
(SELECT Table2.DDD, Table1.AAA
FROM Table2
INNER JOIN Table1
ON Table1.AAA = Table2.AAA)T1
I tried doing the above combined query, but instead of getting 11 rows as output (both queries above had result of 11 rows), I am getting 35 rows as output.
Hope the question made sense!

You'll need to specify a criteria to match each row the first query with which row of the second query.
If, for example, the column AAA is unique in both queries and you want to match rows with the same values you could do:
select a.*, b.*
from (
SELECT AAA,BBB,CCC FROM Table1
) a
full join join (
SELECT Table2.DDD, Table1.AAA
FROM Table2
INNER JOIN Table1
ON Table1.AAA = Table2.AAA
) b on b.aaa = a.aaa
If there aren't any clear matching rules, you can produce an artificial row number on each result set and use it to match rows. For example:
select
a.aaa, a.bbb, a.ccc,
b.ddd, b.aaa
from (
SELECT AAA, BBB, CCC,
row_number() over(order by aaa) as rn
FROM Table1
) a
full join join (
SELECT Table2.DDD, Table1.AAA,
row_number() over(order by table1.aaa, table2.ddd) as rn
FROM Table2
INNER JOIN Table1
ON Table1.AAA = Table2.AAA
) b on b.rn = a.rn

If you have several results and want to have all of them as additional columns you can simply use ",":
create table temp1 as select '1' as c1 from DUAL;
create table temp2 as select '2' as c2 from DUAL;
create table temp3 as select '3' as c3 from DUAL;
select a.c1, b.c2, c.c3 from temp1 a, (select c2 from temp2) b, (select c3 from temp3) c;
An alternative could also be that you want to have all the results as additional rows then you would use UNION ALL between the individual results.

Related

Select data from multiple table without join sql

I have 2 SQL table.Table A and Table B. Both of this table have 10000 records respectively.
In Table A have 3 Column=>
ColumnA,ColumnB,ColumnC
In Table B have 3 Column=>
ColumnD,ColumnE,ColumnF
My Requirement is to select ColumnA and ColumnD with their original record(10000).
My question is how can I select only ColumnA and ColumnD.
The first problem is I cant join this two table because this two table stands as separately.
The second problem is I cant Union this two table because my requirement is to get Two column but when I union, I only get one column with combining two column.
You could create an on-the-fly join column via ROW_NUMBER, and then join on that:
WITH cte1 AS (
SELECT ColumnA, ROW_NUMBER() OVER (ORDER BY ColumnA) rn
FROM TableA
),
cte2 AS (
SELECT ColumnD, ROW_NUMBER() OVER (ORDER BY ColumnD) rn
FROM TableB
)
SELECT t1.ColumnA, t2.ColumnD
FROM cte t1
INNER JOIN cte t2
ON t1.rn = t2.rn;
Of course, this would just pair the 10K records using the arbitrary orderings in the A and D columns. If you have some specific logic for how these two columns should be paired up, then let us know. The bottom line is that you can't easily get away from the concepts of join or union to bring these two columns together.

Comparing two tables for equality in HIVE

I have two tables, table1 and table2. Each with the same columns:
key, c1, c2, c3
I want to check to see if these tables are equal to eachother (they have the same rows). So far I have these two queries (<> = not equal in HIVE):
select count(*) from table1 t1
left outer join table2 t2
on t1.key=t2.key
where t2.key is null or t1.c1<>t2.c1 or t1.c2<>t2.c2 or t1.c3<>t2.c3
And
select count(*) from table1 t1
left outer join table2 t2
on t1.key=t2.key and t1.c1=t2.c1 and t1.c2=t2.c2 and t1.c3=t2.c3
where t2.key is null
So my idea is that, if a zero count is returned, the tables are the same. However, I'm getting a zero count for the first query, and a non-zero count for the second query. How exactly do they differ? If there is a better way to check this certainly let me know.
The first one excludes rows where t1.c1, t1.c2, t1.c3, t2.c1, t2.c2, or t2.c3 is null. That means that you effectively doing an inner join.
The second one will find rows that exist in t1 but not in t2.
To also find rows that exist in t2 but not in t1 you can do a full outer join. The following SQL assumes that all columns are NOT NULL:
select count(*) from table1 t1
full outer join table2 t2
on t1.key=t2.key and t1.c1=t2.c1 and t1.c2=t2.c2 and t1.c3=t2.c3
where t1.key is null /* this condition matches rows that only exist in t2 */
or t2.key is null /* this condition matches rows that only exist in t1 */
If you want to check for duplicates and the tables have exactly the same structure and the tables do not have duplicates within them, then you can do:
select t.key, t.c1, t.c2, t.c3, count(*) as cnt
from ((select t1.*, 1 as which from table1 t1) union all
(select t2.*, 2 as which from table2 t2)
) t
group by t.key, t.c1, t.c2, t.c3
having cnt <> 2;
There are various ways that you can relax the conditions in the first paragraph, if necessary.
Note that this version also works when the columns have NULL values. These might be causing the problem with your data.
Well, the best way is calculate the hash sum of each table, and compare the sum of hash.
So no matter how many column are they, no matter what data type are they, as long as the two table has the same schema, you can use following query to do the comparison:
select sum(hash(*)) from t1;
select sum(hash(*)) from t2;
And you just need to compare the return values.
I used EXCEPT statement and it worked.
select * from Original_table
EXCEPT
select * from Revised_table
Will show us all the rows of the Original table that are not in the Revised table.
If your table is partitioned you will have to provide a partition predicate.
Fyi, partition values don't need to be provided if you use Presto and querying via SQL lab.
I would recommend you not using any JOINs to try to compare tables:
it is quite an expensive operations when tables are big (which is often the case in Hive)
it can give problems when some rows/"join keys" are repeated
(and it can also be unpractical when data are in different clusters/datacenters/clouds).
Instead, I think using a checksum approach and comparing the checksums of both tables is best.
I have developed a Python script that allows you to do easily such comparison, and see the differences in a webbrowser:
https://github.com/bolcom/hive_compared_bq
I hope that can help you!
First get count for both the tables C1 and C2. C1 and C2 should be equal. C1 and C2 can be obtained from the following query
select count(*) from table1
if C1 and C2 are not equal, then the tables are not identical.
2: Find distinct count for both the tables DC1 and DC2. DC1 and DC2 should be equal. Number of distinct records can be found using the following query:
select count(*) from (select distinct * from table1)
if DC1 and DC2 are not equal, the tables are not identical.
3: Now get the number of records obtained by performing a union on the 2 tables. Let it be U. Use the following query to get the number of records in a union of 2 tables:
SELECT count (*)
FROM
(SELECT *
FROM table1
UNION
SELECT *
FROM table2)
You can say that the data in the 2 tables is identical if distinct count for the 2 tables is equal to the number of records obtained by performing union of the 2 tables. ie DC1 = U and DC2 = U
another variant
select c1-c2 "different row counts"
, c1-c3 "mismatched rows"
from
( select count(*) c1 from table1)
,( select count(*) c2 from table2 )
,(select count(*) c3 from table1 t1, table2 t2
where t1.key= t2.key
and T1.c1=T2.c1 )
Try with WITH Clause:
With cnt as(
select count(*) cn1 from table1
)
select 'X' from dual,cnt where cnt.cn1 = (select count(*) from table2);
One easy solution is to do inner join. Let's suppose we have two hive tables namely table1 and table2. Both the table has same column namely col1, col2 and col3. The number of rows should also be same. Then the command would be as follows
**
select count(*) from table1
inner join table2
on table1.col1 = table2.col1
and table1.col2 = table2.col2
and table1.col3 = table2.col3 ;
**
If the output value is same as number of rows in table1 and table2 , then all the columns has same value, If however the output count is lesser than there are some data which are different.
Use a MINUS operator:
SELECT count(*) FROM
(SELECT t1.c1, t1.c2, t1.c3 from table1 t1
MINUS
SELECT t2.c1, t2.c2, t2.c3 from table2 t2)

Multiple count based on dynamic criteria

I have two database for which I want to compare the amount of times a case appears.
TAB1:
ID Sequence
A2D 1
A2D 2
A2D 3
A3D 1
TAB2:
ID Sequence
A2D 1
A2D 2
A3D 1
A3D 2
Now, for this example, I am trying to get this result:
ID Table1 Table2
A2D 3 2
A3D 1 2
I have tried these code without any success:
SELECT R1.ID as ID, COUNT(R1.ID) as Table1,
COUNT(R2.ID) as Table2
FROM TAB1 AS R1, TAB2 AS R2
WHERE R1.ID = R2.ID
GROUP BY R1.ID
This one gave me wrong count values...
Also, this one simply crash:
select
(
select count(*) as Table1
from TAB1
where ID = R1.ID
),(
select count(*) as Table2
from TAB2
where ID= R1.ID
)
FROM TAB1 AS R1
As you can see though, I am trying to have my criteria dynamic. Most examples I found were including basic hard-coded criteria. But for my case, I want the query to look at my first table ID, count the amount of time it appears, do it for the 2nd table with the same ID, then move on to the next ID.
If my question lacks information or is confusing just ask me, I'll do my best to be more precise.
Thanks in advance !
Here I am using a UNION ALL as a subquery
SELECT ID, SUM(T1) AS Table1, SUM(T2) AS Table2
FROM
(SELECT ID, COUNT(ID) AS T1, 0 AS T2 FROM TAB1 GROUP BY ID
UNION ALL
SELECT ID, 0 AS T1, COUNT(ID) AS T2 FROM TAB2 GROUP BY ID)
GROUP BY ID
HAVING SUM(T1)>0 AND SUM(T2)>0
I used a different approach, but unfortunately I have to use two queries, i still don't know if they can be combined together. The first one is just for making sums of both tables, and combining the results:
SELECT "Tab1" AS [Table], Tab1.ID, Count(*) AS Total
FROM Tab1
GROUP BY "Tab1", Tab1.ID
UNION SELECT "Tab2" AS [Table], Tab2.ID, Count(*) AS Total
FROM Tab2
GROUP BY "Tab2", Tab2.ID
and, since Access supports Pivot queries, you can use this:
TRANSFORM Sum(qrySums.[Total]) AS Total
SELECT qrySums.[ID]
FROM qrySums
GROUP BY qrySums.[ID]
PIVOT qrySums.[Table];
Not sure if I understand your question, but you could try something like this:
SELECT DISTINCT t.ID,
(SELECT COUNT(ID) FROM R1 WHERE ID = t.ID) AS table1,
(SELECT COUNT(ID) FROM R2 WHERE ID = t.ID) AS table2
FROM table1 t
To get the desired results, I broke it down into two sub-queries (R1SQ and R2SQ) and a main UNION query - R1R2 that uses inner, left and right joins to include all row entries including those rows that do not appear in both tables:
R1SQ
SELECT R1.Builder, Count(R1.Builder) AS Table1
FROM R1
GROUP BY R1.Builder;
R2SQ
SELECT R2.Builder_E, Count(R2.Builder_E) AS Table2
FROM R2
GROUP BY R2.Builder_E;
R1R2
SELECT R1SQ.Builder, R1SQ.Table1, R2SQ.Table2
FROM R1SQ INNER JOIN R2SQ ON R1SQ.Builder = R2SQ.Builder_E
UNION
SELECT R1SQ.Builder, R1SQ.Table1, 0 AS Table2
FROM R1SQ LEFT JOIN R2SQ ON R1SQ.Builder = R2SQ.Builder_E
WHERE (((R2SQ.Builder_E) Is Null))
UNION
SELECT R2SQ.Builder_E, 0 AS Table1, R2SQ.Table2
FROM R1SQ RIGHT JOIN R2SQ ON R1SQ.Builder = R2SQ.Builder_E
WHERE (((R1SQ.Builder) Is Null))
ORDER BY R1SQ.Builder;

Is possible have different conditions for each row in a query?

How I can select a set of rows where each row match a different condition?
Example:
Supposing I have a table with a column called name, I want the result ONLY IF the first row name matches 'A', the second row name matches 'B' and the third row name matches 'C'.
Edit:
I want to do this to work without a fixed size, but in a way I can define the sequence like R,X,V,P,T and it matches the sequence, each one in a row, but in the order.
you can, but probably not in a way you would want:
if your table has a numeric id field, that is incremented with each row, you can self join that table 3 times (lets say as "a", "b" and "c") and use the join condition a.id + 1 = b.id and b.id + 1 = c.id and put you filter in a where clause like: a.name = 'A' AND b.name = 'B' AND c.name = 'C'
but don't expect performance ...
Assuming that You know how to provide a row number to your rows (ROW_NUMBER() in SQL Server, for instance), You can create a lookup (match) table and join on it. See below for explanation:
LookupTable:
RowNum Value
1 A
2 B
3 C
Your SourceTable source table (assuming You already added RowNum to it-in case You didn't, just introduce subquery for it (or CTE for SQL Server 2005 or newer):
RowNum Name
-----------
1 A
2 B
3 C
4 D
Now You need to inner join LookupTable with your SourceTable on LookupTable.RowNum = SourceTable.RowNum AND LookupTable.Name = SourceTable.Name. Then do a left join of this result with LookupTable on RowNum only. If there is LookupTable.RowNum IS NULL in final result then You know that there is no complete match on at least one row.
Here is code for joins:
SELECT T.*, LT2.RowNum AS Matched
FROM LookupTable LT2
LEFT JOIN
(
SELECT ST.*
FROM SourceTable ST
INNER JOIN LookupTable LT ON LT.RowNum = ST.RowNum AND LT.Name = ST.Name
) T
ON LT2.RowNum = T.RowNum
Result set of above query will contain rows with Matched IS NULL if row is not matching condition from LookupTable table.
I suppose you could do a sub query for each row, but it wouldn't perform well or scale well at all and would be hard to maintain.
This may be close to what your after... but I need to know where you're getting your values for A, B, C etc...
Select [insert your fields here]
FROM
(Select T1.Name, T1.Age, RowNum as t1RowNum from T T1 order by name) T1O
Full Outer JOIN
(Select T2.Name, T2.Age, RowNum as T2rowNum From T T2 order By name) T2O
ON T1O.T1RowNum+1 = T2O.T2RowNum

How do I merge data from two tables in a single database call into the same columns?

If I run the two statements in batch will they return one table to two to my sqlcommand object with the data merged. What I am trying to do is optimize a search by searching twice, the first time on one set of data and then a second on another. They have the same fields and I’d like to have all the records from both tables show and be added to each other. I need this so that I can sort the data between both sets of data but short of writing a stored procedure I can’t think of a way of doing this.
Eg. Table 1 has columns A and B, Table 2 has these same columns but different data source. I then wan to merge them so that if a only exists in one column it is added to the result set and if both exist it eh tables the column B will be summed between the two.
Please note that this is not the same as a full outer join operation as that does not merge the data.
[EDIT]
Here's what the code looks like:
Select * From
(Select ID,COUNT(*) AS Count From [Table1]) as T1
full outer join
(Select ID,COUNT(*) AS Count From [Table2]) as T2
on t1.ID = T2.ID
Perhaps you're looking for UNION?
IE:
SELECT A, B FROM Table1
UNION
SELECT A, B FROM Table2
Possibly:
select table1.a, table1.b
from table1
where table1.a not in (select a from table2)
union all
select table1.a, table1.b+table2.b as b
from table1
inner join table2 on table1.a = table2.a
edit: perhaps you would benefit from unioning the tables before counting. e.g.
select id, count() as count from
(select id from table1
union all
select id from table2)
I'm not sure if I understand completely but you seem to be asking about a UNION
SELECT A,B
FROM tableX
UNION ALL
SELECT A,B
FROM tableY
To do it, you would go:
SELECT * INTO TABLE3 FROM TABLE1
UNION
SELECT * FROM TABLE2
Provided both tables have the same columns
I think what you are looking for is this, but I am not sure I am understanding your language correctly.
select id, sum(count) as count
from (
select id, count() as count
from table1
union all
select id, count() as count
from table2
) a
group by id