SQL Query to retrieve results from two equally designed tables - sql

How can I query the results of two equally designed tables?
if table1 contains 1 column with data:
abc
def
hjj
and table2 contains 1 column with data:
uyy
iuu
pol
then i want my query to return
abc
def
hjj
uyy
iuu
pol
but I want to make sure that if I try to do the same task with multiple columns that the associations remain.

SELECT
Column1, Column2, Column3 FROM Table1
UNION
SELECT
Column1, Column2, Column5 AS Column3 FROM Table2
ORDER BY
Column1
Notice how I do an order by at the end and that Column5 in Table2 is the equivalent of Column3 in Table1. The Order By is of course optional, but allows you to control the order of items from both tables once they are combined.

Use a UNION
SELECT *
FROM TABLE_A
UNION
SELECT *
FROM TABLE_B
UNION will give you all distinct results, as where UNION ALL will give you results combined from the sets.

SELECT col FROM t1 UNION SELECT col FROM t2
Union reference.

sev, since union is the solution to what you described and you say that didn't work, perhaps you can provide the code you wrote that didn't work as clearly we are missing part of the picture. Are you positive the second table has the records you want? How do you know for sure?

Related

Best way to understand big and complex SQL queries with many subqueries

I just started in a new project, in a new company.
I was given a big and complex SQL, with about 1000 lines and MANY subqueries, joins, sums, group by, etc.
This SQL is used for report generation (it has no inserts nor updates).
The SQL has some flaws, and my first job in the company is to identify and correct these flaws so that the report shows the correct values (I know the correct values by accessing a legacy system written in Cobol...)
How can I make it easier for me to understand the query, so I can identify the flaws?
As an experienced Java programmer, I know how to refactor a complex bad written monolitic Java code into an easier to understand code with small pieces of code. But I have no clue on how to do that with SQL.
The SQL looks like this:
SELECT columns
FROM
(SELECT columns
FROM
(SELECT DISTINCT columns
FROM table000 alias000
INNER JOIN
table000 alias000
ON column000 = table000.column000
LEFT JOIN
(SELECT columns
FROM (
SELECT DISTINCT columns
FROM columns
WHERE conditions) AS alias000
GROUP BY columns ) alias000
ON
conditions
WHERE conditions
) AS alias000
LEFT JOIN
(SELECT
columns
FROM many_tables
WHERE many_conditions
) )
) AS alias000
ON condition
LEFT JOIN (
SELECT columns
FROM
(SELECT
columns
FROM
many_tables
WHERE many_conditions
) ) ) AS alias001
,
(SELECT
many_columns
FROM
many_tables
WHERE many_conditions) AS alias001
) AS alias001
ON condition
LEFT JOIN
(SELECT
many_columns
FROM many_tables
WHERE many_conditions
) AS alias001
ON condition
,
(SELECT DISTINCT columns
FROM table001 alias001
INNER JOIN
table001 alias001
ON condition
LEFT JOIN
(SELECT columns
FROM (
SELECT DISTINCT columns
FROM tables
WHERE conditions
) AS alias001
GROUP BY
columns ) alias001
ON
condition
WHERE
conditions
) AS alias001
LEFT JOIN
(SELECT columns
FROM tables
WHERE conditions
) AS alias001
ON condition
LEFT JOIN (
SELECT columns
FROM
(SELECT columns
FROM tables
WHERE conditions ) AS alias001
,
(SELECT
columns
FROM
tables
WHERE conditions ) AS alias001
) AS alias001
ON condition
LEFT JOIN
(SELECT
columns
FROM
tables
WHERE conditions
) AS alias001
ON condition
WHERE
condition
) AS alias001
order by column001
How can I make it easier for me to understand the query, so I can identify the flaws?
I deal with code like this every day as we do a lot of reporting and exporting of complex data here.
Step one is to understand the meaning of what you are doing. If you don't understand the meaning, you can't evaluate if you got the correct results. So understand exactly what you are trying to accomplish and see if you can see the results you should see for one record in the user interface. It really helps to have something to compare to so that you can see as you go through the query how adding in new things changes the results. If your query has used single letters or something else meaningless for the derived table aliases, then as you figure out the meaning of that that derived table is supposed to be doing, then replace the alias with something more meaningful like Employees instead of A. This will make it easier for the next person who works on it to decode it later.
Then what you do is start at the innermost derived table(Or subquery if you prefer but when it is being used as a table, the term derived table is more accurate). First figure out what it is supposed to be doing. For instance maybe it is getting all the employees who have less than satisfactory performance evaluations.
Run that and check the results to see if they look correct based on the meaning of what you are doing. For instance, if you are looking at unsatisfactory evaluations and you have 10,000 employees would 5617 seem like a reasonable results set for that chunk of data? Look for repeated records. If the same person is in there three times, then likely you have problem where you are joining one to many and getting the many back when you only want one. This can be fixed either through using aggregate functions and group by or putting another derived table in to replace the problem join.
Once you have the innermost part clear, then start checking the results of the other other derived tables, adding the code back in and checking the results until you find where either records dropped out that should not have (Hey I had 137 employees at this stage and now I only have 116. What caused that?) Remember that is only a clue to look at why that happened. There will be times as you build a complex query when the basic results will change and times when they should not have, that is why understanding the meaning of the data is critical.
Some things in general to look out for:
How null values are handled can affect results
Mixing implict and explict joins can cause incorrect results in some
databases.
At any rate you should always replace all implicit joins with
explicit ones. That makes the code clearer and less likely to have
errors.
If you have implicit joins, look for accidental cross joins. They are
very easy to introduce even in short queries, in complex ones, they
are much more likely which is why implicit joins should never be
used.
If you have left joins look out for places where they get
accidentally converted to inner joins by putting a where clause on
the left join table (other than whether id is null). So this
structure is a problem:
FROM table1 t1
LEFT JOIN Table2 t2 ON t1.t1id = T2.t1id
WHERE t2.somefield = 'test'
and should be
FROM table1 t1
LEFT JOIN Table2 t2 ON t1.t1id = T2.t1id
AND t2.somefield = 'test'
Working from the middle is commonplace in SQL and converting the set based logic of sql as sequential logic can lead to performance issues. Try hard to avoid this although I know it will be very tempting to do so.
The first thing I would do is question the join syntax. Is this literally the way it is currently written now?
select
from tb1, tb2, tb3, tb4, tb5 ...
left join ...
That from clause should look like this
From tb1
Inner join tb2 on .....
Inner join tb3 on .....
....
Left join
http://www-03.ibm.com/software/products/en/data-studio
IBM provides an Eclipse-based analysis tool that has the capability of generating a Visual EXPLAIN graph for complex queries. It shows how indexes are used, what internal result sets are produced and combined and so on.
Example:
SELECT * FROM EMPLOYEE, DEPARTMENT WHERE WORKDEPT=DEPTNO
The solution was to simplify the query using COMMON TABLE EXPRESSIONS.
This allowed me to break the big and complex SQL query into many small and easy to understand queries.
COMMON TABLE EXPRESSIONS:
Can be used to break up complex queries, especially complex joins and sub-queries
Is a way of encapsulating a query definition.
Persist only until the next query is run.
Correct use can lead to improvements in both code quality/maintainability and speed.
Can be used to reference the resulting table multiple times in the same statement (eliminate duplication in SQL).
Can be a substitute for a view when the general use of a view is not required; that is, you do not have to store the definition in metadata.
Example:
WITH cte (Column1, Column2, Column3)
AS
(
SELECT Column1, Column2, Column3
FROM SomeTable
)
SELECT * FROM cte
My new SQL looks like this:
------------------------------------------
--COMMON TABLE EXPRESSION 001--
------------------------------------------
WITH alias001 (column001, column002) AS (
SELECT column005, column006
FROM table001
WHERE condition001
GROUP by column008
)
--------------------------------------------
--COMMON TABLE EXPRESSION 002 --
--------------------------------------------
, alias002 (column009) as (
select distinct column009 from table002
)
--------------------------------------------
--COMMON TABLE EXPRESSION 003 --
--------------------------------------------
, alias003 (column1, column2, column3) as (
SELECT '1' AS column1, '1' as column2, 'name001' AS column3 FROM SYSIBM.SYSDUMMY1
UNION ALL
SELECT '1' AS column1, '1.1' as column2, 'name002' AS column3 FROM SYSIBM.SYSDUMMY1
UNION ALL
SELECT '1' AS column1, '1.2' as column2, 'name003' AS column3 FROM SYSIBM.SYSDUMMY1
UNION ALL
SELECT '2' AS column1, '2' as column2, 'name004' AS column3 FROM SYSIBM.SYSDUMMY1
UNION ALL
SELECT '2' AS column1, '2.1' as column2, 'name005' AS column3 FROM SYSIBM.SYSDUMMY1
UNION ALL
SELECT '2' AS column1, '2.2' as column2, 'name006' AS column3 FROM SYSIBM.SYSDUMMY1
UNION ALL
SELECT '3' AS column1, '3' as column2, 'name007' AS column3 FROM SYSIBM.SYSDUMMY1
UNION ALL
SELECT '3' AS column1, '3.1' as column2, 'name008' AS column3 FROM SYSIBM.SYSDUMMY1
)
--------------------------------------------
--COMMON TABLE EXPRESSION 004 --
--------------------------------------------
, alias004 (column1) as (
select distinct column1 from table003
)
------------------------------------------------------
--COMMON TABLE EXPRESSION 005 --
------------------------------------------------------
, alias005 (column1, column2) as (
select column1, column2 from alias002, alias004
)
------------------------------------------------------
--COMMON TABLE EXPRESSION 006 --
------------------------------------------------------
, alias006 (column1, column2, column3, column4) as (
SELECT column1, column2, column3, sum(column0) as column4
FROM table004
LEFT JOIN table005 ON column01 = column02
group by column1, column2, column3
)
------------------------------------------------------
--COMMON TABLE EXPRESSION 007 --
------------------------------------------------------
, alias007 (column1, column2, column3, column4) as (
SELECT column1, column2, column3, sum(column0) as column4
FROM table006
LEFT JOIN table007 ON column01 = column02
group by column1, column2, column3
)
------------------------------------------------------
--COMMON TABLE EXPRESSION 008 --
------------------------------------------------------
, alias008 (column1, column2, column3, column4) as (
select column1, column2, column3, column4 from alias007 where column5 = 123
)
----------------------------------------------------------
--COMMON TABLE EXPRESSION 009 --
----------------------------------------------------------
, alias009 (column1, column2, column3, column4) as (
select column1, column2,
CASE WHEN column3 IS NOT NULL THEN column3 ELSE 0 END as column3,
CASE WHEN column4 IS NOT NULL THEN column4 ELSE 0 END as column4
from table007
)
----------------------------------------------------------
--COMMON TABLE EXPRESSION 010 --
----------------------------------------------------------
, alias010 (column1, column2, column3) as (
select column1, sum(column4), sum(column5)
from alias009
where column6 < 2005
group by column1
)
--------------------------------------------
-- MAIN QUERY --
--------------------------------------------
select j.column1, n.column2, column3, column4, column5, column6,
column3 + column5 AS column7,
column4 + column6 AS column8
from alias010 j
left join alias006 m ON (m.column1 = j.column1)
left join alias008 n ON (n.column1 = j.column1)
EDIT: I got downvoted on this answer, possibly because they thought I was proposing this as how you should build the final query. I should clarify that this is purely to try and understand what is going on. Once you understand the subqueries and how they link together, you would then use that knowledge to makes the necessary changes to the query and rebuild it in an efficient way.
I've used the technique of intermediate temp tables to troubleshoot complex queries quite a bit. They break the logic up into smaller chunks and are also useful if the original query takes a long time. You can test how to combine these intermediate tables without the overhead of rerunning the whole query. Sometimes I'll use temporary views instead of temp tables because the query optimiser can continue to use indexes on the base tables. The temporary views would get then get dropped once you've finished.
I would start from the innermost subqueries and work my way to the outside.
You're looking for subqueries which appear several times under slightly different guises, and also to give them a concise description - what are they designed to do?
Eg, replace
from (
select x1.y1, x1.y2, x1.y3 ...
from tb1, tb2, tb3, tb4, tb5 ...
left join ...
where ...
group by ...
) as a1
with
from daniel_view1 as a1
where daniel_view1 is
create view daniel_view as
select x1.y1, x1.y2, x1.y3 ...
from tb1, tb2, tb3, tb4, tb5 ...
left join ...
where ...
group by ...
That will already make it look cleaner. Then compare the views. Can any be merged together? You won't necessarily end up keeping the views in the final product, but they will help see the broader pattern without drowning in detail.
Alternatively, you could insert the subquery into a temp table
insert #daniel_work1
select x1.y1, x1.y2, x1.y3 ...
from tb1, tb2, tb3, tb4, tb5 ...
left join ...
where ...
group by ...
Then replace the subquery with
select ... from #daniel_work1 as a1
The other thing you could do is to see if you can break it up into sequential steps.
If you see
select ... from ...
union all
select ... from ...
this could become
insert #steps
select 'step1', ...#1...
insert #steps
select 'step2', ...#2...
union is trickier because set union removes duplicate rows (rows where all of their columns are the same as another row).
By storing intermediate results in temp tables, you can look inside the query as it unfolded, and replay difficult steps. I have 'step_id' as the first column of all my debugging temp tables, so if it gets filled in stages, then you see what data applies to what stage.
There are a few tricks that give a clue about what is going on. If you see a table joined to itself like this:
select ... from mytable t1 inner join mytable t2 on t2.id < t.id
it usually means they want a cross product of the table with itself, but without duplicates. you'll get keys 1 & 2 but not 2 & 1.

Find records that do not have "duplicates"

I am trying to find records in a table which do not have "duplicates" based on a certain criteria. I put duplicates in quotes because these records are not literal duplicates, as I my example data will show.
MyTable
Column1-----Column2-----Column3
ABC---------123---------A
ABC---------123---------Z
BCD---------234---------Z
CDE---------345---------A
CDE---------345---------Z
DEF---------456---------A
DEF---------456---------Z
EFG---------567---------Z
FGH---------678---------A
Just glancing at this data, you can clearly see that the records with BCD, EFG, and FGH in Column1 do not have any additional duplicates; however, all other records look similar, except for the Column3 data.
I could write a query to find these three records, but I only care about records that have "Z" in Column3. This would result in the query only showing BCD and EFG, and not FGH.
So, I would like a query that will find records that find records that do not have duplicates (based on Column1 and Column2) and that have "Z" in Column3.
Any help is greatly appreciated!
You can do this with aggregation and a having clause:
select column1, column2, max(column3)
from mytable
group by column1, column2
having count(*) = 1 and max(column3) = 'Z';
Different syntax, same result.
SELECT A.column1, A.column2
FROM MyTable A
LEFT JOIN MyTable B ON A.column1= B.column1AND A.column2= B.column2
WHERE A.column3 = 'z'
GROUP BY A.column1, A.column2
HAVING COUNT(*) = 1

appending 2 columns into one list in sql

I have 2 tables and each table has some 3 columns. i want to get one column such that one column from each table are apended one after the other
eg:- suppose one column in a table contains hai, how, are, you.
and another column in another column contains i, am, fine.
i want a query which gives hai, how, are, you,i,am,fine. in just one column
can anybody give a query for this in sql...
If I understand your schema correctly you have this
Table1: Column1
hai,
how,
are,
you.
Table2: Column2
i,
am,
fine.
Do This:
Insert Into Table1 (Column1)
Select Column2 From Table2
You will get this:
Table1: Column1
hai,
how,
are,
you.
i,
am,
fine.
If you have 3 Columns
Then just do this:
Insert Into Table1 (Column1, Column2, Column3) //the (Column1, Column2, Column3) is not neccessary if those are the only columns in your Table1
Select Column1, Column2, Column3 From Table2 //the Select Column1, Column2, Column3 could become Select * if those are the only columns of your Table2
EDIT: Do this if you don't want to modify any tables.
Select Column1, Column2, Column3
From Table1
UNION ALL
Select Column1, Column2, Column3
From Table2
Your question isn't very clear. One interpretation of it is that you want to UNION the two:
select column
from table1
union
select column
from table2;
If you really want all rows from both tables (and not the distinct values), UNION ALL will be faster than UNION.
If you want the rows in a certain order be sure to specify an ORDER BY clause.

SQL Server Count Query with no order

I have a series of select count queries tied together by UNION egg
Select Count(Column1) From Table1 where Table1 column1 = 1
union
Select Count(Column2) From Table1 where Table1 column2 = 1
It works fine but it just orders in asc or desc order but I want it to go in order by which I requested, I want the first query to always be first in the result no matter what the value is. Thanks for any help.
Run two queries?
You can add a column and sort on it
Select 1 as sequence, Count(Column1) From Table1 where Table1 column1 = 1
union
Select 2 as sequence, Count(Column2) From Table1 where Table1 column2 = 1
ORDER BY sequence
Try this:
SELECT COUNT(*) AS cnt, 1 AS SortOrder FROM Table1 WHERE column1 = 1
UNION ALL
SELECT COUNT(*) AS cnt, 2 AS SortOrder FROM Table1 WHERE column2 = 1
ORDER BY SortOrder
The main change I have made is to add a column which you can use to ORDER BY. Some of the other changes I have made:
You don't mean UNION, you mean UNION ALL. Otherwise with your query if the counts were the same you'd only get one row. In the new query this wouldn't happen, but you should still use UNION ALL because that's semantically what you mean.
Writing COUNT(column1) is unnecessary because your WHERE clause guarantees that column1 can never be NULL. Use COUNT(*). I imagine that even if you write COUNT(column1) most databases will see that column1 cannot be NULL and omit the unnecessary NULL check, but again there is nothing wrong with being explicit - you want to count all rows and COUNT(*) makes that clear.
You shouldn't have Table1 column1 with a space between. There should be a dot. Or simply omit the table name as it is not required here.

SQL Append table queries

I have two tables, say table1 with two rows of data say row11 and row12
and table2 with 3 rows of data sat row21, row22, row23
Can anyone provide me with the SQL to create a query that returns
row11
row12
row21
row22
row23
Note: I dont want to create a new table just return the data.
Use UNION ALL, based on the example data:
SELECT * FROM TABLE1
UNION ALL
SELECT * FROM TABLE2
UNION removes duplicates - if both tables each had a row whose values were "rowx, 1", the query will return one row, not two. This also makes UNION slower than UNION ALL, because UNION ALL does not remove duplicates. Know your data, and use appropriately.
select * from table1 union select * from table2
Why not use a UNION?
SELECT
Col1,Col2,Col3
FROM
TABLE1
UNION
SELECT
Col1,Col2,Col3
FROM
TABLE2
Are the columns on the two tables identical?
In MS Access you can achieve the same goal with an INSERT INTO query:
INSERT INTO TABLE1 SELECT * FROM TABLE2;