sql to combine two unrelated tables into one - sql

I have tables
table1
col1 col2
a b
c d
and table2
mycol1 mycol2
e f
g h
i j
k l
I want to combine the two tables, which have no common field into one table looking like:
table 3
col1 col2 mycol1 mycol2
a b e f
c d g h
null null i j
null null k l
ie, it is like putting the two tables side by side.
I'm stuck! Please help!

Get a row number for each row in each table, then do a full join using those row numbers:
WITH CTE1 AS
(
SELECT ROW_NUMBER() OVER(ORDER BY col1) AS ROWNUM, * FROM Table1
),
CTE2 AS
(
SELECT ROW_NUMBER() OVER (ORDER BY mycol1) AS ROWNUM, * FROM Table2
)
SELECT col1, col2, mycol1, mycol2
FROM CTE1 FULL JOIN CTE2 ON CTE1.ROWNUM = CTE2.ROWNUM
This is assuming SQL Server >= 2005.

It's really good if you put in a description of why this problem needs to be solved. I'm guessing it is just to practice sql syntax?
Anyway, since the rows don't have anything connecting them, we have to create a connection. I chose the ordering of their values. Also since they have nothing connecting them that also begs the question on why you would want to put them next to each other in the first place.
Here is the complete solution: http://sqlfiddle.com/#!6/67e4c/1
The select code looks like this:
WITH rankedt1 AS
(
SELECT col1
,col2
,row_number() OVER (order by col1,col2) AS rn1
FROM table1
)
,rankedt2 AS
(
SELECT mycol1
,mycol2
,row_number() OVER (order by mycol1,mycol2) AS rn2
FROM table2
)
SELECT
col1,col2,mycol1,mycol2
FROM rankedt1
FULL OUTER JOIN rankedt2
ON rn1=rn2

Option 1: Single Query
You have to join the two tables, and if you want each row in table1 to match to only one row in table2, you have to restrict the join somehow. Calculate row numbers in each table and join on that column. Row numbers are database-specific; here is a solution for mysql:
SELECT
t1.col1, t1.col2, t2.mycol1, t2.mycol2
FROM
(SELECT col1, col2, #t1_row := t1_row + 1 AS rownum FROM table1, (SELECT #t1_row := 0) AS r1) AS t1
LEFT JOIN
(SELECT mycol1, mycol2, #t2_row := t2_row + 1 AS rownum FROM table2, (SELECT #t2_row := 0) AS r2) AS t2
ON t1.rownum = t2.rownum;
This assumes table1 is longer than table2; if table2 is longer, either use RIGHT JOIN or switch the order of the t1 and t2 sub-selects. Also note that you can specify the order of each table separately using an ORDER BY clause in the sub-selects.
(See select increment counter in mysql)
Option 2: Post-processing
Consider making two selects, and then concatenating the results with your favorite scripting language. This is a much more reasonable approach.

Related

PostgreSQL: Error in left join

I am trying to join my master table to some sub-tables in PostgreSQL in a single select query. I am getting a syntax error and I have the feeling I am making a terrible mistake or doing something which is not allowed. The code:
Select
id,
length,
other_stuff
from my_table tbl1
Left join
(
Select
id,
height
from my_table2 tbl2) tbl2 using (id)
left join
-- I get syntax error here
(
With a as (select id from some_table),
b as (Select value from other_table)
Select id, value from a, b) tbl3 using (id)
order by tbl1.id
Can we use WITH clause in left joins sub or nested queries and Is there a better way to do this?
UPDATE1
Well, I would like to add some more details. I have three select queries like this (having unique ID) and I want to join them based on ID.
Query1:
With a as (Select id, my_other records... from postgres_table1)
b as (select id, my_records... from postgres_table2)
c as (select id, my_record.. from postgres_table3, b)
Select
id,
my_records
from a left join c on some_condtion_with_a
order by 1
Second query:
Select
id, my_records
from
(
multiple_sub_queries_by_getting_records_from_c
)
Third Query:
With d as (select id, records.. from b),
e as (select id, records.. from d),
f as (select id, records.. from e)
select
id,
records..
from f
I tried to join them using left join. The first two queries were joined successfully. While, joining third query I got the syntax error. Maybe, I am complicating things thus I asked is there a better way to do it.
You are over complicating things. There is no need to use a derived table to outer join my_table2. And there is no need for a CTE plus a derived table to join the tbl3 alias:
Select id,
length,
other_stuff
from my_table tbl1
Left join my_table2 tbl2 using (id)
left join (
select st.id, ot.value
from some_table st
cross join other_table ot
) tbl3 using (id)
order by tbl1.id;
This assumes that the cross join you create with Select id, value from a, b is intended.
Not tested, but I think you need this. try:
with a as (select id from some_table),
b as (Select value from other_table)
Select
id,
length,
other_stuff
from my_table tbl1
Left join
(
Select
id,
height
from my_table2 tbl2
)
tbl2 using (id)
left join
(
Select id, value from a, b
)
tbl3 using (id)
order by tbl1.id
I've only ever seen/used WITH in the following format:
WITH
temptablename(columns) as (query),
temptablename2(columns) as (query),
...
temptablenameX(columns) as (query)
SELECT ...
i.e. they come first
You'll probably find it easier to write queries if you use indentation to describe nesting levels. I like to make my SELECT FROM WHERE GROUPBY ORDERBY at one indent level, and then tablename INNER JOIN ON etc more indented:
SELECT
column
FROM
table
INNER JOIN
(
SELECT subcolumn FROM subtable WHERE subclause
) myalias
ON
table.id = myalias.whatever
WHERE
blah
Organising your indents every time you nest down a layer really helps. By making everything that is "a table or a block of data like a table (i.e. a subquery)" indented the same amount you can easily see the notional order that the DB should retrieve
Move your WITHs to the top of the statement, you will still use the alias names in place in the sub sub query of course
Looking at your query, there isn't much point in your subqueries.. You don't do any grouping or particularly complex processing of the data, you just select an ID and another column and then join it in. Your query will be simpler if you don't do this:
SELECT
column
FROM
table
INNER JOIN
(
SELECT subcolumn FROM subtable WHERE subclause
) myalias
ON
table.id = myalias.whatever
WHERE
blah
Instead, do this:
SELECT
column
FROM
table
INNER JOIN
subtable
ON
table.id = subtable.id
WHERE
blah
Re your updated requirements, following the same pattern.
look for --my comments
With a as (Select id, my_other records... from postgres_table1)
b as (select id, my_records... from postgres_table2)
c as (select id, my_record.. from postgres_table3, b)
d as (select id, records.. from b),
e as (select id, records.. from d),
f as (select id, records.. from e)
SELECT * FROM
(
--your first
Select
id,
my_records
from a left join c on some_condtion_with_a
) Q1
LEFT OUTER JOIN
(
--your second
Select
id, my_records
from
(
multiple_sub_queries_by_getting_records_from_c
)
) Q2
ON Q1.XXXX = Q2.XXXX --fill this in !!!!!!!!!!!!!!!!!!!
LEFT OUTER JOIN
(
--your third
select
id,
records..
from f
) Q3
ON QX.XXXXX = Q3.XXXX --fill this in !!!!!!!!!!!!!!!!!!!
It'll work, but it might not be the prettiest or most necessary SQL arrangement. As both i and HWNN have said, you can rewrite a lot of these queries where you're just doing some simple selecting in your WITH.. But likely that theyre simple enough that the database optimizer can also see this and rerwite the query for you when it runs it
Just remember to code clearly, and lay your indentation out nicely to stop it tunring into a massive, unmaintainable, undebuggable spaghetti mess

SQL Select Between Two tables

I have these two tables:
T1:
ref || Name
===========
1 || A
2 || B
3 || C
4 || D
5 || E
And
T2:
ref || Name
===========
1 || w
2 || x
6 || y
7 || z
I need this result:
Name1 || Name2
==============
A || w
B || x
C || y
D || z
E || NULL
I mean some kind of full outer join, on column ref, that will not produce NULL value until there is not any record.
The priority of join is with the values that have same ref, and if row count of tables are not equal, there are some NULL results
Use UPD: here is how you may combine values by values count from different tables:
with t1_values as (
SELECT
name,
row_number() over (order by ref) as position
FROM #t1
),
t2_values as (
SELECT
name,
row_number() over (order by ref) as position
FROM #t2
)
SELECT
t1_values.name as name1,
t2_values.name as name2
FROM t1_values
left JOIN t2_values on t1_values.position = t2_values.position
This is very complicated. You want rows that match to match. Then you want unmatched rows to match unmatched rows by position, and then everything else.
with matches as (
select distinct t1.ref
from t1
where exists (select 1 from t2 where t2.ref = t1.ref)
),
tt1 as (
select t1.*, m.ref as match_ref,
row_number() over (partition by m.ref order by t1.ref) as alt_ref
from t1 left join
matches m
on t1.ref = m.ref
),
tt2 as (
select t2.*, m.ref as match_ref,
row_number() over (partition by m.ref order by t2.ref) as alt_ref
from t2 left join
matches m
on t2.ref = m.ref
)
select tt1.name, tt2.name
from tt1 left join
tt2
on tt1.match_ref = tt2.match_ref or
(tt1.match_ref is null and tt2.match_ref is null and tt1.alt_ref = tt2.alt_ref);
Here is the idea. For each row in both tables, add two new columns:
match_ref is ref when ref exists in the other table.
alt_ref is an enumerated column for the ref values that do not match.
Once you have these columns, it is possible to join the tables together, by first checking match_ref and then -- if that is not present -- checking alt.ref.
SQL Fiddle does not appear to be working for SQL Server. However, here is an identical Postgres version that does work. Here is a working version using SQL Server (this is identical to the Postgres version).
A look at your Question
Correct me if I am wrong, but you have two tables that have an reference ID column of names that you wish to return results sets....only, you do not say distinct so you might end up with extras.
...some kind of full outer join, on column ref, that will not
produce NULL value until there is not any record..
The priority of the JOIN is with the values that have same ref, and if row count of tables are
not equal, there are some NULL results
Actually, the result is not really deterministic. Anyways, this question is still too vague. However, I think you really just want rows matching columns to appear together and anything not....well, not. But return everything.
So, if you know which side is larger, try this:
WITH C AS (SELECT DENSE_RANK() OVER (PARTITION BY ref ORDER BY NAME DESC) AS ROW_ID
, ref
, Name AS Name1
FROM T1)
SELECT Name1, B.Name2
FROM C
LEFT OUTER JOIN (SELECT DENSE_RANK() (OVER PARTITION BY ref ORDER BY NAME DESC) AS ROW_ID
, ref
, Name AS Name2) B ON B.Row_ID = C.Row_ID AND B.ref = C.ref
Each of the ref columns have a distinct ID to attach with. It is done in consecutive order, so if there is still an issue, well, you can figure that logic out. But I'm sure this will help you tremendously get where you are wanting. :)
Thank you every body and excuse me if I speak badly sometimes, count it on my fatigue because of hours working.
I think I couldn't say my meaning correctly, so I answered my own question. this is not the best answer and it is not optimized, I know, but it works!
SELECT t.ref,
t.NAME AS n1,
t2.NAME AS n2
INTO #tbl1
FROM #T1 AS t
LEFT JOIN #T2 AS t2
ON t2.ref = t.ref
WHERE t2.ref IS NOT NULL
SELECT ROW_NUMBER() OVER(ORDER BY t.NAME) AS rn,
*
INTO #tbl2
FROM #T1 AS t
WHERE t.ref NOT IN (SELECT ref
FROM #tbl1)
SELECT ROW_NUMBER() OVER(ORDER BY t.NAME) AS rn,
*
INTO #tbl3
FROM #T2 AS t
WHERE t.ref NOT IN (SELECT ref
FROM #tbl1)
SELECT n1,
n2
FROM #tbl1
UNION
SELECT t1.NAME AS n1,
t2.NAME AS n2
FROM #tbl2 t1
FULL OUTER JOIN #tbl3 t2
ON t1.rn = t2.rn

How to join several unrelated tables

I have five queries and each of them will return me single column multiple row output. I want to to write a function which will contain all of these queries.
Can anyone help?
query 1:
Select Col1 as X from Table1;
query 2:
Select Col3 as Y from Table2;
From a function I want to get a table which will have columns
X, Y
How to club these queries under single function?
Add a ROW_NUMBER() to each of the queries and join them by the row number.
Depending on number of rows returned by each of the query you'd join then by inner, left or full join.
Example below assumes that two queries return the same number of rows.
WITH
CTE1
AS
(
SELECT Col1 as X, ROW_NUMBER() OVER(ORDER BY Col1) AS rn
FROM Table1
)
,CTE2
AS
(
SELECT Col3 as Y, ROW_NUMBER() OVER(ORDER BY Col3) AS rn
FROM Table2
)
SELECT
CTE1.X, CTE2.Y
FROM
CTE1
INNER JOIN CTE2 ON CTE1.rn = CTE2.rn
Use the UNION operator:
SELECT
column_1
FROM
tbl_name_1
UNION ALL
SELECT
column_1
FROM
tbl_name_2;
If there is a relation between the two tables, try using a join.
Maybe a simple inner join would be possible here?
select Col1 as X from Table1
join
on Table1.Col1_name = Table2.col3_name

SQL query to find record with ID not in another table

I have two tables with binding primary key in database and I desire to find a disjoint set between them. For example,
Table1 has columns (ID, Name) and sample data: (1 ,John), (2, Peter), (3, Mary)
Table2 has columns (ID, Address) and sample data: (1, address2), (2, address2)
So how do I create a SQL query so I can fetch the row with ID from table1 that is not in table2. In this case, (3, Mary) should be returned?
PS: The ID is the primary key for those two tables.
Try this
SELECT ID, Name
FROM Table1
WHERE ID NOT IN (SELECT ID FROM Table2)
Use LEFT JOIN
SELECT a.*
FROM table1 a
LEFT JOIN table2 b
on a.ID = b.ID
WHERE b.id IS NULL
There are basically 3 approaches to that: not exists, not in and left join / is null.
LEFT JOIN with IS NULL
SELECT l.*
FROM t_left l
LEFT JOIN
t_right r
ON r.value = l.value
WHERE r.value IS NULL
NOT IN
SELECT l.*
FROM t_left l
WHERE l.value NOT IN
(
SELECT value
FROM t_right r
)
NOT EXISTS
SELECT l.*
FROM t_left l
WHERE NOT EXISTS
(
SELECT NULL
FROM t_right r
WHERE r.value = l.value
)
Which one is better? The answer to this question might be better to be broken down to major specific RDBMS vendors. Generally speaking, one should avoid using select ... where ... in (select...) when the magnitude of number of records in the sub-query is unknown. Some vendors might limit the size. Oracle, for example, has a limit of 1,000. Best thing to do is to try all three and show the execution plan.
Specifically form PostgreSQL, execution plan of NOT EXISTS and LEFT JOIN / IS NULL are the same. I personally prefer the NOT EXISTS option because it shows better the intent. After all the semantic is that you want to find records in A that its pk do not exist in B.
Old but still gold, specific to PostgreSQL though: https://explainextended.com/2009/09/16/not-in-vs-not-exists-vs-left-join-is-null-postgresql/
Fast Alternative
I ran some tests (on postgres 9.5) using two tables with ~2M rows each. This query below performed at least 5* better than the other queries proposed:
-- Count
SELECT count(*) FROM (
(SELECT id FROM table1) EXCEPT (SELECT id FROM table2)
) t1_not_in_t2;
-- Get full row
SELECT table1.* FROM (
(SELECT id FROM table1) EXCEPT (SELECT id FROM table2)
) t1_not_in_t2 JOIN table1 ON t1_not_in_t2.id=table1.id;
Keeping in mind the points made in #John Woo's comment/link above, this is how I typically would handle it:
SELECT t1.ID, t1.Name
FROM Table1 t1
WHERE NOT EXISTS (
SELECT TOP 1 NULL
FROM Table2 t2
WHERE t1.ID = t2.ID
)
SELECT COUNT(ID) FROM tblA a
WHERE a.ID NOT IN (SELECT b.ID FROM tblB b) --For count
SELECT ID FROM tblA a
WHERE a.ID NOT IN (SELECT b.ID FROM tblB b) --For results

How do I compare 2 rows from the same table (SQL Server)?

I need to create a background job that processes a table looking for rows matching on a particular id with different statuses. It will store the row data in a string to compare the data against a row with a matching id.
I know the syntax to get the row data, but I have never tried comparing 2 rows from the same table before. How is it done? Would I need to use variables to store the data from each? Or some other way?
(Using SQL Server 2008)
You can join a table to itself as many times as you require, it is called a self join.
An alias is assigned to each instance of the table (as in the example below) to differentiate one from another.
SELECT a.SelfJoinTableID
FROM dbo.SelfJoinTable a
INNER JOIN dbo.SelfJoinTable b
ON a.SelfJoinTableID = b.SelfJoinTableID
INNER JOIN dbo.SelfJoinTable c
ON a.SelfJoinTableID = c.SelfJoinTableID
WHERE a.Status = 'Status to filter a'
AND b.Status = 'Status to filter b'
AND c.Status = 'Status to filter c'
OK, after 2 years it's finally time to correct the syntax:
SELECT t1.value, t2.value
FROM MyTable t1
JOIN MyTable t2
ON t1.id = t2.id
WHERE t1.id = #id
AND t1.status = #status1
AND t2.status = #status2
Some people find the following alternative syntax easier to see what is going on:
select t1.value,t2.value
from MyTable t1
inner join MyTable t2 on
t1.id = t2.id
where t1.id = #id
SELECT COUNT(*) FROM (SELECT * FROM tbl WHERE id=1 UNION SELECT * FROM tbl WHERE id=2) a
If you got two rows, they different, if one - the same.
SELECT * FROM A AS b INNER JOIN A AS c ON b.a = c.a
WHERE b.a = 'some column value'
I had a situation where I needed to compare each row of a table with the next row to it, (next here is relative to my problem specification) in the example next row is specified using the order by clause inside the row_number() function.
so I wrote this:
DECLARE #T TABLE (col1 nvarchar(50));
insert into #T VALUES ('A'),('B'),('C'),('D'),('E')
select I1.col1 Instance_One_Col, I2.col1 Instance_Two_Col from (
select col1,row_number() over (order by col1) as row_num
FROM #T
) AS I1
left join (
select col1,row_number() over (order by col1) as row_num
FROM #T
) AS I2 on I1.row_num = I2.row_num - 1
after that I can compare each row to the next one as I need