Joining tables using concat column - sql

So I have two tables I want to join using SQL.
Since they did not have a common column I used
SELECT NEW_ID = CONCAT ('0',table1.ID)
Now that I have the new column with matching data in both tables, how do I join both tables? Is there any way to use the NEW_ID column as a temporary column so that I do not have to alter table 1?

In your case, suitable. of course in terms of performance it is not the best solution ( compare fields with diffrent types)
Select *
From Table1 As t1 inner join Table2 as t2
ON t1.ID = CAST(t2.ID AS INT)

Different ways, here's one:
;WITH CTE AS
(
SELECT NEW_ID = CONCAT ('0',table1.ID) FROM TableA
)
SELECT * FROM CTE AS C INNER JOIN TableB AS B ON C.New_ID = B.ID

You Can Just give the Concrete Expression in the Join.You Don't have to add a new Column For that. Like this
SELECT
*
FROM Table1 A
INNER JOIN Table2 B
ON RIGHT('00'+A.Id,2) = RIGHT('00'+B.Id,2)

Related

Pass values as parameter from select query

I want to pass values from output of select query to another query. Basically both queries will be part of a stored procedure. e.g.
select Id, RelId
from tables
There will be multiple rows returned by above query and I want to pass them to the following query
select name
from table2
where Id = #Id and MgId = #RelId
Please suggest
You cannot pass multiple values in SQL.
But maybe you can just join your 2 tables, that would be far more efficient.
Not knowing your table schemes I suggest something like this. You might have to adapt this to your actual table schemas off course
select name
from table2 t2
inner join tables t on t2.Id = t.Id
and t2.MgId = t.RelId
EDIT
As Gordon mentioned in his answer, this approach can show double rows in your result.
If you don't want that than here are 2 ways of getting rid of the doubles
select distinct name
from ...
or by grouping by adding this at the end of the statement
group by name
Though this will work, avoiding the doubles like in Gordon's answer is better
I would suggest using exists:
select t2.name
from table2 t2
where exists (select 1
from tables t
where t2.Id = t.Id and t2.MgId = t.RelId
);
The difference between exists and join is that this will not generate duplicates, if there are multiple matches between the tables.
Or...
SELECT *
INTO #Table1
FROM ...
SELECT *
INTO #Table2
FROM ...
SELECT *
FROM #Table1 T1
JOIN #Table2 T2
DROP TABLE #Table1, #Table2

How to take distinct values in hive join

I need to take the distinct values from Table 2 while joining with Table 1 in Hive. Because the table 2 has duplicate records.
Considering below join condition is it possible to take only distinct key_col from table 2? i dont want to use select distinct * from ...
select * from Table_1 a left join Table_2 b on a.key_col = b.key_col
Note: This is in Hive
Use Left semi join. This will give you all the record in table1 which exist in table2(duplicate record) without duplicates.
select a.* from Table_1 a left semi join Table_2 b on a.key_col = b.key_col

T-SQL "Where not in" using two columns

I want to select all records from a table T1 where the values in columns A and B has no matching tuple for the columns C and D in table T2.
In mysql “Where not in” using two columns I can read how to accomplish that using the form select A,B from T1 where (A,B) not in (SELECT C,D from T2), but that fails in T-SQL for me resulting in "Incorrect syntax near ','.".
So how do I do this?
Use a correlated sub-query:
...
WHERE
NOT EXISTS (
SELECT * FROM SecondaryTable WHERE c = FirstTable.a AND d = FirstTable.b
)
Make sure there's a composite index on SecondaryTable over (c, d), unless that table does not contain many rows.
You can't do this using a WHERE IN type statement.
Instead you could LEFT JOIN to the target table (T2) and select where T2.ID is NULL.
For example
SELECT
T1.*
FROM
T1 LEFT OUTER JOIN T2
ON T1.A = T2.C AND T1.B = T2.D
WHERE
T2.PrimaryKey IS NULL
will only return rows from T1 that don't have a corresponding row in T2.
I Used it in Mysql because in Mysql there isn't "EXCLUDE" statement.
This code:
Concates fields C and D of table T2 into one new field to make it easier to compare the columns.
Concates the fields A and B of table T1 into one new field to make it easier to compare the columns.
Selects all records where the value of the new field of T1 is not equal to the value of the new field of T2.
SQL-Statement:
SELECT T1.* FROM T1
WHERE CONCAT(T1.A,'Seperator', T1.B) NOT IN
(SELECT CONCAT(T2.C,'Seperator', T2.D) FROM T2)
Here is an example of the answer that worked for me:
SELECT Count(1)
FROM LCSource as s
JOIN FileTransaction as t
ON s.TrackingNumber = t.TrackingNumber
WHERE NOT EXISTS (
SELECT * FROM LCSourceFileTransaction
WHERE [LCSourceID] = s.[LCSourceID] AND [FileTransactionID] = t.[FileTransactionID]
)
You see both columns exist in LCSourceFileTransaction, but one occurs in LCSource and one occurs in FileTransaction and LCSourceFileTransaction is a mapping table. I want to find all records where the combination of the two columns is not in the mapping table. This works great. Hope this helps someone.

How can I reference a single table multiple times in the same query?

Sometimes I need to treat the same table as two separate tables. What is the solution?
You can reference, just be sure to use a table alias
select a.EmployeeName,b.EmployeeName as Manager
from Employees A
join Employees B on a.Mgr_id=B.Id
Use an alias like a variable name in your SQL:
select
A.Id,
A.Name,
B.Id as SpouseId,
B.Name as SpouseName
from
People A
join People B on A.Spouse = B.id
Use an alias:
SELECT t1.col1, t2.col3
FROM tbl t1
INNER JOIN tbl t2
ON t1.col1 = t2.col2
Alias is the most obvious solution
SELECT * FROM x1 AS x,y1 AS y
However if the table is the result of a query a common table expressions is quite usefull
;WITH ctx AS
( select * from z)
SELECT y.* FROM ctx AS c1,ctx AS c2
A third solution -- suitable when your query lasts a long time -- is temporary tables:
SELECT *
INTO #monkey
FROM chimpanzee
SELECT * FROM #monkey m1,#monkey m2
DROP TABLE #MONKEY
Note a common table expression is only available for one query (the query directly after it), and temporary tables last for the whole batch.

An issue possibly related to Cursor/Join

Here is my situation:
Table one contains a set of data that uses an id for an unique identifier. This table has a one to many relationship with about 6 other tables such that.
Given Table 1 with Id of 001:
Table 2 might have 3 rows with foreign key: 001
Table 3 might have 12 rows with foreign key: 001
Table 4 might have 0 rows with foreign key: 001
Table 5 might have 28 rows with foreign key: 001
I need to write a report that lists all of the rows from Table 1 for a specified time frame followed by all of the data contained in the handful of tables that reference it.
My current approach in pseudo code would look like this:
select * from table 1
foreach(result) {
print result;
select * from table 2 where id = result.id;
foreach(result2) {
print result2;
}
select * from table 3 where id = result.id
foreach(result3) {
print result3;
}
//continued for each table
}
This means that the single report can run in the neighbor hood of 1000 queries. I know this is excessive however my sql-fu is a little weak and I could use some help.
LEFT OUTER JOIN Tables2-N on Table1
SELECT Table1.*, Table2.*, Table3.*, Table4.*, Table5.*
FROM Table1
LEFT OUTER JOIN Table2 ON Table1.ID = Table2.ID
LEFT OUTER JOIN Table3 ON Table1.ID = Table3.ID
LEFT OUTER JOIN Table4 ON Table1.ID = Table4.ID
LEFT OUTER JOIN Table5 ON Table1.ID = Table5.ID
WHERE (CRITERIA)
Join doesn't do it for me. I hate having to de-tangle the data on the client side. All those nulls from left-joining.
Here's a set-based solution that doesn't use Joins.
INSERT INTO #LocalCollection (theKey)
SELECT id
FROM Table1
WHERE ...
SELECT * FROM Table1 WHERE id in (SELECT theKey FROM #LocalCollection)
SELECT * FROM Table2 WHERE id in (SELECT theKey FROM #LocalCollection)
SELECT * FROM Table3 WHERE id in (SELECT theKey FROM #LocalCollection)
SELECT * FROM Table4 WHERE id in (SELECT theKey FROM #LocalCollection)
SELECT * FROM Table5 WHERE id in (SELECT theKey FROM #LocalCollection)
Ah! Procedural! My SQL would look like this, if you needed to order the results from the other tables after the results from the first table.
Insert Into #rows Select id from Table1 where date between '12/30' and '12/31'
Select * from Table1 t join #rows r on t.id = r.id
Select * from Table2 t join #rows r on t.id = r.id
--etc
If you wanted to group the results by the initial ID, use a Left Outer Join, as mentioned previously.
You may be best off to use a reporting tool like Crystal or Jasper, or even XSL-FO if you are feeling bold. They have things built in to handle specifically this. This is not something the would work well in raw SQL.
If the format of all of the rows (the headers as well as all of the details) is the same, it would also be pretty easy to do it as a stored procedure.
What I would do: Do it as a join, so you will have the header data on every row, then use a reporting tool to do the grouping.
SELECT * FROM table1 t1
INNER JOIN table2 t2 ON t1.id = t2.resultid -- this could be a left join if the table is not guaranteed to have entries for t1.id
INNER JOIN table2 t3 ON t1.id = t3.resultid -- etc
OR if the data is all in the same format you could do.
SELECT cola,colb FROM table1 WHERE id = #id
UNION ALL
SELECT cola,colb FROM table2 WHERE resultid = #id
UNION ALL
SELECT cola,colb FROM table3 WHERE resultid = #id
It really depends on the format you require the data in for output to the report.
If you can give a sample of how you would like the output I could probably help more.
Join all of the tables together.
select * from table_1 left join table_2 using(id) left join table_3 using(id);
Then, you'll want to roll up the columns in code to format your report as you see fit.
What I would do is open up cursors on the following queries:
SELECT * from table1 order by id
SELECT * from table1 r, table2 t where t.table1_id = r.id order by r.id
SELECT * from table1 r, table3 t where t.table1_id = r.id order by r.id
And then I would walk those cursors in parallel, printing your results. You can do this because all appear in the same order. (Note that I would suggest that while the primary ID for table1 might be named id, it won't have that name in the other tables.)
Do all the tables have the same format? If not, then if you have to have a report that can display the n different types of rows. If you are only interested in the same columns then it is easier.
Most databases have some form of dynamic SQL. In that case you can do the following:
create temporary table from
select * from table1 where rows within time frame
x integer
sql varchar(something)
x = 1
while x <= numresults {
sql = 'SELECT * from table' + CAST(X as varchar) + ' where id in (select id from temporary table'
execute sql
x = x + 1
}
But I mean basically here you are running one query on your main table to get the rows that you need, then running one query for each sub table to get rows that match your main table.
If the report requires the same 2 or 3 columns for each table you could change the select * from tablex to be an insert into and get a single result set at the end...