Join based on temp column - sql

I have two tables that need to be joined, but the only similar column has excess data that needs to be stripped. I would just modify the tables, but I only have read access to them. So, I strip the unneeded text out of the table and add a temp column, but I cannot join to it. I get the error:
Invalid column name 'TempJoin'
SELECT
CASE WHEN CHARINDEX('- ExtraText',a.Column1)>0 THEN LEFT(a.Column1, (CHARINDEX('- ExtraText', a.Column1))-1)
WHEN CHARINDEX('- ExtraText',a.Column1)=0 THEN a.Column1
END AS TempJoin
,a.Column1
,b.Column2
FROM Table1 as a
LEFT JOIN Table2 as b WITH(NOLOCK) ON b.Column2=TempJoin

Easiest way would be to wrap this in a CTE. Also, be careful using NOLOCK, unless you have an explicit reason.
WITH cte AS (
SELECT
CASE WHEN CHARINDEX('- ExtraText',a.Column1) > 0
THEN LEFT(a.Column1, (CHARINDEX('- ExtraText', a.Column1))-1)
WHEN CHARINDEX('- ExtraText',a.Column1) = 0
THEN a.Column1
END AS TempJoin,
a.Column1
FROM Table1 AS a
)
SELECT *
FROM cte
LEFT JOIN Table2 AS b WITH(NOLOCK) ON b.Column2 = TempJoin;

Related

How to use an alias in a sql join

I create a variable with a case when like it:
case when (a.exit_date='0001-01-01' and z.fermeture<>'0001-01-01') then z.fermeture
else a.exit_date
end as final_exit_date,
And after I got a sql join like it:
select a.*,b.*
from table1 as a
left join table2 as b on (a.id=b.id and b.start <= a.exit_date and a.exit_date < b.end)
where a.id=28445
When I do it, it works ! But me I don't want use the variable "a.exit_date"
I want replace it per the variable that I created ( final_exit_date), like it:
select a.*,b.*
from table1 as a
left join table2 as b on (a.id = b.id and b.start <= final_exit_date and final_exit_date < b.end)
where a.id=28445
Thanks in advance for reading me !!
When you create an expression with an alias in a SELECT list, the only place you are allowed to use the alias is in the ORDER BY clause. This frustrates many people in SQL because it is often that the alias would be useful in a WHERE clause or elsewhere, but that is not possible.
The solution is that you have to duplicate the expression instead of using the alias.
SELECT a.*,b.*
FROM table1 AS a
-- you will need the z table as well
LEFT JOIN table2 AS b ON (a.id=b.id and b.start <=
CASE WHEN (a.exit_date='0001-01-01' AND z.fermeture<>'0001-01-01') THEN z.fermeture ELSE a.exit_date END
AND
CASE WHEN (a.exit_date='0001-01-01' AND z.fermeture<>'0001-01-01') THEN z.fermeture ELSE a.exit_date END < b.end)
WHERE a.id=28445
Another option is to use a common table expression (CTE) or a subquery so that the alias is available. Using a CTE, it would look something like this:
;WITH records AS (
SELECT a.*, CASE WHEN (a.exit_date='0001-01-01' AND z.fermeture<>'0001-01-01') THEN z.fermeture ELSE a.exit_date END AS final_exit_date
FROM table1 AS a
LEFT OUTER JOIN otherTable AS z ON a.id = z.id -- whatever the condition is
)
SELECT r.* b.*
FROM records AS r
LEFT JOIN table2 AS b ON r.id = b.id AND b.start<= r.final_exit_date AND r.final_exit_date < b.end
There are likely some issues with the query above since you didn't include the table names or the columns (or how they are even related), but you should be able to create a working solution by adapting one of these two approaches.

sql - ignore duplicates while joining

I have two tables.
Table1 is 1591 rows. Table2 is 270 rows.
I want to fetch specific column data from Table2 based on some condition between them and also exclude duplicates which are in Table2. Which I mean to join the tables but get only one value from Table2 even if the condition has occurred more than time. The result should be exactly 1591 rows.
I tried to make Left,Right, Inner joins but the data comes more than or less 1591.
Example
Table1
type,address,name
40,blabla,Adam
20,blablabla,Joe
Table2
type,currency
40,usd
40,gbp
40,omr
Joining on 'type'
Result
type,address,name,currency
40,blabla,name,usd
20,blblbla,Joe,null
try this it has to work
select *
from
Table1 h
inner join
(select type,currency,ROW_NUMBER()over (partition by type order by
currency) as rn
from
Table2
) sr on
sr.type=h.type
and rn=1
Try this. It's standard SQL, therefore, it should work on your rdbms system.
select * from Table1 AS t
LEFT OUTER JOIN Table2 AS y ON t.[type] = y.[type] and y.currency IN (SELECT MAX(currency) FROM Table2 GROUP BY [type])
If you want to control which currency is joined, consider altering Table2 by adding a new column active/non active and modifying accordingly the JOIN clause.
You can use outer apply if it's supported.
select a.type, a.address, a.name, b.currency
from Table1 a
outer apply (
select top 1 currency
from Table2
where Table2.type = a.type
) b
I typical way to do this uses a correlated subquery. This guarantees that all rows in the first table are kept. And it generates an error if more than one row is returned from the second.
So:
select t1.*,
(select t2.currency
from table2 t2
where t2.type = t1.type
fetch first 1 row only
) as currency
from table1 t1;
You don't specify what database you are using, so this uses standard syntax for returning one row. Some databases use limit or top instead.

Combining SQL Queries to pass result from 1 as a parameter of the 2nd (SQL Server)

I am a little out of practice with SQL and I am trying to verify some data that has been converted in a system. Some of the queries I originally developed prior to the conversion are not proving out the work. I have been able to trace the source data back and verify that conversion was correct, but this is on an account by account basis. I would like to have a query to show the full dataset.
I have been able to work a solution down to 2 queries, but I cannot figure out how to combine them into one piece to show the full data set, where one value from the first query needs to be an element in the second query.
Query 1
select distinct
CreatedDate, AccountNum
From
Table1 A
Join
Table2 B on A.Column1 = B.Column1 and a.Column2 = b.Column2
Join
Table3 C on A.Column3 = C.Column3 and A.Column4 = C.Column4
where
Condition A and Condition B
Query 2
Select distinct
AccountNum, Responsible
From
Table3 D
Join
Table4 E on D.Column1 = E.Column2
where
StartDate <= 'DateValue' and EndDate > 'DateValue'
I would like to use the CreatedDate value from query 1 as the DateValue in query 2, but I have not found a solution to give the results I am looking for.
If I add a qualifier to each query, like account number, I end up with 1 result from query 1. I then put that CreatedDate into query 2 and I get the results I want. If I only have the account number on the 2nd query, I get two results, one from time period A to B with a responsible value of X and the 2nd from time period C to D with Responsible Value Y, which is where the CreateDate value falls between. Everything I have tried to combine these queries either ends up with a Responsible value of X (or no results), when I want that Y value.
I have not been able to successfully integrate the two queries, so that I can have that CreatedDate value passed as a parameter to figure out the Responsible value.
A solution that would work would be to create an intermediate table for the results of the 1st query and then join that table to 2nd query. However, I do not have access to create/insert/update tables/records on the database, so I cannot use this method.
I think you are looking for this
SELECT DISTINCT accountnum,
responsible
FROM table1 A
JOIN table2 B
ON A.column1 = B.column1
AND a.column2 = b.column2
JOIN table3 C
ON A.column3 = C.column3
AND A.column4 = C.column4
JOIN table4 D
ON D.column1 = C.column2
AND startdate <= createddate
AND enddate > createddate
where Condition A and Condition B
Note: You may have to add proper alias name to the columns
Select distinct AccountNum, Responsible
From Table3 D
Join Table4 E on D.Column1 = E.Column2
Join (
select distinct CreatedDate, AccountNum
From Table1 A
Join Table2 B on A.Column1 = B.Column1 and a.Column2 = b.Column2
Join Table3 C on A.Column3 = C.Column3 and A.Column4 = C.Column4
where Condition A and Condition B
) X
on D.AccountNum=X.AccountNum
and D.StartDate <= X.CreatedDate and EndDate > X.CreatedDate
Another solution is to make the first query into a table-valued UDF:
Create function GetCreateDateAndAcctId([Parameters for 2 conditions here])
Returns table As
Return
select distinct CreatedDate, AccountNum
From Table1 a
Join Table2 b
on b.Column1 = a.Column1
and b.Column2 = a.Column2
Join Table3 c
on c.Column3 = a.Column3
and c.Column4 = a.Column4
where condition1 -- here put predicate
and condition2 -- using input parameters
Then, to use it, just include it as a table in your second query like this:
Select distinct AccountNum, Responsible
From Table3 d
Join Table4 e
on e.Column2 = d.Column1
outer apply dbo.GetCreateDateAndAcctId(Parameters) cd
where StartDate <= cd.CreatedDate and EndDate > cd.CreatedDate
If you do this, the logic for the first query remains in a separate database object for reusability (you can use it in any other process without copying it). and better maintainability, (it's in only one place for fixing bugs and enhancements, etc. Also, since it's a table valued UDF, the SQL Server query processor will actually combine it with the second query's SQL into a single reusable compiled execution plan.

Optimising SQL query

I have a SQL query that performs an INNER JOIN on two tables having >50M rows each. I wish to reduce the time it takes to search through the join by reducing the rows that are joined based on a column present on one of the tables.
Say I have table1 with columns A,B,C and table2 with columns A,D,E. I wish to join based on column A but only those rows that have value 'e' for column E of table 2.
My SQL query :
SELECT one.B, two.D
FROM table1 one
INNER JOIN table2 two WHERE two.E IN ('e')
ON one.A = two.A
WHERE one.B > 10
AND two.D IN ('...')
It gives the error :
ORA-00905: missing keyword
Where am I going wrong? How do I achieve the intended result?
SELECT one.B, two.D
FROM table1 one
INNER JOIN table2 two -- WHERE two.E IN ('e') --> shouldn't use where here
ON one.A = two.A and two.E = 'e'
WHERE one.B > 10
AND two.D IN ('...')
Comments included in the code.
As vkp pointed out, the WHERE is improperly used. Instead you could also make a subquery to include that where statement. So that:
INNER JOIN table2 two WHERE two.E IN ('e')
becomes
INNER JOIN (select * from table2 WHERE E IN ('e')) two
You could also put the condition in the Where clause
SELECT one.B, two.D
FROM table1 a
JOIN table2 b
ON b.A = a.A
WHERE a.B > 10
And b.E = 'e'
AND b.D In ('...')
EDITED to remove 2nd incorrect suggestion

Hive : Checking if a string from table 1 is present in a list of strings from table 2 while joining two tables

I am trying to join on whether a string(a column from table 1) is present in list of strings(a column from table 2) in Hive QL. Can anyone please help me with the syntax.
SELECT
A.id
FROM tab1 A
inner join tab2 B
ON (
(array_contains(B.purchase_items, A.item_id) = true )
)
Above SQL does not work.
First, unless Hive QL is backwards, your query is wrong upfront:
SELECT A.ID FROM A tab1
will return nothing because you've declared table "A" as "tab1". Either reverse the Alias or correct the table alias reference: (I assume tab1 is the table name, so go with option 1)
SELECT A.ID from tab1 A
--OR
SELECT tab1.id from A tab1
Second, joins do not work based on conditional criteria, they ARE the conditional criteria. Sort of...
For example:
SELECT A.ID
FROM tab1 A
INNER JOIN tab2 B
ON A.item_id = B.purchase_item
is almost like doing a simple cross join with a WHERE condition:
SELECT A.ID
FROM tab1 A, tab2 B --better to use it straight as "FROM tab1 A cross join tab2 B"
WHERE a.item_id = b.purchase_item
You can use LEFT SEMI JOIN, which would retrieve rows from left side table with columns matched from right side table.
SELECT A.id FROM tab1 A
LEFT SEMI JOIN tab2 B
ON A.col1 = B.col1 AND <any-other-join-cond>;
Note that the SELECT and WHERE clauses can’t reference columns from the right hand table.