SQL - Left Join - On part of string only - sql

I'm trying to make a Left Join, should be simple enough, I have 2 problems;
The values are in Binary
I need to Join the left 3 characters in one string to the right 3 characters of the other (after they are changed from binary)
Join, Left 3 characters of this one
convert(VARCHAR(max),(file_key7), 102)
in db [RF_Sydney].[dbo].[std_file]
with the Right 3 characters of this one
convert(VARCHAR(max),(code_key), 11)
in db [RF_Sydney].[dbo].[std_code]

In SQL Server, you can join on any condition that can be satisfied; in other words, you can do this:
SELECT *
FROM dbo.std_file f LEFT JOIN std_code c
ON LEFT(convert(VARCHAR(max),(f.file_key7), 102), 3)
= RIGHT(convert(VARCHAR(max),(c.code_key), 11),3)
Performance will suck (unless you use persisted computed columns and define an index).

The best way to do this is to use a computed column in each of those tables. This will allow you to simplify your join code, and even allow you to define an index on the column to improve performance if you need it. As for getting the left and right value, there are LEFT() and RIGHT() functions you can use:
LEFT(convert(VARCHAR(max),(file_key7), 102), 3)
and
RIGHT(convert(VARCHAR(max),(code_key), 11), 3)
For the join expression and query itself, we don't have enough information yet to know exactly how you want these to fit together.

Do you know the length of file_key7 or code_key? Joining LEFT(str,len) = RIGHT(str,len) should work but possible take a big performance hit. Maybe you should create field/column and stick your partial keys in it already converted in the right character format

Related

Join on phone numbers in different formats

I need an oracle SQL join on two tables on fields with phone numbers that have different formats. The field on one table is the format 555-555-5555 and the other (555) 555-5555.
What is the syntax that could make this work? The tables are small enough I could probably get by with dropping area codes and just focus on the last 4 digits.
Is it possible? If I can't do a join I'm curious of the syntax for a simple compare such as: Where last4(phonenumber) = '4567'
If you want to compare the whole number, you can probably user regexp_replace to keep only the digits and then do the comparison:
where regexp_replace(phone_number,'\D','') = '55555551234';
\D matches non-digit character and removes them.
If last 4 digits will do, you can use substr:
where substr(phone_number,-4) = '1234';
Basically, you can use any string function on your join (in the ON clause, it doesn't have to be straight forward columns, can be calculated values).
For example, following what you suggested, you can use SUBSTR to get the last four digits, and use this on your join:
SELECT * from tableA INNER JOIN tableB on SUBSTR(tableA.num,-4,4) = SUBSTR(tableAB,-4,4)

MS Access Update SQL Query Extremely Slow and Multiplying the Amount of Records Updated

I am stumped on how to make this query run more efficiently/correctly. Here is the query first and then I can describe the tables that are involved:
UPDATE agg_pivot_test AS p
LEFT JOIN jd_cleaning AS c
ON c.Formerly = IIF(c.Formerly LIKE '*or*', '*' & p.LyFinalCode & '*', CStr(p.LyFinalCode))
SET p.CyFinalCode = c.FinalCode
WHERE p.CyFinalCode IS NULL AND c.Formerly IS NOT NULL;
agg_pivot_test has 200 rows of data and only 99 fit the criteria of WHERE p.CyFinalCode IS NULL. The JOIN needs some explaining. It is an IIF because some genius decided to link last year's data to this year's data using Formerly. It is a string because sometimes multiple items have been consolidated down to one so they use "or" (e.g., 632 or 631 or 630). So if I want to match this year's data I have to use Formerly to match last year's LyFinalCode. So this year the code might be 629, but I have to use the Formerly to map the items that were 632, 631, or 630 to the new code. Make sense? That is why the ON has an IIF. Also, Formerly is a string and LyFinalCode is an integer... fun.
Anyway, when you run the query it says it is updating 1807 records when again, there are only 200 records and only 99 that fit the criteria.
Any suggestions about what this is happening or how to fix it?
An interesting problem. I don't think I've ever come across something quite like this before.
I'm guessing what's happening is that rows where CyFinalCode is null, are being matched multiple times by the join statement, and thus the join expression is calculating a cartesian product of row-matches, and this is the basis of the rows updated message. It seems odd, as I would have expected access to complain about multiple row matches, when row matches should only be 1:1 in an update statement.
I would suggest rewriting the query (with this join) as a select statement, and seeing what the query gives you in the way of output; Something like:
SELECT p.*, c.*
FROM agg_pivot_test p LEFT JOIN jd_cleaning c
ON c.Formerly = IIF(c.Formerly LIKE '*or*', '*' & p.LyFinalCode & '*', CStr(p.LyFinalCode))
WHERE p.CyFinalCode IS NULL AND c.Formerly IS NOT NULL
I'm also inclined to suggest changing "... & p.LyFinalCode & ..." to "... & CStr(p.LyFinalCode) & ..." - though I can't really see why it should make a difference.
The only other thing I can think to suggest is change the join a bit: (this isnt guaranteed to be better necessarily - though it might be)
UPDATE agg_pivot_test AS p LEFT JOIN jd_cleaning AS c
ON (c.Formerly = CStr(p.LyFinalCode) OR InStr(c.Formerly, CStr(p.LyFinalCode)) > 0)
(Given the syntax of your statement, I assume this sql is running within access via ODBC; in which case this should be fine. If I'm wrong the sql is running server side, you'll need to change InStr to SubString.)

SQL query to find records with specific prefix

I'm writing SQL queries and getting tripped up by wanting to solve everything with loops instead of set operations. For example, here's two tables (lists, really - one column each); idPrefix is a subset of idFull. I want to select every full ID that has a prefix I'm interested in; that is, every row in idFull which has a corresponding entry in idPrefix.
idPrefix.ID idFull.ID
---------- ----------
12 8
15 12
300 12-1-1
12-1-2
15
15-1
300
Desired result would be everything in idFull except the value 8. Super-easy with a for each loop, but I'm just not conceptualizing it as a set operation. I've tried a few variations on the below; everything seems to return all of one table. I'm not sure if my issue is with how I'm doing joins, or how I'm using LIKE.
SELECT f.ID
FROM idPrefix AS p
JOIN idFull AS f
ON f.ID LIKE (p.ID + '%')
Details:
Values are varchars, prefixes can be any length but do not contain the delimiter '-'.
This question seems similar, but more complex; this one only uses one table.
Answer doesn't need to be fast/optimized/whatever.
Using SQL Server 2008, but am more interested in conceptual understanding than a flavor-specific query.
Aaaaand I'm coming back to both real coding & SO after ~3 years, so sorry if I'm rusty on any etiquette.
Thanks!
You can join the full table to the prefix table with a LIKE
SELECT idFull.ID
FROM idFull full
INNER JOIN idPrefix pre ON full.ID LIKE pre.ID + '%'

Compare column between two tables (greater and equal to)

I have two table, and i want to compare the two column from those two table. The column reflow in table f_product must greater and equal to column lreflow in table f_line. The coding that I used is
SELECT f_product.oiv,f_product.product,f_product.passive,f_product.pitch,f_product.reflow,f_line.lreflow,f_product.spi,f_product.scomp,f_product.pallet,f_product.printer,f_line.line
FROM f_product,f_line
WHERE f_product.passive=f_line.passive
AND f_product.pitch=f_line.pitch
AND f_product.spi=f_line.spi
AND f_product.pallet=f_line.pallet
AND f_product.printer=f_line.printer
AND f_product.reflow >= f_line.lreflow
AND oiv='PMLE4720A' .
However, the result display out did not compare out the column data in between f_product.reflow and f_line.lreflow. For example, the result still list out the result of reflow=8 and lreflow=10 where reflow is less than the value of lreflow.
Is that my sql coding have any error?
I'm guessing this is Oracle? Sometimes it gets confused by the ambiguity between real where clauses and an implicit join using a where. I would recast it into ansi sql joins:
SELECT
.....
FROM
f_product a INNER JOIN f_line b ON
(a.passive = b.passive AND
a.pitch =b.pitch AND
a.spi=b.spi AND
a.pallet=b.pallet)
where oiv='PMLE4720A'
and a.reflow >= b.lreflow
Assuming the relationship between product and line is such that it makes sense to jion on these four fields...

Splitting text in SQL Server stored procedure

I'm working with a database, where one of the fields I extract is something like:
1-117 3-134 3-133
Each of these number sets represents a different set of data in another table. Taking 1-117 as an example, 1 = equipment ID, and 117 = equipment settings.
I have another table from which I need to extract data based on the previous field. It has two columns that split equipment ID and settings. Essentially, I need a way to go from the queried column 1-117 and run a query to extract data from another table where 1 and 117 are two separate corresponding columns.
So, is there anyway to split this number to run this query?
Also, how would I split those three numbers (1-117 3-134 3-133) into three different query sets?
The tricky part here is that this column can have any number of sets here (such as 1-117 3-133 or 1-117 3-134 3-133 2-131).
I'm creating these queries in a stored procedure as part of a larger document to display the extracted data.
Thanks for any help.
Since you didn't provide the DB vendor, here's two posts that answer this question for SQL Server and Oracle respectively...
T-SQL: Opposite to string concatenation - how to split string into multiple records
Splitting comma separated string in a PL/SQL stored proc
And if you're using some other DBMS, go search for "splitting text ". I can almost guarantee you're not the first one to ask, and there's answers for every DBMS flavor out there.
As you said the format is constant though, you could also do something simpler using a SUBSTRING function.
EDIT in response to OP comment...
Since you're using SQL Server, and you said that these values are always in a consistent format, you can do something as simple as using SUBSTRING to get each part of the value and assign them to T-SQL variables, where you can then use them to do whatever you want, like using them in the predicate of a query.
Assuming that what you said is true about the format always being #-### (exactly 1 digit, a dash, and 3 digits) this is fairly easy.
WITH EquipmentSettings AS (
SELECT
S.*,
Convert(int, Substring(S.AwfulMultivalue, V.Value * 6 - 5, 1) EquipmentID,
Convert(int, Substring(S.AwfulMultivalue, V.Value * 6 - 3, 3) Settings
FROM
SourceTable S
INNER JOIN master.dbo.spt_values V
ON V.Value BETWEEN 1 AND Len(S.AwfulMultivalue) / 6
WHERE
V.type = 'P'
)
SELECT
E.Whatever,
D.Whatever
FROM
EquipmentSettings E
INNER JOIN DestinationTable D
ON E.EquipmentID = D.EquipmentID
AND E.Settings = D.Settings
In SQL Server 2005+ this query will support 1365 values in the string.
If the length of the digits can vary, then it's a little harder. Let me know.
Incase if the sets does not increase by more than 4 then you can use Parsename to retrieve the result
Declare #Num varchar(20)
Set #Num='1-117 3-134 3-133'
select parsename(replace (#Num,' ','.'),3)
Result :- 1-117
Now again use parsename on the same resultset
Select parsename(replace(parsename(replace (#Num,' ','.'),3),'-','.'),1)
Result :- 117
If the there are more than 4 values then use split functions