Pattern matching against two tables

Pattern matching against two tables - sql

I have two tables with lists of files: table1 has column1 with file names that begin with the string 'STO-' followed by a string pattern. So in total, the string has 16 alphanumberic characters (with dashes its a 20-character string).
This is similar to the same file name string found in table2, column1. The issue, however, is that in the first table there are also additional text and characters appended to that 20-digit string. I'm attempting to match results from both tables where those 20-digit character strings match, along with additional information from the table. I've found plenty of information about pattern matching within a table, but not comparing two tables. Hopefully I'm explaining myself and can provide an example to help:
TABLE1.Column1 contains a file name 'STO-100-XX-XXXX-XXXX_Text.pdf '
TABLE2.Column2 contains a file name 'STO-100-XX-XXXX-XXXX.pdf' and TABLE2.Column3='Y'
So again, I'm trying to see the list of files from TABLE1 that where the first 20 alphanumeric character string has a match from TABLE2.

SELECT * from TABLE1 t1
INNER JOIN TABLE2 t2
ON SUBSTRING(t1.Column1, 1, 20) = SUBSTRING(t2.Column2, 1, 20)
(tested on SQL Server 2005, but I believe SUBSTRING is an ANSI SQL function, so should work on most databases).
Also, it's a little unclear from your question, but if you additionally wanted to restrict the results based on column3, you would simply do
SELECT * from TABLE1 t1
INNER JOIN TABLE2 t2
ON SUBSTRING(t1.Column1, 1, 20) = SUBSTRING(t2.Column2, 1, 20)
WHERE t2.Column3 = 'Y'

Related

comparing column values in SQL

I am working on multiple tables in PostgreSQL and I want to compare the column values of two tables but one of them is written differently here is how
it's the same values but one of them starts with three Zeros.
I've tried this
select * from table1, table2
where table1.projectid=table2.project_id and operating_unit= 'USA'
I even tries to replace '=' with 'IN' but both return an empty table

Your code should work if one of the values is a number. But neither are. How about converting them?
select *
from table1 join
table2
on table1.projectid::numeric = table2.project_id::numeric and
operating_unit = 'USA';
Sadly, having to convert on both ends precludes the use of indexes. So an alternative is to just change one side or the other:
select *
from table1 join
table2
on '000' || table1.projectid] = table2.project_id and
operating_unit = 'USA';
At least this makes it possible to use an index.

SQL UPDATE part of a string with value from other table

I need to replace part of a string with a value from another database table. Actually I need to replace the userids with emails.
DB1.TABLE1
ID|EMAIL
1 |johndoe; janedoe;
2 |otherguy; johndoe;
DB2.TABLE2
ID|USERID |EMAIL
1 |johndoe |johndoe#test.com
2 |janedoe |janedoe#test.com
3 |otherguy|otherguy#test.com
my query
UPDATE
TABLE1
set
EMAIL = TABLE2.EMAIL
from
DB2.TABLE2
where
TABLE1.EMAIL = TABLE2.USERID
How can I specify the "part of the string" thing ?

There are a number of comments about changing your schema...which would be the best way forward.
It looks like what you are storing in table1.email is actually a list of UserId from table2. So you'll need to break out these ids in order to join to the tables together.
If you absolutely must follow this path, then there are existing Q+As on the site that will help you:
(I've taken a leap of faith that you are using SQL server ... but if you search I'm sure you can find similar answers for other RDBMSs)
Turning a Comma Separated string into individual rows
and
Multiple rows to one comma separated value

I guess you need the following
UPDATE TABLE1
SET EMAIL = (
SELECT TABLE2.EMAIL
FROM TABLE2
WHERE TABLE1.EMAIL LIKE TABLE2.USERID + '%');
demo

PL/SQL Developer: joining VARCHAR2 to NUMBER?

Here's where I am:
TABLE1.ITM_CD is VARCHAR2 datatype
TABLE2.ITM_CD is NUMBER datatype
Executing left join TABLE2 on TABLE1.ITM_CD = TABLE2.ITM_CD yields ORA-01722: invalid number error
Executing left join TABLE2 on to_number(TABLE1.ITM_CD) = TABLE2.ITM_CD also yields ORA-01722: invalid number error.
-- I suspect this is because one of the values in TABLE1.ITM_CD is the string "MIXED"
Executing left join TABLE2 on TABLE1.ITM_CD = to_char(TABLE2.ITM_CD) successfully runs, but it returns blank values for the fields selected from TABLE2.
Here is a simplified version of my working query:
select
A.ITM_CD
,B.COST
,B.SIZE
,B.DESCRIPTION
,A.HOLD_REASON
from
TABLE1 a
left join TABLE2 b on a.ITM_CD = to_char(b.ITM_CD)
This query returns a list of item codes and hold reasons, but just blank values for the cost, size, and descriptions. And I did confirm that TABLE2 contains values for these fields for the returned codes.
UPDATE: Here are pictures with additional info.
I selected the following info from ALL_TAB_COLUMNS--I don't necessarily know what all fields mean, but thought it might be helpful
TABLE1 sample data
TABLE2 sample data

You can convert the TABLE1.ITM_CD to a number after you strip any leading zeros and filter out the "MIXED" values:
select A.ITM_CD
,B.COST
,B.SIZE
,B.DESCRIPTION
,A.HOLD_REASON
from ( SELECT * FROM TABLE1 WHERE ITM_CD <> 'MIXED' ) a
left join TABLE2 b
on TO_NUMBER( LTRIM( a.ITM_CD, '0' ) ) = b.ITM_CD

This is a SQL (and really a database) problem, not PL/SQL. You will need to fix this - fight your bosses if you have to. Item code must be a Primary Key in one of your two tables, and a Foreign Key in the other table, pointing to the PK. Or perhaps Item code is PK in another table which you didn't show us, and Item code in both the tables you showed us should be FK pointing to that PK.
You don't have that arrangement now, which is exactly why you are having all this trouble. The data type must match (it doesn't now), and you shouldn't have values like 'MIXED' - unless your business rules allow it, and then the field should be VARCHAR2 and 'MIXED' should be one of the values in the PK column (whichever table that is in).
In your case, the problem is that the codes in VARCHAR2 format start with a leading 0, so if you compare to the numbers converted to strings, you never get a match (and in the outer join, the match is always assumed to be to NULL).
Instead, when you convert your numbers to strings, add leading zero(s) like this:
...on a.ITM_CD = TO_CHAR(b.ITM_CD, '099999')

You can trim leading zeros from your string column.
select
A.ITM_CD
,B.COST
,B.SIZE
,B.DESCRIPTION
,A.HOLD_REASON
from
TABLE1 a
left join TABLE2 b on trim(LEADING '0' FROM a.ITM_CD ) = to_char(b.ITM_CD)

Select values from one table depending on referenced value in another table

I have two tables in my SQLite Database (dummy names):
Table 1: FileID F_Property1 F_Property2 ...
Table 2: PointID ForeignKey(fileid) P_Property1 P_Property2 ...
The entries in Table2 all have a foreign key column that references an entry in Table1.
I now would like to select entries from Table2 where for example F_Property1 of the referenced file in Table1 has a specific value.
I tried something naive:
select * from Table2 where fileid=(select FileID from Table1 where F_Property1 > 1)
Now this actually works..kind of. It selects a correct file id from Table1 and returns entries from Table2 with this ID. But it only uses the first returned ID. What I need it to do is basically connect the returned IDs from the inner select by OR so it returns data for all the IDs.
How can I do this? I think it is some kind of cross-table-query like what is asked here What is the proper syntax for a cross-table SQL query? but these answers contain no explaination of what they are actually doing so I'm struggeling with any implementation.
They are using JOIN statements, but wouldn't this mix entries from Table1 and Table2 together while only checking matching IDs in both tables? At least that is how I understand this http://www.codeproject.com/Articles/33052/Visual-Representation-of-SQL-Joins
As you may have noticed from the style, I'm very new to using databases in general, so please forgive me if not everything is clear about what I want. Please leave a comment and I will try to improve the question if neccessary.

The = operator compares a single value against another, so it is assumed that the subquery returns only a single row.
To check whether a (column) value is in a set of values, use IN:
SELECT *
FROM Table2
WHERE fileid IN (SELECT FileID
FROM Table1
WHERE F_Property1 > 1)

The way joins work is not by "mixing" the data, but sort of combining them based on the key.
In your case (I am assuming the key field in Table 1 is unique), if you join those two tables on the primary key field, you will end up with all the entries in table2 plus all corresponding fields from table1. If you were doing this:
select * from table1, table2 where table1.fieldID=table2.foreignkey;
then, providing your key fields are set up right, you will end up with the following:
PointID ForeignKey(fileid) P_Property1 P_Property2 FileID F_Property1 F_Property2
The field values from table1 would be from matching rows.
Now, if you do this:
select table1.* from table 1, table2 where
table1.fieldID=table2.foreignkey and F_Property1>1;
Would essentially get the same set of records, but will only show the columns from the second table, and only those that satisfy the where condition for the first one.
Hope this helps :)

If I understood your question correctly this will get the job done.
Select t2.*
from table1 t1
inner join table2 t2 on t2.id = t1.id
where t1.Prop = 'SomeValue'

CharIndex returning null when I know there is overlap in two strings

I have created a new column in my table(table1) . I am trying to populate it with data from another table, table2.
Table1 has a column called 'Name'. 'Name' contains a substring indicating the language of the column. I wish to compare this substring with the 'Language' column of table2, which contains the substring in the name column and insert the corresponding LanguageID into my new column.
So, for instance :
table1
Name
xxXxxxXxxxxxzxzxzxz xxxazxzxxXXXZxxzxzx 2183909213 ENG-UK nfjksdnfnd 723984782347
and table2 :
table2
Language | ID
ENG-uk | 1
In the table1 name column, the string before and after the Language can take any form, a varying number of characters. The language will always have a space before and after it.
So, I want to end up with :
table1
Name | LanguageID
xx... | 1
I have this query which I believe should work :
INSERT INTO table1 (LanguageID)
SELECT t2.ID FROM table2 t2, table1 t1 WHERE CHARINDEX(LOWER(t2.Language), LOWER(t1.Name)) != null
The problem is, when I run this...."(0 row(s) affected)", which should not be the case.
Does anyone have any ideas ?

The reason that you don't get any matches at all is that you can't use the != operator to compare null values, you have to use is not null for that.
However, that will give you a very big result, as the return value from charindex is never null. When the string isn't found it returns zero, so that is what you should compare against.
Also, you can't insert columns, you have to first add the column to the table, then update the records:
update t1
set LanguageID = t2.ID
from table1 t1
inner join table2 t2 on charindex(lower(t2.Language), lower(t1.Name)) != 0

1st, CHARINDEX returns 0 when search string does not exist. It doesn't do null.
See http://msdn.microsoft.com/en-us/library/ms186323.aspx
2nd, I think you should UPDATE, not INSERT.
example (this won't work correctly if t2.Language is not UNIQUE):
UPDATE table1 t1
SET t1.LanguageID = (SELECT t2.ID from table2 t2 where CHARINDEX(LOWER(t1.Name), LOWER(t2.Language))>0)
where exists (SELECT t2.ID from table2 t2 where CHARINDEX(LOWER(t1.Name), LOWER(t2.Language))>0)

You should consider adding individual columns for the first table information. Otherwise, you will end up with Performance Issues due to the SubString Operation.
It's clear from the first table that the table schema is not Normalized. Moreover, the First table schema is not suitable for any Search/Sorting operations.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas