PostgreSQL: Joining Tables Based on Searched Concatenated Strings - sql

I'm not sure how to write a join clause that takes a value from table 1, then searches a string in table 2 to see if they match. Sound confusing?
Here's the actual example I'm working with.
Table 1
Customer_Id Concat_Phone_Numbers
1 8888888888;1111111111
Table 2
Caller Callee Calldate
1111111111 3333333333 1/1/1900
I want to create a table that looks like this:
Desired Table
Customer_Id Calldate
1 1/1/1900
I'm lost when it comes to writing the join clause so that the entire list in Table 1's second column is searched for a matching phone number/entry.
Thank you in advance for your help! (PS it's my first time asking a question!)
Edit::
Here's where I'm at now
Select
*
from table1
left join table2
on ??????????????????

Yuck! You should fix the data structure. You really need a table with one row per customer and per phone number. You'll understand why if you care about performance.
But, if you are stuck with this data model, you can do a join using string and/or array operations. Here is a method using regular expressions
select . . .
from table1 t1 left join
table2 t2
on t2.caller ~ '^' || replace(t1.phone_numbers, ';', '|') || '$' or
t2.callee ~ '^' || replace(t1.phone_numbers, ';', '|') || '$' ;

Related

Join two tables in mssql with column int and varchar having comma seperated values

I have 2 tables as tblemployee and tbllead . My issue is in tblemployee i have column id as a integer value and tbllead column leademployees having varchar with comma seperated value.
So when I am joining this two table i am facing problem.
select * from tblemployee as e join tbllead as l
on e.id=l.LeadEmployees
Getting:- Error converting data type varchar to bigint.
My desired result should have all the columns values except 4.
Thanks in Advance...
Here is a general solution which should work on any standard database:
SELECT DISTINCT
t1.ID
FROM tblemployee t1
INNER JOIN tbllead t2
ON ',' + t2.leademployees + ',' LIKE '%,' + CONVERT(varchar(10), t1.ID) + ',%'
ORDER BY
t1.ID;
This query uses a trick I first saw being used by #CL , which can best be explained by showing some data. The first row of tbllead has this data:
1,2,6,7
The comparison for the inner numbers and outer numbers is different, due to the presence/abseence of one of the commas on either side. But if we concantenate commas around this string, we get the following:
,1,2,6,7,
Now, we can just compare every ID from the tblemployee table, surrounded by commas, against the list of employees on every record. That is, compare ,1, ,2, etc.
Here is a demo for SQL Server:
Demo
If you are using MySQL, the above query would change slightly. Also, there would be an even tighter option in MySQL using FIND_IN_SET. But the answer I gave is more useful in my view, because it can easily be applied regardless of the database.

Inner-Join on two column where one column has a single tailing character

Hi I'm new to SQL and I have 2 tables that I am trying to do an inner-join with.
------------------------
First table:
------------------------
ID-Number CustomerName
------------------------
Second table
------------------------
ID-Number CustomerDevice
(ID with a single tailing character)
Questions
What would be the best preforming way to execute the inner-join on both table's ID-number?
Is there a method to remove the trailing character within the inner-join command?
You don't have much choice. Here is how you can express the logic:
select . . .
from t1 join
t2
on t1.id like t2.id + '_';
Unfortunately, this may not make use of indexes. (Also note that + for string concatenation is SQL Server-specific).
You might be able to rewrite the query as:
on t1.id = left(t2.id, len(t2.id) - 1)
This should be able to use an index on t1(id).
The best approach is to fix the data, so your ids are the same type, same length, and have a properly declared foreign key relationship. Another alternative available in SQL Server is an index on a computed column:
alter table t2 add realId as (left(id, len(id) - 1));
create index idx_t2_realId on t2(realId);
Then write the join logic using realId.
Would this work?
SELECT
ID-Number,
CustomerName,
CustomerDevice
FROM t1
INNER JOIN t2 on t1.ID-Number=LEFT(t2.ID-Number,LEN(t2.ID-Number)-1)
EDIT: Forgot the 1
Given that the table Customer has this column
ID_number int not null;
And the the table Device has this column
ID_number varchar(15);
And we know that Device.ID_number, if it is not NULL, is always equal to some Customer.ID_number with a letter appended, then (SQL Server):
SELECT *
FROM Customer c
JOIN Device d
ON c.ID_number = CAST(SUBSTRING(i.ID_number, 1, LEN(i.ID_number) - 1) AS int)
More robust solutions that allow for more possibilities in the data require more defensive coding. You may want to define a scalar function to process Customer.ID_number.

sql newbie trying to cross reference data in two tables

I have two tables, DATA_TABLE and ERROR_TABLE. DATA_TABLE has a column ID with an 8 digit number. ERROR_TABLE has a column KEY with idkey= followed by eight digits.
I can't use an intersect because the tuples are different and I don't know how to work around it.
What I need is to find the rows in DATA_TABLE that have a value in the ID column that corresponds to a value in the ERROR_TABLE in the KEY column.
Here's a visual. I need to see which of these match and and display those rows from the DATA_TABLE
DATA_TABLE
ID
24294857
19573859
49205983
ERROR_TABLE
KEY
idkey=24294857
idkey=66849896
idkey=94697356
i would use a simple inner join
select *
from DATA_TABLE
inner join ERROR_COLUMN
on DATA_TABLE.ID = ERROR_TABLE.KEY
select dt.*
from data_table dt
join error_table et
on et.idkey = dt.id
Similar to PM 77-1, but using an explicit join:
SELECT *
FROM DATA_TABLE
JOIN ERROR_TABLE
ON 'idkey=' || ID = KEY;
See: http://sqlfiddle.com/#!4/dbe73/3
Do you mean something like this?
SELECT * FROM DATA_TABLE d, ERROR_TABLE e
WHERE ('idkey='||d.id = e.key)
ORDER BY DATA_TABLE.id
Best thing to do is correct your data in Error Table. These examples will cut the text leaving digits only. You may explicitly convert them to number with TO_NUMBER() function to join and compare. You already have plenty join examples. But before joining you need to convert both sides to number cutting off the characters in front of numbers:
-- This example assumes you will always have '=' in your string --
SELECT SUBSTR('idkey=24294857', INSTR('idkey=24294857', '=')+1) digits_only
FROM dual
/
SELECT regexp_replace('idkey=24294857','[^0-9]') digits_only
FROM dual;
/
SELECT REGEXP_SUBSTR('idkey=24294857','[[:digit:]]+') AS digits_only FROM dual
/
After this your query will compare number to number in join, which is the only right thing to do.
Thanks for the help everyone!
I tried every suggestion, but couldn't get any to work. I'm really green with SQL and only understand a few commands.
Finally, I got this from a colleague and it accomplished what I need. Now, I just need to understand it...
SELECT *
FROM DATA_TABLE
WHERE EXISTS
(SELECT NULL
FROM ERROR_TABLE
WHERE ID = SUBSTR(ERROR_TABLE.KEY,7));

select query to perform data existance in other table where data need to split and each part need to check

I dont want to use any function or any procedure.
I want simple select query to check the existance of the each part of string.
like i have one table dummy which have name column
Id name
1 as;as;as
2 asd;rt
and child table
child_id name
23 as
24 asd
25 rt
so any i can do that
i have tried like
select substr(first_name,1,instr(first_name,';')-1) from dummy;
select substr(first_name,instr(first_name,';')+1,instr(first_name,';')-1)
from dummy;
Which is giving only first/second part but other part
how to get other part
If I've got it right - You need to join these tables if child's NAME is included in a DUMMY.Name
SQLFiddle example
select t1.*,
t2.child_id,
t2.name as t2name
from t1
left join t2 on (';'||t1.name||';' like '%;'||t2.name||';%')
I would need more information on this question. We do not know if you have to detect more than one of the possible strings on just one field.
You could use three like clauses for the three possible scenarios
LIKE column_name ||'%;'
LIKE '%;'|| column_name
LIKE ';%'|| column_name ||'%;'
But it would probably work better for the future learning about building regular expressions. Here is a webpage that helped me a lot: txt2re.com

How to compare string data to table data in SQL Server - I need to know if a value in a string doesn't exist in a column

I have two tables, one an import table, the other a FK constraint on the table the import table will eventually be put into. In the import table a user can provide a list of semicolon separated values that correspond to values in the 2nd table.
So we're looking at something like this:
TABLE 1
ID | Column1
1 | A; B; C; D
TABLE 2
ID | Column2
1 | A
2 | B
3 | D
4 | E
The requirement is:
Rows in TABLE 1 with a value not in TABLE 2 (C in our example) should be marked as invalid for manual cleanup by the user. Rows where all values are valid are handled by another script that already works.
In production we'll be dealing with 6 columns that need to be checked and imports of AT LEAST 100k rows at a time. As a result I'd like to do all the work in the DB, not in another app.
BTW, it's SQL2008.
I'm stuck, anyone have any ideas. Thanks!
Seems to me you could pass ID & Column1 values from Table1 to a Table-Valued function (or a temp table in-line) which would parse the ;-delimited list, returning individual values per record.
Here are a couple options:
T-SQL: Parse a delimited string
Quick T-Sql to parse a delimited string
The result (ID, value) from the function could be used to compare (unmatched query) against values in Table 2.
SELECT tmp.ID
FROM tmp
LEFT JOIN Table2 ON Table2.id = tmp.ID
WHERE Table2.id is null
The ID results of the comparison would then be used to flag records in Table 1.
Perhaps inserting those composite values into 'TABLE 1' may have seemed like the most convenient solution at one time. However, unless your users are using SQL Server Management Studio or something similar to enter the values directly into the table then I assume there must be a software layer between the UI and the database. If so, you're going to save yourself a lot headaches both now and in the long run by investing a little time in altering your code to split the semi-colon delimited inputs into discrete values before inserting them into the database. This will result in 'TABLE 1' looking something like this
TABLE 1
ID | Column1
1 | A
1 | B
1 | C
1 | D
It's then trivial to write the SQL to find those IDs which are invalid.
If it is possible, try putting the values in separate rows when importing (instead of storing it as ; separated).
This might help.
Here is an easy and straightforward solution for the IDs of the invalid rows, despite its lack of performance because of string manipulations.
select T1.ID
from [TABLE 1] T1
left join [TABLE 2] T2
on ('; ' + T1.COLUMN1 + '; ') like ('%; ' + T2.COLUMN2 + '; %')
where T1.COLUMN1 is not null
group by T1.ID
having count(*) < len(T1.COLUMN1) - len(replace(T1.COLUMN1, ';', '')) + 1
There are two assumptions:
The semicolon-separated list does not contain duplicates
TABLE 2 does not contain duplicates in COLUMN2.
The second assumption can easily be fixed by using (select distinct COLUMN2 from [TABLE 2]) rather than [TABLE 2].