SQL: Can I use CHARINDEX to return the best match not just the first match? - sql

http://sqlfiddle.com/#!6/5ac78/1
Not sure if that fiddle will work. I want to return code 2 from the join on CHARINDEX.
As another example, I have a Description table (dt) that looks like this:
ID Description Code
158 INTEREST 199
159 INTEREST PAID 383
160 INTEREST PAYABLE ON ACCOUNT 384
And a master table (mt) with entries like this:
ID Narrative Code
1 INTEREST PAID NULL
I need to set the Code on the master table to 383. When I do an INSERT based on a JOIN using CHARINDEX(dt.Description, mt.Description) > 0, it sets the mt.Code to 199 every time.
How I can update the master table to pull the Code from Description table with the best match, not just the first matching instance?
Thanks!

You could just use a simple JOIN to find a match with a LEFT JOIN to eliminate all but the longest match;
UPDATE t1
SET t1.codeA = t2_1.codeB
FROM table1 t1
JOIN table2 t2_1
ON CHARINDEX(t2_1.colB, t1.colA) > 0
LEFT JOIN table2 t2_2
ON CHARINDEX(t2_2.colB, t1.colA) > 0
AND t2_1.codeB <> t2_2.codeB
AND LEN(t2_2.colB) > LEN(t2_1.colB)
WHERE t2_2.colB IS NULL;
An SQLfiddle to test with.
Note that it's (probably) not possible to make a CHARINDEX query like this one (or your original query) use indexes, so the query may be very slow for large amounts of data.
Also, always test first before running SQL updates from random people on the Internet on your production data :)

This is awkward, but it seems to work:
update table1
set codeA = (
select max(codeB)
from table2
where charindex(colB, colA) > 0
)
where exists (
select 1
from table2
where charindex(colB, colA) > 0
);
Revised fiddle is here: http://sqlfiddle.com/#!6/5ac78/12
The problem is knowing which is the "best" value to return. I have assumed the row with the maximum ID is the one you want.

Related

Including multiple columns in NOT IN

I have two tables as below.
Table 1
Book price
A 100
B 200
C 400
D 300
Table 2
Book price
A 100
B 200
C 400
Now I am executing below command as I want only the 4th record to get inserted into table 2. I want to add both the column names before NOT IN. what should I do?
Insert into table2 select * from table1 t1 where t1.book not in (select book from table2);
You can use NOT EXISTS along with matching the presumably primary key columns Book for both of the tables such as
INSERT INTO table2
SELECT *
FROM table1 t1
WHERE NOT EXISTS ( SELECT 0 FROM table2 WHERE Book=t1.Book);
Demo
P.S.: Should be careful about NULL values while using NOT IN operator. Moreover, using NOT IN is mostly less performant than using NOT EXISTS
I tried it and it looks fine to me.
may be you run the script twice so it will show 0 rows created in the second time.
make sure you committed the row.

SQL Combining two different tables

(P.S. I am still learning SQL and you can consider me a newbie)
I have 2 sample tables as follows:
Table 1
|Profile_ID| |Img_Path|
Table 2
|Profile_ID| |UName| |Default_Title|
My scenario is, from the 2nd table, i need to fetch all the records that contain a certain word, for which i have the following query :
Select Profile_Id,UName from
Table2 Where
Contains(Default_Title, 'Test')
ORDER BY Profile_Id
OFFSET 5 ROWS
FETCH NEXT 20 ROWS ONLY
(Note that i am setting the OFFSET due to requirements.)
Now, the scenario is, as soon as i retrieve 1 record from the 2nd table, i need to fetch the record from the 1st table based on the Profile_Id.
So, i need to return the following 2 results in one single statement :
|Profile_Id| |Img_Path|
|Profile_Id| |UName|
And i need to return the results in side-by-side columns, like :
|Profile_Id| |Img_Path| |UName|
(Note i had to merge 2 Profile_Id columns into one as they both contain same data)
I am still learning SQL and i am learning about Union, Join etc. but i am a bit confused as to which way to go.
You can use join:
select t1.*, t2.UName
from table1 t1 join
(select Profile_Id, UName
from Table2
where Contains(Default_Title, 'Test')
order by Profile_Id
offset 5 rows fetch next 20 rows only
) t2
on t2.profile_id = t1.profile_id
SELECT a.Profile_Id, a.Img_Path, b.UName
FROM table1 a INNER JOIN table2 b ON a.Profile_Id=b.Profile_Id
WHERE b.Default_Title = 'Test'

SQL - Stripping a string and using it in a condition

So I have a SQL query issue given to me which i'm struggling to resolve:
It currently brings back 6710445 rows but i need to apply further conditions based on a particular string field.
SELECT
Table1.ExampleColumn1 -- (ID)
,Table1.ExampleColumn2
,Table2.ExampleColumn3
,Table2.ExampleColumn4
,Table3.ExampleColumn5
,Table3.ExampleColumn6
,Table1.StringField
FROM [Example Database].[dbo].[Table1] AS Table1
INNER JOIN [Example Database].[dbo].[Table2] AS Table2
ON Example = Example
INNER JOIN [Example Database].[dbo].[Table3] AS Table3
ON Example = Example
WHERE Month BETWEEN 201304 AND 201603
AND (Age < 19)
The above 'Table1.StringField' has the following type codes displayed as a string in each the rows: "||J183,Y752,J374,Y752."
I also have a reference table (Call it 'Ref1') with 514 of these codes displayed individually, which has no other fields in the table whatsoever.
So what i need to be able to do is find rows from the query above which has any of values from the 'Ref1' displayed anywhere within 'Table1.StringField' individual rows, and if not to not include that row in the results set.
I tried to strip down the 'StringField' column of the comma's and "||" but it didn't work as well as i hoped and ended up bringing back over 30M rows.
Any ideas on how to do this? Preferably so it's efficient and doesn't make the user wait 10 minutes just to query it?
Maybe this will get you half way there... I also agree with Sean Lange's comment about not storing delimited data to begin with but I'm assuming the OP already knows this. You can also pivot/unpivot this data to achieve this as well. This is probably the most brute force way of doing sort of what you're looking to do.
--DROP TABLE #Table
--DROP TABLE #Ref
CREATE TABLE #Table (Col VARCHAR(MAX))
CREATE TABLE #Ref (Code VARCHAR(10))
INSERT INTO #Table (Col) VALUES ('A123,B234,C345'),('A123'),('C345')
INSERT INTO #Ref (Code) VALUES ('A123'),('B234')
SELECT * FROM #Table
SELECT * FROM #Ref
SELECT DISTINCT t.Col
FROM #Table t
CROSS APPLY (
SELECT CASE WHEN CHARINDEX(r.Code, t.Col) > 0 THEN 1 ELSE 0 END AS [ItsHere] FROM #Ref r) oa
WHERE oa.ItsHere = 1
What you need to do is join your query to the Ref1 table on Table1.StringField = Ref1.Ref_1_value and then exclude the Table1 rows that don't match any Ref_1_value. Like this:
SELECT
Table1.ExampleColumn1 -- (ID)
,Table1.ExampleColumn2
,Table2.ExampleColumn3
,Table2.ExampleColumn4
,Table3.ExampleColumn5
,Table3.ExampleColumn6
,Table1.StringField
FROM [Example Database].[dbo].[Table1] AS Table1
INNER JOIN [Example Database].[dbo].[Table2] AS Table2
ON Example = Example
INNER JOIN [Example Database].[dbo].[Table3] AS Table3
ON Example = Example
INNER JOIN [Example Database].[dbo].[Ref1] as Ref1
ON Table1.StringField = Ref1.Ref_1_value
WHERE Month BETWEEN 201304 AND 201603
AND (Age < 19)
AND Ref1.Ref_1_value is not null

SQL Sybase Query Strange Behaviour

I've got 2 tables with exactly the same structure in the same Sybase database but they're separate tables.
This query works on one of the 2:
select * from table1 where
QUOTA_FIELD >
(SELECT
count(ACCOUNT) FROM
table1 As t1
where SECTOR = t1.SECTOR
AND
STATUS = 'QUOTA'
)
But for the second table I have to change it to this:
select * from table2 as tref where
QUOTA_FIELD >
(SELECT
count(ACCOUNT) FROM
table2 As t2
where tref.SECTOR = t2.SECTOR
AND
STATUS = 'QUOTA'
)
There's a restriction on where this will execute which means it needs to work like in the first query.
Does anyone have any ideas as to why the first might work as expected and the second wouldn't?
Since I am not yet allowed to comment, here as an answer to the question "does anyone...?":
No. I couldn't find anyone :)
This first query cannot work correctly, since it compares a column with itself (as long as the column names are all normal ASCII characters and not some similar looking UNICODE ones). Please give a proof that the result of this query is in every case the same as of query 2.
Also, the second query would normally be done like that: where SECTOR = tref.SECTOR...
You might be looking for something like this in query #1 :
select * from table1 t2 where
QUOTA_FIELD >
(SELECT
count(ACCOUNT) FROM
table1 As t1
where t2.SECTOR = t1.SECTOR
AND
t1.STATUS = 'QUOTA'
)
This explicitly specifies that the table in subquery is joining with the table in outer query ( co-related subquery ).
If this works, use the same idea in query #2

MS SQL - Joining on two tables with a substringed key in one column

I have a 2 tables I need to join, however on one of the tables I need to extract a key from a varchar field in each row.
Table 1 Description (numeric 18,varchar 4000)
descriptionid description
1 Blah Blah: Queue 1Blah Blah
2 foobar:Queue 2
3 rem:Queue 2 -This is a note
4 Anotherrow: Queue 3
5 Something else
Table 2 Queue - (numeric 18, varchar 100)
queueid queue
123 Queue 1
124 Queue 2
127 Queue 3
129 Queue 4
So I need to produce the output like so
View 3 Queue-Description (numeric 18, numeric 18)
descriptionid queueid
1 123
2 124
3 124
4 127
5 null
So in table 1 row 1, I need to strip out the value Queue1 from the description, verify it is in the queue table, and lookup the queueid.
I am unable to change the structure of tables 1 and 2.
What ways can this be achieved in MSSQL?
What is the most efficient way to do this in SQL - using MSSQL 2005 here.
most efficient way
Well... don't know about that but it is a way.
select T1.descriptionid,
T2.queueid
from Table1 as T1
left outer join Table2 as T2
on T1.description like '%'+T2.queue+'%'
Another way
select T1.descriptionid,
T2.queueid
from Table1 as T1
left outer join Table2 as T2
on charindex(T2.queue, T1.description, 1) > 0
If there are more than one match (see comment by Ed Harper) you can use this to pick the one with the longest match.
select T1.descriptionid,
T2.queueid
from Table1 as T1
outer apply (
select top 1 T3.queueid
from Table2 as T3
where charindex(T3.queue, T1.description, 1) > 0
order by len(T3.queue) desc
) as T2(queueid)
The most efficient way to do this is to add an extra column to your table and insert the extracted the ID from the string. You can do this when rows are added and you can process the existing ones fairly easily. But trying to left join like this will be very slow.
In Sql Server 2005 you can extract your queue string using regex. The Data Extraction section on this page contains an example.
In a stored procedure you can then build an indexed temp table that contains a new column - this allows you to do this without changing the table metadata).
If you can change the table metadata you can:
Trigger the content into another column (on insert).
Or if the information is not needed immediately a daily sql job could extract the information.