Find text of a column in other column with T-SQL - sql

I have a doubt to find text from one column in a table; in other column from other table.
Imagine that you have this columns:
And you want to find the COMPLETE text of [A].X in [B].Y
And to discover where do you have the match. The colour yellow show this choice:
I have been thinking to use the "CONTAINS" function, but I think that it could be not the best idea. Because you have to write the text that you need to find (instead of a complete text of a column).
CONTAINS T-SQL
I thought that it could be like this:
Use AdventureWorks2012;
GO
SELECT [B].Y
FROM Production.Product
WHERE CONTAINS(([A].Y), [A].X);
But it doesn't work.
Which is the best option?
I am using SQL SERVER V17.0
Thanks!!!

I would go for like:
select b.y, a.x
from b join
a
on b.y like '%' + a.x + '%' ;
There is not, however, a really efficient way to do this logic in SQL Server.

Here is a quick example searching a list of strings for a particular set of words defined in another table. This example just searches through the system error messages text and looks for the words 'overflow' and 'CryptoAPI', but you'll substitute the words table with your 'A' and the 'sys.messages' table with your table 'B'
NOTE: this isn't the most efficient way to search large amounts of text.
-- CREATE TEMP TABLE WITH WORDS TO MATCH
CREATE TABLE #words (
[Word] nvarchar(100)
)
-- SAMPLE STRINGS
INSERT INTO #words VALUES ('overflow')
INSERT INTO #words VALUES ('CryptoAPI')
-- SEARCH THROUGH SYSTEM ERROR MESSAGES FOR SAMPLE STRINGS
SELECT [W].[Word] AS 'Matched word'
, [M].[text]
FROM [sys].[messages] AS [M]
JOIN #words AS [W]
ON [M].[text] LIKE '%' + [W].[Word] + '%'

CREATE TABLE #TempA
(ColumnX VARCHAR(10)
);
CREATE TABLE #TempB
(ColumnY VARCHAR(100)
);
INSERT #TempA
VALUES('fish'),('burguer'),('sugar'),('tea'),('coffee'),('window'),('door');
INSERT #TempB
VALUES('I like potatoes'),('I eat sugar'),('I eat sugar with onions'), ('I have a car'),('I don''t like dogs');
SELECT *
FROM #TempB b
WHERE EXISTS(SELECT 1 FROM #TempA a WHERE CHARINDEX(a.ColumnX, b.ColumnY,1) > 0);

I agree with #GordonLinoff. Like is what i would also go for but i want to make one improvement in the answer provided.
CREATE TABLE #TempA (ColumnX VARCHAR(10));
CREATE TABLE #TempB (ColumnY VARCHAR(100));
INSERT #TempA
VALUES ('fish')
,('burguer')
,('sugar')
,('tea')
,('coffee')
,('window')
,('door');
INSERT #TempB
VALUES ('I steam potatoes')
, ('I like potatoes')
,('I eat sugar')
,('I eat sugar with onions')
,('I have a car coffee')
,('I don''t like dogs')
,('Window is clean')
,('Open the door')
;
SELECT b.ColumnY
,a.ColumnX
FROM #TempB b
INNER JOIN #TempA a ON ' '+ b.ColumnY + ' ' LIKE '% ' + a.ColumnX + ' %'
This will take care of the TEA to be not found in STEAM.

Related

SQL Server : exploding CSV for SELECT statement

I have a table structure as below;
id txtName intReferences
------------------------------
1 Fred 1,4,6,444,56,43,
2 Sam 5,33,5904,43
3 Tom 1200
4 Samantha 43,44,888,99
I'd like to write a T-SQL query to return all the records based on a series of numbers provided.
For example, querying for 43 would return Fred, Sam and Samantha. The catch is, when querying for 3, it shouldn't return results for Sam or Samantha, given that that isn't the number in its entirety. Looking for a direct and whole number match.
The CSV value may end in a comma.
I've tried to use the "IN" statement, but it returns results if any portion of the number exists. Ideally trying to achieve without creating a function given some database restrictions.
Use string_split():
select t.*
from t cross apply
string_split(t.intReferences, ',') s
where s.value = '3';
Then, fix your data model so your are not storing integer values in strings. This is bad, bad, bad. Here are some reasons why:
Numbers should be stored as numbers, not strings (using the correct type).
SQL Server has lousy string manipulation functions.
Only one value should be stored in a column.
Foreign key relationships should be properly declared.
Resulting queries cannot be optimized to using indexes or partitions.
SQL has a great way to store lists. It is called a table not a string.
Clearly, the best way to accommodate this situation is to have properly normalized data.
Another method for querying the data with the current structure would be to check for comma + (your number) + comma. Something like this...
Declare #Temp Table(id int, txtName varchar(200), intReferences varchar(200))
Insert Into #Temp Values(1, 'Fred', '1,4,6,444,56,43,')
Insert Into #Temp Values(2, 'Sam', '5,33,5904,43')
Insert Into #Temp Values(3, 'Tom', '1200')
Insert Into #Temp Values(4, 'Samantha', '43,44,888,99')
Select *
From #Temp
Where ',' + intReferences + ',' like '%,' + '43' + ',%'
Select *
From #Temp
Where ',' + intReferences + ',' like '%,' + '3' + ',%'

How to apply trim function inside this query [duplicate]

Below is simple sql query to select records using in condition.
--like this I have 6000 usernames
select * from tblUsers where Username in ('abc ','xyz ',' pqr ',' mnop ' );
I know there are LTrim & Rtrim in sql to remove the leading trailing spaces form left & right respectively.
I want to remove the spaces from left & right in all the usernames that I am supplying to the select query.
Note:-
I want to trim the values that I am passing in the in clause.(I don't want to pass LTrim & RTrim to each value passed).
There are no trailing space in the records but value that I am passing in the clause is copied from excel & then pasted in Visual Studio. Then using ALT key I put '(single quote) at the left & right sides of the string. Due to this some strings has spaces in the right side trailing.
How to use the trim function in the select query?
I am using MS SQL Server 2012
If I understand your question correctly you are pasting from Excel into an IN clause in an adhoc query as below.
The trailing spaces don't matter. It will still match the string foo without any trailing spaces.
But you need to ensure that there are no leading spaces.
As the source of the data is Excel why not just do it all there?
You can use formula
= CONCATENATE("'",TRIM(SUBSTITUTE(A1,"'","''")),"',")
Then copy the result (from column B in the screenshot above) and just need to trim off the extra comma from the final entry.
You can do like this:
select * from tblUsers where LTRIM(RTRIM(Username)) in (ltrim(rtrim('abc')),ltrim(rtrim('xyz')),ltrim(rtrim('pqr')),ltrim(rtrim('mnop')));
However, if you have permission to update the database. Please remove all the spaces in your Username field. It is really not good to do the query like this.
One way to tackle your problem and still be able to benefit from an index on username is to use a persisted computed column:
Setup
-- drop table dbo.tblUsers
create table dbo.tblUsers
(
UserId INT NOT NULL IDENTITY(1, 1) CONSTRAINT PK_UserTest PRIMARY KEY,
Username NVARCHAR(64) NOT NULL,
UsernameTrimmed AS LTRIM(RTRIM(Username)) PERSISTED
)
GO
-- other columns may be included here with INCLUDE (col1, col2)
CREATE INDEX IDX_UserTest ON dbo.tblUsers (UsernameTrimmed)
GO
insert into dbo.tblUsers (Username) VALUES ('abc '),('xyz '),(' pqr '), (' mnop '), ('abc'), (' useradmin '), ('etc'), (' other user ')
GO
-- some mock data to obtain a large number of records
insert into dbo.tblUsers (Username)
select top 20000 SUBSTRING(text, 1, 64) from sys.messages
GO
Test
-- this will use the index (index seek)
select * from tblUsers where UsernameTrimmed in (LTRIM(RTRIM('abc')), LTRIM(RTRIM(' useradmin ')));
This allows for faster retrievals at the expense of extra space.
In order to get rid of query construction (and the ugliness of many LTRIMs and RTRIMs), you can push searched users in a table that looks like tblUsers.
create table dbo.searchedUsers
(
Username NVARCHAR(64) NOT NULL,
UsernameTrimmed AS LTRIM(RTRIM(Username)) PERSISTED
)
GO
Push raw values into dbo.searchedUsers.Username column and the query should look like this:
select U.*
from tblUsers AS U
join dbo.searchedUsers AS S ON S.UsernameTrimmed = U.UsernameTrimmed
The big picture
It is way better to properly trim your data in the service layer of your application (C#) so that future clients of your table may rely on decent information. So, trimming should be performed both when inserting information into tblUsers and when searching for users (IN values)
select *
from tblUsers
where RTRIM(LTRIM(Username)) in ('abc','xyz','pqr','mnop');
Answer: SELECT * FROM tblUsers WHERE LTRIM(RTRIM(Username)) in ('abc','xyz','pqr','mnop');
However, please note that if you have functions in your WHERE clause it defeats the purpose of having an indexes on that column and will use a
scan than a seek.
I would propose you clean your data before inserting into tblUsers
I think you can try this:
Just replace the table2 with you table name form where you are getting the username
select * from tblUsers where Username in ((select distinct
STUFF((SELECT distinct ', ' + RTRIM(LTRIM(t1.Username))
from table2 t1
FOR XML PATH(''), TYPE
).value('.', 'NVARCHAR(MAX)')
,1,2,'') UserName
from table2 t) );
I'd do it in two step:
1) populate a temp table with all your strings with blanks
2) do a select with a subselect
create table a (a char(1))
insert into a values('a')
insert into a values('b')
insert into a values('c')
insert into a values('d')
create table #b (atmp char(5))
insert into #b values ('a ')
insert into #b values (' b')
insert into #b values (' c ')
select * from a where a in (select ltrim(rtrim(atmp)) from #b)

How to manipulate comma-separated list in SQL Server

I have a list of values such as
1,2,3,4...
that will be passed into my SQL query.
I need to have these values stored in a table variable. So essentially I need something like this:
declare #t (num int)
insert into #t values (1),(2),(3),(4)...
Is it possible to do that formatting in SQL Server? (turning 1,2,3,4... into (1),(2),(3),(4)...
Note: I can not change what those values look like before they get to my SQL script; I'm stuck with that list. also it may not always be 4 values; it could 1 or more.
Edit to show what values look like: under normal circumstances, this is how it would work:
select t.pk
from a_table t
where t.pk in (#place_holder#)
#placeholder# is just a literal place holder. when some one would run the report, #placeholder# is replaced with the literal values from the filter of that report:
select t.pk
from a_table t
where t.pk in (1,2,3,4) -- or whatever the user selects
t.pk is an int
note: doing
declare #t as table (
num int
)
insert into #t values (#Placeholder#)
does not work.
Your description is a bit ridicuolus, but you might give this a try:
Whatever you mean with this
I see what your trying to say; but if I type out '#placeholder#' in the script, I'll end up with '1','2','3','4' and not '1,2,3,4'
I assume this is a string with numbers, each number between single qoutes, separated with a comma:
DECLARE #passedIn VARCHAR(100)='''1'',''2'',''3'',''4'',''5'',''6'',''7''';
SELECT #passedIn; -->: '1','2','3','4','5','6','7'
Now the variable #passedIn holds exactly what you are talking about
I'll use a dynamic SQL-Statement to insert this in a temp-table (declared table variable would not work here...)
CREATE TABLE #tmpTable(ID INT);
DECLARE #cmd VARCHAR(MAX)=
'INSERT INTO #tmpTable(ID) VALUES (' + REPLACE(SUBSTRING(#passedIn,2,LEN(#passedIn)-2),''',''','),(') + ');';
EXEC (#cmd);
SELECT * FROM #tmpTable;
GO
DROP TABLE #tmpTable;
UPDATE 1: no dynamic SQL necessary, all ad-hoc...
You can get the list of numbers as derived table in a CTE easily.
This can be used in a following statement like WHERE SomeID IN(SELECT ID FROM MyIDs) (similar to this: dynamic IN section )
WITH MyIDs(ID) AS
(
SELECT A.B.value('.','int') AS ID
FROM
(
SELECT CAST('<x>' + REPLACE(SUBSTRING(#passedIn,2,LEN(#passedIn)-2),''',''','</x><x>') + '</x>' AS XML) AS AsXml
) as tbl
CROSS APPLY tbl.AsXml.nodes('/x') AS A(B)
)
SELECT * FROM MyIDs
UPDATE 2:
And to answer your question exactly:
With this following the CTE
insert into #t(num)
SELECT ID FROM MyIDs
... you would actually get your declared table variable filled - if you need it later...

How to minimize sql select?

I have an array of words like this one:
$word1 = array('test1','test2','test3','test4','test5',...,'test20');
I need to search in my table every row that has at least one of these words in the text column. So far, I have this sql query:
SELECT * FROM TABLE WHERE text LIKE '$word1[0]' OR text LIKE '$word1[1]'
OR ... OR text LIKE '$word1[20]'
But I see that this design isn't very efficient. Is there any way I can shorten this query, in such a way that I don't need to write out every word in the where clause?
Example SELECT * FROM TABLE WHERE text IN ($word1)
P.S.: this is an example of what I'm looking for, not an actual query I can run.
If you use a table variable instead of a list to store your words then you can use something like:
DECLARE #T TABLE (Word VARCHAR(255) NOT NULL);
INSERT #T (Word)
VALUES ('test1'), ('test2'), ('test3'), ('test4'), ('test5'), ('test20');
SELECT *
FROM TABLE t
WHERE EXISTS
( SELECT 1
FROM #T
WHERE t.Text LIKE '%' + word + '%'
);
You can also create a table type to store this, then you can pass this as a parameter to a stored procedure if required:
CREATE TYPE dbo.StringList (Value VARCHAR(MAX) NOT NULL);
GO
CREATE PROCEDURE dbo.YourProcedures #Words dbo.StringList READONLY
AS
SELECT *
FROM TABLE t
WHERE EXISTS
( SELECT 1
FROM #Words w
WHERE t.Text LIKE '%' + w.word + '%'
);
GO
DECLARE #T dbo.StringList;
INSERT #T (Value)
VALUES ('test1'), ('test2'), ('test3'), ('test4'), ('test5'), ('test20');
EXECUTE dbo.YourProcedure #T;
For more on this see table-valued Parameters on MSDN.
EDIT
I may have misunderstood your requirements as you used LIKE but with no wild card operator, in which case you can just use IN, however I would still recommend using a table to store your values:
DECLARE #T TABLE (Word VARCHAR(255) NOT NULL);
INSERT #T (Word)
VALUES ('test1'), ('test2'), ('test3'), ('test4'), ('test5'), ('test20');
SELECT *
FROM TABLE t
WHERE t.Text IN (SELECT Word FROM #T);
You can use a SELECT like this without declaring an array:
SELECT * FROM TABLE WHERE text IN ('test1', 'test2', 'test3', 'test4', 'test5')
One solution could be :
Create a table in the database with the searched words in a column called word (by example)- by using wildcard if you need
use this kind of request
SELECT *
FROM TABLE, FILTER_TABLE
WHERE TABLE.text LIKE FILTER_TABLE.word
Although I don't have access to SQL Server 2008 at the moment and SQLfiddle seems sick, it would seem you can use a table value constructor to simplify the expression somewhat;
SELECT * FROM test
JOIN (SELECT w FROM (VALUES('word1'), ('word2'), ('word3'), ('word4')) AS a(w)) a
ON test.text LIKE '%'+a.w+'%';
...which will search the text column in the test table for the words listed as values. If you don't want duplicates of rows where multiple words match, you can just add a DISTINCT to the select.
Note though that you may want to look into fulltext indexing if you're doing extensive searches, a LIKE query to find words in a string in this way will not use any indexes, and will most likely be quite slow unless the data is already in memory.

LIKE with Multiple Consecutive White Spaces

I have following query with LIKE predicate in SQL Server 2012. It replaces white spaces with %. I have two records in the table.
DECLARE #MyTable TABLE (ITMEID INT, ITMDESC VARCHAR(100))
INSERT INTO #MyTable VALUES (1,'Healty and Alive r')
INSERT INTO #MyTable VALUES (2, 'A liver patient')
DECLARE #SearchCriteria VARCHAR(100)
SET #SearchCriteria = 'Alive'
SELECT *
FROM #MyTable
WHERE (ITMDESC LIKE '%'+REPLACE(#SearchCriteria,' ','%')+'%' ESCAPE '\')
I got this query from a friend to consider multiple consequent white spaces as a single space. The challenge is I don't see any reference for this.
Is there a pitfall in the approach?
REPLACE(#SearchCriteria,' ','%') always returns Alive. There is no Alive word in the second row, therefore it's not returned.
In fact, WHERE clause will look like this: WHERE (ITMDESC LIKE '%Alive%' ESCAPE '\')
The second row doesn't meet it.
Probably, you want something like this:
SELECT *
FROM #MyTable
WHERE (REPLACE(ITMDESC,' ','') LIKE '%'+#SearchCriteria+'%' ESCAPE '\')
you can use as below
DECLARE #MyTable TABLE (ITMEID INT, ITMDESC VARCHAR(100))
INSERT INTO #MyTable VALUES (1,'Healty and Alive r')
INSERT INTO #MyTable VALUES (2, 'A liver patient')
ECLARE #SearchCriteria VARCHAR(100)
SET #SearchCriteria = 'Alive'
SELECT *
FROM #MyTable
WHERE (REPLACE(ITMDESC,' ','') LIKE '%'+#SearchCriteria+'%' ESCAPE '\')
it will return both records as you want
The simplest solution is to replace all spaces with some moniker and then replace that moniker with a single space.
Select Replace(Replace(ItmDesc, ' ', '<z>'), '<z>', ' ')
From MyTable
SQL Fiddle version