Include wildcards in sql server in the values themselves - not when searching with LIKE - sql

Is there a way to include wildcards in sql server in the values themselves - not when searching with LIKE?
I have a database that users search on. They search for model numbers that contain different wildcard characters but do not know that these wildcard characters exist.
For example, a model number may be 123*abc in the database, but the user will search for 1234abc because that's what they see for their model number on their unit at home.
I'm looking for a way to allow users to search without knowledge of wildcards but have a systematic way to include model numbers with wildcard characters in the database.

What you could do is add a PERSISTED computed column to your table with valid pattern expression for SQL Server. You stated that * should be any letter or numerical character, and comma delimited values in parenthesis can be any one of those characters. Provided that commas don't appear in your main data, nor parenthesis, then this should work:
USE Sandbox;
GO
CREATE TABLE SomeTable (SomeString varchar(15));
GO
INSERT INTO SomeTable
VALUES('123abc'),
('abc*987'),
('def(q,p,r,1)555');
GO
ALTER TABLE SomeTable ADD SomeString_Exp AS REPLACE(REPLACE(REPLACE(REPLACE(SomeString,'*','[0-9A-z]'),'(','['),')',']'),',','') PERSISTED; --What you're interested in
SELECT *
FROM SomeTable;
GO
DECLARE #String varchar(15) = 'defp555';
SELECT *
FROM SomeTable
WHERE #String LIKE SomeString_Exp; --And how to search
GO
DROP TABLE SomeTable;
If * is any character, and noy any alphanumeric then you could shorten the whole thing to (and provided your on SQL Server 2017):
ALTER TABLE SomeTable ADD SomeString_Exp AS REPLACE(TRANSLATE(SomeString,'*()','_[]'),',','') PERSISTED;

I'm thinking either:
where #model_number like replace(model_number, '*', '%')
or
where #model_number like replace(model_number, '*', '_')
Depending on whether '*' means any string (first example) or exactly one character (second example).

Related

LIKE operator and % wildcard when string contains underscore

I have a table in SQL Server that stores codes. Depending on the nomenclature, some begin with 'DB_' and others with 'DBL_'. I need a way to filter the ones that start with 'DB_', since when I try to do it, it returns all the results.
CREATE TABLE CODES(Id integer PRIMARY KEY, Name Varchar(20));
INSERT INTO CODES VALUES(1,'DBL_85_RC001');
INSERT INTO CODES VALUES(2,'DBL_85_RC002');
INSERT INTO CODES VALUES(3,'DBL_85_RC003');
INSERT INTO CODES VALUES(4,'DB_20_SE_RC010');
INSERT INTO CODES VALUES(5,'DB_20_SE_RC011');
SELECT * FROM CODES where Name like 'DB_%';
The result that returns:
1|DBL_85_RC001
2|DBL_85_RC002
3|DBL_85_RC003
4|DB_20_SE_RC010
5|DB_20_SE_RC011
Expected result:
4|DB_20_SE_RC010
5|DB_20_SE_RC011
_ is a wildcard for a single character in a LIKE expression. Thus both 'DB_' and 'DBL' are LIKE 'DB_'. If you want a literal underscore you need to put it in brackets ([]):
SELECT *
FROM CODES
WHERE [Name] LIKE 'DB[_]%';
The underscore is a wildcard in SQL Server. You can escape it:
where name like 'DB$_%' escape '$'
You could also use left():
where left(name, 3) = 'DB_'
However, this is not index- and optimizer friendly.

SQL Server stored procedure to search list of values without special characters

What is the most efficient way to search a column and return all matching values while ignoring special characters?
For example if a table has a part_number column with the following values '10-01' '14-02-65' '345-23423' and the user searches for '10_01' and 140265 it should return '10-01' and '14-02-65'.
Processing the input to with a regex to remove those characters is possible, so the stored procedure could could be passed a parameter '1001 140265' then it could split that input to form a SQL statement like
SELECT *
FROM MyTable
WHERE part_number IN ('1001', '140265')
The problem here is that this will not match anything. In this case the following would work
SELECT *
FROM MyTable
WHERE REPLACE(part_number,'-','') IN ('1001', '140265')
But I need to remove all special characters. Or at the very least all of these characters ~!##$%^&*()_+?/\{}[]; with a replace for each of those characters the query takes several minutes when the number of parts in the IN clause is less than 200.
Performance is improved by creating a function that does the replaces, so the query takes less than a minute. But without removals the query takes around 1 second, is there any way to create some kind of functional index that will work on multiple SQL Server engines?
You could use a computed column and index it:
CREATE TABLE MyTable (
part_number VARCHAR(10) NOT NULL,
part_number_int AS CAST(replace(part_number, '-', '') AS int)
);
ALTER TABLE dbo.MyTable ADD PRIMARY KEY (part_number);
ALTER TABLE dbo.MyTable ADD UNIQUE (part_number_int);
INSERT INTO dbo.MyTable (part_number)
VALUES ('100-1'), ('140265');
SELECT *
FROM dbo.MyTable AS MT
WHERE MT.part_number_int IN ('1001', '140265');
Of course your replace statement will be more complex and you'll have to sanitize user input the same way you sanitize column values. But this is going to be the most efficient way to do it.
This query can now seek your column efficiently:
But to be honest, I'd just create a separate column to store cleansed values for querying purpose and keep the actual values for display. You'll have to take care of extra update/insert clauses, but that's a minimum damage.

Regex to get data with special characters

I have some data in my table's column upn.
Here is a small sample set of this data.
Pasquale.Rombolà#it.eurw.domain.net
JuanMaria.RomanGonçalves#eurs.domain.net
Santo.Paternò#it.eurw.domain.net
Peter.Browne#UK.EURW.domain.net
François.ESTIN#fr.eurw.domain.net
Frédéric.Huynh#fr.eurw.domain.net
Frédérique.Psaume#fr.eurw.domain.net
Laura.PiñeiroGomez#eurs.domain.net
Maria.AranzabalSaldaña#eurs.domain.net
Alberto.RubioMuñoz#eurs.domain.net
Peter.Brüggemann#UK.EURW.domain.net
Russel.Peters#CA.domain.net
I want to query this table for UPN values where I have some special characters in the UPN. So my query should not return upns such as:
Peter.Browne#UK.EURW.domain.net
and
Russel.Peters#CA.domain.net
But returns everything else with special characters such as [à,ò,ñ,ü ...etc]
I have tried this query but it doesn't work.
Select * from TableName
Where [UPN] like %[a-z,0-9,#,\.,-,A-Z]%
It returns everything including those which don't have any special characters.
Please help.
If I understand correctly, I think you'll just need to add a "^" as the first character inside the square brackets.
At present you're saying you want to return all those UPNs where one or more characters is in the list you give (i.e. the "ordinary" characters). The "^" should reverse that and give you all the UPNs where at least one of the characters is not in the list you give.
Update: After testing locally ... Make sure your collation is "Accent Sensitive" (if necessary add "Latin1_General_CI_AS" or similar after your "like" clause.
I found it only worked if rather than "A-Z", I actually typed out the whole alphabet.
You need to add binary collate clause in it. Chose necessary collation as per your data. For given sample data Latin1_General_BIN works. Here is the link for collation in sql server.
This snippet worked for me on my machine-
create table #t (name varchar(100));
insert into #t values
('Pasquale.Rombolà#it.eurw.domain.net'),
('JuanMaria.RomanGonçalves#eurs.domain.net'),
('Santo.Paternò#it.eurw.domain.net'),
('Peter.Browne#UK.EURW.domain.net'),
('François.ESTIN#fr.eurw.domain.net'),
('Frédéric.Huynh#fr.eurw.domain.net'),
('Frédérique.Psaume#fr.eurw.domain.net'),
('Laura.PiñeiroGomez#eurs.domain.net'),
('Maria.AranzabalSaldaña#eurs.domain.net'),
('Alberto.RubioMuñoz#eurs.domain.net'),
('Peter.Brüggemann#UK.EURW.domain.net'),
('Russel.Peters#CA.domain.net');
select * from #t where name not like '%[^a-zA-Z0-9#.]%' COLLATE Latin1_General_BIN;
Output-
Peter.Browne#UK.EURW.domain.net
Russel.Peters#CA.domain.net

Finding the "&" character in SQL SERVER using a like statement and Wildcards

I need to find the '&' in a string.
SELECT * FROM TABLE WHERE FIELD LIKE ..&...
Things we have tried :
SELECT * FROM TABLE WHERE FIELD LIKE '&&&'
SELECT * FROM TABLE WHERE FIELD LIKE '&\&&'
SELECT * FROM TABLE WHERE FIELD LIKE '&|&&' escape '|'
SELECT * FROM TABLE WHERE FIELD LIKE '&[&]&'
None of these give any results in SQLServer.
Well some give all rows, some give none.
Similar questions that didn't work or were not specific enough.
Find the % character in a LIKE query
How to detect if a string contains special characters?
some old reference Server 2000
http://web.archive.org/web/20150519072547/http://sqlserver2000.databases.aspfaq.com:80/how-do-i-search-for-special-characters-e-g-in-sql-server.html
& isn't a wildcard in SQL, therefore no escaping is needed.
Use % around the value your looking for.
SELECT * FROM TABLE WHERE FIELD LIKE '%&%'
Your statement contains no wildcards, thus is equivalent to WHERE FIELD = '&'.
& isn't a special character in SQL so it doesn't need to be escaped. Just write
WHERE FIELD LIKE '%&%'
to search for entries that contain & somewhere in the field
Be aware though, that this will result in a full table scan as the server can't use any indexes. Had you typed WHERE FIELD LIKE '&%' the server could do a range seek to find all entries starting with &.
If you have a lot of data and can't add any more constraints, you should consider using SQL Server's full-text search to create and use and FTS index, with predicates like CONTAINS or FREETEXT

How do I escape special characters in user input for a SQL LIKE?

Take the following example (SQL Server 2008 - might work with more). You'll have to imagine #query being some parameter whose source is user input:
DECLARE #query varchar(100)
SET #query = 'less than 1% fat'
CREATE TABLE X ([A] VARCHAR(100))
INSERT X VALUES ('less than 1% fat')
INSERT X VALUES ('less than 1% of doctors recommend this - it''s full of fat!')
SELECT * FROM X WHERE A LIKE '%' + #query + '%'
DROP TABLE X
The query states 'less than 1% fat', but we actually get more than we wanted:
less than 1% fat
less than 1% of doctors recommend this - it's full of fat!
To get the required behaviour, I change #query to 'less than 1[%] fat' - then only the first result is returned.
Is there a standard way to prepare strings for clauses which use LIKEs, or do I have to roll my own?
Instead of using LIKE you can use free-text search and the CONTAINS or FREETEXT predicates. LIKE with a leading wildcard ignores indexes and results in a full table scan while the full-text searches use existing free-text indexes to speed up the search. You will have to configure free text indexing before you can use it.
If you want to stick with LIKE the best solution would be to escape the string in your client code. T-SQL provides very limited string manipulation functionality and the REPLACE function doesn't even accept wildcards. You would have to nest multiple REPLACE statements to account for all wildcards used by LIKE.
You can combine that with the ESCAPE clause and use a rare character like §, ¶ or ¤ as an escape character if your search string may contain the [ or ] characters.
SQL Server allows you to specify an escape character, then you can just escape the query string before using it.
Sample from the MSDN page I linked:
SELECT c1
FROM mytbl2
WHERE c1 LIKE '%10-15!% off%' ESCAPE '!';