How to detect numbers in string in many observations? - sql

I would like to detect all numbers from Name to gain ID. I use this code:
select Name, ID
from [my_table]
where [Name] like ('%000234%')
But I need that working for many names. I tried Name like in (000234, 000235, ...), but it doesn't work. is it possible to gain whole list of IDs searching by Names?

If you want to select names for a specific range of numbers
select Name, ID
from [my_table]
where [Name] like '%00023[4-9]%'
Which will search for names 000234 - 000239
For other wild card reference

Insert data to temp table. Then you can merge it with the main table.
DECLARE #TempTable TABLE (TempName NVARCHAR(10))
INSERT INTO #TempTable
VALUES
('000234'),
('000235')
SELECT
A.Name,
A.ID
FROM
[my_table] A INNER JOIN
#TempTable B ON A.Name LIKE '%' + B.TempName + '%'

Added test table for clarity.
CREATE TABLE #Test (
NameString varchar(25) )
INSERT INTO #Test (NameString)
SELECT 'Mike387592'
UNION ALL
SELECT 'Nancy2387'
UNION ALL
SELECT 'Tim0088297234'
WITH CTE AS (
SELECT
NameString
,PATINDEX('%[0-9]%', NameString) AS [Start]
,LEN(NameString) AS [End]
FROM #Test )
SELECT
NameString
,SUBSTRING(NameString, [Start], [End])
FROM CTE

If you have a slew of names in a table that have nonsequential numbers that don't have a pattern shared amongst them, one way to gather that data is to use an IN predicate and to dynamically construct your query.
E.G. You want names with numbers 000412, 001523 & 001687.
You would dynamically generate a query like this:
SELECT Name, ID
FROM [my_table]
WHERE Name IN ( '000412', '001523', '001687' )
Dynamic queries could be generated from the database or by the software calling to the database, but are primarily discouraged because they pose a security threat and aren't reusable. Nevertheless, this is an option.
If you must use the LIKE predicate because there are other characters surrounding your numeric string, something like this would be the generated query:
SELECT Name, ID
FROM [my_table]
WHERE Name LIKE '%000412%' OR
Name LIKE '%001523%' OR
Name LIKE '%001687%'

Related

SQL: How can I do a keyword search using words stored in a separate search table?

I am doing a keyword search in a SQL table where I want to search for a set of keywords word1, word2, ... , wordn and flag instances where these keywords are found in a new column. Assuming that I am looking for these keywords in a variable [Description] in #TABLE, the query I am using looks like this:
SELECT *
, (CASE WHEN [Description] LIKE '%word1%' THEN 'word1'
WHEN [Description] LIKE '%word2%' THEN 'word2'
...
WHEN [Description] LIKE '%wordn%' THEN 'wordn'
END) as Keywords
INTO #RESULTS_TABLE
FROM #TABLE
Now, the problem with this method is that the keywords I am searching are hard-coded into the code, which makes it inconvenient if I want to alter the set of keywords that I am searching for. Instead of doing this, I would like to have the keywords I will search in some separate table #KEYWORDS as a variable [words] and then reference all the keywords listed in that table (under that variable name) for the search. This would allow me to alter the search table and then re-run the select query on the updated search table, without having to change the select code.
Question: Assuming I have a table #KEYWORDS populated with the keywords I want to search, what is the best way to write the keyword search query so that it gets the keywords from the table rather than from hardcoded terms?
My first choice would be a temp table:
create temporary table words_table (
words varchar
);
insert into words_table values
(word1),
(word2), -- etc for the rest of your rows
Then you can:
select t.[Description]
from
your_table t
join words_table wt on
ON t.[Description] LIKE CONCAT('%',wt.words,'%')
or
select t.[Description]
from
your_table t
join words_table wt
ON t.[Description] LIKE '%' + wt.words +'%'

Get maximum value in a column in sql query if the column is alphanumeric

This is the table which I have by name project and it contains 3 columns:
estimateId
name
projectName
I want to fetch data from SQL database based on maximum value of estimateId
but here estimateid is alphanumeric. How can I achieve this.
I need a SQL query to achieve this:
For example estimateId contains values like:
Elooo1
Elooo2
......
Elooo10
and so on. So how can I achieve this?
Setup Testing Data
DECLARE #tmpTable TABLE ( estimateId NVARCHAR(MAX));
INSERT into #tmpTable(estimateId) VALUES ('Elooo1'),('Elooo2'),('Elooo3'),('Elooo4'),('Elooo5'),('Elooo6');
Split data based on the pattern
SELECT T.prefix AS prefix, MAX(T.suffix) AS suffix, MAX(estimateId) AS estimateId FROM (SELECT estimateId,LEFT(estimateId, PATINDEX('%[a-zA-Z][^a-zA-Z]%', estimateId )) AS prefix,LTRIM(RIGHT(estimateId, LEN(estimateId) - PATINDEX('%[a-zA-Z][^a-zA-Z]%', estimateId ))) As suffix FROM #tmpTable) T GROUP BY T.prefix
Result
prefix suffix estimateId
Elooo 6 Elooo6
Reference
split alpha and numeric using sql
I just started SQL like today.. so i'm totally a newbie, but I think I could solve your problem. I would do something like this
SELECT name, projectName FROM table ORDER BY estimateId ASC
or (I think you will need ORDER BY ... DESC)
SELECT name, projectName FROM table ORDER BY estimateId DESC
You seem to be looking to extract the numeric part of the strings. Assuming that the strings have variable length, and that the numbers are always at the end, you can do:
try_cast(
substring(estimateId, patindex('%[0-9]%', estimateId), len(estimateId))
as int
)
This captures everything from the the first number in the string to the end of the string, and attempts to convert it to a number (if the conversion fails, try_cast() returns null rather than raising an error).
It is not very clear what you want to use this information for. For example, if you wanted to sort your data accordingly, you would do:
select *
from mytable
order by try_cast(
substring(estimateId, patindex('%[0-9]%', estimateId), len(estimateId))
as int
)

compare some lists in where condition sql

I have some question in Sqlserver2012. I have a table that contains a filed that save who System Used from this information and separated by ',', I want to set into parameter the name of Systems and query the related rows:
declare #System nvarchar(50)
set #System ='BPM,SEM'
SELECT *
FROM dbo.tblMeasureCatalog t1
where ( ( select Upper(value) from dbo.split(t1.System,','))
= any( select Upper(value) from dbo.split(#System,',')))
dbo.split is a function to return systems in separated rows
Forgetting for a second that storing delimited lists in a relational database is abhorrent, you can do it using a combination of INTERSECT and EXISTS, for example:
DECLARE #System NVARCHAR(50) = 'BPM,SEM';
DECLARE #tblMeasureCatalog TABLE (System VARCHAR(MAX));
INSERT #tblMeasureCatalog VALUES ('BPM,XXX'), ('BPM,SEM'), ('XXX,SEM'), ('XXX,YYY');
SELECT mc.System
FROM #tblMeasureCatalog AS mc
WHERE EXISTS
( SELECT Value
FROM dbo.Split(mc.System, ',')
INTERSECT
SELECT Value
FROM dbo.Split(#System, ',')
);
Returns
System
---------
BPM,XXX
BPM,SEM
XXX,SEM
EDIT
Based on your question stating "Any" I assumed that you wanted rows where the terms matched any of those provided, based on your comment I now assume you want records where the terms match all. This is a fairly similar approach but you need to use NOT EXISTS and EXCEPT instead:
Now all is still quite ambiguous, for example if you search for "BMP,SEM" should it return a record that is "BPM,SEM,YYY", it does contain all of the searched terms, but it does contain additional terms too. So the approach you need depends on your requirements:
DECLARE #System NVARCHAR(50) = 'BPM,SEM,XXX';
DECLARE #tblMeasureCatalog TABLE (System VARCHAR(MAX));
INSERT #tblMeasureCatalog
VALUES
('BPM,XXX'), ('BPM,SEM'), ('XXX,SEM'), ('XXX,YYY'),
('SEM,BPM'), ('SEM,BPM,XXX'), ('SEM,BPM,XXX,YYY');
-- METHOD 1 - CONTAINS ALL SEARCHED TERMS BUT CAN CONTAIN ADDITIONAL TERMS
SELECT mc.System
FROM #tblMeasureCatalog AS mc
WHERE NOT EXISTS
(
SELECT Value
FROM dbo.Split(#System, ',')
EXCEPT
SELECT Value
FROM dbo.Split(mc.System, ',')
);
-- METHOD 2 - ONLY CONTAINS ITEMS WITHIN THE SEARCHED TERMS, BUT NOT
-- NECESSARILY ALL OF THEM
SELECT mc.System
FROM #tblMeasureCatalog AS mc
WHERE NOT EXISTS
( SELECT Value
FROM dbo.Split(mc.System, ',')
EXCEPT
SELECT Value
FROM dbo.Split(#System, ',')
);
-- METHOD 3 - CONTAINS ALL ITEMS IN THE SEARCHED TERMS, AND NO ADDITIONAL ITEMS
SELECT mc.System
FROM #tblMeasureCatalog AS mc
WHERE NOT EXISTS
( SELECT Value
FROM dbo.Split(#System, ',')
EXCEPT
SELECT Value
FROM dbo.Split(mc.System, ',')
)
AND LEN(mc.System) = LEN(#System);
You have a problem with your data structure because you are storing lists of things in a comma-delimited list. SQL has a great data structure for storing lists. It goes by the name "table". You should have a junction table with one row per "measure catalog" and "system".
Sometimes, you are stuck with other people's really bad design decisions. One solution is to use split(). Here is one method:
select mc.*
from dbo.tblMeasureCatalog mc
where exists (select 1
from dbo.split(t1.System, ',') t1s join
dbo.split(#System, ',') ss
on upper(t1s.value) = upper(ss.value)
);
you can try this :
declare #System nvarchar(50)
set #System ='BPM,SEM'
SELECT * from dbo.tblMeasureCatalog t1 inner join dbo.Split (#System ,',') B on t1.it=B.items

Solution to avoid non-sargable argument in where clause

In the code_list CTE in this query I have a row constructor that will eventually take any number of arguments. The column icd in the patient_codes CTE is a five digit identifier that is most descriptive that the three digit codes that the row constructor has. The table icd_patient has a 100 million rows so for performance's sake, I would like to filer the rows on this table before I do any further work. I have
;with code_list(code_list)
as
(
select x.code_list
from (values ('70700'),('25002')) as x(code_list)
),patient_codes
as
(
select distinct icd,pat_id,id
from icd_patient
where icd in (select icd from code_list)
)
select distinct pat_id from patient_codes
The problem is, however, is that in the icd_patient table all of the icd columns are five digit and more descriptive. If I look at the execution plan of this query it's pretty streamlined. If I do
;with code_list(code_list)
as
(
select x.code_list
from (values ('70700'),('25002')) as x(code_list)
),patient_codes
as
(
select substring(icd,1,3) as icd,pat_id
from icd_patient2
where substring(icd,1,3) in (select * from code_list)
)
select * from patient_codes
this if course has a large performance impact because of the substring expression in the where clause. Does something akin to like in exist so I can take advantage of my indexes?
Index on icd_patient
CREATE NONCLUSTERED INDEX [ix_icd_patient] ON [dbo].[icd_patient2]
(
[pat_id] ASC
)
INCLUDE ( [id],
This much simpler query should be better than (or, at worst, the same as) your existing query.
select pat_id
FROM dbo.icd_patient
where icd LIKE '707%'
OR icd LIKE '250%'
GROUP BY pat_id;
Note that sargability only matters if there is actually an index on this column.
An alternative (since OR can sometimes give the optimizer fits):
SELECT pat_id FROM
(
SELECT pat_id
FROM dbo.icd_patient
WHERE icd LIKE '707%'
UNION ALL
SELECT pat_id
FROM dbo.icd_patient
WHERE icd LIKE '250%'
) AS x
GROUP BY pat_id;
To make this extensible beyond a handful of OR conditions, I would use a table-valued parameter (TVP).
CREATE TYPE dbo.StringPatterns AS TABLE(s VARCHAR(3) PRIMARY KEY);
Then your stored procedure could say:
CREATE PROCEDURE dbo.whatever
#sp dbo.StringPatterns READONLY
AS
BEGIN
SET NOCOUNT ON;
SELECT p.pat_id
FROM dbo.icd_patient AS p
INNER JOIN #sp AS sp
ON p.pat_id LIKE sp.s + '%'
GROUP BY p.pat_id;
END
Then you can pass in your set of three-character substrings from a DataTable or other collection in C#. From T-SQL just as an example:
DECLARE #p dbo.StringPatterns;
INSERT #p VALUES('707'),('250');
EXEC dbo.whatever #sp = #p;
Something like like in does not exist. The following is sargable:
select *
from icd_patient
where icd like '70700%' or
icd like '25002%'
Because like with a constant initial substring is a special case for SQL Server. This does not work when the strings on the right are variables.
One solution is to create an indexed view on the icd_patient table with an index on the first five characters of the icd code.
Using "IN" makes that part of a command non-sargable on both sides. End of discussion.
Saying he fixes it using substring, completely changes what it would return while it remains non sarged.
Any "fix" should exactly match results. The actual fix is to join the cte so the five characters match or put three characters in the cte and match that in a join or put 4 characters in the cte where the fourth is "%" and join matching by using LIKE
Using a "like" that starts with "%" increases the complexity of the search, but it would still use the index to find the value because parsing the index should use less reading by only getting the full table row when a search is successful.

How to combine IN operator with LIKE condition (or best way to get comparable results)

I need to select rows where a field begins with one of several different prefixes:
select * from table
where field like 'ab%'
or field like 'cd%'
or field like "ef%"
or...
What is the best way to do this using SQL in Oracle or SQL Server? I'm looking for something like the following statements (which are incorrect):
select * from table where field like in ('ab%', 'cd%', 'ef%', ...)
or
select * from table where field like in (select foo from bar)
EDIT:
I would like to see how this is done with either giving all the prefixes in one SELECT statement, of having all the prefixes stored in a helper table.
Length of the prefixes is not fixed.
Joining your prefix table with your actual table would work in both SQL Server & Oracle.
DECLARE #Table TABLE (field VARCHAR(32))
DECLARE #Prefixes TABLE (prefix VARCHAR(32))
INSERT INTO #Table VALUES ('ABC')
INSERT INTO #Table VALUES ('DEF')
INSERT INTO #Table VALUES ('ABDEF')
INSERT INTO #Table VALUES ('DEFAB')
INSERT INTO #Table VALUES ('EFABD')
INSERT INTO #Prefixes VALUES ('AB%')
INSERT INTO #Prefixes VALUES ('DE%')
SELECT t.*
FROM #Table t
INNER JOIN #Prefixes pf ON t.field LIKE pf.prefix
you can try regular expression
SELECT * from table where REGEXP_LIKE ( field, '^(ab|cd|ef)' );
If your prefix is always two characters, could you not just use the SUBSTRING() function to get the first two characters of "field", and then see if it's in the list of prefixes?
select * from table
where SUBSTRING(field, 1, 2) IN (prefix1, prefix2, prefix3...)
That would be "best" in terms of simplicity, if not performance. Performance-wise, you could create an indexed virtual column that generates your prefix from "field", and then use the virtual column in your predicate.
Depending on the size of the dataset, the REGEXP solution may or may not be the right answer. If you're trying to get a small slice of a big dataset,
select * from table
where field like 'ab%'
or field like 'cd%'
or field like "ef%"
or...
may be rewritten behind the scenes as
select * from table
where field like 'ab%'
union all
select * from table
where field like 'cd%'
union all
select * from table
where field like 'ef%'
Doing three index scans instead of a full scan.
If you know you're only going after the first two characters, creating a function-based index could be a good solution as well. If you really really need to optimize this, use a global temporary table to store the values of interest, and perform a semi-join between them:
select * from data_table
where transform(field) in (select pre_transformed_field
from my_where_clause_table);
You can also try like this, here tmp is temporary table that is populated by the required prefixes. Its a simple way, and does the job.
select * from emp join
(select 'ab%' as Prefix
union
select 'cd%' as Prefix
union
select 'ef%' as Prefix) tmp
on emp.Name like tmp.Prefix