Get records matching regex in Ms-Sql - sql

I am using query as follows to get any records that begins with any character, has bunch of 0s and ends with number (1 in this case).
where column like '_%[0]1'
But the issue is it's even returning me d0101 etc. which I don't want. I just want d0001, or r0001. Can I use it to exactly match pattern, not partially using like?
Any other options in ms-sql?

SQL-Server does not really do proper regular expressions but you can generate the search clause you want like this:
where column like '_%1' and column not like '_%[^0]%1'
The second condition will exclude all cases where you have a character other than 0 in the middle of the string.
It will allow strings of all possible lengths, provided they start with an arbitrary character, then have any number of 0s and finish with a 1. All other strings will not satisfy the where clause.

create table tst(t varchar(10));
insert into tst values('d0101');
insert into tst values('d0001');
insert into tst values('r0001');
select * from tst where PATINDEX('%00%1', t)>0
or
select * from tst where t like '%00%1'

You use the _ to say that you don't care what char is there (single char) and then use the rest of the string you want:
DECLARE # TABLE (val VARCHAR(100))
INSERT INTO #
VALUES
('d0001'),
('f0001'),
('e0005'),
('e0001')
SELECT *
FROM #
WHERE val LIKE '_0001'
This code only really handles your two simple examples. If it is more complex, add it to your post.

Related

SQL Substring REGEX pattern matching (TERADATA)

I have a column say LINES with the below string patters. I want to extract the date from the strings. For example for each lines I would need the date i.e 20201123 or 20201124 whichever the case may be.
Since the dates are in different positions I can't really use substring for this. How do I go about this ? Is there a simpler REGEX method within substring that I can apply to this.
Here is a simple reproduced code for testing.
create volatile table TEST
(LINES VARCHAR(1000) CHARACTER SET LATIN NOT CASESPECIFIC)
ON COMMIT PRESERVE ROWS;
insert into TEST values('path/to/file/OVERALL_GOTO_Datas.20201123.dat');
insert into TEST values('path/to/file/endartstmov20201124.20201124.dat');
insert into TEST values('path/to/file/TESTDEV20201123.20201123.5.0014.CHK.dat');
insert into TEST values('path/to/file/DEVTOTES20201124.20201124.5.0109.CHK.dat');
insert into TEST values('path/to/file/STORE_PARTNER.20201124.20201124.0.0501.CHK.dat');
SELECT * FROM TEST;
Appreciate your responses. Thanks.
Using the teradata REGEXP_SUBSTR
You should be able to use this regex :
SELECT REGEXP_SUBSTR(LINES, '(:?\.([0-9]{8})\.)')
see : https://regex101.com/r/WRqEmY/2
An other way is with regexp_extract ( https://teradata.github.io/presto/docs/148t/functions/regexp.html )
SELECT regexp_extract(LINES, '(?:\.([0-9]{8})\.)', 1)

How to Extract only numbers from the String without using function in SQL

Table contains data as below
Table Name is REGISTER
Column Name is EXAM_CODE
Values like ('S6TJ','S7','S26','S24')
I want answer like below
Result set - > (6,7,26,24)
Please suggest solution - since regexp_replace is not recognized built in function name in SQL.
The complexity of the answer depends on two things: the RDBMS used and whether the numbers in the EXAM_CODE are contiguous.
I have assumed that the RDBMS is SQL Server and the numbers in EXAM_CODE are always contiguous. If not, please advise and I can revise the answer.
The following SQL shows a way of accomplishing the above using PATINDEX.:
CREATE TABLE #REGISTER (EXAM_CODE VARCHAR(10));
INSERT INTO #REGISTER VALUES ('S6TJ'),('S7'),('S26'),('S24');
SELECT LEFT(EXAM_CODE, PATINDEX('%[^0-9]%', EXAM_CODE) - 1)
FROM (
SELECT RIGHT(EXAM_CODE, LEN(EXAM_CODE) - PATINDEX('%[0-9]%', EXAM_CODE) + 1) + 'A' AS EXAM_CODE
FROM #REGISTER
) a
DROP TABLE #REGISTER
This outputs:
6
7
26
24
PATINDEX matches a specified pattern against a string (or returns 0 if there is no match).
Using this, the inner query fetches all of the string AFTER the first occurence of a number. The outer query then strips any text that may appear on the end of the string.
Note: The character A is appended to the result of the inner query in order to ensure that the PATINDEX check in the outer query will make a match. Otherwise, PATINDEX would return 0 and an error would occur.

How to fetch a range of numbers using like operator

Currently I'm trying to fetch the records which are matching my condition.
I'm using wildcard operator but it's not fetching the records as I expect.
I have multiple records in my table and I'm using the query below:
select *
from My_table
where RegNum like '117[15-24]%'
I thought above query will fetch the records from 11715 to 11724, but currently it's fetching records from 11710 to 11719. I got to know that % wildcard operator will consider single digits only.
Is there any other way to use two digit number in wildcard operator or is there any other solution to fetch the records what I'm looking for?
I speculate that you are using SQL Server. When comparing numerical ranges, the best thing to do is to just use an inequality. If your RegNum column is text, then cast it to integer first and then compare:
SELECT *
FROM My_table
WHERE (CAST RegNum AS int) BETWEEN 11715 AND 11724;
If you want to use LIKE, we might be able to try:
SELECT *
FROM My_table
WHERE RegNum LIKE '1171[5-9]' OR RegNum LIKE '1172[0-4]';
The logic you want is perhaps better captured by:
where left(regnum, 5) between '11715' and '11724'
Not all databases support left(), in those, use the substring function instead.
You have a logic fallacy in wanting to use like for numeric ranges. like is for strings. If you want numeric ranges, use numbers. You could do this with the above condition:
where cast(left(regnum, 5) as int) between 11715 and 11724
But this is logically equivalent to the original string comparison.

Why won't SQL return a value if I use "=" but will return if I use "like"?

Here are my queries:
(Won't return a value)
select * from T_VoucherHeaderEntry
where Vhe_VoucherNo = 'APV-1808-00160'
(Will return a value)
Select * from T_VoucherHeaderEntry where Vhe_VoucherNo like 'APV-1808-00160%'
I tried trimming my first query but it doesn't work.
You appear to have other control characters in your stored data, specifically carriage-return and line-feed. This highlights the issue and the final query finds all rows currently affected by this1:
;declare #t table (Val1 varchar(20))
insert into #t(Val1) values ('abc
'),('def')
select * from #t where Val1 = 'abc'
select * from #t where Val1 like 'abc%'
select * from #t where Val1 like '%
%'
So, fix those rows however you choose to do so. Next, add a CHECK constraint on this column:
ALTER TABLE T_VoucherHeaderEntry
ADD CONSTRAINT CK_T_VoucherHeaderEntry_NoExoticChars
CHECK (Vhe_VoucherNo not like '%[^-A-Za-z0-9]%')
(It's expressed as a double negative to say we want to disallow any character in the provided range. We have to put - as the first character so that it's interpreted literally and not as a range separator)
And finally update your applications to not attempt to insert such bogus data in the first place.
1The third query identifies those specifically affected by CR/LF issue. For a more general approach, once you've decided on the appropriate character range to specify in your check constraint, a variant of that same approach will find rows that won't satisfy the check constraint for you to fix.
if your Vde_VoucherNo column contain this 'APV-1808-00160' value then definitely below should work and return data
select * from T_VoucherHeaderEntry
where Vhe_VoucherNo = 'APV-1808-00160'
in case of white-space in your column value, you can use trim function
select * from T_VoucherHeaderEntry
where trim(Vhe_VoucherNo) = 'APV-1808-00160'
But if your column contain pattern of this values APV-1808-00160 then like will work which is your 2nd query
Select * from T_VoucherHeaderEntry
where Vhe_VoucherNo like 'APV-1808-00160%'
BTW noticed the two query is from two different table , so that may be also reason

Finding numeric values in varchar field

sorry if this is a duplicate, I wasn't able to find what I was looking for in the answered questions.
I'm looking to query for only records with a field formatted like this numbers (0-9), hyphen (-), number (0-9), hyphen (-), numbers (0-9). This is what I have tried:
SELECT *
FROM TABLE_1
WHERE LTRIM(RTRIM(LOC_NAME)) LIKE '[0-9]-[0-9]-[0-9]'
The result set I'm looking for would be 123456-123-1234.
I thought at first they may have spaces so I trimmed the field but still no results are showing with the ABOVE query. The BELOW query returns them, but with other results:
SELECT *
FROM TABLE_1
WHERE LOC_NAME LIKE '%[0-9]-[0-9]%'
But I would get results like 1-2-3 Place...
I think this does what you want:
SELECT *
FROM TABLE_1
WHERE LTRIM(RTRIM(LOC_NAME)) NOT LIKE '%[^-0-9]%'
This checks that the field has no non-hyphens or non-digits.
If you specifically want two hyphens, separated by digits, then:
WHERE LTRIM(RTRIM(LOC_NAME)) NOT LIKE '%[^-0-9]%' AND
LTRIM(RTRIM(LOC_NAME)) LIKE '[0-9]%-[0-9]%-[0-9]%' AND
LTRIM(RTRIM(LOC_NAME)) NOT LIKE '%-%-%-%'
The second pattern requires at least two hyphens and a digit in all three parts of the name. The third forbids three hyphens.
I would do this way
select *
from table_1
where isnumeric(replace(LOC_NAME, '-','')) = 1;
Update (2018-Jun-12)
After reading the comments of #EzLo, I realized that the OP may just need two hyphens (no more, no less), so I am updating my answer with the following demo code
create table #t (LOC_NAME varchar(100));
go
insert into #t (loc_name)
values ('a12-b12-123'), ('123456-123-11'), ('123-123-123-123')
go
select *
from #t --table_1
where isnumeric(replace(LOC_NAME, '-','')) = 1
and len(loc_name)-len(replace(LOC_NAME, '-',''))=2
The result is: