Pull 3 digits from middle of id and sort by even odd - sql

I have file ids in my database that start with:
a single character prefix
a period
a three digit client id
a hyphen
a three digit file number.
Example F.129-123
We have several ids for each client.
I need to be able to strip out the three digit file number and then pull them based on even or odd so that I can assign specific data to each result population.
One added issue. Some of the ids have characters added at the end.
Example: F.129-123A or F.129-123.NF
So I need to be able to just use the three digit file number without any other characters, because the added characters create errors while conversion.

If you are using SQL SERVER,
you can use CHARINDEX() to find the index of - and then
get 3 digits after - using SUBSTRING()
SELECT substring('F.123-234',charindex('-','F.123-234')+1, 3)
If you are using MySQL,
you can use POSITION() to find the index of - and then get 3 digits after - using SUBSTRING()
SELECT SUBSTRING('F.123-234',POSITION( '-' IN 'F.123-234' )+1,3);
If you are using Oracle,
you can use INSTR() to find the index of - and then get 3 digits after - using SUBSTR()
UPDATES:
Based on the requirements in comments, you can use a query like below achieve what you need.
SELECT
SUBSTRING(MatterID,CHARINDEX('-',MatterID)+1, 3) as FileNo
FROM
Matters
WHERE
MatterID LIKE'f.129%'
AND MatterID NOT LIKE '%col%'
AND substring( MatterID, CHARINDEX('-',MatterID)+1, 3) % 2 = 0

If you are working with Microsoft SQL Server, then you could use of patindex() function with substring() function to get the only 3 digits file number
select left(substring(string, PATINDEX('%[0-9][-]%', string)+2, LEN(string)), 3)
Note that if you have other period (i.e. -, /) then you will need to modify chars like PATINDEX('%[0-9][/]%')

In Postgres you can use split_part() to get the part after the hyphen, then cast it to an integer:
select *
from the_table
order by split_part(file_id, '-', 2)::int;
This assumes that there is always exactly one - in the string. I understand your question that this is the case as the format is fixed.

Is this helpful
Create table #tmpFileNames(id int, FileName VARCHAR(50))
insert into #tmpFileNames values(1,'F.129-123')
insert into #tmpFileNames values(2,'F.129-125')
insert into #tmpFileNames values(3,'F.129-124')
insert into #tmpFileNames values(4,'F.129-123A')
insert into #tmpFileNames values(5,'F.129-124B')
insert into #tmpFileNames values(6,'F.129-125.PQ')
insert into #tmpFileNames values(7,'F.129-123.NF')
select SUBSTRING(STUFF(FileName, 1, CHARINDEX('-',FileName), ''),0,4), * from #tmpFileNames
Order by SUBSTRING(STUFF(FileName, 1, CHARINDEX('-',FileName), ''),0,4),id
Drop table #tmpFileNames

Related

Removing characters after a specified character format

I have a field that should contain 6 digits, a period, and six digits (######.######). The application that I use allows this to be free-form entry. Because users are users and will do what they want I have several fields that have a dash and some letters afterwards (######.######-XYZ).
Using T-SQL how do I identify and subsequently remove the -XYZ so that I can return the integrity of the data. The column is an NVARCHAR(36), PK, and does not allow null values. The column in question does have a unique columnID field.
If the part you want is the first 13 characters, then use left():
select left(field, 13)
You can check if the first 13 characters are what you expect:
select (case when field like '[0-9][0-9][0-9][0-9][0-9][0-9].[0-9][0-9][0-9][0-9][0-9][0-9]%'
then left(field, 13)
else -- whatever you want when the field is bad
end)
since it'a free-form and "users are users", use charindex to find out if 1) there is a - and 2) remove it.
Example:
DECLARE #test NVARCHAR(36) = N'######.######-XYZ'
SELECT SUBSTRING(#test,1,COALESCE(NULLIF(CHARINDEX('-',#test,1),0),LEN(#test)+1)-1)

How to identify combination of number and character in SQL

I have a requirement where I have to find number of records in a special pattern in the field ref_id in a table. It's a varchar column. I need to find all the records where 8th, 9th and 10th character are numeric+XX. That is it should be like 2XX or 8XX. I tried using regexp :digit: but no luck. Essentially I am looking for all records where 8th-10th characters are 1XX, 2XX, 3XX… etc
Using REGEXP_LIKE, replace table with Yours:
SELECT COUNT(*)
FROM table
WHERE REGEXP_LIKE(ref_id,'^.{7}[0-9]XX');
.{7} whatever seven characters
[0-9] 8th character digit
XX 9th and 10th characters X
Or with [:digit:] class as You are mentioning, You may use:
SELECT COUNT(*)
FROM table
WHERE REGEXP_LIKE(ref_id,'^.{7}[[:digit:]]XX');
This can also be achieved using standard non-regex SQL functions
select * from t where s like '________XX%' -- any 8 characters and then XX
AND translate( substr(s,8,1),'?0123456789','?') is null; --8th one is numeric
DEMO
No need for a regexp:
select * from mytable where substr(ref_id, 8, 3) in ('0XX','1XX','2XX','3XX','4XX','5XX','6XX','7XX','8XX','9XX')
or
select * from mytable where substr(ref_id, 8, 3) in ('1XX','2XX','3XX','4XX','5XX','6XX','7XX','8XX','9XX')
I don't know if '0XX' is a valid match or not.
Regexp's tend to be slow.

SQL syntax alphanumeric characters, SQL Server

If I pull this ID down from my source system it looks like 9006ABCD.
What would the syntax look like if I just want to return 9006 as the ID?
Essentially, I don't need the alpha characters.
Assuming that '9006ABCD' is a string value, then you can extract the leading numbers using:
select left(id, patindex('%[^0-9]%', id + 'X') - 1)
Of course, there may be easier ways. If you just want the first four characters, then use left(id, 4).

SQL Server search using like while ignoring blank spaces

I have a phone column in the database, and the records contain unwanted spaces on the right. I tried to use trim and replace, but it didn't return the correct results.
If I use
phone like '%2581254%'
it returns
customerid
-----------
33470
33472
33473
33474
but I need use percent sign or wild card in the beginning only, I want to match the left side only.
So if I use it like this
phone like '%2581254'
I get nothing, because of the spaces on the right!
So I tried to use trim and replace, and I get one result only
LTRIM(RTRIM(phone)) LIKE '%2581254'
returns
customerid
-----------
33474
Note that these four ids have same phone number!
Table data
customerid phone
-------------------------------------
33470 96506217601532388254
33472 96506217601532388254
33473 96506217601532388254
33474 96506217601532388254
33475 966508307940
I added many number for test propose
The php function takes last 7 digits and compare them.
For example
01532388254 will be 2581254
and I want to search for all users that has this 7 digits in their phone number
2581254
I can't figure out where's the problem!
It should return 4 ids instead of 1 id
Given the sample data, I suspect you have control characters in your data. For example char(13), char(10)
To confirm this, just run the following
Select customerid,phone
From YourTable
Where CharIndex(CHAR(0),[phone])+CharIndex(CHAR(1),[phone])+CharIndex(CHAR(2),[phone])+CharIndex(CHAR(3),[phone])
+CharIndex(CHAR(4),[phone])+CharIndex(CHAR(5),[phone])+CharIndex(CHAR(6),[phone])+CharIndex(CHAR(7),[phone])
+CharIndex(CHAR(8),[phone])+CharIndex(CHAR(9),[phone])+CharIndex(CHAR(10),[phone])+CharIndex(CHAR(11),[phone])
+CharIndex(CHAR(12),[phone])+CharIndex(CHAR(13),[phone])+CharIndex(CHAR(14),[phone])+CharIndex(CHAR(15),[phone])
+CharIndex(CHAR(16),[phone])+CharIndex(CHAR(17),[phone])+CharIndex(CHAR(18),[phone])+CharIndex(CHAR(19),[phone])
+CharIndex(CHAR(20),[phone])+CharIndex(CHAR(21),[phone])+CharIndex(CHAR(22),[phone])+CharIndex(CHAR(23),[phone])
+CharIndex(CHAR(24),[phone])+CharIndex(CHAR(25),[phone])+CharIndex(CHAR(26),[phone])+CharIndex(CHAR(27),[phone])
+CharIndex(CHAR(28),[phone])+CharIndex(CHAR(29),[phone])+CharIndex(CHAR(30),[phone])+CharIndex(CHAR(31),[phone])
+CharIndex(CHAR(127),[phone]) >0
If the Test Results are Positive
The following UDF can be used to strip the control characters from your data via an update
Update YourTable Set Phone=[dbo].[udf-Str-Strip-Control](Phone)
The UDF if Interested
CREATE FUNCTION [dbo].[udf-Str-Strip-Control](#S varchar(max))
Returns varchar(max)
Begin
;with cte1(N) As (Select 1 From (Values(1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) N(N)),
cte2(C) As (Select Top (32) Char(Row_Number() over (Order By (Select NULL))-1) From cte1 a,cte1 b)
Select #S = Replace(#S,C,' ')
From cte2
Return LTrim(RTrim(Replace(Replace(Replace(#S,' ','><'),'<>',''),'><',' ')))
End
--Select [dbo].[udf-Str-Strip-Control]('Michael '+char(13)+char(10)+'LastName') --Returns: Michael LastName
As promised (and nudged by Bill), the following is a little commentary on the UDF.
We pass a string that we want stripped of Control Characters
We create an ad-hoc tally table of ascii characters 0 - 31
We then run a global search-and-replace for each character in the
tally-table. Each character found will be replaced with a space
The final string is stripped of repeating spaces (a little trick
Gordon demonstrated several weeks ago - don't have the original
link)

Extracting specific part of column values in Oracle SQL

I want to extract a specific part of column values.
The target column and its values look like
TEMP_COL
---------------
DESCOL 10MG
TEGRAL 200MG 50S
COLOSPAS 135MG 30S
The resultant column should look like
RESULT_COL
---------------
10MG
200MG
135MG
This can be done using a regular expression:
SELECT regexp_substr(TEMP_COL, '[0-9]+MG')
FROM the_table;
Note that this is case sensitive and it always returns the first match.
I would probably approach this using REGEXP_SUBSTR() rather than base functions, because the structure of the prescription text varies from record to record.
SELECT TRIM(REGEXP_SUBSTR(TEMP_COL, '(\s)(\S*)', 1, 1))
FROM yourTable
The pattern (\s)(\S*) will match a single space followed by any number of non-space characters. This should match the second term in all cases. We use TRIM() to remove a leading space which is matched and returned.
how do you know what is the part you want to extract? how do you know where it begins and where it ends? using the white-spaces?
if so, you can use substr for cutting the data and instr for finding the white-spaces.
example:
select substr(tempcol, -- string
instr(tempcol, ' ', 1), -- location of first white-space
instr(tempcol, ' ', 1, 2) - instr(tempcol, ' ', 1)) -- length until next space
from dual
another solution is using regexp_substr (but it might be harder on performance if you have a lot of rows):
SELECT REGEXP_SUBSTR (tempcol, '(\S*)(\s*)', 1, 2)
FROM dual;
edit: fixed the regular expression to include expressions that don't have space after the parsed text. sorry about that.. ;)