How to get maximum value of a specific part of strings?

How to get maximum value of a specific part of strings? - sql

I have below records
Id Title
500006 FS/97/98/037
500007 FS/97/04/035
500008 FS/97/01/036
500009 FS/97/104/040
I should split Title field and get 4th part of text and return maximum value. In this example my query should return 040 or 40.

select max(cast(right(Title, charindex('/', reverse(Title) + '/') - 1) as int))
from your_table
SQLFiddle demo

You can use PARSENAME function since you always have 4 parts(confirmed in comments section)
select max(cast(parsename(replace(Title,'/','.'),1) as int))
from yourtable

If you want to split the data in the Title column and get the part from the splitted text by position, you may try with one JSON-based approach with a simple string transformation. You need to transform the data in the Title column into a valid JSON array (FS/97/98/037 into ["FS","97","08","037"]) and after that to parse thе data with OPENJSON(). The result from OPENJSON() (using default schema and parsing JSON array) is a table with columns key, value and type, and the key column holds the index of the items in the JSON array:
Note, that using STRING_SPLIT() is not an option here, because the order of the returned rows is not guaranteed.
Table:
CREATE TABLE Data (
Id varchar(6),
Title varchar(50)
)
INSERT INTO Data
(Id, Title)
VALUES
('500006', 'FS/97/98/037'),
('500007', 'FS/97/04/035'),
('500008', 'FS/97/01/036'),
('500009', 'FS/97/104/040')
Statement:
SELECT MAX(j.[value])
FROM Data d
CROSS APPLY OPENJSON(CONCAT('["', REPLACE(d.Title, '/', '","'), '"]')) j
WHERE (j.[key] + 1) = 4
If you data has fixed format with 4 parts, even this approach may help:
SELECT MAX(PARSENAME(REPLACE(Title, '/', '.'), 1))
FROM Data

You can also try the below query.
SELECT Top 1
CAST('<x>' + REPLACE(Title,'/','</x><x>') + '</x>' AS XML).value('/x[4]','int') as Value
from Data
order by 1 desc
You can find the live demo Here.

Related

Order by generated column

I have following data structure:
CREATE TABLE test(
id INTEGER PRIMARY KEY,
data TEXT NOT NULL
);
INSERT INTO test (data) VALUES ('10.20.3.40'), ('10.100.3'), ('10.20.20.40')
The problem is that i need to order by data column using integer logic (dot-separated string as array of integers).
Using order by it returns data sorted as text:
SELECT data FROM test ORDER BY data
10.100.3
10.20.20.40
10.20.3.40
Result I need to achieve:
10.20.3.40
10.20.20.40
10.100.3
The simplest method to sort it properly without reimplementing arrays in SQLite which I've found is to add zero-padding to each part of data.
So basically I need to:
Select all data from table;
Split value of data column;
Add zero-padding and join it back;
Join newly generated column with reformatted data
Order by this column
What have I already done:
WITH RECURSIVE split_str(source, part) AS (
SELECT '10.20.3.40' || '.', NULL
UNION ALL
SELECT
substr(source, instr(source, '.') + 1),
substr('000000' || source, instr(source, '.'), 6)
FROM split_str WHERE source != ''
)
SELECT group_concat(part, '.') AS new_data FROM split_str WHERE part IS NOT NULL
It splits constant string '10.20.3.40' by dot, add leading zeros to each part and join it back using group_concat(). It returns:
000010.000020.000003.000040
Now I need to apply such a modification to values of data column from test table and somehow use this values for sorting. That's result I'm trying to get:
I'm not an expert in SQL (obviously) and don't understand how to apply expression in WITH clause on each data column separately.

As you can't use GROUP_CONCAT() if you want to preserve the order, just build up another string instead.
Then, only take the records where there's no more 'unpadded' string still to be processed.
WITH RECURSIVE
test_set(original)
AS
(
SELECT '10.20.3.40'
UNION ALL
SELECT '10.100.3'
UNION ALL
SELECT '10.20.20.40'
),
split_str(original, remaining, padded)
AS
(
SELECT original, original || '.', '' FROM test_set
---------
UNION ALL
---------
SELECT
original,
substr(remaining, instr(remaining, '.') + 1),
padded || '.' || substr('000000' || remaining, instr(remaining, '.'), 6)
FROM
split_str
WHERE
remaining != ''
)
SELECT
original,
padded
FROM
split_str
WHERE
remaining = ''
ORDER BY
padded
Demo: db<>fiddle.uk
(You may or may not want to strip the leading ., depending on your needs.)

How to fetch only a part of string

I have a column which has inconsistent data. The column named ID and it can have values such as
0897546321
ABC,0876455321
ABC,XYZ,0873647773
ABC,
99756
test only
The SQL query should fetch only Ids which are of 10 digit in length, should begin with a 08 , should be not null and should not contain all characters. And for those values, which have both digits and characters such as ABC,XYZ,0873647773, it should only fetch the 0873647773 . In these kind of values, nothing is fixed, in place of ABC, XYZ , it can be anything and can be of any length.
The column Id is of varchar type.
My try: I tried the following query
select id
from table
where id is not null
and id not like '%[^0-9]%'
and id like '[08]%[0-9]'
and len(id)=10
I am still not sure how should I deal with values like ABC,XYZ,0873647773
P.S - I have no control over the database. I can't change its values.

SQL Server generally has poor support regular expressions, but in this case a judicious use of PATINDEX is viable:
SELECT SUBSTRING(id, PATINDEX('%,08[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9],%', ',' + id + ','), 10) AS number
FROM yourTable
WHERE ',' + id + ',' LIKE '%,08[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9],%';
Demo

If you normalise your data, and split the delimited data into parts, you can achieve this some what more easily:
SELECT SS.value
FROM dbo.YourTable YT
CROSS APPLY STRING_SPLIT(YT.YourColumn,',') SS
WHERE LEN(SS.value) = 10
AND SS.value NOT LIKE '%[^0-9]%';
If you're on an older version of SQL Server, you'll have to use an alternative String Splitter method (such as a XML splitter or user defined inline table-value function); there are plenty of examples on these already on Stack Overflow.
db<>fiddle

sql extract rightmost number in string and increment

i have transaction codes like
"A0004", "1B2005","20CCCCCCC21"
I need to extract the rightmost number and increment the transaction code by one
"AA0004"----->"AA0005"
"1B2005"------->"1B2006"
"20CCCCCCCC21"------>"20CCCCCCCC22"
in SQL Server 2012.
unknown length of string
right(n?) always number
dealing with unsignificant number of string and number length is out of my league.
some logic is always missing.
LEFT(#a,2)+RIGHT('000'+CONVERT(NVARCHAR,CONVERT(INT,SUBSTRING( SUBSTRING(#a,2,4),2,3))+1)),3

First, I want to be clear about this: I totally agree with the comments to the question from a_horse_with_no_name and Jeroen Mostert.
You should be storing one data point per column, period.
Having said that, I do realize that a lot of times the database structure can't be changed - so here's one possible way to get that calculation for you.
First, create and populate sample table (Please save us this step in your future questions):
DECLARE #T AS TABLE
(
col varchar(100)
);
INSERT INTO #T (col) VALUES
('A0004'),
('1B2005'),
('1B2000'),
('1B00'),
('20CCCCCCC21');
(I've added a couple of strings as edge cases you didn't mention in the question)
Then, using a couple of cross apply to minimize code repetition, I came up with that:
SELECT col,
LEFT(col, LEN(col) - LastCharIndex + 1) +
REPLICATE('0', LEN(NumberString) - LEN(CAST(NumberString as int))) +
CAST((CAST(NumberString as int) + 1) as varchar(100)) As Result
FROM #T
CROSS APPLY
(
SELECT PATINDEX('%[^0-9]%', Reverse(col)) As LastCharIndex
) As Idx
CROSS APPLY
(
SELECT RIGHT(col, LastCharIndex - 1) As NumberString
) As NS
Results:
col Result
A0004 A0005
1B2005 1B2006
1B2000 1B2001
1B00 1B01
20CCCCCCC21 20CCCCCCC22
The LastCharIndex represents the index of the last non-digit char in the string.
The NumberString represents the number to increment, as a string (to preserve the leading zeroes if they exists).
From there, it's simply taking the left part of the string (that is, up until the number), and concatenate it to a newly calculated number string, using Replicate to pad the result of addition with the exact number of leading zeroes the original number string had.

Try This
DECLARE #test nvarchar(1000) ='"A0004", "1B2005","20CCCCCCC21"'
DECLARE #Temp AS TABLE (ID INT IDENTITY,Data nvarchar(1000))
INSERT INTO #Temp
SELECT #test
;WITH CTE
AS
(
SELECT Id,LTRIM(RTRIM((REPLACE(Split.a.value('.' ,' nvarchar(max)'),'"','')))) AS Data
,RIGHT(LTRIM(RTRIM((REPLACE(Split.a.value('.' ,' nvarchar(max)'),'"','')))),1)+1 AS ReqData
FROM
(
SELECT ID,
CAST ('<S>'+REPLACE(Data,',','</S><S>')+'</S>' AS XML) AS Data
FROM #Temp
) AS A
CROSS APPLY Data.nodes ('S') AS Split(a)
)
SELECT CONCAT('"'+Data+'"','-------->','"'+CONCAT(LEFT(Data,LEN(Data)-1),CAST(ReqData AS VARCHAR))+'"') AS ExpectedResult
FROM CTE
Result
ExpectedResult
-----------------
"A0004"-------->"A0005"
"1B2005"-------->"1B2006"
"20CCCCCCC21"-------->"20CCCCCCC22"

STUFF(#X
,LEN(#X)-CASE PATINDEX('%[A-Z]%',REVERSE(#X)) WHEN 0 THEN LEN(#X) ELSE PATINDEX('%[A-Z]%',REVERSE(#X))-1 END+1
,LEN(((RIGHT(#X,CASE PATINDEX('%[A-Z]%',REVERSE(#X)) WHEN 0 THEN LEN(#X) ELSE PATINDEX('%[A-Z]%',REVERSE(#X))-1 END)/#N)+1)#N)
,((RIGHT(#X,CASE PATINDEX('%[A-Z]%',REVERSE(#X)) WHEN 0 THEN LEN(#X) ELSE PATINDEX('%[A-Z]%',REVERSE(#X))-1 END)/#N)+1)#N)
works on number only strings
99 becomes 100
mod(#N) increments

Extract a string in Postgresql and remove null/empty elements

I Need to extract values from string with Postgresql
But for my special scenario - if an element value is null i want to remove it and bring the next element 1 index closer.
e.g.
assume my string is: "a$$b"
If i will use
select string_to_array('a$$b','$')
The result is:
{a,,b}
If Im trying
SELECT unnest(string_to_array('a__b___d_','_')) EXCEPT SELECT ''
It changes the order
1.d
2.a
3.b
order changes which is bad for me.
I have found a other solution with:
select array_remove( string_to_array(a||','||b||','||c,',') , '')
from (
select
split_part('a__b','_',1) a,
split_part('a__b','_',2) b,
split_part('a__b','_',3) c
) inn
Returns
{a,b}
And then from the Array - i need to extract values by index
e.g. Extract(ARRAY,2)
But this one seems to me like an overkill - is there a better or something simpler to use ?

You can use with ordinality to preserve the index information during unnesting:
select a.c
from unnest(string_to_array('a__b___d_','_')) with ordinality as a(c,idx)
where nullif(trim(c), '') is not null
order by idx;
If you want that back as an array:
select array_agg(a.c order by a.idx)
from unnest(string_to_array('a__b___d_','_')) with ordinality as a(c,idx)
where nullif(trim(c), '') is not null;

SQL Summing digits of a number

i'm using presto. I have an ID field which is numeric. I want a column that adds up the digits within the id. So if ID=1234, I want a column that outputs 10 i.e 1+2+3+4.
I could use substring to extract each digit and sum it but is there a function I can use or simpler way?

You can combine regexp_extract_all from #akuhn's answer with lambda support recently added to Presto. That way you don't need to unnest. The code would be really self explanatory if not the need for cast to and from varchar:
presto> select
reduce(
regexp_extract_all(cast(x as varchar), '\d'), -- split into digits array
0, -- initial reduction element
(s, x) -> s + cast(x as integer), -- reduction function
s -> s -- finalization
) sum_of_digits
from (values 1234) t(x);
sum_of_digits
---------------
10
(1 row)

If I'm reading your question correctly you want to avoid having to hardcode a substring grab for each numeral in the ID, like substring (ID,1,1) + substring (ID,2,1) + ...substring (ID,n,1). Which is inelegant and only works if all your ID values are the same length anyway.
What you can do instead is use a recursive CTE. Doing it this way works for ID fields with variable value lengths too.
Disclaimer: This does still technically use substring, but it does not do the clumsy hardcode grab
WITH recur (ID, place, ID_sum)
AS
(
SELECT ID, 1 , CAST(substring(CAST(ID as varchar),1,1) as int)
FROM SO_rbase
UNION ALL
SELECT ID, place + 1, ID_sum + substring(CAST(ID as varchar),place+1,1)
FROM recur
WHERE len(ID) >= place + 1
)
SELECT ID, max(ID_SUM) as ID_sum
FROM recur
GROUP BY ID

First use REGEXP_EXTRACT_ALL to split the string. Then use CROSS JOIN UNNEST GROUP BY to group the extracted digits by their number and sum over them.
Here,
WITH my_table AS (SELECT * FROM (VALUES ('12345'), ('42'), ('789')) AS a (num))
SELECT
num,
SUM(CAST(digit AS BIGINT))
FROM
my_table
CROSS JOIN
UNNEST(REGEXP_EXTRACT_ALL(num,'\d')) AS b (digit)
GROUP BY
num
;

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

How to get maximum value of a specific part of strings? - sql

I have below records Id Title 500006 FS/97/98/037 500007 FS/97/04/035 500008 FS/97/01/036 500009 FS/97/104/040 I should split Title field and get 4th part of text and return maximum value. In this example my query should return 040 or 40.

select max(cast(right(Title, charindex('/', reverse(Title) + '/') - 1) as int)) from your_table SQLFiddle demo

You can use PARSENAME function since you always have 4 parts(confirmed in comments section) select max(cast(parsename(replace(Title,'/','.'),1) as int)) from yourtable

You can also try the below query. SELECT Top 1 CAST('<x>' + REPLACE(Title,'/','</x><x>') + '</x>' AS XML).value('/x[4]','int') as Value from Data order by 1 desc You can find the live demo Here.

Related

Order by generated column

How to fetch only a part of string

sql extract rightmost number in string and increment

Extract a string in Postgresql and remove null/empty elements

SQL Summing digits of a number

Categories

Resources