How to get string from value - sql

I have an Customer_value column.
The column contains values like:
DAL123245,HC.533675,ABC.01232423
HC.3425364,ABC.045367544,DAL4346456
HC.35344,ABC.03543645754,ABC.023534454,DAL.4356433
ABC.043534553,HC.3453643,ABC.05746343
What I am trying to do is get the number after the first "ABC.0" string.
For example, this is what I would like to get:
1232423
5367544
3543645754
43534553
this is what I tried:
Substring(customer_value,charindex('ABC.', customer_value) + 5, len(customer_value)) as dataneeded
The issue that I got is for 1 and 2 I got that right data as needed, but for 3 and 4, because there are multiple ABC so it gave me everything after the first ABC.
How can I get the number after the first ABC. only?
Thank you so much

Just another option is to use a bit of JSON to parse and preserve the sequence in concert with a CROSS APPLY
Note: Use OUTER APPLY to see NULL values
Example
Select NewVal = replace(Value,'ABC.0','')
From YourTable A
Cross Apply (
Select Top 1 *
From OpenJSON( '["'+replace(string_escape(customer_value,'json'),',','","')+'"]' )
Where Value like 'ABC.0%'
Order by [key]
) B
Results
NewVal
1232423
45367544
3543645754
43534553

On the assumption you are using SQL Server (given your use of charindex()/substring()/len()) you can use apply to calculate the starting position and then find the next occurence utilising the start position optional parameter of charindex, then get the substring between the values.
select Substring(customer_value, p1.v, Abs(p2.v-p1.v)) as dataneeded
from t
cross apply(values(charindex('ABC.', customer_value)+5))p1(v)
cross apply(values(charindex(',', customer_value,p1.v)))p2(v)

Related

A SQL Query to select certain strings in a folder path

I have a table with a column that contains the path to SSIS packages located in a drive. The entire folder path is populated in the column. I need a SQL query to get a section of the string within the folder path.
An example of record in the column_1.
/FILE "\"G:\Enterprise_Data\Packages\SSIS_Packages_Source_to_Target_Data_Snowflake.dtsx\""/CHECKPOINTING OFF /REPORTING E
All I am interested in extracting is the "SSIS_Packages_Source_to_Target_Data_Snowflake". Everything I have tried so far throws errors. The latest code I tried is:
SELECT SUBSTRING(Column_1, LEFT(CHARINDEX('dtsx', Column_1)), LEN(Column_1) - CHARINDEX('dtsx', Column_1)).
I would really appreciate some help with this.
Thanks!
Given you know the extension and its unlikely to appear elsewhere in the string, find it, and truncate to it. Do that in a CROSS APPLY so we can use the value multiple times.
Then find the nearest slash (using REVERSE) and use SUBSTRING from there to the end.
SELECT
SUBSTRING(Y.[VALUE], LEN(Y.[VALUE]) - PATINDEX('%\%', REVERSE(Y.[VALUE])) + 2, LEN(Y.[VALUE]))
FROM (
VALUES ('/FILE "\"G:\Enterprise_Data\Packages\SSIS_Packages_Source_to_Target_Data_Snowflake.dtsx\""/CHECKPOINTING OFF /REPORTING E')
) AS X ([Value])
CROSS APPLY (
VALUES (SUBSTRING(X.[Value], 1, PATINDEX('%.dtsx%', X.[Value])-1))
) AS Y ([Value]);
Returns:
SSIS_Packages_Source_to_Target_Data_Snowflake
Another possible way is this. not sure on the performance of it though
SELECT vt.[value]
FROM (
VALUES ('/FILE "\"G:\Enterprise_Data\Packages\SSIS_Packages_Source_to_Target_Data_Snowflake.dtsx\""/CHECKPOINTING OFF /REPORTING E')
) AS X ([Value])
OUTER APPLY (
SELECT * FROM STRING_SPLIT(x.Value,'\')
) vt
WHERE vt.[value] LIKE '%.dtsx'
Thank you #Dale K for your response and solutions provided. I was able to replicate the same for my query to obtain the result. Below is how I modified the query in my environment to fetch only the new string column1 after applying the string manipulations based on your solution:
SELECT SUBSTRING(Y.column1, LEN(Y.column1) - PATINDEX('%%', REVERSE(Y.column1)) +2, LEN(Y.column1))
FROM (SELECT column1 FROM CTE1) AS X ([column2])
CROSS APPLY (SELECT SUBSTRING(X.column2, 1, PATINDEX('%.dtsx%', X.column2)-1) FROM CTE1) AS Y ([column1])
I am actually query a CTE table (CTE1) to get my desired result. The issue is that, I have other columns in the CTE1 that I need to include in the final select query results, which of course should include the string manipulated results from Column1. Currently, I get errors when I try to include other columns in my final result from the CTE1 along with the resultset from the query above.
Example of final query:
Select Jobname,job_step,job_date,job_duration,Column1 (this will be the resultset from the string manipulation)FROM CTE1;
So, what I'm currently doing that is not working is as follows:
SELECT C1.Jobname,C1.job_step,C1.job_date,C1.job_duration,Column1 =(SELECT SUBSTRING(Y.column1, LEN(Y.column1) - PATINDEX('%\%', REVERSE(Y.column1)) +2, LEN(Y.column1)) FROM (SELECT column1 FROM CTE1) AS X ([column2]) CROSS APPLY (SELECT SUBSTRING(X.column2, 1, PATINDEX('%.dtsx%', X.column2)-1) FROM CTE1) AS Y ([column1])) FROM CTE1 C1
Please, how can I obtain the final results with all the above columns present in the resultsets?
Thank you.

Extract String Between Two Different Characters

I am having some trouble trying to figure out a way to extract a string between two different characters. My issue here is that the column (CONFIG_ID) contains more that 75,000 rows and the format is not consistent, so I cannot figure out a way to get the numbers between E and B.
*CONFIG_ID*
6E15B1P
999E999B1P
1E3B1P
1E30B1P
5E24B1P
23E6B1P
Another option is to use a CROSS APPLY to calculate the values only once. Another nice thing about CROSS APPLY is that you can stack calculations and use them in the top SELECT
Notice the nullif() rather than throwing an error if the character is not found, it will return a NULL
THIS ALSO ASSUMES there are no LEADING B's
Example
Declare #YourTable Table ([CONFIG_ID] varchar(50)) Insert Into #YourTable Values
('6E15B1P')
,('999E999B1P')
,('1E3B1P')
,('1E30B1P')
,('5E24B1P')
,('23E6B1P')
,('23E6ZZZ') -- Notice No B
Select [CONFIG_ID]
,NewValue = substring([CONFIG_ID],P1,P2-P1)
From #YourTable
Cross Apply ( values (nullif(charindex('E',[CONFIG_ID]),0)+1
,nullif(charindex('B',[CONFIG_ID]),0)
) )B(P1,P2)
Results
CONFIG_ID NewValue
6E15B1P 15
999E999B1P 999
1E3B1P 3
1E30B1P 30
5E24B1P 24
23E6B1P 6
23E6ZZZ NULL -- Notice No B
SUBSTRING(config_id,PATINDEX('%E%',config_id)+1,PATINDEX('%B%',config_id)-PATINDEX('%E%',config_id)-1)
As in:
WITH dat
AS
(
SELECT config_id
FROM (VALUES ('1E30B1P')) t(config_id)
)
SELECT SUBSTRING(config_id,PATINDEX('%E%',config_id)+1,PATINDEX('%B%',config_id)-PATINDEX('%E%',config_id)-1)
FROM dat
A cased substring of a left could be enough.
select *
, CASE
WHEN [CONFIG_ID] LIKE '%E%B%'
THEN SUBSTRING(LEFT([CONFIG_ID], CHARINDEX('B',[CONFIG_ID],CHARINDEX('E',[CONFIG_ID]))),
CHARINDEX('E',[CONFIG_ID]), LEN([CONFIG_ID]))
END AS [CONFIG_EB]
from Your_Table
CONFIG_ID
CONFIG_EB
6E15B1P
E15B
999E999B1P
E999B
1E3B1P
E3B
1E30B1P
E30B
5E24B1P
E24B
23E6B1P
E6B
23E678
null
236789
null
23B456
null
Test on db<>fiddle here

Extract a string in Postgresql and remove null/empty elements

I Need to extract values from string with Postgresql
But for my special scenario - if an element value is null i want to remove it and bring the next element 1 index closer.
e.g.
assume my string is: "a$$b"
If i will use
select string_to_array('a$$b','$')
The result is:
{a,,b}
If Im trying
SELECT unnest(string_to_array('a__b___d_','_')) EXCEPT SELECT ''
It changes the order
1.d
2.a
3.b
order changes which is bad for me.
I have found a other solution with:
select array_remove( string_to_array(a||','||b||','||c,',') , '')
from (
select
split_part('a__b','_',1) a,
split_part('a__b','_',2) b,
split_part('a__b','_',3) c
) inn
Returns
{a,b}
And then from the Array - i need to extract values by index
e.g. Extract(ARRAY,2)
But this one seems to me like an overkill - is there a better or something simpler to use ?
You can use with ordinality to preserve the index information during unnesting:
select a.c
from unnest(string_to_array('a__b___d_','_')) with ordinality as a(c,idx)
where nullif(trim(c), '') is not null
order by idx;
If you want that back as an array:
select array_agg(a.c order by a.idx)
from unnest(string_to_array('a__b___d_','_')) with ordinality as a(c,idx)
where nullif(trim(c), '') is not null;

SQL Summing digits of a number

i'm using presto. I have an ID field which is numeric. I want a column that adds up the digits within the id. So if ID=1234, I want a column that outputs 10 i.e 1+2+3+4.
I could use substring to extract each digit and sum it but is there a function I can use or simpler way?
You can combine regexp_extract_all from #akuhn's answer with lambda support recently added to Presto. That way you don't need to unnest. The code would be really self explanatory if not the need for cast to and from varchar:
presto> select
reduce(
regexp_extract_all(cast(x as varchar), '\d'), -- split into digits array
0, -- initial reduction element
(s, x) -> s + cast(x as integer), -- reduction function
s -> s -- finalization
) sum_of_digits
from (values 1234) t(x);
sum_of_digits
---------------
10
(1 row)
If I'm reading your question correctly you want to avoid having to hardcode a substring grab for each numeral in the ID, like substring (ID,1,1) + substring (ID,2,1) + ...substring (ID,n,1). Which is inelegant and only works if all your ID values are the same length anyway.
What you can do instead is use a recursive CTE. Doing it this way works for ID fields with variable value lengths too.
Disclaimer: This does still technically use substring, but it does not do the clumsy hardcode grab
WITH recur (ID, place, ID_sum)
AS
(
SELECT ID, 1 , CAST(substring(CAST(ID as varchar),1,1) as int)
FROM SO_rbase
UNION ALL
SELECT ID, place + 1, ID_sum + substring(CAST(ID as varchar),place+1,1)
FROM recur
WHERE len(ID) >= place + 1
)
SELECT ID, max(ID_SUM) as ID_sum
FROM recur
GROUP BY ID
First use REGEXP_EXTRACT_ALL to split the string. Then use CROSS JOIN UNNEST GROUP BY to group the extracted digits by their number and sum over them.
Here,
WITH my_table AS (SELECT * FROM (VALUES ('12345'), ('42'), ('789')) AS a (num))
SELECT
num,
SUM(CAST(digit AS BIGINT))
FROM
my_table
CROSS JOIN
UNNEST(REGEXP_EXTRACT_ALL(num,'\d')) AS b (digit)
GROUP BY
num
;

Expanding list from SQL column with XML datatype

Given a table with an XML column, how can I expand a list stored therein, into a relational rowset?
Here is an example of a similar table.
DECLARE #test TABLE
(
id int,
strings xml
)
insert into #test (id, strings)
select 1, '<ArrayOfString><string>Alpha</string><string>Bravo</string><string>Charlie</string></ArrayOfString>' union
select 2, '<ArrayOfString><string>Bravo</string><string>Delta</string></ArrayOfString>'
I would like to obtain a rowset like this.
id string
-- ------
1 Alpha
1 Bravo
1 Charlie
2 Bravo
2 Delta
I have figured out that I can use the nodes method to output the XML value for each id like this
select id, R.strings.query('/ArrayOfString/string') as string
from #test cross apply strings.nodes('/ArrayOfString/string') as R(strings)
returning
id string
-- ------
1 <string>Alpha</string><string>Bravo</string><string>Charlie</string>
1 <string>Alpha</string><string>Bravo</string><string>Charlie</string>
1 <string>Alpha</string><string>Bravo</string><string>Charlie</string>
2 <string>Bravo</string><string>Delta</string>
2 <string>Bravo</string><string>Delta</string>
and that I can append [1] to the query to get only the first string
select id, R.strings.query('/ArrayOfString/string[1]') as string
from #test cross apply strings.nodes('/ArrayOfString/string') as R(strings)
but that still leaves me with this
id string
-- ------
1 <string>Alpha</string>
1 <string>Alpha</string>
1 <string>Alpha</string>
2 <string>Bravo</string>
2 <string>Bravo</string>
Additionally, if I change the R.strings.query to R.strings.value to get rid of the tags, I get an error; XQuery [#test.strings.value()]: 'value()' requires a singleton (or empty sequence), found operand of type 'xdt:untypedAtomic *'.
Can anyone point out what I'm missing in order to get each subsequent string value for each id on subsequent rows, and why the value() method is giving me that error?
Here is how you can do it using a cross apply and the xml methods value and nodes:
select
id,
s2.value('.', 'varchar(max)') as string
from #test
cross apply strings.nodes('/*/string') t(s2)
You were getting the error because you were missing the () around the xPath:
select id,
R.strings.value('(/ArrayOfString/string)[1]','varchar(max)') as string
from #test
cross apply strings.nodes('/ArrayOfString/string') as R(strings)
The problem with your script is that you are always selecting the first string. Using the '.' you will get the contents of the XML element string.
Maybe you could try using the string() function instead of the value() function? This should return the actual value of the node without the xml.
Here is the msdn page - http://msdn.microsoft.com/en-us/library/ms188692.aspx
Really hope that helps. :)