tsql - extract only a portion of a dynamic string - sql

I have a table with a column that holds a string such as:
xx-xx-xxx-84
xx-25-xxx-xx
xx-xx-123-xx
I want to go ahead and query out the numbers only, but as you can see, the numbers are placed at different places in the string every time. Is there away to query for only the numbers in the string?
Thank you,
I appreciate your time!

This requires repeated application of string functions. One method that helps with all the nesting is using OUTER APPLY. Something like this:
select t3.col
from t outer apply
(select t.*, patindex(t.col, '[0-9]') - 1 as numpos) t1 outer apply
(select t1.*, substring(t1.col, t1.numpos, len(t1.col)) as col2) t2 outer apply
(select t2.*,
(case when col2 like '%-%'
then substring(t2.col, charindex('-', t2.col))
else t2.col
end) as col3
) t3

The easy way (sure if only 'x' and '-' are in strings):
SELECT REPLACE(REPLACE(s,'x',''),'-','') FROM T
Or if X can be any non digital character then using PATINDEX() function:
SELECT S, SUBSTRING(S,
PATINDEX('%[0-9]%',s),
PATINDEX('%[0-9][^0-9]%',s)
+PATINDEX('%[0-9]',s)
+1
-PATINDEX('%[0-9]%',s)) as digit
FROM T

Related

Postgresql subtract comma separated string in one column from another column

The format is like:
col1
col2
V1,V2,V3,V4,V5,V6
V4,V1,V6
V1,V2,V3
V2,V3
I want to create another column called col3 which contains the subtraction of two columns.
What I have tried:
UPDATE myTable
SET col3=(replace(col1,col2,''))
It works well for rows like row2 since the order of replacing patterns matters.
I was wondering if there's a perfect way to achieve the same goal for rows like row1.
So the desired output would be:
col1
col2
col3
V1,V2,V3,V4,V5,V6
V4,V1,V6
V2,V3,V5
V1,V2,V3
V2,V3
V1
Any suggestions would be appreciated!
Split values into tables, subtract sets and then assemble it back. Everything is possible as an expression defining new query column.
with t (col1,col2) as (values
('V1,V2,V3,V4,V5,V6','V4,V1,V6'),
('V1,V2,V3','V2,V3')
)
select col1,col2
, (
select string_agg(v,',')
from (
select v from unnest(string_to_array(t.col1,',')) as a1(v)
except
select v from unnest(string_to_array(t.col2,',')) as a2(v)
) x
)
from t
DB fiddle
You will have to unnest the elements then apply an EXCEPT clause on the "unnested" rows and aggregate back:
select col1,
col2,
(select string_agg(item,',' order by item)
from (
select *
from string_to_table(col1, ',') as c1(item)
except
select *
from string_to_table(col2, ',') as c2(item)
) t)
from the_table;
I wouldn't store that result in a separate column, but if you really need to introduce even more problems by storing another comma separated list.
update the_table
set col3 = (select string_agg(item,',' order by item)
from (
select *
from string_to_table(col1, ',') as c1(item)
except
select *
from string_to_table(col2, ',') as c2(item)
) t)
;
string_to_table() requires Postgres 14 or newer. If you are using an older version, you need to use unnest(string_to_array(col1, ',')) instead
If you need that a lot, consider creating a function:
create function remove_items(p_one text, p_other text)
returns text
as
$$
select string_agg(item,',' order by item)
from (
select *
from string_to_table(col1, ',') as c1(item)
except
select *
from string_to_table(col2, ',') as c2(item)
) t;
$$
language sql
immutable;
Then the above can be simplified to:
select col1, col2, remove_items(col1, col2)
from the_table;
Note, POSTGRESQL is not my forte, but thought I'd have a go at it. Try:
SELECT col1, col2, RTRIM(REGEXP_REPLACE(Col1,CONCAT('\m(?:', REPLACE(Col2,',','|'),')\M,?'),'','g'), ',') as col3 FROM myTable
See an online fidle.
The idea is to use a regular expession to replace all values, based on following pattern:
\m - Word-boundary at start of word;
(?:V4|V1|V6) - A non-capture group that holds the alternatives from col2;
\M - Word-boundary at end of word;
,? - Optional comma.
When replaced with nothing we need to clean up a possible trailing comma with RTRIM(). See an online demo where I had to replace the word-boundaries with the \b word-boundary to showcase the outcome.

Select the last non-NULL value when current row is NULL

I know that there are a lot of solutions for this but unfortunately I cannot use partition or keyword TOP. Nothing I tried on earlier posts works.
My table looks like this:
The result I want is when any completion percentage is NULL it should get the value from last non-value completion percentage, like this:
I tried this query but nothing works. Can you tell me where I am going wrong?
SELECT sequence,project_for_lookup,
CASE WHEN completion_percentage IS NOT NULL THEN completion_percentage
ELSE
(SELECT max(completion_percentage) FROM [project_completion_percentage] AS t2
WHERE t1.project_for_lookup=t2.project_for_lookup and
t1.sequence<t2.sequence and
t2.completion_percentage IS NOT null
END
FROM [project_completion_percentage] AS t1
SQL Server 2008 doesn't support cumulative window functions. So, I would suggest outer apply:
select cp.projectname, cp.sequence,
coalesce(cp.completion_percentage, cp2.completion_percentage) as completion_percentage
from completion_percentage cp outer apply
(select top 1 cp2.*
from completion_percentage cp2
where cp2.projectname = cp.projectname and
cp2.sequence < cp.sequence and
cp2.completion_percentage is not null
order by cp2.sequence desc
) cp2;
Does this work? It seems to for me. You were missing a parenthesis and had the sequence backwards.
http://sqlfiddle.com/#!3/465f2/4
SELECT sequence,project_for_lookup,
CASE WHEN completion_percentage IS NOT NULL THEN completion_percentage
ELSE
(
SELECT max(completion_percentage)
FROM [project_completion_percentage] AS t2
WHERE t1.project_for_lookup=t2.project_for_lookup
-- sequence was reversed. You're on the row t1, and want t2 that is from a prior sequence.
and t2.sequence<t1.sequence
and t2.completion_percentage IS NOT null
--missing a closing paren
)
END
FROM [project_completion_percentage] AS t1

SQL: Select strings which have equal words

Suppose I have a table of strings, like this:
VAL
-----------------
Content of values
Values identity
Triple combo
my combo
sub-zero combo
I want to find strings which have equal words. The result set should be like
VAL MATCHING_VAL
------------------ ------------------
Content of values Values identity
Triple combo My combo
Triple combo sub-zero combo
or at least something like this.
Can you help?
One method is to use a hack for regular expressions:
select t1.val, t2.val
from t t1 join
t t2
on regexp_like(t1.val, replace(t2.val, ' ', '|');
You might want the case to be identical as well:
on regexp_like(lower(t1.val), replace(lower(t2.val), ' ', '|');
You could use a combination of SUBSTRING and LIKE.
use charIndex(" ") to split the words up in the substring if thats what you want to do.
Using some of the [oracle internal similiarity] found in UTL_Match (https://docs.oracle.com/database/121/ARPLS/u_match.htm#ARPLS71219) matching...
This logic is more for matching names or descriptions that are 'Similar' and where phonetic spellings or typo's may cause the records not to match.
By adjusting the .5 below you can see how the %'s get you closer and closer to perfect matches.
with cte as (
select 'Content of values' val from dual union all
select 'Values identity' val from dual union all
select 'triple combo' from dual union all
select 'my combo'from dual union all
select 'sub-zero combo'from dual)
select a.*, b.*, utl_match.edit_distance_similarity(a.val, b.val) c, UTL_MATCH.JARO_WINKLER(a.val,b.val) JW
from cte a
cross join cte b
where UTL_MATCH.JARO_WINKLER(a.val,b.val) > .5
order by utl_match.edit_distance_similarity(a.val, b.val) desc
and screenshot of query/output.
Or we could use an inner join and > if we only want one way compairisons...
select a.*, b.*, utl_match.edit_distance_similarity(a.val, b.val) c, UTL_MATCH.JARO_WINKLER(a.val,b.val) JW
from cte a
inner join cte b
on A.Val > B.Val
where utl_match.jaro_winkler(a.val,b.val) > .5
order by utl_match.edit_distance_similarity(a.val, b.val) desc
this returns the 3 desired records.
But this does not explicitly check each any word matches. which was your base requirement. I just wanted you to be aware of alternatives.

SQL _ wildcard not working as expected. Why?

so i have this query
select id, col1, len(col1)
from tableA
from there I wanted to grab all data in col1 that have exactly 5 characters and start with 15
select id, col1, len(col1)
from tableA
where col1 like '15___' -- underscore 3 times
Now col1 is a nvarchar(192) and there are data that starts with 15 and are of length 5. But the second query always shows me no rows.
Why is that?
The case could be that the field is a large empty string? Such as "15123 "
You could also try another solution?
select id, col1, len(col1)
from tableA
where col1 like '15%' AND Len(col1)=5
EDIT - FOR FUTURE REFERENCE:
For sake of comprehensiveness, char and nchar uses the full field size, so char(10) would be 15________ ("15" + 8 characters) long, because it implicitly forces the size, whereas a varchar resizes based on what it is supplied 15 is simply 15.
To get around this you could
A) Do an LTRIM/RTRIM To cut off all extra spaces
select id, col1, len(col1)
from tableA
where rtrim(ltrim(col1)) like '15___'
B) Do a LEFT() to only grab the left 5 characters
select id, col1, len(col1)
from tableA
where left(col1,5) like '15___'
C) Cast as a varchar, a rather sloppy approach
select id, col1, len(col1)
from tableA
where CAST(col1 AS Varchar(192)) like '15___'
Does this query return anything?
select id, col1, len(col1)
from tableA
where len(col1) = 5 and
left(col1, 2) = '15';
If not, then there are no values that match that pattern. And, my best guess would be spaces, in which case, this might work:
select id, col1, len(col1)
from tableA
where ltrim(rtrim(col1)) like '15___';

Detect (find) string in another string ( nvarchar (MAX) )

I've got nvarchar(max) column with different values alike 'A2'
And another column from another table with values alike '(A2 AND A3) OR A4'
I need to detect does string from second column contains string from first column.
So then I need to select all columns of second table which contains an string from first column of first table.
something alike ... but that is wrong
SELECT * Cols FROM T2
WHERE (SELECT T1.StringCol FROM T1) IN T2.StringCol
but I more understand it like it (in f# syntax)
for t1.date, t1.StringCol from t1
for t2.StringCol from t2
if t2.StringCol.Contains( t1.StringCol )
yield t2.StringCol, t1.date
This should get what you want...
select t2.*
from t1 cross join t2
where patindex('%' + t1.StringCol + '%', t2.StringCol) > 0