sql in clause doesn't work - sql

I have a table with a column ancestry holding a list of ancestors formatted like this "1/12/45". 1 is the root, 12 is children of 1, etc...
I need to find all the records having a specific node/number in their ancestry list. To do so, I wrote this sql statement:
select * from nodes where 1 in (nodes.ancestry)
I get following error statement: operator does not exist: integer = text
I tried this as well:
select * from nodes where '1' in (nodes.ancestry)
but it only returns the records having 1 in their ancestry field. Not the one having for instance 1/12/45
What's wrong?
Thanks!

This sounds like a job for LIKE, not IN.
If we assume you want to search for this value in any position, and then we might try:
select * from nodes where '/' + nodes.ancestry + '/' like '%/1/%'
Note that exact syntax for string concatenation varies between SQL products. Note that I'm prepending and appending to the ancestry column so that we don't have to treat the first/last items in the list differently than middle items. Note also that we surround the 1 with /s, so that we don't get false matches for e.g. with /51/ or /12/.

In MySQL you could write:
SELECT * FROM nodes
WHERE ancestry = '1'
OR LEFT(ancestry, 2) = '1/'
OR RIGHT(ancestry, 2) = '/1'
OR INSTR(ancestry, '/1/') > 0

The in operator expects a comma separated list of values, or a query result, i.e.:
... in (1,2,3,4,5)
or:
... in (select id from SomeOtherTable)
What you need to do is to create a string from the number, so that you can look for it in the other string.
Just looking for the string '1' in the ancestry list would give false positives, as it would find it in the string '2/12/45'. You need to add the separator to the beginning and the end of both strings, so that you look for a string like '/1/' in a string like '/1/12/45/':
select * from nodes
where charindex('/' + convert(varchar(50), 1) + '/', '/' + nodes.ancestry + '/') <> 0

Related

How to ignore specific string value when using pattern and patindex function in SQL Server Query?

I have this query here.
WITH Cte_Reverse
AS (
SELECT CASE PATINDEX('%[^0-9.- ]%', REVERSE(EmailName))
WHEN 0
THEN REVERSE(EmailName)
ELSE left(REVERSE(EmailName), PATINDEX('%[^0-9.- ]%', REVERSE(EmailName)) - 1)
END AS Platform_Campaign_ID,
EmailName
FROM [Arrakis].[xtemp].[Stage_SendJobs_Marketing]
)
SELECT REVERSE(Platform_Campaign_ID) AS Platform_Campaign_ID, EmailName
FROM Cte_Reverse
WHERE REVERSE(Platform_Campaign_ID) <> '2020'
AND REVERSE(Platform_Campaign_ID) <> ''
AND LEN(REVERSE(Platform_Campaign_ID)) = 4;
It is working for the most part, below is a screenshot of the result set.
The query I posted above extracts the 4 numbers to the right out of the initial value that is set for the column I am extracting out of. But I am unable to figure out how I can also have the query ignore cases when the right most value is -v2, -v1, etc. essentially anything with -v and whatever number version it is.
If you want four digits, then one method is:
select substring(emailname, patindex('%[0-9][0-9][0-9][0-9]%', emailname), 4)

Is there any way to add a unique identifier to every replacement that REGEXP_REPLACE performs?

I have a large text-CLOB that needs some converting done.
A lot of the lines in my CLOB are preceded by a variable name in brackets like so:
[VARIABLE_NAME_ONE] variable_one = 1 + variable_two;
[VARIABLE_NAME_TWO] variable_two = 2 + variable_three;
[VARIABLE_NAME_ONE] variable_one = variable_four - 4;
The problem is that some of the variable names in brackets are not unique, but they need to be unique after I'm done converting.
What I would like is to extend all the variable names in brackets with something like a counter, in order to ensure uniqueness. Because of the brackets, my initial thought was a simple regexp_replace, but is there any way to incorporate a counter in that?
To complete my explanation, I would like the previous example lines converted into this:
[VARIABLE_NAME_ONE_1] variable_one = 1 + variable_two;
[VARIABLE_NAME_TWO_2] variable_two = 2 + variable_three;
[VARIABLE_NAME_ONE_3] variable_one = variable_four - 4;
You can use hierarchical query through splitting by semi-colons by REGEXP_SUBSTR while replacing the values just before the square brackets. And then combine the pieces by LISTAGG() function
UPDATE tab
SET col = (
WITH t AS
(
SELECT REPLACE(REGEXP_SUBSTR(col,'[^;]+',1,level),']','_'||level||']')
AS col, level AS lvl
FROM TAB t
CONNECT BY level <= REGEXP_COUNT(col,';')
)
SELECT LISTAGG(col,';') WITHIN GROUP (ORDER BY lvl)||';'
FROM t)
Demo

SQL special group by on list of strings ending with *

I would like to perform a "special group by" on strings with SQL language, some ending with "*". I use postgresql.
I can not clearly formulate this problem, even if I have partially solved it, with select, union and nested queries which are not elegant.
For exemple :
1) INPUT : I have a list of strings :
thestrings
varchar(9)
--------------
1000
1000-0001
1000-0002
2000*
2000-0001
2000-0002
3000*
3000-00*
3000-0001
3000-0002
2) OUTPUT : That I would like my "special group by" return :
1000
1000-0001
1000-0002
2000*
3000*
Because 2000-0001 and 2000-0002 are include in 2000*,
and because 3000-00*, 3000-0001 and 3000-0002 are includes in 3000*
3) SQL query I do :
SELECT every strings ending with *
UNION
SELECT every string where the begining NOT IN (SELECT every string ending with *) <-- with multiple inelegant left functions and NOT IN subqueries
4) That what I'm doing return :
1000
1000-0001
1000-0002
2000*
3000*
3000-00* <-- the problem
The problem is : 3000-00* staying in my result.
So my question is :
How can I generalize my problem? to remove all string who have a same begining string in the list (ending with *) ?
I think of regular expressions, but how to pass a list from a select in a regex ?
Thanks for help.
Select only strings for which no master string exists in the table:
select str
from mytable
where not exists
(
select *
from mytable master
where master.str like '%*'
and master.str <> mytable.str
and rtrim(mytable.str, '*') like rtrim(master.str, '*') || '%'
);
Assuming that only one general pattern can match any given string, the following should do what you want:
select coalesce(tpat.thestring, t.thestring) as thestring
from t left join
t tpat
on t.thestring like replace(tpat.thestring, '*', '%') and
t.thestring <> tpat.thestring
group by coalesce(tpat.thestring, t.thestring);
However, that is not your case. However, you can adjust this with distinct on:
select distinct on (t.thestring) coalesce(tpat.thestring, t.thestring)
from t left join
t tpat
on t.thestring like replace(tpat.thestring, '*', '%') and
t.thestring <> tpat.thestring
order by t.thestring, length(tpat.thestring)

How to use regexp_substr() with group of delimiter characters?

I have a string something like this 'SERO02~~~NA_#ERO5'. I need to sub string it using delimiter ~~~. So can get SERO02 and NA_#ERO5 as result.
I create an regex experession like this:
select regexp_substr('SERO02~~~NA_#ERO5' ,'[^~~~]+',1,2) from dual;
It worked fine and returns : NA_#ERO5
But if I change the string to ERO02~NA_#ERO5 the result is still same.
But I expect the expression to return nothing since delimiter ~~~ is not found in that string. Can someone help me out to create correct expression?
[^~~~] matches a single character that is not one of the characters following the caret in the square brackets. Since all those characters are identical then [^~~~] is the same as [^~].
You can match it using:
SELECT REGEXP_SUBSTR(
'SERO02~~~NA_#ERO5',
'~~~(.*?)(~~~|$)',
1,
1,
NULL,
1
)
FROM DUAL;
Which will match ~~~ then store zero-or-more characters in a capture group (the round brackets () indicates a capture group) until it finds either ~~~ or the end-of-string. It will then return the first capture group.
You can do it without regular expressions, with a bit of logics:
with test(text) as ( select 'SERO02~~~NA_#ERO5' from dual)
select case
when instr(text, '~~~') != 0 then
substr(text, instr(text, '~~~') + 3)
else
null
end
from test
This will give the part of the string after '~~~', if it exists, null otherwise.
You can edit the ELSE part to get what you need when the input string does not contain '~~~'.
Even using regexp,to match the string '~~~', you need to write it exactly, without []; the [] is used to list a set of characters, so [aaaaa] is exactly the same than [a],while [abc] means 'a' OR 'b' OR 'c'.
With regexp, even if not necessary, one way could be the following:
substr(regexp_substr(text, '~~~.*'), 4)
In case you want all elements. Handles NULL elements too:
SQL> with tbl(str) as (
select 'SERO02~~~NA_#ERO5' from dual
)
select regexp_substr(str, '(.*?)(~~~|$)', 1, level, null, 1) element
from tbl
connect by level <= regexp_count(str, '~~~') + 1;
ELEMENT
-----------------
SERO02
NA_#ERO5
SQL>

sql prepend entries

I have some entries that are inconstant, they should be prepended by the same string.
some have numbers and other have the dollar sign and the number
so I have a syntax that finds all the entries which do not have the dollar sign
WHERE [mydata] not like '$%'
how do I add the string before each entry?
update table set mydata = '$' + mydata where [mydata] not like '$%'
The + only works in SQLServer; you may need to use the concatenate function otherwise.
If you are looking for just a select statement to return the value with a $, this will work:
Select '$' + field
from [table]
where field not like '$%'
You can run it as a case statement had you wanted both records that have a $ and those without to be returned with the $
Select case when left(field,1) = '$' then '' else '$' end + field
from [table]
where field not like '$%'
edit: Might need to convert the 'field' into a varchar for the + to work, you'll get a syntax error if the field is an int (but you have the $ spuradically in the field, so I assumed it's a varchar