SQL special group by on list of strings ending with * - sql

I would like to perform a "special group by" on strings with SQL language, some ending with "*". I use postgresql.
I can not clearly formulate this problem, even if I have partially solved it, with select, union and nested queries which are not elegant.
For exemple :
1) INPUT : I have a list of strings :
thestrings
varchar(9)
--------------
1000
1000-0001
1000-0002
2000*
2000-0001
2000-0002
3000*
3000-00*
3000-0001
3000-0002
2) OUTPUT : That I would like my "special group by" return :
1000
1000-0001
1000-0002
2000*
3000*
Because 2000-0001 and 2000-0002 are include in 2000*,
and because 3000-00*, 3000-0001 and 3000-0002 are includes in 3000*
3) SQL query I do :
SELECT every strings ending with *
UNION
SELECT every string where the begining NOT IN (SELECT every string ending with *) <-- with multiple inelegant left functions and NOT IN subqueries
4) That what I'm doing return :
1000
1000-0001
1000-0002
2000*
3000*
3000-00* <-- the problem
The problem is : 3000-00* staying in my result.
So my question is :
How can I generalize my problem? to remove all string who have a same begining string in the list (ending with *) ?
I think of regular expressions, but how to pass a list from a select in a regex ?
Thanks for help.

Select only strings for which no master string exists in the table:
select str
from mytable
where not exists
(
select *
from mytable master
where master.str like '%*'
and master.str <> mytable.str
and rtrim(mytable.str, '*') like rtrim(master.str, '*') || '%'
);

Assuming that only one general pattern can match any given string, the following should do what you want:
select coalesce(tpat.thestring, t.thestring) as thestring
from t left join
t tpat
on t.thestring like replace(tpat.thestring, '*', '%') and
t.thestring <> tpat.thestring
group by coalesce(tpat.thestring, t.thestring);
However, that is not your case. However, you can adjust this with distinct on:
select distinct on (t.thestring) coalesce(tpat.thestring, t.thestring)
from t left join
t tpat
on t.thestring like replace(tpat.thestring, '*', '%') and
t.thestring <> tpat.thestring
order by t.thestring, length(tpat.thestring)

Related

SQL find '%' between %s

I need to find (exclude in fact) any results that contain '%' sign, wherever in a string field. That would mean ... WHERE string LIKE '%%%'. Googling about escaping gave me the following ideas. The first throws syntax error, the second returns rows but there are records actually contain '%'.
1st:
SELECT * FROM table
WHERE string NOT LIKE '%!%%' ESCAPE '!'
///tried with different escape characters
2nd:
SELECT * FROM table
WHERE string NOT LIKE '%[%]%'
Trying on GCP BigQuery.
Try:
SELECT *
FROM table
WHERE string NOT LIKE '%!%%' {ESCAPE '!'}
With curly braces as shown in microsoft sql server docs
Or also:
WITH indata(s) AS (
SELECT 'not excluded'
UNION ALL SELECT '%excluded'
UNION ALL SELECT 'Ex%cluded'
UNION ALL SELECT 'Excluded%'
)
SELECT * FROM indata WHERE INSTR(s,'%') = 0;
-- out s
-- out --------------
-- out not excluded
find (exclude in fact) any results that contain '%'
Consider below simple approach
select *
from your_table
where not regexp_contains(string , '%')

Select statement with column contains '%'

I want to select names from a table where the 'name' column contains '%' anywhere in the value. For example, I want to retrieve the name 'Approval for 20 % discount for parts'.
SELECT NAME FROM TABLE WHERE NAME ... ?
You can use like with escape. The default is a backslash in some databases (but not in Oracle), so:
select name
from table
where name like '%\%%' ESCAPE '\'
This is standard, and works in most databases. The Oracle documentation is here.
Of course, you could also use instr():
where instr(name, '%') > 0
One way to do it is using replace with an empty string and checking to see if the difference in length of the original string and modified string is > 0.
select name
from table
where length(name) - length(replace(name,'%','')) > 0
Make life easy on yourselves and just use REGEXP_LIKE( )!
SQL> with tbl(name) as (
select 'ABC' from dual
union
select 'E%FS' from dual
)
select name
from tbl
where regexp_like(name, '%');
NAME
----
E%FS
SQL>
I read the documentation mentioned by Gordon. The relevent sentence is:
An underscore (_) in the pattern matches exactly one character (as opposed to one byte in a multibyte character set) in the value
Here was my test:
select c
from (
select 'a%be' c
from dual) d
where c like '_%'
The value a%be was returned.
While the suggestions of using instr() or length in the other two answers will lead to the correct answer, they will do so slowly. Filtering on function results simply take longer than filtering on fields.

SQLite: order so that results with same starting letter would come first?

SQLite 3.7.11
Given I have these two records in a SQLITE table, with the field called name:
"preview"
"view"
If I run the query:
SELECT * FROM table WHERE name LIKE "%" + $keyword + "%" ORDER BY name
with $keyword set to "vi"
I would get the results in this order:
1) "preview"
2) "view"
Is there a way to order so that the names whose first letter is the same as the first letter of the keyword (in this example "v") would come first?
You can sort by the position of your keyword in the search string
SELECT * FROM your_table
WHERE name LIKE concat('%', '$keyword', '%')
ORDER BY POSITION('$keyword' IN name)
name
Not sure if it can be done less verbose, but this should work. I used ? as a placeholder for your variable. The query as you posted it doesn't seem to be correct SQL.
SELECT * FROM table1
WHERE name LIKE '%' || ? || '%'
ORDER BY (CASE WHEN SUBSTR(Name,1,1) = SUBSTR(?,1,1) THEN 0 ELSE 1 END),
name

Search (in Oracle), using wild card without worrying about the case or order in which the words appear

How to search using wild card (in Oracle), without worrying about case or order in which the words appear.
e.g. if I search for like '%a%b%', it should return values containing *a*b*, *A*B* ,*b*a* and *B*A* This is just a sample, the search may have 5 or more words, is it possible get the result in just one expression rather than using AND.
select *
from yourtable
where yourfield like '%a%'
and yourfield like '%b%'
Or you can investigate Oracle Text
select * from table
where upper(column) like '%A%B%';
or
select * from table
where lower(column) like '%a%b%';
or
select * from table
where upper(column) like '%A%'
and upper(column) like '%B%';
or
select * from table
where lower(column) like '%a%'
and lower(column) like '%b%';
However, if you would like to find out *a*b* or *b*a* and so on. The other solution would be to sort the CHARS in the VARCHAR first i.e. if you have a string like aebcd then sort it to abcde and then do pattern match using like. As per this you can use below query to sort the chars in a varchar and then do the pattern match on it.
SELECT 1 val
FROM (SELECT MIN(permutations) col
FROM (SELECT REPLACE (SYS_CONNECT_BY_PATH (n, ','), ',') permutations
FROM (SELECT LEVEL l, SUBSTR ('cba', LEVEL, 1) n
FROM DUAL --replace dual by your table
CONNECT BY LEVEL <= LENGTH ('cba')) yourtable
CONNECT BY NOCYCLE l != PRIOR l)
WHERE LENGTH (permutations) = LENGTH ('cba')) temp_tab
WHERE upper(col) like '%A%B%';
Returns
val
------------------
1

sql in clause doesn't work

I have a table with a column ancestry holding a list of ancestors formatted like this "1/12/45". 1 is the root, 12 is children of 1, etc...
I need to find all the records having a specific node/number in their ancestry list. To do so, I wrote this sql statement:
select * from nodes where 1 in (nodes.ancestry)
I get following error statement: operator does not exist: integer = text
I tried this as well:
select * from nodes where '1' in (nodes.ancestry)
but it only returns the records having 1 in their ancestry field. Not the one having for instance 1/12/45
What's wrong?
Thanks!
This sounds like a job for LIKE, not IN.
If we assume you want to search for this value in any position, and then we might try:
select * from nodes where '/' + nodes.ancestry + '/' like '%/1/%'
Note that exact syntax for string concatenation varies between SQL products. Note that I'm prepending and appending to the ancestry column so that we don't have to treat the first/last items in the list differently than middle items. Note also that we surround the 1 with /s, so that we don't get false matches for e.g. with /51/ or /12/.
In MySQL you could write:
SELECT * FROM nodes
WHERE ancestry = '1'
OR LEFT(ancestry, 2) = '1/'
OR RIGHT(ancestry, 2) = '/1'
OR INSTR(ancestry, '/1/') > 0
The in operator expects a comma separated list of values, or a query result, i.e.:
... in (1,2,3,4,5)
or:
... in (select id from SomeOtherTable)
What you need to do is to create a string from the number, so that you can look for it in the other string.
Just looking for the string '1' in the ancestry list would give false positives, as it would find it in the string '2/12/45'. You need to add the separator to the beginning and the end of both strings, so that you look for a string like '/1/' in a string like '/1/12/45/':
select * from nodes
where charindex('/' + convert(varchar(50), 1) + '/', '/' + nodes.ancestry + '/') <> 0