How do I search "%" (percentage sign) in text using ILIKE operator in PostgreSQL [duplicate] - sql

This question already has answers here:
How can you find a literal percent sign (%) in PostgreSQL using a LIKE query?
(3 answers)
Closed 2 years ago.
| id | text |
|----|-------|
| 1 | AB |
| 2 | CD%EF |
| 3 | GH |
I have a text column in a table having a value with a "%" sign.
I wanted to extract that value using the following query —
SELECT text FORM table
WHERE text ILIKE '%%%'
expected Output should be:
CD%EF
1 {row}
actual output returns:
AB
CD%EF
GH
3 {rows}

You can use \% to represent a literal percent sign:
SELECT text FORM table WHERE text ILIKE '%\%%'
Like the documentation says:
To match a literal underscore or percent sign without matching other characters, the respective character in pattern must be preceded by the escape character. The default escape character is the backslash but a different one can be selected by using the ESCAPE clause. To match the escape character itself, write two escape characters.

Related

How to extract string between quotes, with a delimiter in Snowflake

I've got a bunch of fields which are double quoted with delimiters but for the life of me, I'm unable to get any regex to pull out what I need.
In short - the delimiters can be in any order and I just need the value that's between the double quotes after each delimiter. Some sample data is below, can anyone help with what regex might extract each value? I've tried
'delimiter_1=\\W+\\w+'
but I only seem to get the first word after the delimiter (unfortunately - they do have spaces in the value)
some content delimiter_1="some value" delimiter_2="some other value" delimiter_4="another value" delimiter_3="the last value"
The problem is returning a varying numbers of values from the regex function. For example, if you know that there will 4 delimiters, then you can use REGEXP_SUBSTR for each match, but if the text will have varying delimiters, this approach doesn't work.
I think the best solution is to write a function to parse the text:
create or replace function superparser( SRC varchar )
returns array
language javascript
as
$$
const regexp = /([^ =]*)="([^"]*)"/gm;
const array = [...SRC.matchAll(regexp)]
return array;
$$;
Then you can use LATERAL FLATTEN to process the returning values from the function:
select f.VALUE[1]::STRING key, f.VALUE[2]::STRING value
from values ('some content delimiter_1="some value" delimiter_2="some other value" delimiter_4="another value" delimiter_3="the last value"') tmp(x),
lateral flatten( superparser(x) ) f;
+-------------+------------------+
| KEY | VALUE |
+-------------+------------------+
| delimiter_1 | some value |
| delimiter_2 | some other value |
| delimiter_4 | another value |
| delimiter_3 | the last value |
+-------------+------------------+

Containstable is not working for special characters

I have a table, as an example Fruits, with column 'Name' which is Full Text Indexed
| ID | Name |
| 0 | *Apple |
| 1 | *Banana|
If we use full text search
select * from CONTAINSTABLE (Fruits, Name , '*')
This is ignored, as an escape character. While I have tried to use '/' and '%' to search but no use. Is there any option to search special characters?
I don't want to use LIKE Operator as it more time than Full text search.

How to add delimiter to String after every n character using hive functions?

I have the hive table column value as below.
"112312452343"
I want to add a delimiter such as ":" (i.e., a colon) after every 2 characters.
I would like the output to be:
11:23:12:45:23:43
Is there any hive string manipulation function support available to achieve the above output?
For fixed length this will work fine:
select regexp_replace(str, "(\\d{2})(\\d{2})(\\d{2})(\\d{2})(\\d{2})(\\d{2})","$1:$2:$3:$4:$5:$6")
from
(select "112312452343" as str)s
Result:
11:23:12:45:23:43
Another solution which will work for dynamic length string. Split string by the empty string that has the last match (\\G) followed by two digits (\\d{2}) before it ((?<= )), concatenate array and remove delimiter at the end (:$):
select regexp_replace(concat_ws(':',split(str,'(?<=\\G\\d{2})')),':$','')
from
(select "112312452343" as str)s
Result:
11:23:12:45:23:43
If it can contain not only digits, use dot (.) instead of \\d:
regexp_replace(concat_ws(':',split(str,'(?<=\\G..)')),':$','')
This is actually quite simple if you're familiar with regex & lookahead.
Replace every 2 characters that are followed by another character, with themselves + ':'
select regexp_replace('112312452343','..(?=.)','$0:')
+-------------------+
| _c0 |
+-------------------+
| 11:23:12:45:23:43 |
+-------------------+

Only return fields that contain numbers or special characters EXCEPT . Error

In Redshift I want to return fields that contain numbers or special characters EXCEPT . (anything other and a-z and A-Z)
The following gets me anything that contains a number but I need to extend this to any special character except full stop (.)
SELECT DISTINCT name
FROM table
WHERE name ~ '[0-9]'
I need something like:
SELECT DISTINCT name
FROM table
WHERE name ~ '[0-9]' OR name ~'[,#';:#~[]{}etcetc'
Sample Data:
name
john
joh1n1
j!ohn!
jo!h2n
joh.n
jo.&hn
j.3ohn
j.$9ohn
Expected Output:
name
joh1n1
j!ohn!
jo!h2n
jo.&hn
j.3ohn
j.$9ohn
You may use
WHERE name !~ '^[[:alpha:].]+$'
Here, all records that do not consist of only alphabetic or dot symbols will be returned. ^ matches the start of a string position, [[:alpha:].]+ matches one or more letters or dots and $ matches the end of string position.
If it is for PostgreSQL you may use
WHERE name SIMILAR TO '%[^[:alpha:].]%'
The SIMILAR TO operator accepts POSIX character classes and bracket expressions and wildcards, too, and requires a full string match. So, % allows any chars before any 1 char other than letter or dot ([^[:alpha:].]), and then there may also be any other chars till the end of the string.
You can do:
SELECT DISTINCT name FROM table WHERE name !~* '[a-z]'
This means: match on names that do not contain any alphanumeric character.
Operator !~* means:
Does not match regular expression, case insensitive
Edit based on the provided sample data and expected results.
If you want to match on names that contain at least one character other than an alphabetic character or a dot, then you can do:
select * from mytable where name ~* '[^a-z.]'
Demo on DB Fiddle:
with mytable(name) as (values
('john'),
('joh1n1'),
('j!ohn!'),
('jo!h2n'),
('joh.n'),
('jo.&hn'),
('j.3ohn'),
('j.$9ohn')
)
select * from mytable where name ~* '[^a-z.]'
| name |
| :------ |
| joh1n1 |
| j!ohn! |
| jo!h2n |
| jo.&hn |
| j.3ohn |
| j.$9ohn |

PostgreSQL String search for partial patterns removing exrtaneous characters

Looking for a simple SQL (PostgreSQL) regular expression or similar solution (maybe soundex) that will allow a flexible search. So that dashes, spaces and such are omitted during the search. As part of the search and only the raw characters are searched in the table.:
Currently using:
SELECT * FROM Productions WHERE part_no ~* '%search_term%'
If user types UTR-1 it fails to bring up UTR1 or UTR 1 stored in the database.
But the matches do not happen when a part_no has a dash and the user omits this character (or vice versa)
EXAMPLE search for part UTR-1 should find all matches below.
UTR1
UTR --1
UTR 1
any suggestions...
You may well find the offical, built-in (from 8.3 at least) fulltext search capabilities in postrgesql worth looking at:
http://www.postgresql.org/docs/8.3/static/textsearch.html
For example:
It is possible for the parser to produce overlapping tokens from the
same of text.
As an example, a hyphenated word will be reported both as the entire word
and as each component:
SELECT alias, description, token FROM ts_debug('foo-bar-beta1');
alias | description | token
-----------------+------------------------------------------+---------------
numhword | Hyphenated word, letters and digits | foo-bar-beta1
hword_asciipart | Hyphenated word part, all ASCII | foo
blank | Space symbols | -
hword_asciipart | Hyphenated word part, all ASCII | bar
blank | Space symbols | -
hword_numpart | Hyphenated word part, letters and digits | beta1
SELECT *
FROM Productions
WHERE REGEXP_REPLACE(part_no, '[^[:alnum:]]', '')
= REGEXP_REPLACE('UTR-1', '[^[:alnum:]]', '')
Create an index on REGEXP_REPLACE(part_no, '[^[:alnum:]]', '') for this to work fast.