Search PostgreSQL column for substring - sql

I have a DB column that has entries like this:
"56/45/34"
"78/34/145"
"45"
"" (i.e. NULL)
I want to search for the rows that match a certain number - for example "45" would should return the first and third rows but not the second.

We can try using a regex approach here with word boundaries:
select col
from your_table
where col ~* '\y45\y';
Demo

You can convert the delimited string to an array and then test the array
select *
from the_table
where '45' = any(string_to_array(the_column, '/'))

Related

Parsing a string in postgresql

Let's say I have column of datatype varchar, the column contains values similar to these
'My unique id [john3 UID=123]'
'My unique id [henry2 UID=1234]'
'My unique id [tom2 UID=56]'
'My unique id [jerry25 UID=98765]'
How can I get only the numbers after UID= in the strings using postgresql.
for eg in string 'My unique id [john3 UID=123]' I want only 123, similarly in string 'My unique id [jerry25 UID=98765]' I want only 98765
Is there a way in PostgreSQL to do it?
We can use REGEXP_REPLACE here:
SELECT col, REGEXP_REPLACE(col, '.*\[\w+ UID=(\d+)\].*$', '\1') AS uid
FROM yourTable;
Demo
Edit:
In case a given value might not match the above pattern, in which case you would want to return the entire original value, we can use a CASE expression:
SELECT col,
CASE WHEN col LIKE '%[%UID=%]%'
THEN REGEXP_REPLACE(col, '.*\[\w+ UID=(\d+)\].*$', '\1')
ELSE col END AS uid
FROM yourTable;
You can also use regexp_matches for a shorter regular expression:
select regexp_matches(col, '(?<=UID\=)\d+') from t;

SQL Select where column contains 1 value in set of words

I need a select which would return row if column A of that row contains any word from a list of words which get from user input
SELECT *
FROM MyTable
WHERE ColumnA CONTAINS ANY 'list of word'
Since the list of words has an unknown number of words, I store the whole list in the same string. each word can be separated with "_", "-" or white space.
You can try something like this if you are using oracle :
SELECT * FROM MyTable WHERE ColumnA in (select upper(regexp_substr('word1-
word2-word3','[^-]+',1,level)) from dual
connect by upper(regexp_substr('word1-word2-word3','[^-]+',1,level)) is
not null)
If you are using "_" then replace the hyphen with underscore is regexp_substr parameter.
I've came up with this solution:
SELECT *
from TableA tb
RIGHT JOIN STRING_SPLIT ( 'list of words' , 'seperator' ) v on tb.ColumnA = v.value
WHERE tb.ColumnA IS NOT NULL

Where x character equal value

How can I select records where in the column Value the 5th character is letter A?
For example the following records:
ID Value
-------------------------
1 1234A5636A6363
2 1234A4343B6363
3 1234B5353A6363
if I run
select * from table
where Value like '%A%'
this will return all records
but all I want is the first 2 where the 5th character is A, regardless if there are more A characters in the text or not
select *
from your_table
where substring(Value, 5, 1) = 'A'
The LIKE operator, in addition to %, which matches any number of any character, can use _, which matches any one single character. You may try:
SELECT *
FROM yourTable
WHERE Value LIKE '____A%'; -- 4 underscores here
use like below by using _(underscore)
LIKE '____A%'
SQL Server
select *
from YourTableName
where CHARINDEX('A', ColumnName) = 5
Note:- This finds where string 'A' starts at position 5
AND specify Your ColumnName

Ordering Postgresql query result on Housenumber column by custom comparer

I'm using postgresql as DB.
Using my query to select column housenumber of varchar type(and some other columns) from table with buildings info. So I want the result to be ordered other way, rather then string comparison.
For example, if I have following results:
"1"
"1 block2"
"1 b30"
"1 b3"
"1 b3 s4"
"10"
"2"
I want this result to be sorted by following logic:
1) getting source string "1 b3 s4"
2) split it into ["1" , "b3" , "s4"]
3) try to parse all substrings to integer, ignoring letters, which
are not numbers into [1 , 3, 4]
4) calculate bigger number for future sorting as
1 * 1000000 + 3 * 1000 + 4 = 1003004.
Is this possible and how could I implement this methoad and use it for sorting query result?
Here is my sql query(shorted):
SELECT housenumber, name
FROM osm_buildings
where
housenumber <> ''
order by housenumber
limit 100
I'm not sure why you would want to convert to some big integer for sorting. You can do the following:
Remove all characters that are not digits or spaces.
Convert to an array, splitting on one or more spaces.
Convert the array to an integer array.
Then you can can sort on this:
order by regexp_split_to_array(regexp_replace(v.addr, '[^0-9 ]', '', 'g'), ' +')::int[]
You can store this as a value in a table, if you want to persist it.
Here is a db<>fiddle.

pgsql parse string to get a string after certain position

I have a table column that has data like
NA_PTR_51000_LAT_CO-BOGOTA_S_A
NA_PTR_51000_LAT_COL_M_A
NA_PTR_51000_LAT_COL_S_A
NA_PTR_51000_LAT_COL_S_B
NA_PTR_51000_LAT_MX-MC_L_A
NA_PTR_51000_LAT_MX-MTY_M_A
I want to parse each column value so that I get the values in column_B. Thank you.
COLUMN_A COLUMN_B
NA_PTR_51000_LAT_CO-BOGOTA_S_A CO-BOGOTA
NA_PTR_51000_LAT_COL_M_A COL
NA_PTR_51000_LAT_COL_S_A COL
NA_PTR_51000_LAT_COL_S_B COL
NA_PTR_51000_LAT_MX-MC_L_A MX-MC
NA_PTR_51000_LAT_MX-MTY_M_A MX-MTY
I'm not sure of the Postgresql and I can't get SQL fiddle to accept the schema build...
substring and length may vary...
Select Column_A, substr(columN_A,18,length(columN_A)-17-4) from tableName
Ok how about this then:
http://sqlfiddle.com/#!15/ad0dd/56/0
Select column_A, b
from (
Select Column_A, b, row_number() OVER (ORDER BY column_A) AS k
FROM (
SELECT Column_A
, regexp_split_to_table(Column_A, '_') b
FROM test
) I
) X
Where k%7=5
Inside out:
Inner most select simply splits the data into multiple rows on _
middle select adds a row number so that we can use the use the mod operator to find all occurances of a 5th remainder.
This ASSUMES that the section of data you're after is always the 5th segment AND that there are always 7 segments...
Use regexp_matches() with a search pattern like 'NA_PTR_51000_LAT_(.+)_'
This should return everything after NA_PTR_51000_LAT_ before the next underscore, which would match the pattern you are looking for.