Postgresql database search with regex - sql

I'm using PostgreSQL database with VB.NET and ODBC (Windows).
I'm searching sentences for whole words by combining SELECT with a regular expression, like this:
"SELECT dtbl_id, name
FROM mytable
WHERE name ~*'" + "( |^)" + TextBox1.Text + "([^A-z]|$)"
This searches well in some cases but because of syntax errors in text (or other reasons) it sometimes fails. For example, if I have the sentence
BILLY IDOL: WHITE WEDDING
the word "white" will be found. But if I have
CLASH-WHITE RIOT
then "white" will not be found, because there is no space between start of word "white".
The simplest solution would be to temporarily change or replace characters in the sentences :,.\/-= etc to spaces.
Is this possible to do in single SELECT line to be suitable for use with .NET/ODBC? Maybe inside the same regular expression?
If it is, how?

Try this:
SELECT 'CLASH-WHITE RIOT' ~ '[[:<:]]WHITE[[:>:]]';
[[:<:]] and [[:>:]] simply mean beginning and end of a word respectively
more info you can find at: http://www.postgresql.org/docs/9.1/static/functions-matching.html#FUNCTIONS-POSIX-REGEXP

Related

How to replace some characters after a specific character to another specific character in one big sql line in notepad++

I have a big sql file with thousand user something like this:
('someone1#mydomain.com','{SSHA512}JWHCqHzazH2vGneLPfhMKkoAamzvxdNCWYOlhZ+uDx36jHdoMXwQmbEemvUMn7ZG6c9+22noXjjb2hAb99/5A/slscDJPKav','','en_US','maildir','Maildir','/home/vmail','vmail1','mydomain.com/someone1/',0,'mydomain.com','','','normal','',0,0,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,NULL,'1970-01-01 01:01:01',0,'',NULL,NULL,'2020-03-19 13:15:58','2015-08-03 06:11:53','2020-03-19 13:15:58','9999-12-31 00:00:00',1'someone1'),
('someone2#mydomain.com','{SSHA512}UoMeyocmdC2DxM0S7B4WFdjnCNuvkngzzLus33h9nugKVlvdhlcboKmMDDuAkCHEyLBUgf8DicKWFPJVS7EOF/ytv27MQ3Ch','','en_US','maildir','Maildir','/home/vmail','vmail1','mydomain.com/someone2/',0,'mydomain.com','','','normal','',0,0,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,NULL,'1970-01-01 01:01:01',0,'',NULL,NULL,'2015-12-17 12:27:35','2015-08-03 06:44:10','2021-06-08 06:55:33','9999-12-31 00:00:00',1'someone2'),
('someone3#mydomain.com','{SSHA512}A6ToCf4OfP3XNEU9ngEmGN/LDquH9+s9Qxme3SoJaDyVvxiWpnwwTiAALSdnmhIxDB2VQK0zhdF+jP8ARvh0N3IDL0Xv/KmL','','en_US','maildir','Maildir','/home/vmail','vmail1','mydomain.com/someone3/',0,'mydomain.com','','','normal','',0,0,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,NULL,'1970-01-01 01:01:01',0,'',NULL,NULL,'2018-04-03 12:31:09','2015-08-03 06:50:01','2018-04-03 12:31:18','9999-12-31 00:00:00',1'someone3'),
('someone4#mydomain.com','{SSHA512}t7/JbUPQ+rtKeRTgWRH6KlETr2JsqYORBOZouzOzs4Wo6YfHYLoy0m+U4kZXk+AeNgMep2hGZSodPZdK2l2bn9MhOKHOuF/L','','en_US','maildir','Maildir','/home/vmail','vmail1','mydomain.com/someone4/',0,'mydomain.com','','','normal',''0,0,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,NULL,'1970-01-01 01:01:01',0,'',NULL,NULL,'2020-03-18 07:48:26','2016-11-14 06:59:04','2021-06-08 05:54:28',9999-12-31 00:00:00',1'someone4')
And now I need to delete the last word ('someone1' , 'someone2' , 'someone3' , 'someone4') for every user which adjoining to 1. It will be looks like
....9999-12-31 00:00:00',1)
not like in original
....9999-12-31 00:00:00',1'someone1')
....9999-12-31 00:00:00',1'someone2')
etc
But don't forget they are not in different lines. All this is in one big line and this makes me to ask you help. Thanks a lot.
It seems that (from your examples) the rows do not contain any parentheses except their start and end characters. So you can search for one quotation mark ', and a number of letters and/or digits, and one quotation mark ', and than ).
To do this;
Open Replace window in Notepad++ by using ctrl+h shortcut
From Search Mode section select Reqular expression
Write '[a-zA-Z0-9]*?[-,_,.]*?[a-zA-Z0-9]*?[-,_,.]*?[a-zA-Z0-9]*?[-,_,.]*?[a-zA-Z0-9]*?'\) to Find what box
Write '\) to Replace with box
Click Replace All button.
This works if user names consist of letters or digits and _, -, . at most 3 times.
Be Sure that you have a copy of original file as a backup. And also be aware of that the regular expression that we use may find unrelated parts if any row contains closing parentheses except end of it.

TRIM or REPLACE in Netsuite Saved Search

I've looked at lots of examples for TRIM and REPLACE on the internet and for some reason I keep getting errors when I try.
I need to strip suffixes from my Netsuite item record names in a saved item search. There are three possible suffixes: -T, -D, -S. So I need to turn 24335-D into 24335, and 24335-S into 24335, and 24335-T into 24335.
Here's what I've tried and the errors I get:
Can you help me please? Note: I can't assume a specific character length of the starting string.
Use case: We already have a field on item records called Nickname with the suffixes stripped. But I've ran into cases where Nickname is incorrect compared to Name. Ex: Name is 24335-D but Nickname is 24331-D. I'm trying to build a saved search alert that tells me any time the Nickname does not equal suffix-stripped Name.
PS: is there anywhere I can pay for quick a la carte Netsuite saved search questions like this? I feel bad relying on free technical internet advice but I greatly appreciate any help you can give me!
You are including too much SQL - a formulae is like a single result field expression not a full statement so no FROM or AS. There is another place to set the result column/field name. One option here is Regex_replace().
REGEXP_REPLACE({name},'\-[TDS]$', '')
Regex meaning:
\- : a literal -
[TDS] : one of T D or S
$ : end of line/string
To compare fields a Formulae (Numeric) using a CASE statement can be useful as it makes it easy to compare the result to a number in a filter. A simple equal to 1 for example.
CASE WHEN {custitem_nickname} <> REGEXP_REPLACE({name},'\-[TDS]$', '') then 1 else 0 end
You are getting an error because TRIM can trim only one character : see oracle doc
https://docs.oracle.com/javadb/10.8.3.0/ref/rreftrimfunc.html (last example).
So try using something like this
TRIM(TRAILING '-' FROM TRIM(TRAILING 'D' FROM {entityid}))
And always keep in mind that saved searches are running as Oracle SQL queries so Oracle SQL documentation can help you understand how to use the available functions.

String manipulation with Replace in SQL

I am using a replace function to add some quotes around a couple of keywords.
However, this replacement doesn't work for a few cases like the one below.
See example below.
This is the query:
replace(replace(aa.SourceQuery,'sequence','"sequence"'),'timestamp','"timestamp"')
Before:
select timestamp, SparkTimeStamp
from SparkRecordCounts
After:
select "timestamp", Spark"timestamp"
from SparkRecordCounts
However, I want it to be like:
select "timestamp", Sparktimestamp
from SparkRecordCounts
EDIT I wrote this before knowing what RDBMS you were using but have left it in case it helps someone else.
I think you are looking for word boundaries in your replacement, which are generally a job for regular expressions.
Oracle has one built in, called regexp_replace, and you could use something like this:
regexp_replace(aa.SourceQuery, '(^|\s|\W)timestamp($|\s|\W)', '\1"timestamp"\2')
The regular expression looks at the start for:
^ - the start of the line OR
\s - a space character OR
\W - a non-word character
It then matches timestamp, and must end with:
$ - the end of the line OR
\s - a space character OR
\W - a non-word character
Then, and only then, does it perform the replace. \1 and \2 are used to preserve what word boundary matched at the beginning and ending of the word.
I'm not sure how other databases handle regexp_replace, it looks like mysql can via a plugin like this but there may not be a native method.
SQL Server has a solution to something similar here

How can I perform a SQL SELECT with a LIKE condition for a string containing an open bracket character?

I have a simple search query:
<cfquery name="_qSearch" dbtype="Query">
SELECT
*
FROM MyQoQ
WHERE
DESCRIPTION LIKE '%#URL.searchString#%'
</cfquery>
This query works excellently for most values. However, if someone searches for a value like "xxx[en", it bombs with the error message The pattern of the LIKE conditional is malformed..
Is there any way around this, since the bracket has a special use in CFQUERY?
QoQ shares a feature of TSQL (MS SQL Server) whereby it's not just % and _ that are wildcards in LIKE - it also supports regex-style character classes, as in[a-z] for any lowercase letter.
To escape these values and match the literal equivalents, you can use a character class itself, i.e. [[] will match a literal [, and of course you probably also want to escape any % and _ in the user input - you can do all three like so:
'%#Url.SearchString.replaceAll('[\[%_]','[$0]')#%'
That is just a simple regex replace (using String.replaceAll) to match all instances of [ or % or _ and wrap each one in [..] - the $0 on the replacement side represents the matched text.

REGEX for complete word matching

OK So i am confused (obviously)
I'm trying to return rows (from Oracle) where a text field contains a complete word, not just the substring.
a simple example is the word 'I'.
Show me all rows where the string contains the word 'I', but not simply where 'I' is a substring somewhere as in '%I%'
so I wrote what i thought would be a simple regex:
select REGEXP_INSTR(upper(description), '\bI\b') from mytab;
expecting that I should be detected with word boundaries. I get no results (or rather the result 0 for each row.
what i expect:
'I am the Administrator' -> 1
'I'm the administrator' -> 0
'Am I the administrator' -> 1
'It is the infamous administrator' -> 0
'The adminisrtrator, tis I' -> 1
isn't the /b supposed to find the contained string by word boundary?
tia
I believe that \b is not supported by your flavor of regex :
http://download.oracle.com/docs/cd/B19306_01/appdev.102/b14251/adfns_regexp.htm#i1007670
Therefore you could do something like :
(^|\s)word(\s|$)
To at least ensure that your "word" is separated by some whitespace or it's the whole string.
Oracle doesn't support word boundary anchors, but even if it did, you wouldn't get the desired result: \b matches between an alphanumeric character and a non-alphanumeric character. The exact definition of what an alnum is differs between implementations, but in most flavors, it's [A-Za-z0-9_] (.NET also considers Unicode letters/digits).
So there are two boundaries around the I in %I%.
If you define your word boundary as "whitespace before/after the word", then you could use
(^|\s)I(\s|$)
which would also work at the start/end of the string.
Oracle native regex support is limited. \b or < cannot be used as word delimiters. You may want Oracle Text for word search.