regex - how to match all select queries in a file in notepad++ - sql

I'm trying to pull out all select queries in a file. Actually I need to know how many select queries there are in a script. For that I use notepad++.
I look for strings in the following configurations:
'SELECT * FROM'
'SELECT aWORD FROM'
'SELECT FIRSTWORD, SECONDWORD, THIRDWORD FROM'
I have tried with the following regex:
select (\w+)(,\s*\w+)* from
This one didn't work in notepad++. What am I doing wrong?
Thanks in advances.
Manel

Based on your regex, it should be this:
select[\*]*[\w+]*[,\s*\w+]*from
Update: Tried this on Notepad++ 5.8.2 and it works.
As the above regex is a greedy expression and 5.8.2 does not support non-greedy regex, you may have to upgrade to v5.9.3 which does.
Tried using the below regex and it does show you non-greedy results:
select.+?from

use this:
['"]\s*select[*,\s\w]+from\s*['"]
it also matches invalid querys (i think that is not a matter) but respects the desired quotes (and double quotes also)
perhaps you want to add = inside the search

Notepad++ was build on top of scintilla and uses POSIX regex, take a look at Regular Expressions in SciTE
Based on your example, your probably looking for :
select[a-z\*, ]+from
I've tested this on Notepad++ 5.9 and it matches the examples you've provided.

Related

Extract characters between a string and the first occurrence of something in BigQuery

I want to extract a set of characters between "u1=" and the first semi-colon using a regex. For instance, given the following string: id=1w54;name=nick;u1=blue;u2=male;u3=ohio;u5=
The desired regex output should be just blue.
I tested (?<=u1=)[^;]* on https://regex101.com and it works. However, when I run this in BigQuery, using regexp_extract(string, '(?<=u1=)[^;]*') , I get an error that reads "Cannot parse regular expression: invalid perl operator: (?<"
I'm confused why this isn't working in BQ. Any help would be appreciated.
You can use regexp_extract() like this:
regexp_extract(string, 'u1=([^;]+)')

String manipulation with Replace in SQL

I am using a replace function to add some quotes around a couple of keywords.
However, this replacement doesn't work for a few cases like the one below.
See example below.
This is the query:
replace(replace(aa.SourceQuery,'sequence','"sequence"'),'timestamp','"timestamp"')
Before:
select timestamp, SparkTimeStamp
from SparkRecordCounts
After:
select "timestamp", Spark"timestamp"
from SparkRecordCounts
However, I want it to be like:
select "timestamp", Sparktimestamp
from SparkRecordCounts
EDIT I wrote this before knowing what RDBMS you were using but have left it in case it helps someone else.
I think you are looking for word boundaries in your replacement, which are generally a job for regular expressions.
Oracle has one built in, called regexp_replace, and you could use something like this:
regexp_replace(aa.SourceQuery, '(^|\s|\W)timestamp($|\s|\W)', '\1"timestamp"\2')
The regular expression looks at the start for:
^ - the start of the line OR
\s - a space character OR
\W - a non-word character
It then matches timestamp, and must end with:
$ - the end of the line OR
\s - a space character OR
\W - a non-word character
Then, and only then, does it perform the replace. \1 and \2 are used to preserve what word boundary matched at the beginning and ending of the word.
I'm not sure how other databases handle regexp_replace, it looks like mysql can via a plugin like this but there may not be a native method.
SQL Server has a solution to something similar here

SQL regex FIND & Replace everything before a word, phrase, or path

I'm using sql server mgmt studio. I'm trying to do an UPDATE query along with a REPLACE using a regex to strip off internal pathing. It doesn't seem to be working right. Is there some other way I need to be invoking regex in SQL?
UPDATE dbo.Table
SET Path = REPLACE(Path , '.+?(?=Data)', '')
I wanted to basically go from
\\somepath\anotherpath\Data\file.txt to Data\File.txt
There is going to be variations on the paths so I'm trying to use regex to remove all characters before the word Data\
My regular expression is "+?(?=Data)" which seems to be working find in Textpad, but not in SQL.
There is no regexp support in SQL server. This can be done using substring and charindex functions.
UPDATE dbo.Table
SET Path = 'Data\' + SUBSTRING(path,CHARINDEX('\Data\',path)+len('\Data\'),len(path))
WHERE CHARINDEX('\Data\',path) > 0
You could use reverse and charindex to do this:
UPDATE dbo.Table
SET path =
case when path like '%\\%\\%'
then substr(path, 1-charindex('\\', reverse(path),
charindex('\\', reverse(path))+1
)
)
else path
end
This will find the second-last backslash, and take the characters that follow it. The case when is there to deal with paths that contain fewer than two backslashes.

Write regex for pattern like W00001

I am new to Regular Expressions and any help is highly appreciated.
Pattern like W00000,W00001,W00002,W00004
Must begin with W
Each string before comma must be six characters
String can only be repeated four times
Comma in between
Must not begin or end with comma
I tried below pattern and some others, like (^[W]{1}\d{5}){1,4}'), and none of them work correctly:
Select 'X' from dual Where REGEXP_LIKE ('W12342','(^[W]{1}\d{5})(?<!,)$')
My understanding is that the OP is saying the match should fail if the string begins or ends with a comma, not just that the preceding or trailing commas shouldn't match, so anchors are needed. Also, based on the regex he attempted, I infer that a single group, such as W00000, should match. So, I think the regex should be this, if the characters following the W must always be digits:
^W[:digit:]{5}(,W[:digit:]{5}){0,3}$
Or this, if they can be something other than digits:
^W[^,]{5}(,W[^,]{5}){0,3}$
UPDATE:
The OP posted the following comment:
I am on Oracle 11g and [:digit:] doesn't work. When I replace it with [0-9] it then works fine.
According to the documentation, Oracle 11g conforms to the POSIX regex standard and should be able to use POSIX character classes such as [:digit:]. However, I noticed in the docs that Oracle 11g does support Perl-style backslash character class abbreviations, which I didn't think was the case when I originally wrote this answer. In that case, the following should work:
^W\d{5}(,W\d{5}){0,3}$
Well in that case, you can do this:
(W[^,]{5},){3}W[^,]{5}
If I understood correctly, this should do it!
^W[0-9]{5}(,W[0-9]{5}){0,3}$
One W12345 pattern, maybe followed by one to 3 ,W12345 blocks.
Edit1: Adding ^$ to fail if there is a comma
Edit2: Fix class, since it fails on Oracle 11g

SQL LIKE Command trouble searching for '

Hi I am facing a problem with the like command in SQL,
I want to search for special characters within a column .
The special characters are a single quotation mark ' and { and }..
I have tried placing these special characters under [] but still it doesn't work for '
I have also used the except option but that was also of no help..
Waiting for a response soon
When you specify a value which has single quote, you need to double it.
SELECT *
FROM dbo.Northwind
WHERE Summary LIKE 'single''quotes%'
Try using this-
select * from <table> where <column> like '%''%'
SQL Server escaping is a pain because there are various ways to escape characters, each with different meaning and use case.
A single quote is escaped with another single quote: WHERE myfield LIKE '%''%'.
The general solution is to escape the special character like so:
SELECT .... WHERE my_column like '%\'%' ESCAPE '\'