Regex in redshift - sql

I have a problem.. I need to extract from this field:
exchange<=><br>type<=>full<br>cont<=>part<br>req<=>no<br>money<=>money<br>money<=>3100,4000,0,month<br>boss<=>0
five informations:
full
part
3100
4.4000
5.month
I have tried to use regexp_substr():
regexp_substr(column,'type<=>[^<br>]*') but I dont have any knowledge about regex and I cant do it in a properly way.. can you help me with that?

I never worked with redshift but with regex I can help you:
"(type|cont|money)<=>([^<,]+)(,([^<,]+),[^<,]+,([^<,]+))?"
The capture number 4 in the string you put as an example it will capture all you need, it even exclude the 0 :
Group 1: money
Group 2: 3100
Group 3: ,4000,0,month
Group 4: 4000
Group 5: month
In case you have problems, tell me.
If you want to master your regex skills I can teach you, it will be useful.

Related

Is there any special considerations that I need to have while adding special characters into a SQLite database?

I've searched the web, but I can't find anything that really explains the behaviour I am seeing on my sqlite database.
If I run the query:
SELECT name, COUNT(*) FROM registry WHERE presence == 0 GROUP BY name ORDER BY COUNT(*);
The return has the following ending
Hugo Martins de Carvalho|13
Duarte Pacheco|14
Edite Estrela|14
Paulo Pisco|15
Carlos Caçao|16
So far so good.
However, if I try to fetch all the items for "Carlos Caçao" nothing appears
SELECT * FROM registry WHERE name == "Carlos Caçao"; // this yields no results
I've tried with other names that also have ç but only for this case I don't have the results appearing
Anyone has any ideas about what might be going on in here ?

filter post regex in sql

Q1: i am trying to capture abc-12345 type pattern with regex and using
'[aA-zZ]+\-[0-9]+'
I am getting most results that are correct but a few are coming back with the [ like '[abc-57489'. whats the best way to fix the column in sql to removew the '['
Q2: to capture more scenarios, i am doing:
coalesce(regexp_extract(column1,'[aA-zZ]+\-[0-9]+'),
coalesce(regexp_extract(column1,'[aA-zZ]+\- [0-9]+'),
coalesce(regexp_extract(column1,'[aA-zZ]+\ - [0-9]+'),
coalesce(regexp_extract(column1,'[aA-zZ]+\ -[0-9]+'),'')))) as columnoneadjusted,
How Can i filter out items post regex that dont have 'abc'
I found a simpler answer.
coalesce(regexp_extract(column1,'(abc)+\-[0-9]+'),
coalesce(regexp_extract(column1,'(abc)+\- [0-9]+'),
coalesce(regexp_extract(column1,'(abc)+\ - [0-9]+'),
coalesce(regexp_extract(column1,'(abc)+\ -[0-9]+'),'')))) as columnoneadjusted,

Using regexp in Big Query to extract URLs

I've been trying to extract any URL present within my 'Text' column in Big Query. The column contains a mixture of text and URLs dotted throughout (a cell might contain more than one URL) I'm trying to use this regexp:
SELECT
REGEXP_EXTRACT (Text, r'(http(s)?:\/\/.)?(www\.)?[-a-zA-Z0-9:%._\+~#=]{2,256}\.[a-z]{2,6}\b([-a-zA-Z0-9%_:?\+.~#&//=]*')
FROM
Data.Text_Files
I currently get 'failed to parse regular expression' when I try to run the query. I've tried modifying it but to no avail.
The regexp works in an online builder but I'm just not sure how to incorporate it into Big Query.
Any help would be much appreciated - or at least pointers on how to incorporate regular expressions into Big Query!
Try below - it is for BigQuery Standard SQL (see Enabling Standard SQL and Migrating from legacy SQL)
WITH YourTable AS (
SELECT 1 AS id, 'What have you tried so far? Please edit your question to show a [Minimal, Complete, and Verifiable example](http://stackoverflow.com/help/mcve) of the code that you are having problems with, then we can try to help with the specific problem. You can also read [How to Ask](http://stackoverflow.com/help/how-to-ask). ' AS Text UNION ALL
SELECT 2 AS id, 'Important on SO, you can mark accepted answer by using the tick on the left of the posted answer, below the voting. see http://meta.stackexchange.com/questions/5234/how-does-accepting-an-answer-work#5235 for why it is important. There are more ... You can check about what to do when someone answers your question - http://stackoverflow.com/help/someone-answers.' AS Text UNION ALL
SELECT 3 AS id, 'If an answer has helped you solve your problem and you accept it you should also consider voting it up. See more at http://stackoverflow.com/help/someone-answers and Upvote section in http://meta.stackexchange.com/questions/5234/how-does-accepting-an-answer-work#5235' AS Text
)
SELECT
id,
REGEXP_EXTRACT_ALL(Text, r'(?i:(?:(?:(?:ftp|https?):\/\/)(?:www\.)?|www\.)(?:[\da-z-_\.]+)(?:[a-z\.]{2,7})(?:[\/\w\.-_\?\&]*)*\/?)') AS URL
FROM YourTable
This gives you output with id field, and repeated field with all respective URLs
If you need flattened result - you can use below variation
WITH YourTable AS (
SELECT 1 AS id, 'What have you tried so far? Please edit your question to show a [Minimal, Complete, and Verifiable example](http://stackoverflow.com/help/mcve) of the code that you are having problems with, then we can try to help with the specific problem. You can also read [How to Ask](http://stackoverflow.com/help/how-to-ask). ' AS Text UNION ALL
SELECT 2 AS id, 'Important on SO, you can mark accepted answer by using the tick on the left of the posted answer, below the voting. see http://meta.stackexchange.com/questions/5234/how-does-accepting-an-answer-work#5235 for why it is important. There are more ... You can check about what to do when someone answers your question - http://stackoverflow.com/help/someone-answers.' AS Text UNION ALL
SELECT 3 AS id, 'If an answer has helped you solve your problem and you accept it you should also consider voting it up. See more at http://stackoverflow.com/help/someone-answers and Upvote section in http://meta.stackexchange.com/questions/5234/how-does-accepting-an-answer-work#5235' AS Text
)
SELECT
id, URL
FROM (
SELECT id, REGEXP_EXTRACT_ALL(Text, r'(?i:(?:(?:(?:ftp|https?):\/\/)(?:www\.)?|www\.)(?:[\da-z-_\.]+)(?:[a-z\.]{2,7})(?:[\/\w\.-_\?\&]*)*\/?)') AS URL
FROM YourTable
), UNNEST(URL) as URL
Note: you can use here any regexp that you will be able to find on web - but what a must is - there is only one matching group is allowed! so all inner matching group should be escaped with ?: as you can see it in above examples. So the ONLY group that you expect to see in output should be left as is - w/o ?:
Your regex has an incomplete capturing group, and has 2 unescaped characters. I don't know which online regex builder you're using, but maybe you forgot to put your new regex into it?
The problems are as follows:
(http(s)?:\/\/.)?(www\.)?[-a-zA-Z0-9:%._\+~#=]{2,256}\.[a-z]{2,6}\b([-a-zA-Z0-9%_:?\+.~#&//=]*
POINTERS TO PROBLEMS ON THIS LINE ---> ^1 ^^2
This is the start of a capturing group with no end. You probably want the ) right before the *.
All slashes need to be escaped. This should probably be \/ or maybe even \/\\.
Here is an example with both of my suggestions implemented: https://regex101.com/r/pt1hqS/1
Good luck fixing it!

Adding a number to a column with a value in it Access

Ok So I have A column with a count IIF expressions As shown:
UNTESTED: Count(IIf([TEST]="UNTESTED",1))
What I want to do is now look at where the location was and when the test was done and if in a specific location and YEAR then add 8 to that value I am now trying to use:
UNTESTED: Count(IIf([TEST]="UNTESTED",1)) AND IIf([REGION]="CANADA" And [YEAR]<=2012,[UNTESTED]+8,[UNTESTED])
Thanks in advance if you can solve this!
Here is the Answer to my question, I solved it :
IIf([Site]="CANADA" Or [Site]="ALASKA",IIf([Year]<"2013",Count(IIf([Pass_Fail]="UNTESTED",1))+8,
Count(IIf([Pass_Fail]="UNTESTED",1))),Count(IIf([Pass_Fail]="UNTESTED",1)))

regular expression to pull words beginning with #

Trying to parse an SQL string and pull out the parameters.
Ex: "select * from table where [Year] between #Yr1 and #Yr2"
I want to pull out "#Yr1" and "#Yr2"
I have tried many patterns, but none has worked, such as:
matches = Regex.Matches(sSQL, "\b#\w*\b")
and
matches = Regex.Matches(sSQL, "\b\#\w*\b")
Any help?
You're trying to put a word boundary after the #, rather than before. Maybe this:
\w(#[A-Z0-9a-z]+)
or
\w(#[^\s]+)
I would have gone with
/^|\s(#\w+)\s|$/
or if you didn't want to include the #
/^|\s#(\w+)\s|$/
though I also like joel's above, so maybe one of these
/^|\s(#[^\s]+)\s|$/
/^|\s#([^\s]+)\s|$/