Regex to match the pattern split with slash - sql

I want to query the database column with regex to match the string like the following...
1. qwge1/dg2/hjetg3
2. tahry4/rtg5
3. jtyg6
How to split the zero to multiple slashes and match the [a-z]+[0-9] part?

You could use:
^([a-z]+[0-9](/|$))+$
The inner expression, [a-z]+[0-9](/|$), describes a series of alphabetic characters followed by a digit, then by a slash or the end of the string. This expression may be repeated 1 to N times, followed by the end of the string.
Demo on DB Fiddle - I added a few non-matching strings to your sample data:
select val, val ~ '^([a-z]+[0-9](/|$))+$'
from (values
('qwge1/dg2/hjetg3'),
('tahry4/rtg5'),
('jtyg6'),
('abc'),
('qwge1/dg2/hjetg'),
('qwge1/dg2/3')
) x(val)
val | ?column?
:--------------- | :-------
qwge1/dg2/hjetg3 | t
tahry4/rtg5 | t
jtyg6 | t
abc | f
qwge1/dg2/hjetg | f
qwge1/dg2/3 | f

Related

SQL: Regex to extract everything between first and last occurrence of a character

I have a table with a string column like this:
------------------------------------------------
| Column |
------------------------------------------------
| #Extract this# and #this too# do not extract |
------------------------------------------------
| Leave this and #get this out# |
------------------------------------------------
I want to extract everything from first # occurrence and the last # occurrence like this:
--------------------------------
| Expected Output |
--------------------------------
| #Extract this and #this too# |
--------------------------------
| #get this out# |
--------------------------------
I have tried
regexp_substr(column, '#[^.]#', 1, regexp_count(column, '#')) but it is giving me empty string.
Does anyone know how to fix this?
Thanks in advance!
select REGEXP_SUBSTR(column, '#.*#') as pattern from [table]
Demo for the regex output
I will use substring to get the first and last occurrence of #. I added +2 in the length of the string because we want to include #.
SELECT
substring(col
from
position('#' in col)
for
length(col) - position('#' in reverse(col)) - position('#' in col) + 2)
FROM
table;
Result:
substring
#Extract this# and #this too#
#get this out#

sql-remove dashes from string column

in stored procedure, i have this field
LTRIM(ISNULL(O.Column1, ''))
If there is a dash(-) symbol at end of the value, want to remove it. only in conditions if a dash symbol exist at start/end.
Any suggestions
EDIT:
Microsoft SQL Server 2014 12.0.5546.0
Expected output:
1)input: "abc-abc" //output: "abc-abc"
2)input: "abc-" //output: "abc"
3)input: "abc" //ouput: "abc"
I think you might be stuck with string manipulation here.
The CASE expression here takes the LTRIM/RTRIM result from your column and checks both ends for a dash, and then each end for a dash. If dashes exist, it strips them out. It's not pretty, and won't perform well on a mountain of data, but will do what you need.
Data setup:
create table trim (col1 varchar(10));
insert trim (col1)
values
('abc'),
(' abc-'),
('abc- '),
('abc-abc '),
(' -abc'),
('-abc '),
(NULL),
(''),
(' -abc- ');
The query:
select
case
when right(ltrim(rtrim(isnull(col1,''))),1) = '-'
and left(ltrim(rtrim(isnull(col1,''))),1) = '-'
then substring(ltrim(rtrim(isnull(col1,''))),2,len(ltrim(rtrim(isnull(col1,''))))-2)
when right(ltrim(rtrim(isnull(col1,''))),1) = '-'
then left(ltrim(rtrim(isnull(col1,''))), len(ltrim(rtrim(isnull(col1,''))))-1)
when left(ltrim(rtrim(isnull(col1,''))),1) = '-'
then right(ltrim(rtrim(isnull(col1,''))), len(ltrim(rtrim(isnull(col1,''))))-1)
else ltrim(rtrim(isnull(col1,'')))
end as trimmed
from trim;
Results:
+---------+
| trimmed |
+---------+
| abc |
| abc |
| abc |
| abc-abc |
| abc |
| abc |
| |
| |
| abc |
+---------+
SQL Fiddle Demo
Since the Database is not mentioned, here is how you do it (rather find it)
SQL Server
Remove the last character in a string in T-SQL?
Oracle
Remove last character from string in sql plus
Postgresql
Postgresql: Remove last char in text-field if the column ends with minus sign
MySQL
Strip last two characters of a column in MySQL
You can use LEFT function, along with SUBSTRING to achieve the result.
SELECT CASE WHEN RIGHT(stringVal,1)= '-' THEN SUBSTRING(stringVal,1,LEN(stringVal)-1)
ELSE stringVal END AS ModifiedString
from
( VALUES ('abc-abc'), ('abc-'),('abc')) as t(stringVal)
+----------------+
| ModifiedString |
+----------------+
| abc-abc |
| abc |
| abc |
+----------------+

What is the difference to use CARET symbol in REGEXP_LIKE in oracle?

I am new to REGEX. So,I tried:
select * from ot.contacts where REGEXP_like(last_name,'^[A-C]');
Also,I tried:
select * from ot.contacts where REGEXP_like(last_name,'[A-C]');
both of them are giving me output where last_name starts with A,b,c and the no of records fetched is same.Can you tell me when I can see difference using this caret symbol?
In this context, ^ represents the beginning of the string.
'^[A-C]' checks for A, B or C at the beginning of the string.
'[A-C]' checks for A, B or C at the anywhere in the string.
Depending on your dataset, both expressions might, or might not produce the same output. Here is on example where the resultset would be different:
last_name | ^[A-C] | [A-C]
----------------- | ------- | -----
Arthur | match | match
Bill | match | match
Jean-Christophe | no match | match

How remove symbols from the sentence in Oracle?

In Oracle database I have such table.
| TREE | ORG_NAME |
|---------------------------------|----------|
| \Google earth\Nest global\ATAP | ATAP |
| \Google earth\Nest\Beemoney\ | Beemoney |
| \Google\\\BeeKey\ | |
| | York |
I am trying to make sql query which would return such result.
| ORGANIZATION |
|-----------------------------------|
| Google earth > Nest global > ATAP |
| Google earth > Nest Beemoney |
| Google > BeeKey |
| York |
As you can see I want:
1) Replace \ symbol at the beginning and end of the sentence.
2) Replace \ symbol which is inside sentence to > symbol.
3) Replace \\\ symbol which is inside sentence to > symbol.
4) If TREE colomn is empty take record from ORG_NAME colomn.
Here is how I started. This SQL query solve 2, 3 and 4 part. How to solve problem with 1 part. I think I need to use REGEXP_REPLACE, right? How to make it correctly? Is there any other more elegant way to redisign sql query? As you can see I walk on the same table a few times.
SELECT
COALESCE (TREE, ORG_NAME) as ORGANIZATION
FROM (
SELECT
REPLACE(TREE, '\', '>') AS TREE,
ORG_NAME
FROM (
SELECT
REPLACE(TREE, '\\\', '>') AS TREE,
ORG_NAME
FROM
ORG
)
)
This could be a way with a regexp_replace and a trim to remove the characters from the beginning and the end of the string:
select nvl(regexp_replace( trim('\' from tree), '\\+', ' > '), org_name)
from yourTable
Here is a working solution which uses two calls to regexp_replace:
select
regexp_replace(
regexp_replace('\Google\\\BeeKey\', '^\\?(.*?)\\?$', '\1'), '\\+', ' > ')
from dual;
Google > BeeKey
Demo
The inner call to regexp_replace strips off any possible leading or trailing path separators. The outer call converts any number of internal path separators / to > separators as a replacement.

GROUP_CONCAT automatically add double quotes only when the field contains double quotes

When I use GROUP_CONCAT in BigQuery for fields that contains double quotes,
the result values are
automatically escaped and added some double quotes.
But if the fields doesn't contain double quotes, GROUP_CONCAT behaves a little different.
Case1 (with double quotes)
Table
Row | word | num
--- | ---- | ---
1 | fo"o | 1
2 | ba"r | 1
3 | ba"z | 2
and the Query
SELECT GROUP_CONCAT(word) AS words, num
FROM Table
GROUP BY num
the results
Row | words | num
--- | ------------- | ---
1 | "fo""o,ba""r" | 1
2 | "ba""z" | 2
↑It's escaped automatically.
Case2 (without double quotes)
Table
Row | word | num
--- | ---- | ---
1 | fo'o | 1
2 | ba'r | 1
3 | ba'z | 2
and the Query
SELECT GROUP_CONCAT(word) AS words, num
FROM Table
GROUP BY num
the results
Row | words | num
--- | --------- | ---
1 | fo'o,ba'r | 1
2 | ba'z | 2
↑No double quotes added.
Case3(normal CONCAT with double quotes)
※The normal CONCAT doesn't behave like GROP_CONCAT.
Escaping double quotes are not added.
Table
Row | word
--- | ----
1 | fo"o
2 | ba'r
and the Query
SELECT CONCAT(word, '12"3') AS words
FROM Table
the results
Row | words
--- | ---------
1 | fo"o12"3
2 | ba'r12"3
Question
I wonder why the results are different between these cases.
I don't want to escape and add double quotes when Case1.
Are there any solutions?
Thanks.
The issue has been reported to Google, however they don't provide an ETA of when this will be address.
The issue has been posted int the Official Google BigQuery issue and feature request tracker, updates about this matter can be found at the link provided.
As mentioned in Marilu's answer, Google mentioned plans to support this use-case but didn't provide an ETA in the feature request at the time of posting. As of February 2015, the issue has been "Fixed".
Google has added a GROUP_CONCAT_UNQUOTED function which behaves nearly identically to GROUP_CONCAT except for it doesn't escape the double quotes.
Here is the description of the function from the Google Docs for BigQuery Aggregate Functions:
GROUP_CONCAT_UNQUOTED('str' [, separator])
Concatenates multiple strings into a single string, where each value is separated by the optional separator parameter. If separator is omitted, BigQuery returns a comma-separated string.
Unlike GROUP_CONCAT, this function will not add double quotes to returned values that include a double quote character. For example, the string a"b would return as a"b.
Example: SELECT GROUP_CONCAT_UNQUOTED(x) FROM (SELECT 'a"b' AS x), (SELECT 'cd' AS x);