Updating the left side of a string up to a delimiter - sql

My column "ColumnOne" in my table "MyTable" has values like this: Delimiter is character '-'
|Something |
|Something - SomeOtherThing |
|Something - SomethingElse |
|Something - Whatever |
|OtherThing - |
I want to update the values so eventually it look like this:
|Something |
| SomeOtherThing |
| SomethingElse |
| Whatever |
| |
So basically algorithm being to replace with white space and keep going until you see '-' , replace that too also with whitespace.
I tried the REPLACE command to say like
UPDATE MyTable SET ColumnOne = REPLACE(ColumnOne, ' - ', ' ' + ColumnOne) but that's wrong. I couldn't figure out the pattern for its second argument.
Any suggestions are appreciated.

Use charindex to find the amount of characters to change, stuff to perform the change, and replicate to generate a string of N spaces. Try this:
stuff(ColumnOne,1,charindex('-',ColumnOne),replicate(' ',charindex('-',ColumnOne))

Related

How to explode substrings inside a string in a column in SQL

Let's say I have a table like the one below
| Header 1 | Header 2 | Header 3
--------------------------------------------------------------------------------------
| id1 | detail1 | <a#test.com> , <b#test.com> , <c#test.com> , <d#test.com>
How do i explode it on SQL based on the substring emails inside the angle brackets such that it looks like the one below.
| Header 1 | Header 2 | Header 3. |
-------------------------------------------
| id1 | detail1 | a#test.com |
| id1 | detail1 | b#test.com |
| id1 | detail1 | c#test.com |
| id1 | detail1 | d#test.com |
Using regexp_extract_all and explode should do.
select `Header 1`, `Header 2`, explode(regexp_extract_all(`Header 3`, '<(.+?)>')) as `Header 3` from table
this should get you
+--------+--------+----------+
|Header 1|Header 2|Header 3 |
+--------+--------+----------+
|id1 |detail1 |a#test.com|
|id1 |detail1 |b#test.com|
|id1 |detail1 |c#test.com|
|id1 |detail1 |d#test.com|
+--------+--------+----------+
Be aware that regexp_extract_all was added to spark since version 3.1.0.
For spark blow 3.1.0
This can be done with split, somewhat a dirty hack. But the strategy and the results are the same.
select `Header 1`, `Header 2`, explode(array_remove(split(`Header 3`, '[<>,\\s]+'), '')) as `Header 3` from table
What this do is to regex match the delimiters and split the string into array. It also needs an array_remove function call to remove unneeded empty string.
Explanation
With regexp_extract_all, we use the pattern <(.+?)> to extract all strings within angle brackets, into an array like this
['a#test.com', 'b#test.com', 'c#test.com']
For the pattern (.+?)here
. matches 1 character;
+ is a quantifier of ., looking for 1 or unlimited matches;
? is a non greedy modifier, makes the match stop as soon as possible;
brackets makes the pattern with in angle brackets as a matching group, so we can extract from groups later;
Now with explode, we can separate elements of the array into multiple rows, hence the result above.

sql-remove dashes from string column

in stored procedure, i have this field
LTRIM(ISNULL(O.Column1, ''))
If there is a dash(-) symbol at end of the value, want to remove it. only in conditions if a dash symbol exist at start/end.
Any suggestions
EDIT:
Microsoft SQL Server 2014 12.0.5546.0
Expected output:
1)input: "abc-abc" //output: "abc-abc"
2)input: "abc-" //output: "abc"
3)input: "abc" //ouput: "abc"
I think you might be stuck with string manipulation here.
The CASE expression here takes the LTRIM/RTRIM result from your column and checks both ends for a dash, and then each end for a dash. If dashes exist, it strips them out. It's not pretty, and won't perform well on a mountain of data, but will do what you need.
Data setup:
create table trim (col1 varchar(10));
insert trim (col1)
values
('abc'),
(' abc-'),
('abc- '),
('abc-abc '),
(' -abc'),
('-abc '),
(NULL),
(''),
(' -abc- ');
The query:
select
case
when right(ltrim(rtrim(isnull(col1,''))),1) = '-'
and left(ltrim(rtrim(isnull(col1,''))),1) = '-'
then substring(ltrim(rtrim(isnull(col1,''))),2,len(ltrim(rtrim(isnull(col1,''))))-2)
when right(ltrim(rtrim(isnull(col1,''))),1) = '-'
then left(ltrim(rtrim(isnull(col1,''))), len(ltrim(rtrim(isnull(col1,''))))-1)
when left(ltrim(rtrim(isnull(col1,''))),1) = '-'
then right(ltrim(rtrim(isnull(col1,''))), len(ltrim(rtrim(isnull(col1,''))))-1)
else ltrim(rtrim(isnull(col1,'')))
end as trimmed
from trim;
Results:
+---------+
| trimmed |
+---------+
| abc |
| abc |
| abc |
| abc-abc |
| abc |
| abc |
| |
| |
| abc |
+---------+
SQL Fiddle Demo
Since the Database is not mentioned, here is how you do it (rather find it)
SQL Server
Remove the last character in a string in T-SQL?
Oracle
Remove last character from string in sql plus
Postgresql
Postgresql: Remove last char in text-field if the column ends with minus sign
MySQL
Strip last two characters of a column in MySQL
You can use LEFT function, along with SUBSTRING to achieve the result.
SELECT CASE WHEN RIGHT(stringVal,1)= '-' THEN SUBSTRING(stringVal,1,LEN(stringVal)-1)
ELSE stringVal END AS ModifiedString
from
( VALUES ('abc-abc'), ('abc-'),('abc')) as t(stringVal)
+----------------+
| ModifiedString |
+----------------+
| abc-abc |
| abc |
| abc |
+----------------+

Display substring seperated by / in Hive

I have a column in my table with entries like:
this/is/my/dir/file
this/is/my/another/dir/file
I want to display the string without the filename:
this/is/my/dir/
This is the query which I am using:
select regexp_replace('this/is/my/another/dir/file','[^/]+','');
OK, you can use regexp_replace to remove the file and only reserve the dir path, as we know the file name does not contain the character '/' and is always located at the end of the dir path, so the regexp can be written as '[^/]+$', the examples as below, it means that replace the substring with regexp '[^/]+$' to an empty ''.
select regexp_replace('/this/is/my/dir/file','[^/]+$','') as dir;
+-------------------+
| dir |
+-------------------+
| /this/is/my/dir/ |
+-------------------+
select regexp_replace('this/is/my/another/dir/file','[^/]+$','') as dir;
+--------------------------+
| dir |
+--------------------------+
| this/is/my/another/dir/ |
+--------------------------+

How remove symbols from the sentence in Oracle?

In Oracle database I have such table.
| TREE | ORG_NAME |
|---------------------------------|----------|
| \Google earth\Nest global\ATAP | ATAP |
| \Google earth\Nest\Beemoney\ | Beemoney |
| \Google\\\BeeKey\ | |
| | York |
I am trying to make sql query which would return such result.
| ORGANIZATION |
|-----------------------------------|
| Google earth > Nest global > ATAP |
| Google earth > Nest Beemoney |
| Google > BeeKey |
| York |
As you can see I want:
1) Replace \ symbol at the beginning and end of the sentence.
2) Replace \ symbol which is inside sentence to > symbol.
3) Replace \\\ symbol which is inside sentence to > symbol.
4) If TREE colomn is empty take record from ORG_NAME colomn.
Here is how I started. This SQL query solve 2, 3 and 4 part. How to solve problem with 1 part. I think I need to use REGEXP_REPLACE, right? How to make it correctly? Is there any other more elegant way to redisign sql query? As you can see I walk on the same table a few times.
SELECT
COALESCE (TREE, ORG_NAME) as ORGANIZATION
FROM (
SELECT
REPLACE(TREE, '\', '>') AS TREE,
ORG_NAME
FROM (
SELECT
REPLACE(TREE, '\\\', '>') AS TREE,
ORG_NAME
FROM
ORG
)
)
This could be a way with a regexp_replace and a trim to remove the characters from the beginning and the end of the string:
select nvl(regexp_replace( trim('\' from tree), '\\+', ' > '), org_name)
from yourTable
Here is a working solution which uses two calls to regexp_replace:
select
regexp_replace(
regexp_replace('\Google\\\BeeKey\', '^\\?(.*?)\\?$', '\1'), '\\+', ' > ')
from dual;
Google > BeeKey
Demo
The inner call to regexp_replace strips off any possible leading or trailing path separators. The outer call converts any number of internal path separators / to > separators as a replacement.

Odd bug in SQL TRIM() function

I have the following table:
select * from top3art;
path | count
-----------------------------+--------
/article/candidate-is-jerk | 338647
/article/bears-love-berries | 253801
/article/bad-things-gone | 170098
I want to trim off '/article/' in path values, so I do this:
select *, trim(leading '/article/' from path) from top3art;
path | count | ltrim
-----------------------------+--------+--------------------
/article/candidate-is-jerk | 338647 | ndidate-is-jerk
/article/bears-love-berries | 253801 | bears-love-berries
/article/bad-things-gone | 170098 | bad-things-gone
Rows 2 and 3 work just fine. But what happened to the 1st row??
It trimmed '/article/ca'. Why did it take 2 more characters?
Now watch what happens when I just trim '/articl':
select *, trim(leading '/articl' from path) as test from top3art;
path | count | test
-----------------------------+--------+----------------------
/article/candidate-is-jerk | 338647 | e/candidate-is-jerk
/article/bears-love-berries | 253801 | e/bears-love-berries
/article/bad-things-gone | 170098 | e/bad-things-gone
That works as expected... Now watch what happens when I add one more char in my trim clause, '/article':
select *, trim(leading '/article' from path) as test from top3art;
path | count | test
-----------------------------+--------+--------------------
/article/candidate-is-jerk | 338647 | ndidate-is-jerk
/article/bears-love-berries | 253801 | bears-love-berries
/article/bad-things-gone | 170098 | bad-things-gone
Same as the first result!
I can't make sense of this.
Why is this happening?
How do I fix it?
trim removes any character in the first argument from the second argument, so it also removes the c and the a of "candidate". Instead of trim, you could use a split_part call:
select *, split_part(path, '/article/', 2) as test from top3art;
Trim removes all signs you mentioned not words/phrases.
Instead of trim use replace()
select *, replace(path, '/article/','') from top3art;