Display substring seperated by / in Hive

Display substring seperated by / in Hive - sql

I have a column in my table with entries like:
this/is/my/dir/file
this/is/my/another/dir/file
I want to display the string without the filename:
this/is/my/dir/
This is the query which I am using:
select regexp_replace('this/is/my/another/dir/file','[^/]+','');

OK, you can use regexp_replace to remove the file and only reserve the dir path, as we know the file name does not contain the character '/' and is always located at the end of the dir path, so the regexp can be written as '[^/]+$', the examples as below, it means that replace the substring with regexp '[^/]+$' to an empty ''.
select regexp_replace('/this/is/my/dir/file','[^/]+$','') as dir;
+-------------------+
| dir |
+-------------------+
| /this/is/my/dir/ |
+-------------------+
select regexp_replace('this/is/my/another/dir/file','[^/]+$','') as dir;
+--------------------------+
| dir |
+--------------------------+
| this/is/my/another/dir/ |
+--------------------------+

Related

Updating the left side of a string up to a delimiter

My column "ColumnOne" in my table "MyTable" has values like this: Delimiter is character '-'
|Something |
|Something - SomeOtherThing |
|Something - SomethingElse |
|Something - Whatever |
|OtherThing - |
I want to update the values so eventually it look like this:
|Something |
| SomeOtherThing |
| SomethingElse |
| Whatever |
| |
So basically algorithm being to replace with white space and keep going until you see '-' , replace that too also with whitespace.
I tried the REPLACE command to say like
UPDATE MyTable SET ColumnOne = REPLACE(ColumnOne, ' - ', ' ' + ColumnOne) but that's wrong. I couldn't figure out the pattern for its second argument.
Any suggestions are appreciated.

Use charindex to find the amount of characters to change, stuff to perform the change, and replicate to generate a string of N spaces. Try this:
stuff(ColumnOne,1,charindex('-',ColumnOne),replicate(' ',charindex('-',ColumnOne))

Regex to match the pattern split with slash

I want to query the database column with regex to match the string like the following...
1. qwge1/dg2/hjetg3
2. tahry4/rtg5
3. jtyg6
How to split the zero to multiple slashes and match the [a-z]+[0-9] part?

You could use:
^([a-z]+[0-9](/|$))+$
The inner expression, [a-z]+[0-9](/|$), describes a series of alphabetic characters followed by a digit, then by a slash or the end of the string. This expression may be repeated 1 to N times, followed by the end of the string.
Demo on DB Fiddle - I added a few non-matching strings to your sample data:
select val, val ~ '^([a-z]+[0-9](/|$))+$'
from (values
('qwge1/dg2/hjetg3'),
('tahry4/rtg5'),
('jtyg6'),
('abc'),
('qwge1/dg2/hjetg'),
('qwge1/dg2/3')
) x(val)
val | ?column?
:--------------- | :-------
qwge1/dg2/hjetg3 | t
tahry4/rtg5 | t
jtyg6 | t
abc | f
qwge1/dg2/hjetg | f
qwge1/dg2/3 | f

How remove symbols from the sentence in Oracle?

In Oracle database I have such table.
| TREE | ORG_NAME |
|---------------------------------|----------|
| \Google earth\Nest global\ATAP | ATAP |
| \Google earth\Nest\Beemoney\ | Beemoney |
| \Google\\\BeeKey\ | |
| | York |
I am trying to make sql query which would return such result.
| ORGANIZATION |
|-----------------------------------|
| Google earth > Nest global > ATAP |
| Google earth > Nest Beemoney |
| Google > BeeKey |
| York |
As you can see I want:
1) Replace \ symbol at the beginning and end of the sentence.
2) Replace \ symbol which is inside sentence to > symbol.
3) Replace \\\ symbol which is inside sentence to > symbol.
4) If TREE colomn is empty take record from ORG_NAME colomn.
Here is how I started. This SQL query solve 2, 3 and 4 part. How to solve problem with 1 part. I think I need to use REGEXP_REPLACE, right? How to make it correctly? Is there any other more elegant way to redisign sql query? As you can see I walk on the same table a few times.
SELECT
COALESCE (TREE, ORG_NAME) as ORGANIZATION
FROM (
SELECT
REPLACE(TREE, '\', '>') AS TREE,
ORG_NAME
FROM (
SELECT
REPLACE(TREE, '\\\', '>') AS TREE,
ORG_NAME
FROM
ORG
)
)

This could be a way with a regexp_replace and a trim to remove the characters from the beginning and the end of the string:
select nvl(regexp_replace( trim('\' from tree), '\\+', ' > '), org_name)
from yourTable

Here is a working solution which uses two calls to regexp_replace:
select
regexp_replace(
regexp_replace('\Google\\\BeeKey\', '^\\?(.*?)\\?$', '\1'), '\\+', ' > ')
from dual;
Google > BeeKey
Demo
The inner call to regexp_replace strips off any possible leading or trailing path separators. The outer call converts any number of internal path separators / to > separators as a replacement.

Odd bug in SQL TRIM() function

I have the following table:
select * from top3art;
path | count
-----------------------------+--------
/article/candidate-is-jerk | 338647
/article/bears-love-berries | 253801
/article/bad-things-gone | 170098
I want to trim off '/article/' in path values, so I do this:
select *, trim(leading '/article/' from path) from top3art;
path | count | ltrim
-----------------------------+--------+--------------------
/article/candidate-is-jerk | 338647 | ndidate-is-jerk
/article/bears-love-berries | 253801 | bears-love-berries
/article/bad-things-gone | 170098 | bad-things-gone
Rows 2 and 3 work just fine. But what happened to the 1st row??
It trimmed '/article/ca'. Why did it take 2 more characters?
Now watch what happens when I just trim '/articl':
select *, trim(leading '/articl' from path) as test from top3art;
path | count | test
-----------------------------+--------+----------------------
/article/candidate-is-jerk | 338647 | e/candidate-is-jerk
/article/bears-love-berries | 253801 | e/bears-love-berries
/article/bad-things-gone | 170098 | e/bad-things-gone
That works as expected... Now watch what happens when I add one more char in my trim clause, '/article':
select *, trim(leading '/article' from path) as test from top3art;
path | count | test
-----------------------------+--------+--------------------
/article/candidate-is-jerk | 338647 | ndidate-is-jerk
/article/bears-love-berries | 253801 | bears-love-berries
/article/bad-things-gone | 170098 | bad-things-gone
Same as the first result!
I can't make sense of this.
Why is this happening?
How do I fix it?

trim removes any character in the first argument from the second argument, so it also removes the c and the a of "candidate". Instead of trim, you could use a split_part call:
select *, split_part(path, '/article/', 2) as test from top3art;

Trim removes all signs you mentioned not words/phrases.
Instead of trim use replace()
select *, replace(path, '/article/','') from top3art;

Replacing first occurence of character in a string using HiveQL

I am trying to replace the first occurrence of '-' in a string in Hive table. I am using HiveQL. I searched this topic here and other websites, but could not find clear explanation how to use metacharacters with regexp_replace() to do that.
This is a string from which I need to replace first '-' with empty space: 16-001-02707
The result should be like this: 16001-02707
This is the method I used:
select regexp_replace ('16-001-02707','[^[:digit:]]', '');
However, this doesn't do anything.

select regexp_replace ('16-001-02707','^(.*?)-', '$1');
16001-02707
Following the OP question in the comments
with t as (select '111-22-333333-4-555-6-7-8888-999999' as col)
select regexp_replace (col,'^(.*?)-','$1')
,regexp_replace (col,'^(.*?-.*?)-','$1')
,regexp_replace (col,'^((.*?-){2}.*?)-','$1')
,regexp_replace (col,'^((.*?-){3}.*?)-','$1')
,regexp_replace (col,'^((.*?-){4}.*?)-','$1')
,regexp_replace (col,'^((.*?-){5}.*?)-','$1')
from t
+------------------------------------+------------------------------------+------------------------------------+------------------------------------+------------------------------------+------------------------------------+
| _c0 | _c1 | _c2 | _c3 | _c4 | _c5 |
+------------------------------------+------------------------------------+------------------------------------+------------------------------------+------------------------------------+------------------------------------+
| 11122-333333-4-555-6-7-8888-999999 | 111-22333333-4-555-6-7-8888-999999 | 111-22-3333334-555-6-7-8888-999999 | 111-22-333333-4555-6-7-8888-999999 | 111-22-333333-4-5556-7-8888-999999 | 111-22-333333-4-555-67-8888-999999 |
+------------------------------------+------------------------------------+------------------------------------+------------------------------------+------------------------------------+------------------------------------+

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Display substring seperated by / in Hive - sql

I have a column in my table with entries like: this/is/my/dir/file this/is/my/another/dir/file I want to display the string without the filename: this/is/my/dir/ This is the query which I am using: select regexp_replace('this/is/my/another/dir/file','[^/]+','');

Related

Updating the left side of a string up to a delimiter

Regex to match the pattern split with slash

How remove symbols from the sentence in Oracle?

Odd bug in SQL TRIM() function

Replacing first occurence of character in a string using HiveQL

Categories

Resources