Replace first whitespace after a character in SQL - sql

I have table like this
---------------------------------
| column |
---------------------------------
| abc def #ghi jkl mno #pqr stu |
---------------------------------
and I want an output that replaces whitespace after every # occurrence. I tried using regexp_replace but it replaces only the first occurrence.
Expected output:
---------------------------------
| column |
---------------------------------
| abc def #ghi#jkl mno #pqr#stu |
---------------------------------
Can someone help with this?

You can use the REGEXP_REPLACE function with a regular expression pattern that matches the whitespace character (\s) after a # character.
SELECT REGEXP_REPLACE(column, '(#)\s', '\1') AS column
FROM your_table;
The resulting output will have all occurrences of whitespace after a # character replaced with just the # character.
Output :
---------------------------------
| column |
---------------------------------
| abc def #ghi#jkl mno #pqr#stu |

Related

postsql trouble when populating a table with comma separator

I am trying to populate a table, and I have values with "," from my array "name". so This is not working, if I use something other than a comma separator, it says
BadCopyFileFormat: missing data for column
s = "CREATE TABLE IF NOT EXISTS tokens (address varchar(100) NOT NULL,symbol varchar(100) NOT NULL,name varchar(100) NOT NULL)"
db_cursor.execute(s)
with open('data/tokens.csv', 'r', encoding="utf-8") as f:
next(f) # Skip the header row.
db_cursor.copy_from(f, 'tokens', sep=',')
db_conn.commit()
My data look like
address symbol name
x23fva3 ABC ABC
2vajd83 DAP
29vb4h2 Wink Jamal, ab
2jsbg93 x3 xon3
Is there a way to populate the table with missing values??
What I got to work:
cat data/tokens.csv
address |symbol|name
x23fva3 | ABC | ABC
2vajd83 | DAP |
29vb4h2 | Wink | Jamal, ab
2jsbg93 | x3 | xon3
with open('data/tokens.csv', 'r', encoding="utf-8") as f:
next(f) # Skip the header row.
db_cursor.copy_from(f, 'tokens', sep='|')
db_conn.commit()
select * from tokens ;
address | symbol | name
----------+--------+------------
x23fva3 | ABC | ABC
2vajd83 | DAP |
29vb4h2 | Wink | Jamal, ab
2jsbg93 | x3 | xon3
I use the pipe(|) regularly for this sort of thing as it very rarely shows up in data on its own.
UPDATE
For a file with empty values there needs to still be a separator for each field like:
address |symbol|name
x23fva3 | ABC | ABC
2vajd83 | DAP |
32vb4h3 | |
1jsbg94 | | xon3

Redshift skip the first character of split_part()

I have a table column like below:
| cloumn_a |
| ------------------ |
| Alpha_Black_1 |
| Alpha_Black_2323 |
| Alpha_Red_100 |
| Alpha_Blue_2344 |
| Alpha_Orange_33333 |
| Alpha_White_2 |
| |
Usually, when I want to split with any symbol or character I am using the split_part(text, text, integer) so split_part(column_a, '_', 1)
I need to remove the numeric part of each variable and keep only the text part like Alpha_Black.
I cannot use the trim function because the numeric part can change
How can I skip the first underscore and split from the second one?
I would suggest using REGEXP_REPLACE here:
SELECT
column_a,
REGEXP_REPLACE(column_a, '_\\d+$', '') AS column_a_out
FROM yourTable;
Demo

SQL padding 0 to the left of a number in string

I am a beginner in SQL language and I am using postgre sql and doing little exercices to learn. I have a column of strings named acronym from a destination table:
DO1
ES1
ES2
FR1
FR10
FR2
FR3
FR4
FR5
FR6
FR7
FR8
FR9
GP1
GP2
IN1
IN2
MU1
RU1
TR1
UA1
I would like to add a padding zero for acronym numbers that have only one digit, output:
DO01
ES01
ES02
FR01
FR02
FR03
FR04
FR05
FR06
FR07
FR08
FR09
FR10
GP01
GP02
IN01
IN02
MU01
RU01
TR01
UA01
How can I get to the left of the first number in the string? There is some regex I think but I did not figure it out
You can use the rpad() function to add characters to the end of the value:
select rpad(col, '0', 4)
In your case, though, you want a value in-between. On simple method is -- assuming that the first two characters are strings -- is:
(case when length(col) = 3
then left(col, 2) || '0' || right(col, 1)
else col
end)
Another possibility is using regexp_replace():
regexp_replace(col, '^([^0-9]{2})([0-9])$', '\10\2')
Both of these assume that the strings to be padded are three characters, which is consistent with your data. It is unclear what you want for other lengths.
try with below:
to_char() function
select to_char(column1, 'fm000') as column2
from Test_table;
fm "fill mode"prefix avoids leading spaces in the resulting var char.
000 it defines the number of digits you want to have.
You can use string functions like lpad(), substr(), left():
select
concat(left(columnname, 2), lpad(substr(columnname, 3), 2, '0')) result
from tablename
See the demo.
Results:
| result |
| ------ |
| DO01 |
| ES01 |
| ES02 |
| FR01 |
| FR10 |
| FR02 |
| FR03 |
| FR04 |
| FR05 |
| FR06 |
| FR07 |
| FR08 |
| FR09 |
| GP01 |
| GP02 |
| IN01 |
| IN02 |
| MU01 |
| RU01 |
| TR01 |
| UA01 |

Oracle SQL regex extraction

I have data as follows in a column
+----------------------+
| my_column |
+----------------------+
| test_PC_xyz_blah |
| test_PC_pqrs_bloh |
| test_Mobile_pqrs_bleh|
+----------------------+
How can I extract the following as columns?
+----------+-------+
| Platform | Value |
+----------+-------+
| PC | xyz |
| PC | pqrs |
| Mobile | pqrs |
+----------+-------+
I tried using REGEXP_SUBSTR
Default first pattern occurrence for platform:
select regexp_substr(my_column, 'test_(.*)_(.*)_(.*)') as platform from table
Getting second pattern occurrence for value:
select regexp_substr(my_column, 'test_(.*)_(.*)_(.*)', 1, 2) as value from table
This isn't working, however. Where am I going wrong?
For Non-empty tokens
select regexp_substr(my_column,'[^_]+',1,2) as platform
,regexp_substr(my_column,'[^_]+',1,3) as value
from my_table
;
For possibly empty tokens
select regexp_substr(my_column,'^.*?_(.*)?_.*?_.*$',1,1,'',1) as platform
,regexp_substr(my_column,'^.*?_.*?_(.*)?_.*$',1,1,'',1) as value
from my_table
;
+----------+-------+
| PLATFORM | VALUE |
+----------+-------+
| PC | xyz |
+----------+-------+
| PC | pqrs |
+----------+-------+
| Mobile | pqrs |
+----------+-------+
(.*) is greedy by nature, it will match all character including _ character as well, so test_(.*) will match whole of your string. Hence further groups in pattern _(.*)_(.*) have nothing to match, whole regex fails. The trick is to match all characters excluding _. This can be done by defining a group ([^_]+). This group defines a negative character set and it will match to any character except for _ . If you have better pattern, you can use them like [A-Za-z] or [:alphanum]. Once you slice your string to multiple sub strings separated by _, then just select 2nd and 3rd group.
ex:
SELECT REGEXP_SUBSTR( my_column,'(([^_]+))',1,2) as platform, REGEXP_SUBSTR( my_column,'(([^_]+))',1,3) as value from table;
Note: AFAIK there is no straight forward method to Oracle to exact matching groups. You can use regexp_replace for this purpose, but it unlike capabilities of other programming language where you can exact just group 2 and group 3. See this link for example.

Change value from one column into another

I have got a table:
ID | Description
--------------------
1.13.1-3 | .1 Hello
1.13.1-3 | .2 World
1.13.1-3 | .3 Text
4.54.1-4 | sthg (.1) Ble
4.54.1-4 | sthg (.2) Bla
4.54.1-4 | aaaa (.3) Qwer
4.54.1-4 | bbbb (.4) Tyuio
And would like to change ending of ID by taking value from second column to have result like:
ID | Description
--------------------
1.13.1 | Hello
1.13.2 | World
1.13.3 | Text
4.54.1 | Ble
4.54.2 | Bla
4.54.3 | Qwer
4.54.4 | Tyuio
Is there any quick way to do it in postgresql?
Use regex to manipulate the strings into what you want:
update mytable set
ID = regexp_replace(ID, '\.[^.]*$', '') || substring(Description from '\.[0-9+]'),
Description = regexp_replace(Description, '.*\.[0-9]+\S* ', '')
See SQLFiddle showing this query working with your data.