How to replace rows containing alphabets and special characters with Blank spaces in snowflake - sql

I have a column "A" which contains numbers for example- 0001, 0002, 0003
the same column "A" also contains some alphabets and special characters in some of the rows for example - connn, cco*jjj, hhhhhh11111 etc.
I want to replace these alphabets and special characters rows with blank values and only want to keep the rows containing the number.
which regex expression I can use here?

If you want to extract numbers from these values (even if they end or start with non digits), you may use something like this:
create table testing ( A varchar ) as select *
from values ('123'),('aaa123'),('3343'),('aaaa');
select REGEXP_SUBSTR( A, '\\D*(\\d+)\\D*', 1, 1, 'e', 1 ) res
from testing;
+------+
| RES |
+------+
| 123 |
| 123 |
| 3343 |
| NULL |
+------+

I understand that you want to set to null all values that do not contain digits only.
If so, you can use try_to_decimal():
update mytable
set a = null
where a try_to_decimal(a) is null
Or a regexp match:
update mytable
set a = null
where a rlike '[^0-9]'

Related

SQL - trimming values before bracket

I have a column of values where some values contain brackets with text which I would like to remove. This is an example of what I have and what I want:
CREATE TABLE test
(column_i_have varchar(50),
column_i_want varchar(50))
INSERT INTO test (column_i_have, column_i_want)
VALUES ('hospital (PWD)', 'hopistal'),
('nursing (LLC)','nursing'),
('longterm (AT)', 'longterm'),
('inpatient', 'inpatient')
I have only come across approaches that use the number of characters or the position to trim the string, but these values have varying lengths. One way I was thinking was something like:
TRIM('(*',col1)
Doesn't work. Is there a way to do this in postgres SQL without using the position? THANK YOU!
If all the values contain "valid" brackets, then you may use split_part function without any regular expressions:
select
test.*,
trim(split_part(column_i_have, '(', 1)) as res
from test
column_i_have | column_i_want | res
:------------- | :------------ | :--------
hospital (PWD) | hopistal | hospital
nursing (LLC) | nursing | nursing
longterm (AT) | longterm | longterm
inpatient | inpatient | inpatient
db<>fiddle here
You can replace partial patterns using regular expressions. For example:
select *, regexp_replace(v, '\([^\)]*\)', '', 'g') as r
from (
select '''hospital (PWD)'', ''nursing (LLC)'', ''longterm (AT)'', ''inpatient''' as v
) x
Result:
r
-------------------------------------------------
'hospital ', 'nursing ', 'longterm ', 'inpatient'
See example at db<>fiddle.
Could it be as easy as:
SELECT SUBSTRING(column_i_have, '\w+') AS column_i_want FROM test
See demo
If not, and you still want to use SUBSTRING() to get upto but exclude paranthesis, then maybe:
SELECT SUBSTRING(column_i_have, '^(.+?)(?:\s*\(.*)?$') AS column_i_want FROM test
See demo
But if you really are looking upto the opening paranthesis, then maybe just use SPLIT_PART():
SELECT SPLIT_PART(column_i_have, ' (', 1) AS column_i_want FROM test
See demo

SQL change number format to custom "0000000000000000" from any integer number in SELECT statement

I am trying write a SQL SELECT statement in Snowflake where I need to select a column 'xyz' form table 'a'(select xyz from a). Although, the number in column xyz is formatted into 6,7,8 digits and I want to convert that in select statement itself to 16 digits with leading 0's. I used concat() but since the column contains either 6,7,8 digits I am not able to format it into max 16 digits with leading 0's.
Note: I need it to be in select statement as I cant update the column in the database to 16 digit format.
Example
Input: 123456, 1234567, 12345678
Output should be: 0000000000123456, 0000000001234567, 0000000012345678
Can someone help me out here please. Thanks!
Using TO_VARCHAR with format provided:
SELECT TO_VARCHAR(12345, '0000000000000000');
WITH cte(c) AS (
SELECT * FROM (VALUES (123456), (1234567),(12345678))
)
SELECT c, TO_VARCHAR(c, '0000000000000000') AS result
FROM cte;
Output:
with data(c) as (
select * FROM (
values (12345), (293848238), (284), (239432043223432)
)
)
select lpad(c, 16, '0') as c from data;
+------------------+
| C |
|------------------|
| 0000000000012345 |
| 0000000293848238 |
| 0000000000000284 |
| 0239432043223432 |
+------------------+
If you want dynamic width, or to stick to a form of CONCAT then here are some extra ways.
SELECT column1,
TO_CHAR(column1, '0000000000000000') as fixed_width,
TO_CHAR(column1, REPEAT('0', 16)) as dynamic_width,
RIGHT(REPEAT('0', 16) || column1::text, 16) as another_way,
LPAD(column1, 16, '0')
FROM VALUES
(123456),
(1234567),
(12345678),
(12345.678),
(1234567890123456789)
COLUMN1
FIXED_WIDTH
DYNAMIC_WIDTH
ANOTHER_WAY
LPAD(COLUMN1, 16, '0')
123,456
0000000000123456
0000000000123456
000000123456.000
000000123456.000
1,234,567
0000000001234567
0000000001234567
000001234567.000
000001234567.000
12,345,678
0000000012345678
0000000012345678
000012345678.000
000012345678.000
12,345.678
0000000000012346
0000000000012346
000000012345.678
000000012345.678
1,234,567,890,123,456,789
################
################
890123456789
1234567890123456
the two TO_CHAR/TO_VARCHAR methods don't deal with floating results, where-as the RIGHT version does. But that doesn't handle values that are larger the 16 DP
Using you SQL
So using the SQL you put in your comment, and then using 2 CTE's for fake out some data AND using one of the many answers to your question, we get:
WITH "abc" as (
SELECT * FROM VALUES
(123456),
(1234567)
t(ki)
), "xyz" as (
SELECT * FROM VALUES
(6, 123456),
(7, 1234567)
t(ki_value, piv_id)
)
SELECT DISTINCT
y.kI,
yo.ki_value,
yo.piv_num
FROM "abc" as y
left join (
select
ki_value,
LPAD(piv_id, 16, '0') as piv_num
--concat('0000000000',piv_id) as piv_num
from "xyz"
) as yo on y.KI = yo.piv_num
WHERE y.KI in (123456,1234567)
this gives:
KI |KI_VALUE |PIV_NUM
--|--|--
123,456 |6 |0000000000123456
1,234,567 |7 |0000000001234567
So this "works" but as your SQL is written, it's really bad. The WHERE statement is checking the values are numbers and are in the range, why on earth are you want to string pad numbers as TEXT with however many number of zeros in front, because you should leave them as numbers.

Merging tags to values separated by new line character in Oracle SQL

I have a database field with several values separated by newline.
Eg-(can be more than 3 also)
A
B
C
I want to perform an operation to modify these values by adding tags from front and end.
i.e the previous 3 values should need to be turned into
<Test>A</Test>
<Test>B</Test>
<Test>C</Test>
Is there any possible query operation in Oracle SQL to perform such an operation?
Just replace the start and end of each string with the XML tags using a multi-line match parameter of the regular expression:
SELECT REGEXP_REPLACE(
REGEXP_REPLACE( value, '^', '<Test>', 1, 0, 'm' ),
'$', '</Test>', 1, 0, 'm'
) AS replaced_value
FROM table_name;
Which, for the sample data:
CREATE TABLE table_name ( value ) AS
SELECT 'A
B
C' FROM DUAL;
Outputs:
| REPLACED_VALUE |
| :------------- |
| <Test>A</Test> |
| <Test>B</Test> |
| <Test>C</Test> |
db<>fiddle here
You can use normal replace function as follows:
Select '<test>'
|| replace(your_column,chr(10),'</test>'||chr(10)||'<test>')
|| '</test>'
From your_table;
It will be faster than its regexp_replace function.
Db<>fiddle

SQL: Select rows that contain a word

The goal is to select all rows that contain some specific word, can be in the beginning or the end of the string and/or surrounded by white-space, should not be inside other word, so to speak.
Here are couple rows in my database:
+---+--------------------+
| 1 | string with test |
+---+--------------------+
| 2 | test string |
+---+--------------------+
| 3 | testing stringtest |
+---+--------------------+
| 4 | not-a-test |
+---+--------------------+
| 5 | test |
+---+--------------------+
So in this example, selecting word test, should return rows 1, 2 and 5.
Problem is that for some reason, SELECT * FROM ... WHERE ... RLIKE '(\s|^)test(\s|$)'; returns 0 rows.
Where am I wrong and maybe, how it could be done better?
Edit: Query should also select the row with just a word test.
The answer to my first question is:
I haven't escaped special characters, so \s should be \\s.
Working query: SELECT * FROM ... WHERE ... RLIKE '(\\s|^)test(\\s|$)';. (or just a space ( |^)/( |$), also works)
Hi you could grab with trailing space and with leading space
SELECT * from new_table
where text RLIKE(' test')
union
SELECT * from new_table
where text RLIKE('test ')
REGEXP_INSTR() function, which's is an extension of the INSTR() function, might be used for version 10.0.5+ case-insensitively as default :
SELECT *
FROM t
WHERE REGEXP_INSTR(str, 'TeSt ')>0
OR REGEXP_INSTR(str, ' tESt')>0
Demo
SELECT * FROM ...
WHERE ... LIKE 'test';
This should do the trick.
Is this what you want?
SELECT * FROM ... WHERE ... LIKE
'%test%';
Use word boundary tests:
Before MySQL 8.0, and in MariaDB:
WHERE ... REGEXP '[[:<:]]test[[:>:]]'
MySQL 8.0:
WHERE ... REGEXP '\btest\b'
(If that does not work, double up the backslashes; this depends on whether the client is collapsing backslashes before MySQL gets them.)
Note that this solution will also work with punctuation such as the comma in "foo, test, bar"

show value from text..if delimited by ; show only first value

I have table with values. It is ntext because of ; delimited. Values can be empty, 1 number and numbers delimited by semicolon (as shown)
+-----------+
| room |
+-----------+
| 64 |
+-----------+
| 60008 |
+-----------+
| |
+-----------+
| 127;50047 |
+-----------+
I have this code. Substring is looking for ; and show first value. It is working only where the values are delimited. So how can I change it, that it will show first value when ; and single value also. So from table bellow I will get 64,60008, ,127.
SELECT
T0.U_Scid as 'id',
T3.U_Boarding as 'start',
T3.U_Boarding as 'end',
SUBSTRING(T5.U_Partner, 0, CHARINDEX(';', T5.U_Partner)) AS 'room_id',
CASE WHEN datalength(T5.U_Partner)=0 THEN '9999' ELSE T5.U_Partner END AS 'room_id' ,
CASE WHEN datalength(T5.U_Partner) > 4 THEN T5.U_Partner ELSE '9999' END AS 'partners_id' ,
This is just bonus question. CASE are looking for length of value, if the value is longer than 4 ( 600008 ) write to room_id 9999 and save 600008 to partners_id. If it is empty write 9999 to room_id.
How to make it works together?.so getting value from T_Partner..save it into temporary table T1.TempRoom ( I suppose ).. so T1.TempRoom (is filled with numbers like 64, ,60008,127) then CASE is checking T1.TempRoom for values and save it into room_id and partners_id.
Am I right?
Here is a simple method:
SUBSTRING(T5.U_Partner, 1, CHARINDEX(';', T5.U_Partner + ';')) AS room_id,
That is, concatenate the semicolon to the argument for CHARINDEX(). That will prevent any error occurring.
In addition, indexing for SUBSTRING() starts at 1, not 0.
And, don't use single quotes for column names. Only use them for string and date literals.
EDIT:
You can always use the verbose form:
(CASE WHEN T5.U_Partner LIKE '%;%'
THEN SUBSTRING(T5.U_Partner, 1, CHARINDEX(';', T5.U_Partner + ';'))
ELSE T5.U_Partner
END) AS room_id