Check string for substring existence - sql

How can I check whether a certain substring (for instance 18UT) is part of a string in a column?
Redshifts' SUBSTRING function allows me to "cut" a certain substring based on a starting index + length of the subtring, but not check whether a specific substring exists is in the column's value.
Example:
+------------------+
| col |
+------------------+
| 14TH, 14KL, 18AB |
| 14LK, 18UT, 15AK |
| 14AB, 08ZT, 18ZH |
| 14GD, 52HG, 18UT |
+------------------+
Desired result:
+------------------+------+
| col | 18UT |
+------------------+------+
| 14TH, 14KL, 18AB | No |
| 14LK, 18UT, 15AK | Yes |
| 14AB, 08ZT, 18ZH | No |
| 14GD, 52HG, 18UT | Yes |
+------------------+------+

Here is one option:
select col,
case when ', ' || col || ', ' like '%, 18UT, %' then 'yes' else 'no' end has_18ut
from mytable
While this will solve your immediate, problem, it should be note that storing delimited lists in a database table is bad practice, and should be avoided. Each value should go to a separate row instead.

Related

Remove/delete values in a column SQL

I am very new to using SQL and require help.
I have a table containing comma in the values
+-------------------+
| Sample |
+-------------------+
| sdferewr,yyuyuy |
| q45345,ty67rt |
| wererert,rtyrtytr |
| werr,ytuytu |
+-------------------+
I would want to delete/remove the values after the comma(,) and keep only those values before it.
Output required.
+----------+
| Sample |
+----------+
| sdferewr |
| q45345 |
| wererert |
| werr |
+----------+
How would I be able to do this in SQL? please help
Assuming that the table name is "TABLE_NAME" and the field name is "sample". Then
update TABLE_NAME set sample=SUBSTRING_INDEX(`sample`, ',', 1)
The most simple way to do that is
UPDATE table_name
SET column = substring(column for position('',' in column))
WHERE condition;
position(',' in column) will return the position of the comma and substring(column for n) returns the first n characters

Redshift skip the first character of split_part()

I have a table column like below:
| cloumn_a |
| ------------------ |
| Alpha_Black_1 |
| Alpha_Black_2323 |
| Alpha_Red_100 |
| Alpha_Blue_2344 |
| Alpha_Orange_33333 |
| Alpha_White_2 |
| |
Usually, when I want to split with any symbol or character I am using the split_part(text, text, integer) so split_part(column_a, '_', 1)
I need to remove the numeric part of each variable and keep only the text part like Alpha_Black.
I cannot use the trim function because the numeric part can change
How can I skip the first underscore and split from the second one?
I would suggest using REGEXP_REPLACE here:
SELECT
column_a,
REGEXP_REPLACE(column_a, '_\\d+$', '') AS column_a_out
FROM yourTable;
Demo

SQL padding 0 to the left of a number in string

I am a beginner in SQL language and I am using postgre sql and doing little exercices to learn. I have a column of strings named acronym from a destination table:
DO1
ES1
ES2
FR1
FR10
FR2
FR3
FR4
FR5
FR6
FR7
FR8
FR9
GP1
GP2
IN1
IN2
MU1
RU1
TR1
UA1
I would like to add a padding zero for acronym numbers that have only one digit, output:
DO01
ES01
ES02
FR01
FR02
FR03
FR04
FR05
FR06
FR07
FR08
FR09
FR10
GP01
GP02
IN01
IN02
MU01
RU01
TR01
UA01
How can I get to the left of the first number in the string? There is some regex I think but I did not figure it out
You can use the rpad() function to add characters to the end of the value:
select rpad(col, '0', 4)
In your case, though, you want a value in-between. On simple method is -- assuming that the first two characters are strings -- is:
(case when length(col) = 3
then left(col, 2) || '0' || right(col, 1)
else col
end)
Another possibility is using regexp_replace():
regexp_replace(col, '^([^0-9]{2})([0-9])$', '\10\2')
Both of these assume that the strings to be padded are three characters, which is consistent with your data. It is unclear what you want for other lengths.
try with below:
to_char() function
select to_char(column1, 'fm000') as column2
from Test_table;
fm "fill mode"prefix avoids leading spaces in the resulting var char.
000 it defines the number of digits you want to have.
You can use string functions like lpad(), substr(), left():
select
concat(left(columnname, 2), lpad(substr(columnname, 3), 2, '0')) result
from tablename
See the demo.
Results:
| result |
| ------ |
| DO01 |
| ES01 |
| ES02 |
| FR01 |
| FR10 |
| FR02 |
| FR03 |
| FR04 |
| FR05 |
| FR06 |
| FR07 |
| FR08 |
| FR09 |
| GP01 |
| GP02 |
| IN01 |
| IN02 |
| MU01 |
| RU01 |
| TR01 |
| UA01 |

hive regexp_extract after second occurrence of delimiter

we have a Hive table column which has string separated by ';' and we need to extract the string after second occurrence of ';'
+-----------------+
| col1 |
+-----------------+
| a;b;c;d |
| e;f; ;h |
| i;j;k;l |
+-----------------+
Required output:
+-----------+
| col1 |
+-----------+
| c |
| <null> |
| k |
+-----------+
select regexp_extract
Split the string on ; which will return an array of values and from this you can get the element at index 2.
select split(str,';')[2]
from tbl
If you want to convert empty and space-only strings to NULLs like in your example, then this macro can be useful:
create temporary macro empty_to_null(s string) case when trim(s)!='' then s end;
select empty_to_null(split(col1,'\\;')[2]);

Match a set of characters from one table into the records of an other table

I have two tables (T-SQL):
tblInvalidCharactersList tblMonthsRecords
+-----------+-----------+ +--------+-------------+
| CodePoint | Character | | RecRef | Name |
+-----------+-----------+ +--------+-------------+
| 38 | & | | 21 | Firs> name |
+-----------+-----------+ +--------+-------------+
| 64 | # | | 89 | #Second name|
+-----------+-----------+ +--------+-------------+
| 62 | > | | 321 | Third n«me |
+-----------+-----------+ +--------+-------------+
| 171 | « | | 381 | Fourth name |
+-----------+-----------+ +--------+-------------+
I want to find those records of the tblMonthsRecords which have at least one (or more) character(s) from the Character column of the tblInvalidCharactersList table.
I tried:
SELECT
[RecRef],
[Name]
FROM [tblMonthsRecords]
WHERE [Name] IN (SELECT Character FROM [tblInvalidCharactersList])
and it returns no results at all.
I even tried the NOT IN clause and as you may guess, returns all records.
The reason why I am not hardcoding the characters list within a LIKE clause is because I want the list to be dynamically updated.
You can think the tblInvalidCharactersList as a characters "black list".
I would use exists:
select mr.*
from tblMonthsRecords mr
where exists (select 1
from tblInvalidCharactersList icl
where charindex(icl.Character, mr.name) > 0
);
You don't seem to care about the actual invalid character.
IN will look for exact character match in Name column it will not search for the character in Name column
Use LIKE operator
select Distinct a.*
from tblMonthsRecords a
join tblInvalidCharactersList b
on a.Name like '%' + b.Character + '%'
Another way using charindex
charindex(b.Character,a.Name) > 0