How to trim out letter in the column

How to trim out letter in the column - sql

I don't know the effective way to trim out letter in the name. For example, the f_name column have Jenny, Johnny, Doe, Ken, Smith.
I wanted to trim out the letter in these name so it consist only the first 2 letter. Like Je, Jo, Do, Ke, Sm as the output for the new column.
But the letter in these name don't have equal number of letter, like Johnny have 6 letter and John have 4 letter.
Is there any effective way to trim the uneven character's length without count all the character's length in f_name and place all the condition to trim all names. Like these below.
CASE WHEN LENGTH(f_name) > 4 THEN LTRIM(f_name, 2)

For Oracle use substr():
with data (f_name) as (
select 'Jenny' from dual union all
select 'Johnny' from dual union all
select 'Doe' from dual union all
select 'Ken' from dual union all
select 'Smith' from dual
)
select substr(f_name, 1, 2)
from data
Returns:
SUBSTR(F_NAME,1,2)
------------------
Je
Jo
Do
Ke
Sm

USE SUBSTRING
CASE WHEN LENGTH(f_name) > 4 THEN SUBSTR(f_name,1, 2)

If you want to get least acronym by all names. You may write something like
with s as (select level as lvl from dual connect by level <(select max(LENGTH(f_name)) from your_table ))
select f_name,
max(sub_f_name) keep (dense_rank FIRST order by cnt, t.lvl desc) as least_acronym
select f_name
, substr(t.f_name,-lvl) as sub_f_name
, t.lvl
, count(*) over (partition by substr(t.f_name,-lvl)) as cnt
from your_table t
, s)
group by f_name
NB. Just as Idea. Not tested yet

Related

SQL remove unwanted special characters from a string

Hi i am new to SQL and am writing a case statement for a column of grade values.
The values can be a length of 3 like A02, B04, A10, A09, D03. The first character is a letter while the next 2 are digits.
If a user enters in 'A02 I want to change it to do A02. Basically remove any special characters if there are present.
CASE
WHEN Grade like '[^0-9A-z]%' THEN ''
else Grade end as Grade
So far I have this but I am not sure how to use regex to remove the character only search for it.

Unless you really want to do a CASE for the fun of it, in oracle I'd do it like this which removes punctuation characters and spaces when you select it. Note this does not verify format so a grade of Z1234 would get returned.
WITH tbl(ID, grade) AS (
SELECT 1, 'A01' FROM dual UNION ALL
SELECT 1, '''B02' FROM dual UNION ALL
SELECT 2, '$ C01&' FROM dual
)
SELECT ID, grade, REGEXP_REPLACE(grade, '([[:punct:]]| )') AS grade_scrubbed
from tbl;
ID GRADE GRADE_SCRUBBED
---------- --------- --------------
1 A01 A01
1 'B02 B02
2 $ C01& C01
3 rows selected.
HOWEVER, that said, since you seem to want to verify the format and use regex, you could do it this way although it's a little fugly. See comments.
WITH tbl(ID, grade) AS (
-- Test data. Include every crazy combo you'd never expect to see,
-- because you WILL see it, it's just a matter of time :-)
SELECT 1, 'A01' FROM dual UNION ALL
SELECT 1, '''B02' FROM dual UNION ALL
SELECT 2, '$ C01&' FROM dual UNION ALL
SELECT 3, 'DDD' FROM dual UNION ALL
SELECT 4, 'A'||CHR(10)||'DEF' FROM dual UNION ALL
SELECT 5, 'Z1234' FROM dual UNION ALL
SELECT 6, NULL FROM dual
)
SELECT ID, grade,
CASE
-- Correct format of A99.
WHEN REGEXP_LIKE(grade, '^[A-Z]\d{2}$')
THEN grade
-- if not A99, see if stripping out punctuation and spaces make it match A99.
-- If so, return with punctuation and spaces stripped out.
WHEN NOT REGEXP_LIKE(grade, '^[A-Z]\d{2}$')
AND REGEXP_LIKE(REGEXP_REPLACE(grade, '([[:punct:]]| )'), '^[A-Z]\d{2}$')
THEN REGEXP_REPLACE(grade, '([[:punct:]]| )')
-- if not A99, and stripping out punctuation and spaces didn't make it match A99,
-- then the grade is in the wrong format.
WHEN NOT REGEXP_LIKE(grade, '^[A-Z]\d{2}$')
AND NOT REGEXP_LIKE(REGEXP_REPLACE(grade, '([[:punct:]]| )'), '^[A-Z]\d{2}$')
THEN 'Invalid grade format'
-- Something fell through all cases we tested for. Always expect the unexpected!
ELSE 'No case matched!'
END AS grade_scrubbed
from tbl;
ID GRADE GRADE_SCRUBBED
---------- -------------------- --------------------
1 A01 A01
1 'B02 B02
2 $ C01& C01
3 DDD Invalid grade format
4 A
DEF Invalid grade format
5 Z1234 Invalid grade format
6 No case matched!
7 rows selected.

Extract city from the address column

enter image description hereWhat happens if if ship 3 and 4 are null, but ship2 is not null, that should be city state
Here is sample data in the picture.

I prefer the oldfashioned SUBSTR + INSTR combination which, if compared to Gordon's and Barbaros' suggestions, seems to be somewhat better as their queries return strings that don't even contain a comma, while the OP says
extract city from 1 letter until 1 comma
Here's a comparison:
SQL> with tab (addr) as
2 (
3 select 'RALEIGH, NC 27604-3229' from dual union all
4 select 'SUITE A' from dual union all
5 select 'COEUR D ALENE, ID 83815-8652' from dual union all
6 select '*O/S CITY LIMITS*' from dual
7 )
8 select addr,
9 substr(addr, 1, instr(addr, ',') - 1) littlefoot,
10 --
11 regexp_substr(addr, '[^,]+', 1, 1) gordon,
12 regexp_substr(addr,'[^,]+') barbaros
13 from tab;
ADDR LITTLEFOOT GORDON BARBAROS
---------------------------- --------------- -------------------- --------------------
RALEIGH, NC 27604-3229 RALEIGH RALEIGH RALEIGH
SUITE A SUITE A SUITE A
COEUR D ALENE, ID 83815-8652 COEUR D ALENE COEUR D ALENE COEUR D ALENE
*O/S CITY LIMITS* *O/S CITY LIMITS* *O/S CITY LIMITS*
SQL>

If you want the part before the first comma, you can use regexp_substr():
select regexp_substr(addr, '[^,]+', 1, 1)

Just use regexp_substr with [^,]+ pattern as below
select regexp_substr(address,'[^,]+') as city
from tab;
SQL Fiddle Demo 1
Or alternatively by creating an auxilary table :
with tab as
(
select 'RALEIGH, NC 27604-3229' as str from dual union all
select 'SALINAS, CA 93901' from dual union all
select 'DEPEW, NY 14043-2603' from dual
)
select regexp_substr(str,'[^,]+') as city
from tab;
SQL Fiddle Demo 2

If you don't want to use regexp, you can just use:
select substr(city,1,(instr(city,',')-1))
from mytable;

In SQL sort by Alphabets first then by Numbers

In H2 Database when i have applied order by on varchar column Numbers are coming first then Alphabets. But need to come Alphabets first then Numbers.
I have tried with
ORDER BY IF(name RLIKE '^[a-z]', 1, 2), name
but getting error like If condition is not available in H2.
My Column Data is Like
A
1-A
3
M
2-B
5
B-2
it should come like
A
B-2
M
1-A
2-B
3
5

try this out
SELECT MYCOLUMN FROM MYTABLE ORDER BY REGEXP_REPLACE (MYCOLUMN,'(*)(\d)(*)','}\2') , MYCOLUMN

One thing can be done is by altering the ASCII in order by clause.
WITH tab
AS (SELECT 'A' col FROM DUAL
UNION ALL
SELECT '1-A' FROM DUAL
UNION ALL
SELECT '3' FROM DUAL
UNION ALL
SELECT 'M' FROM DUAL
UNION ALL
SELECT '2-B' FROM DUAL
UNION ALL
SELECT '5' FROM DUAL
UNION ALL
SELECT 'B-2' FROM DUAL)
SELECT col
FROM tab
ORDER BY CASE WHEN SUBSTR (col, 1, 1) < CHR (58) THEN CHR (177) || col ELSE col END;
I have Used CHR(58) as ASCII value of numbers end at 57. and CHR(177) is used as this is the maximum in the ASCII table.
FYR : ASCII table

Given the example dataset, I'm not sure if you need further logic than this- so I'll refrain from making further assumptions:
DECLARE #temp TABLE (myval char(3))
INSERT INTO #temp VALUES
('A'), ('1-A'), ('3'), ('M'), ('2-B'), ('5'), ('B-2')
SELECT myval
FROM #temp
ORDER BY CASE WHEN LEFT(myval, 1) LIKE '[a-Z]'
THEN 1
ELSE 2
END
,LEFT(myval, 1)
Gives output:
myval
A
B-2
M
1-A
2-B
3
5

substring, after last occurrence of character?

I need help with this problem:
I have a column named phone_number and I wanted to query this column to get the the string right of the last occurrence of '.' for all kinds of numbers in one single sql query.
example #:
515.123.1277
011.44.1345.629268
I need to get 1277 and 629268 respectively.
I have this so far:
select phone_number,
case when length(phone_number) <= 12
then
substr(phone_number,-4)
else
substr (phone_number, -6) end
from employees;
This works for this example, but I want it for all kinds of # formats.
Would be great to get some input.
Thanks

It should be as easy as this regex:
SELECT phone_number, REGEXP_SUBSTR(phone_number, '[^.]*$')
FROM employees;
With the end anchor $ it should get everything that is not a . character after the final .. If the last character is . then it will return NULL.

Search for a pattern including the period, [.] with digits, \d, followed by the end of the string, $.
Associate the digits with a character group by placing the pattern, \d, in parenthesis (see below). This is referenced with the subexpr parameter, 1 (last parameter).
Here is the solution:
SCOTT#dev> list
1 WITH t AS
2 ( SELECT '414.352.3100' p_number FROM dual
3 UNION ALL
4 SELECT '515.123.1277' FROM dual
5 UNION ALL
6 SELECT '011.44.1345.629268' FROM dual
7 )
8* SELECT regexp_substr(t.p_number, '[.](\d+)$', 1, 1, NULL, 1) end_num FROM t
SCOTT#dev> /
END_NUM
========================================================================
3100
1277
629268

You can do something like this in oracle:
select regexp_substr(num,'[^\.]+',1,regexp_count(num,'\.')+1) last_number from
(select '515.123.1277' num from dual union all
select '011.44.1345.629268' from dual );
Previous to 11gR2 you can use regexp_replace instead regexp_count:
select regexp_substr(num,'[^\.]+',1,length(regexp_replace (num , '[^\.]+'))+1) last_number from
(select '515.123.1277' num from dual union all
select '011.44.1345.629268' from dual );

Check if string variations exists in another string

I need to check if a partial name matches full name. For example:
Partial_Name | Full_Name
--------------------------------------
John,Smith | Smith William John
Eglid,Timothy | Timothy M Eglid
I have no clue how to approach this type of matching.
Another thing is that name and last name may come in the wrong order, making it harder.
I could do something like this, but this only works if names are in the same order and 100% match
decode(LOWER(REGEXP_REPLACE(Partial_Name,'[^a-zA-Z'']','')), LOWER(REGEXP_REPLACE(Full_Name,'[^a-zA-Z'']','')), 'Same', 'Different')

you could use this pattern on the text provided - works for most engines
([^ ,]+),([^ ,]+)(?=.*\b\1\b)(?=.*\b\2\b)
Demo

WITH
/*
tab AS
(
SELECT 'Smith William John' Full_Name, 'John,Smith' Partial_Name FROM dual
UNION ALL SELECT 'Timothy M Eglid', 'Eglid,timothy' FROM dual
UNION ALL SELECT 'Tim M Egli', 'Egli,Tim,M2' FROM dual
UNION ALL SELECT 'Timot M Eg', 'Eg' FROM dual
),
*/
tmp AS (
SELECT Full_Name, Partial_Name,
trim(CASE WHEN instr(Partial_Name, ',') = 0 THEN Partial_Name
ELSE regexp_substr(Partial_Name, '[^,]+', 1, lvl+1)
END) token
FROM tab t CROSS JOIN (SELECT lvl FROM (SELECT LEVEL-1 lvl FROM dual
CONNECT BY LEVEL <= (SELECT MAX(LENGTH(Partial_Name) - LENGTH(REPLACE(Partial_Name, ',')))+1 FROM tab)))
WHERE LENGTH(Partial_Name) - LENGTH(REPLACE(Partial_Name, ',')) >= lvl
)
SELECT Full_Name, Partial_Name
FROM tmp
GROUP BY Full_Name, Partial_Name
HAVING count(DISTINCT token)
= count(DISTINCT CASE WHEN REGEXP_LIKE(Full_Name, token, 'i')
THEN token ELSE NULL END);
In the tmp each partial_name is splitted on tokens (separated by comma)
The resulting query retrieves only those rows which full_name matches all the corresponding tokens.
This query works with the dynamic number of commas in partial_name. If there can be only zero or one commas then the query will be much easier:
SELECT * FROM tab
WHERE instr(Partial_Name, ',') > 0
AND REGEXP_LIKE(full_name, substr(Partial_Name, 1, instr(Partial_Name, ',')-1), 'ix')
AND REGEXP_LIKE(full_name, substr(Partial_Name,instr(Partial_Name, ',')+1), 'ix')
OR instr(Partial_Name, ',') = 0
AND REGEXP_LIKE(full_name, Partial_Name, 'ix');

This is what I ended up doing... Not sure if this is the best approach.
I split partials by comma and check if first name present in full name and last name present in full name. If both are present then match.
CASE
WHEN
instr(trim(lower(Full_Name)),
trim(lower(REGEXP_SUBSTR(Partial_Name, '[^,]+', 1, 1)))) > 0
AND
instr(trim(lower(Full_Name)),
trim(lower(REGEXP_SUBSTR(Partial_Name, '[^,]+', 1, 2)))) > 0
THEN 'Y'
ELSE 'N'
END AS MATCHING_NAMES

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

How to trim out letter in the column - sql

USE SUBSTRING CASE WHEN LENGTH(f_name) > 4 THEN SUBSTR(f_name,1, 2)

Related

SQL remove unwanted special characters from a string

Extract city from the address column

In SQL sort by Alphabets first then by Numbers

substring, after last occurrence of character?

Check if string variations exists in another string

Categories

Resources