Oracle SQL - Parsing a name string and converting it to first initial & last name - sql

Does anyone know how to turn this string: "Smith, John R"
Into this string: "jsmith" ?
I need to lowercase everything with lower()
Find where the comma is and track it's integer location value
Get the first character after that comma and put it in front of the string
Then get the entire last name and stick it after the first initial.
Sidenote - instr() function is not compatible with my version
Thanks for any help!

Start by writing your own INSTR function - call it my_instr for example. It will start at char 1 and loop until it finds a ','.
Then use as you would INSTR.

The best way to do this is using Oracle Regular Expressions feature, like this:
SELECT LOWER(regexp_replace('Smith, John R',
'(.+)(, )([A-Z])(.+)',
'\3\1', 1, 1))
FROM DUAL;
That says, 1) when you find the pattern of any set of characters, followed by ", ", followed by an uppercase character, followed by any remaining characters, take the third element (initial of first name) and append the last name. Then make everything lowercase.
Your side note: "instr() function is not compatible with my version" doesn't make sense to me, as that function's been around for ages. Check your version, because Regular Expressions was only added to Oracle in version 9i.
Thanks for the points.
-- Stew

instr() is not compatible with your version of what? Oracle? Are you using version 4 or something?

There is no need to create your own function, and quite frankly, it seems a waste of time when this can be done fairly easily with sql functions that already exist. Care must be taken to account for sloppy data entry.
Here is another way to accomplish your stated goal:
with name_list as
(select ' Parisi, Kenneth R' name from dual)
select name
-- There may be a space after the comma. This will strip an arbitrary
-- amount of whitespace from the first name, so we can easily extract
-- the first initial.
, substr(trim(substr(name, instr(name, ',') + 1)), 1, 1) AS first_init
-- a simple substring function, from the first character until the
-- last character before the comma.
, substr(trim(name), 1, instr(trim(name), ',') - 1) AS last_name
-- put together what we have done above to create the output field
, lower(substr(trim(substr(name, instr(name, ',') + 1)), 1, 1)) ||
lower(substr(trim(name), 1, instr(trim(name), ',') - 1)) AS init_plus_last
from name_list;
HTH,
Gabe

I have a hard time believing you don’t have access to a proper instr() but if that’s the case, implement your own version.
Assuming you have that straightened out:
select
substr(
lower( 'Smith, John R' )
, instr( 'Smith, John R', ',' ) + 2
, 1
) || -- first_initial
substr(
lower( 'Smith, John R' )
, 1
, instr( 'Smith, John R', ',' ) - 1
) -- last_name
from dual;
Also, be careful about your assumption that all names will be in that format. Watch out for something other than a single space after the comma, last names having data like “Parisi, Jr.”, etc.

Related

SQL SELECT WHERE without underscore and more

I want to select where 2 strings but without taking
underscore
apostrophe
dash..
Hello !
I want to select an option in my SQL database who look like this :
Chef d'équipe aménagement-finitions
With an original tag who look like this
chef-déquipe-aménagement-finitions
Some results in database had a - too
SELECT *
FROM table
WHERE REPLACE(name, '-', ' ') = REPLACE('chef-déquipe-aménagement-finitions', '-', ' ')
didnt work because of missing '
And a double replace didn't work too.
I want the string be able to compare without taking
underscore
apostrophe
dash
and all things like that
is this possible ?
Thanks for your help
Have good day !
Depends on your rdbms, but here's how I would perform in MySQL 8. If using a different version or rdbms, then first determine how to escape the single quote and modify as needed.
with my_data as (
select 'Chef d''équipe aménagement-finitions' as name
)
select name,
lower(replace(replace(name, '\'', ''), ' ', '-')) as name2
from my_data;
name
name2
Chef d'équipe aménagement-finitions
chef-déquipe-aménagement-finitions
Sql-server and Postgres version:
lower(replace(replace(name, '''', ''), ' ', '-')) as name
After posting, this, I re-read and noticed you are also looking to replace other characters. You could either keep layering the replace function, or, look into other functions.

How do I dynamically extract substring from string?

I’m trying to dynamically extract a substring from a very long URL. For example, I may have the following URLs:
https://www.google.com/ABCDEF Version=“0.0.00.0” GHIJK
https://www.google.com/ABCDEFGH Version=“0.0.0.0” IJKLM
https://www.google.com/ABC Version=“0.0.0.00” 12345
I am trying to extract the version code only (0.0.0.0).
This is what I have so far:
SELECT SUBSTR(col, INSTR(col, ‘Version=“‘)+9)
FROM table
This query returns the following result:
0.0.00.0” GHIJK … (url continues on)
So, I attempt to find “Version” in the link, so I can start from the same position in each row. This works fine, however I’m having a hard time dynamically locating the ending quote (“). I tried using INSTR in the third parameter of my SUBSTR function, like so:
SELECT SUBSTR(col, INSTR(col, ‘Version=“‘)+9, INSTR(col, ‘“‘))
FROM table
I figured that this would find the position of the ending quote, and then use that number for the length, but it returns a strange output. I’ve also used POSITION, CHARINDEX, LENGTH, and LOCATE. None of these functions work in Oracle.
I think maybe when I put +9 after the first INSTR function, it’s setting the query to a fixed position instead of a dynamic one, but I’m not sure how else to remove ‘Version=“‘.
Here's one option (which, actually, selects what's between double quotes - that's version in your example; if there were some other similar substring, you'd get a wrong result).
with test (col) as
(select 'https://www.google.com/ABCDEF Version="0.0.00.0" GHIJK' from dual union all
select 'https://www.google.com/ABCDEFGH Version="0.0.0.0" IJKLM' from dual union all
select 'https://www.google.com/ABC Version="0.0.0.00" 12345' from dual
)
select col,
replace(regexp_substr(col, '".+"'), '"') version
from test;
which results in
https://www.google.com/ABCDEF Version="0.0.00.0" GHIJK 0.0.00.0
https://www.google.com/ABCDEFGH Version="0.0.0.0" IJKLM 0.0.0.0
https://www.google.com/ABC Version="0.0.0.00" 12345 0.0.0.00
You can still use use INSTR to locate the second " in the string, then subtract the location of the first " to get the length that you need to get. Below is an example query:
SELECT col,
SUBSTR (col, INSTR (col, '"') + 1, INSTR (col, '"', 1, 2) - INSTR (col, '"') - 1) version
FROM test;
You can use REGEXP_SUBSTR() with Version=(\d.*\d?) pattern in order to extract the piece between Version=" and "(your quotes are presumed to be regular double quotes " ")
SELECT REGEXP_SUBSTR(url,'Version="(\d.*\d)"',1,1,null,1) AS version
FROM t
where
the third argument(1) is position,
the fourth argument(1) is occurence, and especially important to use the last one as being capture group (1)
indeed using '"(\d.*\d)"' pattern is enough for the
current data set
or
REGEXP_REPLACE() with capture group \2 as
SELECT REGEXP_REPLACE(url,'^(.*Version=")([^"]*).*','\2') AS version
FROM t
Demo

how to replace dots from 2nd occurrence

I have column with software versions. I was trying to remove dot from 2nd occurrence in column, like
select REGEXP_REPLACE('12.5.7.8', '.','');
expected out is 12.578
sample data is here
Is it possible to remove dot from 2nd occurrence
One option is to break this into two pieces:
Get the first number.
Get the rest of the numbers as an array.
Then convert the array to a string with no separator and combine with the first:
select (split_part('12.5.7.8', '.', 1) || '.' ||
array_to_string((REGEXP_SPLIT_TO_ARRAY('12.5.7.8', '[.]'))[2:], '')
)
Another option is to replace the first '.' with something else, then get rid of the '.'s and replace the something else with a '|':
select translate(regexp_replace(version, '^([^.]+)[.](.*)$', '\1|\2'), '|.', '.')
from software_version;
Here is a db<>fiddle with the three versions, including the version a_horse_with_no_name mentions in the comment.
I'd just take the left and right:
concat(
left(str, position('.' in str)),
replace(right(str, -position('.' in str)), '.', '')
)
For a str of 12.34.56.78, the left gives 12. and the right using a negative position gives 34.56.78 upon which the replace then removes the dots

Oracle remove special characters

I have a column in a table ident_nums that contains different types of ids. I need to remove special characters(e.g. [.,/#&$-]) from that column and replace them with space; however, if the special characters are found at the beginning of the string, I need to remove it without placing a space. I tried to do it in steps; first, I removed the special characters and replaced them with space (I used
REGEXP_REPLACE) then found the records that contain spaces at the beginning of the string and tried to use the TRIM function to remove the white space, but for some reason is not working that.
Here is what I have done
Select regexp_replace(id_num, '[:(),./#*&-]', ' ') from ident_nums
This part works for me, I remove all the unwanted characters from the column, however, if the string in the column starts with a character I don't want to have space in there, I would like to remove just the character, so I tried to use the built-in function TRIM.
update ident_nums
set id_num = TRIM(id_num)
I'm getting an error ORA-01407: can't update ident_nums.id_num to NULL
Any ideas what I am doing wrong here?
It does work if I add a where clause,
update ident_nums
set id_num = TRIM(id_num) where id = 123;
but I need to update all the rows with the white space at the beginning of the string.
Any suggestions are welcome.
Or if it can be done better.
The table has millions of records.
Thank you
Regexp can be slow sometimes so if you can do it by using built-in functions - consider it.
As #Abra suggested TRIM and TRANSLATE is a good choice, but maybe you would prefer LTRIM - removes only leading spaces from string (TRIM removes both - leading and trailing character ). If you want to remove "space" you can ommit defining the trim character parameter, space is default.
select
ltrim(translate('#kdjdj:', '[:(),./#*&-]', ' '))
from dual;
select
ltrim(translate(orginal_string, 'special_characters_to_remove', ' '))
from dual;
Combination of Oracle built-in functions TRANSLATE and TRIM worked for me.
select trim(' ' from translate('#$one,$2-zero...', '#$,-.',' ')) as RESULT
from DUAL
Refer to this dbfiddle
I think trim() is the key, but if you want to keep only alpha numerics, digits, and spaces, then:
select trim(' ' from regexp_replace(col, '[^a-zA-Z0-9 ]', ' ', 1, 0))
regexp_replace() makes it possible to specify only the characters you want to keep, which could be convenient.
Thanks, everyone, It this query worked for me
update update ident_nums
set id_num = LTRIM(REGEXP_REPLACE(id_num, '[:space:]+', ' ')
where REGEXP_LIKE(id_num, '^[ ?]')
this should work for you.
SELECT id_num, length(id_num) length_old, NEW_ID_NUM, length(NEW_ID_NUM) len_NEW_ID_NUM, ltrim(NEW_ID_NUM), length(ltrim(NEW_ID_NUM)) length_after_ltrim
FROM (
SELECT id_num, regexp_replace(id_num, '[:(),./#*&-#]', ' ') NEW_ID_NUM FROM
(
SELECT '1234$%45' as id_num from dual UNION
SELECT '#SHARMA' as id_num from dual UNION
SELECT 'JACK TEST' as id_num from dual UNION
SELECT 'XYZ#$' as id_num from dual UNION
SELECT '#ABCDE()' as id_num from dual -- THe 1st character is space
)
)

SQL I need to extract a stored procedure name from a string

I am a bit new to this site but I have looked an many possible answers to my question but none of them has answered my need. I have a feeling it's a good challenge. Here it goes.
In one of our tables we list what is used to run a report this can mean that we can have a short EXEC [svr1].[dbo].[stored_procedure] or "...From svr1.dbo.stored_procedure...".
My goal is to get the stored procedure name out of this string (column). I have tried to get the string between '[' and ']' but that breaks when there are no brackets. I have been at this for a few days and just can't seem to find a solution.
Any assistance you can provide is greatly appreciated.
Thank you in advance for entertaining this question.
almostanexpert
Considering the ending character of your sample sentences is space, or your sentences end without trailing ( whether space or any other character other than given samples ), and assuming you have no other dots before samples, the following would be a clean way which uses substring(), len(), charindex() and replace() together :
with t(str) as
(
select '[svr1].[dbo].[stored_procedure]' union all
select 'before svr1.dbo.stored_procedure someting more' union all
select 'abc before svr1.dbo.stored_procedure'
), t2(str) as
(
select replace(replace(str,'[',''),']','') from t
), t3(str) as
(
select substring(str,charindex('.',str)+1,len(str)) from t2
)
select
substring(
str,
charindex('.',str)+1,
case
when charindex(' ',str) > 0 then
charindex(' ',str)
else
len(str)
end - charindex('.',str)
) as "Result String"
from t3;
Result String
----------------
stored_procedure
stored_procedure
stored_procedure
Demo
With the variability of inputs you seem to have we will need to plan for a few scenarios. The below code assumes that there will be exactly two '.' characters before the stored_procedure, and that [stored_procedure] will either end the string or be followed by a space if the string continues.
SELECT TRIM('[' FROM TRIM(']' FROM --Trim brackets from final result if they exist
SUBSTR(column || ' ', --substr(string, start_pos, length), Space added in case proc name is end of str
INSTR(column || ' ', '.', 1, 2)+1, --start_pos: find second '.' and start 1 char after
INSTR(column || ' ', ' ', INSTR(column || ' ', '.', 1, 2), 1)-(INSTR(column || ' ', '.', 1, 2)+1))
-- Len: start after 2nd '.' and go until first space (subtract 2nd '.' index to get "Length")
))FROM TABLE;
Working from the middle out we'll start with using the SUBSTR function and concatenating a space to the end of the original string. This allows us to use a space to find the end of the stored_procedure even if it is the last piece of the string.
Next to find our starting position, we use INSTR to search for the second instance of the '.' and start 1 position after.
For the length argument, we find the index of the first space after that second '.' and then subtract that '.' index.
From here we have either [stored_procedure] or stored_procedure. Running the TRIM functions for each bracket will remove them if they exist, and if not will just return the name of the procedure.
Sample inputs based on above description:
'EXEC [svr1].[dbo].[stored_procedure]'
'EXEC [svr1].[dbo].[stored_procedure] FROM TEST'
'svr1.dbo.stored_procedure'
Note: This code is written for Oracle SQL but can be translated to mySQL using similar functions.