Trim specific string with leading zeros - sql

I am operating on an oracle server with a table that contains one really weird column.
This column contains strings like:
[0X]+00000026
[22]+2222,555,6666
[WRI] 0000,00
FKI
555
Every case has its own structure.
Now I would like to transform the first example to '26'.
The second one I would like to transform to '2222'.
The last one to '555'.
How would you build that?
Have you ever seen something similar?
Best Regards

I think this does what you want:
select replace(regexp_substr(str, '(^|[+])[0-9]+'), '+', '')
Here is a db<>fiddle.

Related

How can I automatically extract content from a field in a SQL query?

The environment I am currently working in is Snowflake.
As a matter of data sensitivty, I will be using pseudonyms for my following question.
I have a specific field in one of my tables called FIELD_1. The data in this field is structured as such:
I am trying to figure out how to automatically extract from my FIELD_1 the output I have in FIELD_2.
Does anyone have any idea what kind of query I would need to achieve this? Any help would be GREATLYappreciated! I am really quite stuck on this problem.
Thank you!
You seem to want everything up to the first four numbers. Then to replace the underscores with spaces. If so:
select replace(regexp_substr(field_1, '^[^0-9]*[0-9]{4}'), '_', ' ')
Or alternatively, if you want the first three components separated by underscores:
select replace(regexp_substr(field_1, '^[^_]+_[^_]+_[0-9]{4}'), '_', ' ')
If the data is as simplistic in reality as you've described here, you can use a variable-length LEFT() function in conjunction with REPLACE() to get the desired output:
SELECT FIELD_1, REPLACE(LEFT(FIELD_1, LEN(FIELD_1)-10),'_',' ') AS FIELD_2
FROM table_name
See also:
SELECT - Snowflake Documentation
LEFT - Snowflake Documentation
REPLACE - Snowflake Documentation
LENGTH, LEN - Snowflake Documentation

Remove unnecessary Characters by using SQL query

Do you know how to remove below kind of Characters at once on a query ?
Note : .I'm retrieving this data from the Access app and put only the valid data into the SQL.
select DISTINCT ltrim(rtrim(a.Company)) from [Legacy].[dbo].[Attorney] as a
This column is company name column.I need to keep string characters only.But I need to remove numbers only rows,numbers and characters rows,NULL,Empty and all other +,-.
Based on your extremely vague "rules" I am going to make a guess.
Maybe something like this will be somewhere close.
select DISTINCT ltrim(rtrim(a.Company))
from [Legacy].[dbo].[Attorney] as a
where LEN(ltrim(rtrim(a.Company))) > 1
and IsNumeric(a.Company) = 0
This will exclude entries that are not at least 2 characters and can't be converted to a number.
This should select the rows you want to delete:
where company not like '%[a-zA-Z]%' and -- has at least one vowel
company like '%[^ a-zA-Z0-9.&]%' -- has a not-allowed character
The list of allowed characters in the second expression may not be complete.
If this works, then you can easily adapt it for a delete statement.

Transform data in Google bigquery - extract text, split it into multiple columns and pivoting the data

I have some weblog data in big query which I need to transform to make it easier to use and query. The data looks like:
I want to extract and transform the data within the curled brackets after Results{…..} (colored blue). The data is of the form ‘(\d+((PQ)|(KL))+\d+)’ and there can be 1-20+ entries in the result array. I am only interested in the first 16 entries.
I have been able to extract the data within curled brackets into a new column, using Substr and regext_extract. But I'm unable to SPLIT it into columns (sometimes there is only 1 result and so the delimiter "," is missing. I'm new with regex, may be I can use something like ‘(\d+((PQ)|(KL))+\d+){1}’ etc. to split the data into multiple columns and then pivot it.
Ideal output in my case would be to transform it into something like:
In the above solution, each row in original table is repeated from 1-16 times depending on the number of items in the Results array.
I’m not completely sure if it’s possible to do this in big query. I’ll be grateful if anyone can help me out a little here.
If this is not possible, then I can have 16 rows for every event with NULL values in Event_details for cases where there are less than 16 entries in result array.
In case both of these are not possible, the last solution would be to have it transformed into something like:
The reason I want to transform the data is that in most of the cases I would need to find which result array items are appearing and in what order.
Check this out: Split string into multiple columns with bigquery.
In their case its delimited by spaces. replace the \s with ','
something like:
SELECT
Regexp_extract(StringToParse,r'^*{(?:[^,]*,){0}(\d+(?:(?:PQ)|(?:KL))+\d+)\s?') as Word0,
Regexp_extract(StringToParse,r'^*{(?:[^,]*,){1}(\d+(?:(?:PQ)|(?:KL))+\d+)\s?') as Word1,
Regexp_extract(StringToParse,r'^*{(?:[^,]*,){2}(\d+(?:(?:PQ)|(?:KL))+\d+)\s?') as Word2,
Regexp_extract(StringToParse,r'^*{(?:[^,]*,){3}(\d+(?:(?:PQ)|(?:KL))+\d+)\s?') as Word3,
FROM
(SELECT 'bla{1234PQ5,6789KL0,1234PQ5,6789KL0,123' as StringToParse)
Use SPLIT()
SELECT Event_ID, Event_UserID, Event_SessionID, Keyword,
SPLIT(REGEXP_EXTRACT(Event_details,"Results\{(.*)\}"),",") as Event_details_item
FROM mydata.mytable

SQL fetch results by concatenating words in column

I have column store_name (varchar). In that column I have entries like prime sport, best buy... with a space. But when user typed concatenated string like primesport without space I need to show result prime sport. how can I achieve this? Please help me
SELECT *
FROM TABLE
WHERE replace(store_name, ' ', '') LIKE '%'+#SEARCH+'%' OR STORE_NAME LIKE '%'+#SEARCH +'%'
Well, I don't have much idea, and even I am searching for it. But may be what I know works for you, You can achieve this by performing different type of string operations:
Mike can be Myke or Myce or Mikke or so on.
Cat an be Kat or katt or catt or so on.
For this you should write a function to generate number of possible strings and then form a SQL Query using all these, and query the database.
A similar kind of search in known as Soundex Search from Oracle and Soundex Search from Microsoft. Have a look of it. this may work.
And overall make use of functions like upper and lower.
Have you tried using replace()
You can replace the white space in the query then use like
SELECT * FROM table WHERE replace(store_name, ' ', '') LIKE '%primesport%'
It will work for entries like 'prime soft' querying with 'primesoft'
Or you can use regex.

How to do string manipulation in SQL query

I know I'm close to figuring this out but need a little help. What I'm trying to do is all grab a column from a particular table, but chop off the first 4 characters. For example if in a column the value is "KPIT08L", the result I was is 08L. Here is what I have so far but not getting the desired results.
SELECT LEFT(FIELD_NAME, 4)
FROM TABLE_NAME
First up, left will give you the leftmost characters. If you want the characters starting at a specific location, you need to look into mid:
select mid (field_name,5) ...
Secondly, if you value performance,portability and scalability at all, this sort of "sub-column" manipulation should generally be avoided. It's usually far easier (and faster) to patch columns together than to split them apart.
In other words, keep the first four characters in their own column and the rest in a separate column, and do your selects on the relevant one. If you're using anything less than a full column, then it's technically not one attribute of the row.
Try with
SELECT MID(FIELD_NAME, 5) FROM TABLE_NAME
Mid is very powerfull, it let you select the starting point and all the remainder, or,
if specified, the length desidered as in
SELECT MID(FIELD_NAME, 5, 2) FROM TABLE_NAME ' gives 08 in your example text
SELECT RIGHT(FIELD_NAME,LEN(FIELD_NAME)-4)
FROM TABLE_NAME;
If it is for a generic string then the above one will work...
Don't have Access at my current location, but please try this.
SELECT RIGHT(FIELD_NAME, LEN(FIELD_NAME)-4)
FROM TABLE_NAME
The LEFT(FIELD_NAME, 4) will return the first 4 caracters of FIELD_NAME.
What you need to do is :
SELECT MID(FIELD_NAME, 5)
FROM TABLE_NAME
If you have a FIELD_NAME of 10 caracters, the function will return the 6 last caracters (chopping the first 4)!