I have a column error_desc with values like:
Failure occurred in (Class::Method) xxxxCalcModule::endCustomer. Fan id 111232 is not Effective or not present in BL9_XXXXX for date 20160XXX.
What SQL query can I use to display only the number 111232 from that column? The number is placed at 66th position in VARCHAR column and ends 71st.
SELECT substr(ERROR_DESC,66,6) as ABC FROM bl1_cycle_errors where error_desc like '%FAN%'
This solution uses regular expressions.
The challenge I faced was on pulling out alphanumerics. We have to retain only numbers and filter out string,alphanumerics or punctuations in this case, to detect the standalone number.
Pure strings and words not containing numbers can be easily filtered out using
[^[:digit:]]
Possible combinations of alphanumerics are :
1.Begins with a character, contains numbers, may end with characters or punctuations :
[a-zA-Z]+[0-9]+[[:punct:]]*[a-zA-Z]*[[:punct:]]*
2.Begins with numbers and then contains alphabets,may contain punctuations :
[0-9]+[[:punct:]]*[a-zA-Z]+[[:punct:]]*
Begins with numbers then contains punctuations,may contain alphabets :
-- [0-9]+[a-zA-Z][[:punct:]]+[a-zA-Z] --Not able to highlight as code, refer solution's last regex combination
Combining these regular expressions using | operator we get:
select trim(REGEXP_REPLACE(error_desc,'[^[:digit:]]|[a-zA-Z]+[0-9]+[[:punct:]]*[a-zA-Z]*[[:punct:]]*|[0-9]+[[:punct:]]*[a-zA-Z]+[[:punct:]]*|[0-9]+[a-zA-Z]*[[:punct:]]+[a-zA-Z]*',' '))
from error_table;
Will work in most cases.
Related
I have above entries in my database, my requirement is to extract the fields containing the non-english language characters ( including if the data containing the combination of english and non-english characters like HotelName field for the ID 45).
I tried by regexp_like function by looking for the alphanumeric and non-alphanumeric, but i have some data with combination of both the condition fails there.
Thanks in Advance
Raghavan
Does this do what you want?
where regexp_like(hotelname, '[^a-zA-Z0-9 ]')
That is, where the hotel name contains any character that is not a "letter" or digit. You may need to take additional characters into account as well, such as commas, periods, and hyphens.
I have a field in a database table in the format:
111_2222_33333,222_444_3,aaa_bbb_ccc
This is format is uniform to the entire field. Three underscore separated numeric values, a comma, three more underscore separated numeric values, another comma and then three underscore separated text values. No spaces in between
I want to extract the middle value from the second numeric sequence, in the example above I want to get 444
In a SQL query I inherited, the regex used is ^.,(\d+)_.$ but this doesn't seem to do anything.
I've tried to identify the first comma, first number after and the following underscore ,222_ to use as a starting point and from there get the next number without the _ after it
This (,\d*_)(\d+[^_]) selects ,222_444 and is the closest I've gotten
We can try using REGEXP_REPLACE with a capture group:
SELECT
REGEXP_REPLACE(
'111_2222_33333,222_444_3,aaa_bbb_ccc',
'^[^,]+,[^_]+_(.*?)_[^_]+,.*$',
'\1') AS num
FROM yourTable;
Here is a demo showing that the above regex' first capture group contains the quantity you want.
Demo
I need to find rows where the phone number field contains unexpected characters.
Most of the values in this field look like:
123456-7890
This is expected. However, we are also seeing character values in this field such as * and #.
I want to find all rows where these unexpected character values exist.
Expected:
Numbers are expected
Hyphen with numbers is expected (hyphen alone is not)
NULL is expected
Empty is expected
Tried this:
WHERE phone_num is not like ' %[0-9,-,' ' ]%
Still getting rows where phone has numbers.
from https://regexr.com/3c53v address you can edit regex to match your needs.
I am going to use example regex for this purpose
select * from Table1
Where NOT REGEXP_LIKE(PhoneNumberColumn, '^[+]*[(]{0,1}[0-9]{1,4}[)]{0,1}[-\s\./0-9]*$')
You can use translate()
...
WHERE translate(Phone_Number,'a1234567890-', 'a') is NOT NULL
This will strip out all valid characters leaving behind the invalid ones. If all the characters are valid, the result would be NULL. This does not validate the format, for that you'd need to use REGEXP_LIKE or something similar.
You can use regexp_like().
...
WHERE regexp_like(phone_num, '[^ 0123456789-]|^-|-$')
[^ 0123456789-] matches any character that is not a space nor a digit nor a hyphen. ^- matches a hyphen at the beginning and -$ on the end of the string. The pipes are "ors" i.e. a|b matches if pattern a matches of if pattern b matches.
Oracle has REGEXP_LIKE for regex compares:
WHERE REGEXP_LIKE(phone_num,'[^0-9''\-]')
If you're unfamiliar with regular expressions, there are plenty of good sites to help you build them. I like this one
I would like to extract strings of varying length located between two repeating underscores in Hive QL. Below I show a sampling of the pattern of the rows. Specifically, I would like to extract the string between the 3rd and 4th underscores. Thanks!
2016_sadfsa_IL_THIS_xsdaf_asd_eventbyevent_tsaC_NA_300x250
2017_thisshopper_MA_THIS_NAT_Leb_ReasonsWhy_HDIMC_NA_300x600
2017_FordShopper_IL_THESE_NAT_sov_winterEvent_HDIMC_NA_300x600
Just kept trying and I modified this from previous responses to non-Hive SQL. I am still interested in knowing better ways of doing this. Note that creative_str is the name of the column:
select creative_str, ltrim(rtrim(substring(regexp_replace(cast(creative_str as varchar(1000)), '_', repeat(cast(' ' as varchar(1000)),10000)), 30001, 10000)))
from impression_cr
You should be able to do this with Hive's SPLIT() function. If you're trying to grab the value between the third and fourth underscores, this will do it:
SELECT SPLIT("2016_sadfsa_IL_THIS_xsdaf_asd_eventbyevent_tsaC_NA_300x250", "[_]")[3],
SPLIT("2017_thisshopper_MA_THIS_NAT_Leb_ReasonsWhy_HDIMC_NA_300x600", "[_]")[3],
SPLIT("2017_FordShopper_IL_THESE_NAT_sov_winterEvent_HDIMC_NA_300x600", "[_]")[3]
I have a table with a string field containing location information. I want to be able to query this table and retrieve all of the tags matching the format xxxxxxAA where xxxxxx is a 6-digit number and AA is two alphabetic characters.
Is there a method of querying this using SQL or is this something that I need to do in VBA?
Sample data:
BGS5 PM RGP5
022051PM
022201PM
030539PM
WAS3N
179546MM
And I want to return the following without knowing the values:
022051PM
022201PM
030539PM
179546MM
thanks in advance
Jason
You can use a query with a Like comparison in the WHERE clause.
SELECT y.text_field
FROM YourTable AS y
WHERE y.text_field Like '######[A-Z][A-Z]'
The # matches a digit.
[A-Z] matches one character from a character class consisting of only letters. That character class is actually upper case letters. However, the comparison is case-insensitive, so will match lower case letters, too.