Exclude Minimum 6 Digits and Replace Trailing digits In Hive - sql

Can someone help me to write below logic in hive.
I have one value in column in which I have digits with trailing 0's. I need to Replace all these 0's by 9,while replacing the 0's by 9 ,I need to also consider that before 9 minimun 6 digits should be there, else need to exclude some 0's so that before 9 can have at-least 6 digits.PFB some Scenarios.
1234506600000000000
Here we can see the number of digits before trailing 0's is 8 (12345066) so i Just need to remaining 0's by 9 and the output will be like. 1234506699999999999.
1234500000000000000
Here I have only 5 digits before trailing 0's so I need to consider 6th Position's 0 also a digit and need to exclude this while replacing the 0's by 9 so the output will be 1234509999999999999.
1000000000000000000
Here I have only 1 digit before trailing 0's ,so I need to exclude 5 extra 0's and need to replace remaining 0's by 9, so final output will be like 1000009999999999999.
Input Output
1234506600000000000 1234506699999999999
1234500000000000000 1234509999999999999
1000000000000000000 1000009999999999999

If you want to modify leftjoin's technique from the other question we can tweak the Regex to match at least 6 digits including 0s
with mytable as (
select '1234560000000' as input union all
select '123450000000' union all
select '12340000000' union all
select '1230000000'
)
select lpad(concat(splitted[0], translate(splitted[1],'0','9')),13,0)
from
(
select split(regexp_replace(input,'(\\d{6,}?)(0+)$','$1|$2'),'\\|') splitted
from mytable
)s
If you want to go the replace/pad/replace route I proposed, you'd check the length of the number after it's rtrim'd and if it's less than 6, rpad it out to 6 with zeroes. Most implementations of rpad would chop the string off at 6 chars if it were longer than 6 - if they didn't it would be nice and simple to just call rpad after rtrim. It might be worth making your own rpad function that leaves strings longer than N alone, if hive's rpad performs a substring op

Related

Length of VARCHAR in Teradata

I'm trying to write a query that only includes customer numbers either six or seven digits long. The numbers are stored in a VARCHAR(30) field in Teradata. I've tried the following:
...
AND LENGTH(STAFF_NO) > 5
AND LENGTH(STAFF_NO) < 8
...
...
AND CHARACTER_LENGTH(STAFF_NO) > 5
AND CHARACTER_LENGTH(STAFF_NO) < 8
...
...
AND CHAR_LENGTH(STAFF_NO) > 5
AND CHAR_LENGTH(STAFF_NO) < 8
...
but all of these have returned no rows; the query, in each case, has only looked at the maximum length of the field (30) rather than the actual number of characters in it.
How can I filter so it only checks the number of actual characters in the field?
It depends on how you want to handle the trailing / leading whitespace.
Include: CHAR_LENGTH(STAFF_NO) BETWEEN 6 AND 7
Exclude: CHAR_LENGTH(TRIM(STAFF_NO)) BETWEEN 6 AND 7
It shouldn't make a difference whether it's VARCHAR or CHAR.
Use:
LENGTH(TRIM(BOTH FROM STAFF_NO)) AS STAFF_NO_LENGTH
This removes empty spaces from either side of the string.
You need to use CHARACTER_LENGTH function to get length of character in VARCHAR column.
Cheers!!
Each functions that you used should return the length of column value not the max length of the column itself. Probably, there are extra spaces at the end of STAFF_NO? Try adding a TRIM as follows:
AND CHARS(TRIM(STAFF_NO)) BETWEEN 6 AND 7

REGEXP_LIKE between number range

Can someone please finalize the code on the below.
I only want to look for a 6 digit number range anywhere in the RMK field, between 100000 and 999999
REGEXP_LIKE(RMKADH.RMK, '[[:digit:]]')
The current code works but is bringing back anything with a number so I'm trying to narrow it down to 6 digits together. I've tried a few but no luck.
Edit:
I want to flag this field if a 6 digit number is present. The reference will always be 6 digits long only, no more no less. But as it's a free text field it could be anywhere and contain anything.
Example output I do want to flag: >abc123456markj< = flagged.
Output I don't want to flag: >Mark 23647282< because the number it finds is more than 6 characters in length I know it's not a valid reference.
Try this:
REGEXP_LIKE(RMKADH.RMK, '[1-9][[:digit:]]{5}') AND length(RMKADH.RMK) = 6
For more info, see: Multilingual Regular Expression Syntax
You can do a REGEXP_SUBSTR to get 6 digits out of the given field and compare it using between
select * from t
where to_number(regexp_substr(col,'[[:digit:]]{6}')) between 100000 and 999999;
;
Please note that if a bigger sequence than 6 digits exists, the above solution will take first 6 digits into consideration. If you want to do for any 6 consecutive digits, the solution will have to be a different one.
If you want to get all the Records which have only Numeric values in them you can use below query
REGEXP_LIKE(RMKADH.RMK, '^[[:digit:]]+$');
The above will match any number of Numbers from start to end in the string. So if your Numbers span from 1 digit to any number of Digits, this will be useful.
SELECT
to_number(regexp_replace('abc123456markj', '[^[:digit:]]', '')) digits
FROM
dual
WHERE
REGEXP_LIKE('abc123456markj', '[[:digit:]]')
AND
length(regexp_replace('abc123456markj', '[^[:digit:]]', '')) = 6
AND
regexp_replace('abc123456markj', '[^[:digit:]]', '') BETWEEN 100000 AND 999999;

zero padding in teradata sql

Table A
Id varchar(30)
I'm trying to re-create a logic where I have to use 9 digit Ids irrespective of the actual length of the Value of the Id field.
So for instance, if the Id is of length 6, I'll need to left pad with 3 leading zeros. The actual length can be anything ranging from 1 to 9.
Any ideas how to implement this in Teradata SQL?
If the actual length is 1 to 9 characters why is the column defined as VarCar(30)?
If it was a numeric column it would be easy:
CAST(CAST(numeric_col AS FORMAT '9(9)') AS CHAR(9))
For strings there's no FORMAT like that, but depending on your release you might have an LPAD function:
LPAD(string_col, 9, '0')
Otherwise it's:
SUBSTRING('000000000' FROM CHAR_LENGTH(string_col)+1) || string_col,
If there are more than nine characters all previous calculations will return them.
If you want to truncate (or a CHAR instead of a VARCHAR result) you have to add a final CAST AS CHAR(9)
And finally, if there are leading or trailing blanks you might want to use TRIM(string_col)

using oracle sql substr to get last digits

I have a result of a query and am supposed to get the final digits of one column say 'term'
The value of column term can be like:
'term' 'number' (output)
---------------------------
xyz012 12
xyz112 112
xyz1 1
xyz02 2
xyz002 2
xyz88 88
Note: Not limited to above scenario's but requirement being last 3 or less characters can be digit
Function I used: to_number(substr(term.name,-3))
(Initially I assumed the requirement as last 3 characters are always digit, But I was wrong)
I am using to_number because if last 3 digits are '012' then number should be '12'
But as one can see in some specific cases like 'xyz88', 'xyz1') would give a
ORA-01722: invalid number
How can I achieve this using substr or regexp_substr ?
Did not explore regexp_substr much.
Using REGEXP_SUBSTR,
select column_name, to_number(regexp_substr(column_name,'\d+$'))
from table_name;
\d matches digits. Along with +, it becomes a group with one or more digits.
$ matches end of line.
Putting it together, this regex extracts a group of digits at the end of a string.
More details here.
Demo here.
Oracle has the function regexp_instr() which does what you want:
select term, cast(substr(term, 1-regexp_instr(reverse(term),'[^0-9]')) as int) as number
select SUBSTRING(acc_no,len(acc_no)-1,len(acc_no)) from table_name;

SQL to_char() printing 2 digit number as 0xx and not touching 3 number digit

I'm currently battling with something
that must be trivial for you.
I have 2 number 191 and 97, and I need to put them in a SQL request, as chars and 97 must be printed as 097.
At first I tried 999, but it added 2 space to my numbers.
then 099, it does print 097 but it adds a space to it.
to_char(:center, '099') = " 197" and " 097"
Where is this space coming from?
Thanks.
What you're looking for is the Format Modifier element:
to_char(:center, 'fm099')
The leading space is for the potential minus sign. To remove it you can use FM in the format:
to_char(v_num,'FM099')
9 9999 Returns value with the specified number of digits with a leading space if positive or with a leading minus if negative.Leading zeros are blank, except for a zero value, which returns a zero for the integer part of the fixed-point number.
From http://docs.oracle.com/cd/B19306_01/server.102/b14200/sql_elements004.htm#i34510
Use #DavidAldridge solution, or trim your value.
If you are looking for the all column values in same number of digits even the actual value having less digits. Try this
a) SELECT TO_CHAR(COLUMN_NAME, 'FM099') FROM TABLE_NAME;
b) SELECT TO_CHAR(COLUMN_NAME, 'FM000') FROM TABLE_NAME;
Both is working fine. but don't know which one would be the best choice.