Formatting hierachical queries with lpad function - sql

SELECT LPAD(last_name, LENGTH(last_name)+(LEVEL*2)-2,'_')
AS org_chart
FROM employees
START WITH last_name='King'
CONNECT BY PRIOR employee_id=manager_id ;
LPAD(char1,n [,char2]) returns char1, left-padded to length n with the sequence
of characters in char2.
This tells SQL to take the LAST_NAME and left-pad it with the '_' character till
the length of the resultant string is equal to the value determined by
LENGTH(last_name)+(LEVEL*2)-2.
For LEVEL = 1. Hence, (2 * 1) - 2 = 2 - 2 = 0.
For LEVEL = 2. Hence, (2 * 2) - 2 = 4 - 2 = 2 . So its gets padded with 2 '_'
characters and is displayed indented.
and also how to determine the formula that length(ename) to be added with level*2-2
the output is king doesn't get padded with '-'
This is the correct output
ORG_CHART
KING
__PAUL
__JONES
____SCOTT
______ADAMS
____FORD
______SMITH
__BLAKE
____ALLEN
____WARD
____MARTIN
____TURNER
____JAMES
__CLARK
____MILLER
but formula lpad(king,4+(l*10)-10,'-')=>lpad(king,4,'-')
which means king should be padded with 4'-'

You need to include the length of the field because you have to allow for the number of characters in that, plus the amount of indentation you want. Taking the third-level Greenberg for example, that is displayed as:
____Greenberg
... with four underscores. Level is 3 here, so (level * 2) - 2 is four. But if you only used that value you'd get:
lpad('Greenberg', 4, '_')
and the output of that is just:
Gree
You want the final output string, including the underscores, to be four characters longer than the name on its own. 'Greenberg' is 9 characters, and ____Greenberg is 13; so your padding length has to be 13, which is the length of the name plus the number of underscores you want to appear in front.
Another way to get the same effect is with:
SELECT LPAD('_', (LEVEL - 1) * 2, '_') || last_name AS org_chart
...
That makes the underscore padding separate from the name itself - it's based just on the level, and the name is just concatenated on the end.
For King, the level is 1. You said the formula is:
lpad(king,4+(l*10)-10,'-')=>lpad(king,4,'-')
Which is right, but 'King' is already four characters long, so padding it out to four characters has no effect. Your are padding it out to the final length of 4. lpad doesn't add four underscores regardless; it only adds underscores up to the requested length, which is 4 in this case.
I think you're just misinterpreting how the function workd. As the documentation says (emphasis added):
LPAD returns expr1, left-padded to length n characters with the
sequence of characters in expr2.
...
The argument n is the total length of the return value as it is displayed on your terminal screen.
So:
select lpad('King',4,'_') from dual;
LPAD('KING',4,'_')
------------------
King
If you asked for a longer final length you'd get the number of underscores needed to pad 'King' out to that length:
select lpad('King',5,'_') from dual;
LPAD('KING',5,'_')
------------------
_King
If you want King to be indented as well, by two underscores; and subsequent levels to be indented more to match (so Kochhar gets 4 and Greenberg gets 6) then remove the -2 from the calculation.

Related

Oracle SQL - Redacting multiple occurences all but last four digits of numbers of varying length within free text narrative

Is there are straightforward way, perhaps using REGEXP_REPLACE or the like, to redact all but the last four digits of numbers (or varying length of 5 or above) appearing within free text (there may be multiple occurrences of separate numbers within the text)?
E.g.
Input = 'This is a test text with numbers 12345, 9876543210 and separately number 1234567887654321 all buried within the text'
Output = 'This is a test text with numbers ****5, *****3210 and separately number ************4321 all buried within the text'
With REGEX_REPLACE it's obviously straightforward to replace all numbers with the *, but it's maintaining the final four digits and replacing with the correct number of *s that's vexing me.
Any help would be much appreciated!
(Just for context, due to the usual kind of business limitations, this had to be done within the query retrieving the data rather than using actual Oracle DBMS redaction functionality).
Many thanks.
You could try the following regex:
regexp_replace(txt, '(\d{4})(\d+(\D|$))', '****\2')
This captures sequences of 4 digits followed by at least one digit, then by a non-digit character (or the end of string), and replaces them with 4 stars.
Demo on DB Fiddle:
with t as (select 'select This is a test text with numbers 12345, 9876543210 and separately number 1234567887654321 all buried within the text' txt from dual)
select regexp_replace(txt, '(\d{4})(\d+\D)', '****\2') new_text from t
| NEW_TEXT |
| :-------------------------------------------------------------------------------------------------------------------------- |
| select This is a test text with numbers ****5, ****543210 and separately number ****567887654321 all buried within the text |
Edit
Here is a simplified version, suggested by Aleksej in the comments:
regexp_replace(txt, '(\d{4})(\d+)', '****\2')
This works because of the greadiness of the regexp engine, that will slurp as many '\d+' as possible.
If you really need to keep the length of the numbers, then (I think) there is not wayy todo it in one step. You'll have to split the string in numbers and not numbers and then replace the digits seperatly:
SELECT listagg(CASE WHEN REGEXP_LIKE(txt, '\d{5,}') -- if the string is of your desired format
THEN LPAD('*', LENGTH(txt) - 4,'*') || SUBSTR(txt, LENGTH(txt) -3) -- replace all digits but the last 4 with *
ELSE txt END)
within GROUP (ORDER BY lvl)
FROM (SELECT LEVEL lvl, REGEXP_SUBSTR(txt, '(\d+|\D+)', 1, LEVEL ) txt -- Split the string in numerical and non numerical parts
FROM (select 'This is a test text with numbers 12345, 9876543210 and separately number 1234567887654321 all buried within the text' AS txt FROM dual)
CONNECT BY REGEXP_SUBSTR(txt, '(\d+|\D+)', 1, LEVEL ) IS NOT NULL)
Result:
This is a test text with numbers *2345, ******3210 and separately number ************4321 all buried within the text
And as your example replaced the first for digits of your first number - you might also want to replace at least 4 digits:
SELECT listagg(CASE WHEN REGEXP_LIKE(txt, '\d{5,}') -- if the string is of your desired format
THEN LPAD('*', GREATEST(LENGTH(txt) - 4, 4),'*') || SUBSTR(txt, GREATEST(LENGTH(txt) -3, 5)) -- replace all digits but the last 4 with *
ELSE txt END)
within GROUP (ORDER BY lvl)
FROM (SELECT LEVEL lvl, REGEXP_SUBSTR(txt, '(\d+|\D+)', 1, LEVEL ) txt -- Split the string in numerical and non numerical parts
FROM (select 'This is a test text with numbers 12345, 9876543210 and separately number 1234567887654321 all buried within the text' AS txt FROM dual)
CONNECT BY REGEXP_SUBSTR(txt, '(\d+|\D+)', 1, LEVEL ) IS NOT NULL)
(Added GREATEST in the second line to replace at least 4 digits.)
Result:
This is a test text with numbers ****5, ******3210 and separately number ************4321 all buried within the text

how to repeat characters in a string

I'm trying to link two tables, one has an 'EntityRef' that's made of four alpha characters and a sequential number...
EntityRef
=========
SWIT1
LIVE32
KIRB48
MEHM38
BRAD192
The table that I'm trying to link to stores the reference in a 15 character field where the 4 alphas are at the start and the numbers are at the end but with zeros in between to make up the 15 characters...
EntityRef
=========
SWIT00000000001
LIVE00000000032
So, to get theses to link, my options are to either remove the zeros on one field or add the zeros on the other.
I've gone for the later as it seems to be a simpler approach and eliminates the risk of getting into problems if the numeric element contains a zero.
So, the alpha is always 4 characters at the beginning and the number is the remainder and 15 minus the LEN() of the EntityRef is the number of zeros that I need to insert...
left(entityref,4) as 'Alpha',
right(entityref,len(EntityRef)-4) as 'Numeric',
15-len(EntityRef) as 'No.of Zeros'
Alpha Numeric No.of Zeros
===== ======= ===========
SWIT 1 10
LIVE 32 9
KIRB 48 9
MEHM 38 9
MALL 36 9
So, I need to concatenate the three elements but I don't know how to create the string of zeros to the specified length...how do I do that??
Concat(Alpha, '0'*[No. of Zeros], Numeric)
What is the correct way to repeat a character a specified number of times?
You can use string manipulation. In this case:
LEFT() to get the alpha portion.
REPLICATE() to get the zeros.
STUFF() to get the number.
The query:
select left(val, 4) + replicate('0', 15 - len(val)) + stuff(val, 1, 4, '')
from (values ('SWIT1'), ('ABC12345')) v(val)
You may try left padding with zeroes:
SELECT
LEFT(EntityRef, 4) +
RIGHT('00000000000' + SUBSTRING(ISNULL(EntityRef,''), 5, 30), 11) AS EntityRef
FROM yourTable;
Demo
With casting to integer the numeric part:
select *
from t1 inner join t2
on concat(left(t2.EntityRef, 4), cast(right(t2.EntityRef, 11) as bigint)) = t1.EntityRef
See the demo.
I found the answer as soon as I posted the question (sometimes it helps you think it through!).
(left(entityref,4) + replicate('0',15-len(EntityRef)) +
right(entityref,len(EntityRef)-4)),

Can (should) I include elements of TO_CHAR, CONCAT and FM in the same select statement?

I'm following up on a previous post that successfully generated an Oracle SELECT statement. From within that prior script I now need to
concatenate two different fields (numeric values for 3-digit area codes and 7-digit phone numbers), then
format the resulting column as XXX-XXX-XXXX
but my attempts at using TO_CHAR, CONCAT (or || I have tried doing the concatenation both ways), and FM in the same line result in invalid number or invalid operator errors (depending on how I've rearranged the elements in the line) painfully reminding me that my barely-elementary scripting shows a significant lack understanding of proper use and syntax.
The combination of TO_CHAR and CONCAT (||) successfully produces a 9-digit string, but I'm trying to attain as result formatted as XXX-XXX-XXXX from the following (I've edited out the lines from the original script for data elements not relevant to this particular question; nothing in the original query is nested, it just selects several fields and has a series of left joins linking on a common UID field in different tables)
select distinct
cn.dflt_id StudentIdNumber,
to_char (p.area_code || p.phone_no) Phone,
from
co_name cn
left join co_v_name_phone1 p on cn.name_id = p.name_id
order by cn.dflt_id
Would anyone offer helpful advice on attaining the desired XXX-XXX-XXXX formatting in the resulting Phone column? My attempts with variants of 'fm999g999g9999' have thus far not been successful.
Thanks,
Scott
Here are a few options that crossed my mind; have a look, pick the one you find the most appropriate. If you still have problems, post your own test case.
RES2 is a simple concatenation of substrings that have a - in between
RES3 uses format mask with an adjusted NLS_NUMERIC_CHARACTERS for thousands
RES4 concatenates area code (which is OK by itself) with regular expression that splits a string into two parts; the first has {3} characters, and the second one has {4} of them
By the way, are area codes really numbers? No leading zeros?
SQL> with test (area_code, phone_number) as
2 (select 123, 9884556 from dual union
3 select 324, 1254789 from dual
4 )
5 select
6 to_char(area_code) || to_char(phone_number) l_concat,
7 --
8 substr(to_char(area_code) || to_char(phone_number), 1, 3) ||'-'||
9 substr(to_char(area_code) || to_char(phone_number), 4, 3) ||'-'||
10 substr(to_char(area_code) || to_char(phone_number), 7)
11 res2,
12 --
13 to_char(to_char(area_code) || to_char(phone_number),
14 '000g000g0000', 'nls_numeric_characters=.-') res3,
15 --
16 to_char(area_code) ||'-'||
17 regexp_replace(to_char(phone_number), '(\d{3})(\d{4})', '\1-\2') res4
18 from test;
L_CONCAT RES2 RES3 RES4
------------- ------------- ------------- -------------
1239884556 123-988-4556 123-988-4556 123-988-4556
3241254789 324-125-4789 324-125-4789 324-125-4789
SQL>

Strange phenomenon with substring and charindex

When comparing the code snippets below, you can see that the last element of the 'result 1' result set, only has one character (a comma) appending the number, whereas the other numbers (rows 1 - 3) have a comma and a number appending the 4 digits that I want.
In the 'Result 2' result set, I specifically change the length of the substring to be the string length minus two characters, yet the last element only removes a single element, the trailing comma, while the rows 1 - 3 remove both the number and the trailing comma. There is no blank space in the last row. Please could someone advise why this is happening?
Code 1:
select substring(c,2,charindex(',',c,2)) as empno
from table t
where len(c) > 1
and substring(c,1,1) = ','
Result 1:
7654,7
7698,7
7782,7
7788,
Code 2:
select substring(c,2,charindex(',',c,2)-2) as empno
from table t
where len(c) > 1
and substring(c,1,1) = ','
Result 2:
7654
7698
7782
7788
*edit: table t is:-
c
----------------------
,7654,7698,7782,7788,
7654,7698,7782,7788,
654,7698,7782,7788,
54,7698,7782,7788,
4,7698,7782,7788,
,7698,7782,7788,
7698,7782,7788,
698,7782,7788,
98,7782,7788,
8,7782,7788,
,7782,7788,
7782,7788,
782,7788,
82,7788,
2,7788,
,7788,
7788,
788,
88,
8,
,
CHARINDEX:
Returns part of a character, binary, text, or image expression in SQL Server.
is defined as:
CHARINDEX ( expressionToFind , expressionToSearch [ , start_location ] )
AND
SUBSTRING:
Returns part of a character, binary, text, or image expression in SQL Server.
is defined as:
SUBSTRING ( expression ,start , length )
Query1:
substring(c,2,charindex(',',c,2))
In above charIndex returns first position of ',' i.e. 6 in every case.
So the returned value is acting as length for the substring which is 6 and that's why you are getting each record of 6 lengths.
Query 2:
substring(c,2,charindex(',',c,2)-2)
In above charIndex returns first position of ',' i.e. 6 in every case. But also you are reducing the length by subtracting 2 from it.
So the returned value is acting as length for the substring which is 4 now and that's why you are getting each record of 4 lengths in this query.
See this query1 and query2 represent CHARINDEX returned values:
c query1 query2
,7654,7698,7782,7788, 6 4
,7698,7782,7788, 6 4
,7782,7788, 6 4
,7788, 6 4

how to add zeros after decimal in Oracle

I want to add zeroes after the number .
for eg a= 6895
then a= 6895.00
datatype of a is number(12);
I am using the below code .
select to_char(6895,'0000.00') from dual .
I m getting the desired result from above code but
'6895' can be any number.so due to that i need to add '0' in above code manually.
for eg.
select to_char(68955698,'00000000.00') from dual .
Can any one suggest me the better method .
The number format models are the place to start when converting numbers to characters. 0 prepends 0s, which means you'd have to get rid of them somehow. However, 9 means:
Returns value with the specified number of digits with a leading space if positive or with a leading minus if negative. Leading zeros are blank, except for a zero value, which returns a zero for the integer part of the fixed-point number.
So, the following gets you almost there, save for the leading space:
SQL> select to_char(987, '9999.00') from dual;
TO_CHAR(
--------
987.00
You then need to use a format model modifier, FM, which is described thusly:
FM Fill mode. Oracle uses trailing blank characters and leading zeroes
to fill format elements to a constant width. The width is equal to the
display width of the largest element for the relevant format model
...
The FM modifier suppresses the above padding in the return value of
the TO_CHAR function.
This gives you a format model of fm9999.00, which'll work:
SQL> select to_char(987, 'fm9999.00') from dual;
TO_CHAR(
--------
987.00
If you want a lot of digits before the decimal then simply add a lot of 9s.
datatype of a is number(12);
Then use 12 9s in the format model. And, keep the decimal to just 2. So, since the column datatype is NUMBER(12), you cannot have any number more than the given size.
SQL> WITH DATA AS(
2 SELECT 12 num FROM dual union ALL
3 SELECT 234 num FROM dual UNION ALL
4 SELECT 9999 num FROM dual UNION ALL
5 SELECT 123456789 num FROM dual)
6 SELECT to_char(num,'999999999999D99') FROM DATA
7 /
TO_CHAR(NUM,'999
----------------
12.00
234.00
9999.00
123456789.00
SQL>
Update Regarding leading spaces
SQL> select ltrim(to_char(549,'999999999999.00')) from dual;
LTRIM(TO_CHAR(54
----------------
549.00
SQL>
With using CASE and SUBSTR it is very simple.
CASE WHEN SUBSTR(COLUMN_NAME,1,1) = '.' THEN '0'||COLUMN_NAME ELSE COLUMN_NAME END