Replace the digits of a number with subsequent higher digits - sql

Given a number I want to replace each digit with the next digit that is larger. If there is no next larger digit leave the digit as it was.
Eg : Input : 1234, Output - 2344
Since in Oracle we can process everything row by row, I tried first to separate the digits of number into rows by using the below query.
SELECT REGEXP_SUBSTR ('1234','[[:digit:]]',1,LEVEL) txt
FROM dual
CONNECT BY LEVEL <= length('1234');
The query will give me this result.
TXT
----------------
1
2
3
4
But I am stuck in here, how to compare the two rows and replace them with the largest.
Attempted expansion and clarification based on comments:
Treat the number as a string of digits. For each digit, find the first digit among the remaining digits to the right of the current one, that has a higher value than the current digit. That may not be the highest-value digit in the string, or even the highest among all the digits to the right, it is just the first higher value encountered. If there is no higher value then keep the current digit intact. Only consider following digits, preceding ones are ignored.
Some examples:
1234 -> 2344
1357 -> 3577
1157 -> 5577
1245638 -> 2456888
Breaking down the last one:
Digit 1 is 1; the first digit in the remaining string 245638 that is higher than 1 is 2.
Digit 2 is 2; the first digit in the remaining string 45638 that is higher than 2 is 4.
Digit 3 is 4; the first digit in the remaining string 5638 that is higher than 4 is 5.
Digit 4 is 5; the first digit in the remaining string 638 that is higher than 5 is 6.
Digit 5 is 6; the first digit in the remaining string 38 that is higher than 6 is 8.
Digit 6 is 3; the first digit in the remaining string 8 that is higher than 3 is 8.
Digit 7 is 8; no subsequent digit is higher then 8 so keep existing digit 8.

After some clarification in comments:
WITH t AS (
SELECT LEVEL AS pos,
ROWNUM AS txt_order,
REGEXP_SUBSTR ('1245638','[[:digit:]]',1,LEVEL) AS txt
FROM dual
CONNECT BY LEVEL <= LENGTH('1245638')
),
v AS (
SELECT t1.pos, t1.txt,
MIN(t2.txt) KEEP (DENSE_RANK FIRST ORDER BY t2.pos) as new_txt
FROM t t1
LEFT JOIN t t2 ON t2.pos > t1.pos AND t2.txt > t1.txt
GROUP BY t1.pos, t1.txt
)
SELECT LISTAGG(NVL(new_txt, txt), NULL) WITHIN GROUP (ORDER BY pos) AS OUTPUT
FROM v;
OUTPUT
--------
2456888
The t CTE is just your original query. Now the v CTE is finding the first digit later in the list which is larger than the current one; the nvl uses the current digit if there isn't one larger. The listagg just sticks the digits back together in the right order.
SQL Fiddle of the same logic, but using a recursive CTE instead of the connect-by to generate the digits, just so multiple values can be 'converted' in one go from a table. Which gives:
ORIGINAL OUTPUT
---------------------------------------- --------
1234 2344
1157 5577
1357 3577
1245638 2456888

Related

split(regexp_replace ) Like Function In Presto : 331

Is there any way to split values based on consecutive 0's in presto.Minimum 6 digits should be there in first split, if digit count is less than 6 than need to consider some 0's as digit then split if digit count is >= 6 then just need to split in 2 groups.
below query is working as expected in Hive.But I am not able to do the same using presto.
select low as orginal_Value,
split(regexp_replace(low,'(\\d{6,}?)(0+)$','$1|$2'),'\\|') Output_Value from test;
Presto Query:
presto> SELECT regexp_split('1234567890000', '(\d{6,}?)(0+)$') as output;
output
[1234567890000]
(1 row)
It worked Now.
select split(regexp_replace('1234567890000','(\d{6,}?)(0+)$','$1|$2'), '|') as output;
enter code here
output
-------------------
[123456789, 0000]

How to format a numeric value in Postgresql

I have a set of code that generates a number which can be seen below
SELECT MAX(RIGHT ("node_id",3)::numeric) + 1 as newnum from sewers.structures
WHERE(
ST_WITHIN(
ST_CENTROID((ST_SetSRID(structures.geom, 4326))),
ST_SetSRID((SELECT geom FROM sewers."Qrtr_Qrtr_Sections" WHERE "plat_page" = '510A'),4326)) ) and "node_id" != 'PRIVATE' and "node_id" !='PRIV_SAN' and "node_id" !='PRIV_STORM'
When I run this it generates a number based on the previously placed values. The out put will be a number that can be up to 3 digits. I want to take an output of less than three digits, and force it into a 3 digit format.
For example, if I generate the number 93 I would like to format it as 093. Same for single digit numbers like 2, I want it to be formated at 002 and so on. However, if it generates a 3 digit number, I want it to keep the same format, so 121 stays as 121.
If I got your question right, you're looking for lpad():
WITH j (x) AS (
VALUES (2),(121),(93)
)
SELECT lpad(x::text,3,'0') FROM j;
lpad
------
002
121
093
(3 rows)
Since the output will be a string, you can use to_char with a format of three 0
select to_char(1,'000');
to_char
---------
001
(1 row)

create/rank 50 digit number and cast it to varchar

I cant create such a long numeric value(numeric value out of range), so i have to cast it, but it doesnt work.
ID DesiredID
1 100000..1(50 digit long)
2 100000..2(50 digit long)
3 100000..3(50 digit long)
...
999 435345...(50 digit long)
The numbers can have any values, but they need to be 50 digit long and the ID starts from 1 and goes up to a three digit number(999).
I have tryed something like
select (100000.....000 + dense_rank() over (order by ID))::varchar(50)
but i am getting the numeric value out of range error. With:
select (1000 + dense_rank() over (order by ID))::varchar(4)
the sql works.
Postgres supports unlimited length numerics, so both these work:
select '10000000000000000000000000000000000000000000000000000000000001'::numeric
select 10000000000000000000000000000000000000000000000000000000000001
Your code will work fine if the number you start with really has 50 digits.
I will bet two pizzas and a bottle of beer that you have a 1 followed by 50 zeros, totalling 51 digits.
Here's an example of it working:
https://dbfiddle.uk/?rdbms=postgres_13&fiddle=c197e05132fa454b1187201ee28ca39e

SQLite3 Order by highest/lowest numerical value

I am trying to do a query in SQLite3 to order a column by numerical value. Instead of getting the rows ordered by the numerical value of the column, the rows are ordered alphabetically by the first digit's numerical value.
For example in the query below 110 appears before 2 because the first digit (1) is less than two. However the entire number 110 is greater than 2 and I need that to appear after 2.
sqlite> SELECT digit,text FROM test ORDER BY digit;
1|one
110|One Hundred Ten
2|TWO
3|Three
sqlite>
Is there a way to make 110 appear after 2?
It seems like digit is a stored as a string, not as a number. You need to convert it to a number to get the proper ordering. A simple approach uses:
SELECT digit, text
FROM test
ORDER BY digit + 0

display non-printable ascii characters in SQL as :ascii: or :print: does not work

I am trying to fetch all non-printable ASCII characters from DESCRIPTION field in a table using SQL in TOAD however the below query is not working .
select
regexp_instr(a.description,'[^[:ascii:]]') as description from
poline a where a.ponum='XXX' and a.siteid='YYY' and
regexp_instr(a.description,'[^[:ascii:]]') > 0
the above query bought error ORA-127729: invalid character class in regular expression. I tried :print: instead of :ascii: however it didn't bring any result. Below is the description for this record which has non-printable characters.
Sherlock 16 x 6.5” Wide Wheelbarrow wheel .M100P.10R – Effluent care bacteria and enzyme formulation
:ascii: is not a valid character class, and even if it were, it doesn't appear to be what you are trying to get here (ascii does contain non-printable characters). Valid classes can be found here.
Actually if you replace :ascii: with :print: in your original query, it will indeed return the first position in each POLINE.DESCRIPTION that is a non-printable character. (If it returns nothing for you, it may be because your DESCRIPTION data is actually all printable.)
But as you stated you want to identify Every non-printable char in each DESCRIPTION in POLINE, some changes would be needed. I'll include an example that gets every match as a starting place.
In this example, each DESCRIPTION will be decomposed to its individual constituent characters, and each char will be checked for printability. The location within the DESCRIPTION string along with the ASCII number of the non-printable character will be returned.
This example assumes there is a unique identifier for each row in POLINE, here called POLINE_ID.
First, create the test table:
CREATE TABLE POLINE(
POLINE_ID NUMBER PRIMARY KEY,
PONUM VARCHAR2(32),
SITEID VARCHAR2(32),
DESCRIPTION VARCHAR2(256)
);
And load some data. I inserted a couple non-printing chars in the example Sherlock string you provided, #23 and #17. An example string composed of only the first 64 ASCII chars (of which the first 31 are not in :print:) is also included, and some fillers to fall through the PONUM and SITEID predicates.
INSERT INTO POLINE VALUES (1,'XXX','YYY','Sherlock'||CHR(23)||' 16 x 6.5” Wide Wheelbarrow wheel .M100P.10R –'||CHR(17)||' Effluent care bacteria and enzyme formulation');
DECLARE
V_STRING VARCHAR2(64) := CHR(1);
BEGIN
FOR POINTER IN 2..64 LOOP
V_STRING := V_STRING||CHR(POINTER);
END LOOP;
INSERT INTO POLINE VALUES (2, 'XXX','YYY',V_STRING);
INSERT INTO POLINE VALUES (3, 'AAA','BBB',V_STRING);
END;
/
INSERT INTO POLINE VALUES(4,'XXX','YYY','VOLTRON');
Now we have 4 rows total. Three of them contain (multiple) non-printable characters, but only two of them should match all the restrictions.
Then run a query. There are two example queries below--the first uses REGEXP_INSTR with as in your initial example query (substituting :cntrl: for :print:). But for an alternative, a 2nd, variant is also included that just checks whether each char is in the first 31 ascii chars.
Both example queries, will index every char of each DESCRIPTION, and check whether it is printable, and collect the ascii number and location of each non-printable character in each candidate DESCRIPTION. The example table here has DESCRIPTIONs that are 256 chars long, so this is used as the max index in the cartesian join.
Please note, these are not efficient, and are designed to get EVERY match. If you end up only needing the first match afterall, your original query replaced with :print: will perform much better. Also, this could also be tuned by dropping into PL/SOL or perhaps going recursive (if PL/SQL is allowed in your use case, or you are 11gR2+, etc.). Also some predicates here such as REGEXP_LIKE do not impact the end result and serve only to allow preliminary filtration. These could be superfluous (or worse) for you, depending on your data set.
First example, using regex and :print:
SELECT
POLINE_ID,
STRING_INDEX AS NON_PRINTABLE_LOCATION,
ASCII(REGEXP_SUBSTR(SUBSTR(DESCRIPTION, STRING_INDEX, 1), '[[:cntrl:]]', 1, 1)) AS NON_PRINTABLE_ASCII_NUMBER
FROM POLINE
CROSS JOIN (SELECT LEVEL AS STRING_INDEX
FROM DUAL
CONNECT BY LEVEL < 257) CANDIDATE_LOCATION
WHERE PONUM = 'XXX'
AND SITEID = 'YYY'
AND REGEXP_LIKE(DESCRIPTION, '[[:cntrl:]]')
AND REGEXP_INSTR(SUBSTR(DESCRIPTION, STRING_INDEX, 1), '[[:cntrl:]]', 1, 1, 0) > 0
AND STRING_INDEX <= LENGTH(DESCRIPTION)
ORDER BY 1 ASC, 2 ASC;
Second example, using ASCII numbers:
SELECT
POLINE_ID,
STRING_INDEX AS NON_PRINTABLE_LOCATION,
ASCII(SUBSTR(DESCRIPTION, STRING_INDEX, 1)) AS NON_PRINTABLE_ASCII_NUMBER
FROM POLINE
CROSS JOIN (SELECT LEVEL AS STRING_INDEX
FROM DUAL
CONNECT BY LEVEL < 257) CANDIDATE_LOCATION
WHERE PONUM = 'XXX'
AND SITEID = 'YYY'
AND REGEXP_LIKE(DESCRIPTION, '[[:cntrl:]]')
AND ASCII(SUBSTR(DESCRIPTION, STRING_INDEX, 1)) BETWEEN 1 AND 31
AND STRING_INDEX <= LENGTH(DESCRIPTION)
ORDER BY 1 ASC, 2 ASC;
In our test data, these queries will produce equivalent output. We should expect this to have two hits (for chrs 17 and 23) in the Sherlock DESCRIPTION, and 31 hits for the first-64-ascii DESCRIPTION.
Result:
POLINE_ID NON_PRINTABLE_LOCATION NON_PRINTABLE_ASCII_NUMBER
1 9 23
1 56 17
2 1 1
2 2 2
2 3 3
2 4 4
2 5 5
2 6 6
2 7 7
2 8 8
2 9 9
2 10 10
2 11 11
2 12 12
2 13 13
2 14 14
2 15 15
2 16 16
2 17 17
2 18 18
2 19 19
2 20 20
2 21 21
2 22 22
2 23 23
2 24 24
2 25 25
2 26 26
2 27 27
2 28 28
2 29 29
2 30 30
2 31 31
33 rows selected.
EDIT In response to comments, here is some elaboration on what we can expect from [[:cntrl:]] and [^[:cntrl:]] with regexp_instr.
[[:cntrl:]] will match any of the first 31 ascii characters, while [^[:cntrl:]] is the logical negation of [[:cntrl:]], so it will match anything except the first 31 ascii characters.
To compare these, we can start with the simplest case of only one character, ascii #31. Since there's only one character, the result can only be either match or miss. One will expect the following to return 1 for the match:
SELECT REGEXP_INSTR(CHR(31),'[[:cntrl:]]',1,1,0) AS MATCH_INDEX FROM DUAL;
MATCH_INDEX
1
But 0 for the miss with negating [^[:cntrl:]] :
SELECT REGEXP_INSTR(CHR(31),'[^[:cntrl:]]',1,1,0) AS MATCH_INDEX FROM DUAL;
MATCH_INDEX
0
Now if we include two (or more) characters that are a mix of printable and non-printnable, there are more possible outcomes. Both [[:cntrl:]] and [^[:cntrl:]] can match, but they can only match different things. If we move from only ascii #31 to ascii #64#31, we will still expect [[:cntrl:]] to match (since there is a non-printable character in the second position) but it should now return 2, since the non-printable is in the second position.
SELECT REGEXP_INSTR(CHR(64)||CHR(31),'[[:cntrl:]]',1,1,0) AS MATCH_INDEX FROM DUAL;
MATCH_INDEX
2
And now [^[:cntrl:]] also has the opportunity to match (at the first position):
SELECT REGEXP_INSTR(CHR(64)||CHR(31),'[^[:cntrl:]]',1,1,0) AS MATCH_INDEX FROM DUAL;
MATCH_INDEX
1
When there are a mix of printable and control characters, both [[:cntrl:]] and [^[:cntrl:]] can match, but they will match at different indices.