While trying to sort data in ascending order in DataWeave, I've noticed that if the input for orderBy() contains strings that start with numbers and alphabetic characters like ["6", "7", "a", "8"], then the result is ["6","7","8","a"].
Is there a way that we can have strings which are starting with alphabetic characters before numbers?
By default in Unicode digits are before alphabet characters. You can use the criteria parameter of orderBy() to change the way numbers are going to be compared with characters. For example adding character { as a prefix which is a character that is just ufter z (lowercase z), then numbers will be after any alphabet characters.
Example:
%dw 2.0
output application/json
import isNumeric from dw::core::Strings
---
payload orderBy (if (isNumeric($)) "{" ++ $ else $ )
Output:
[
"a",
"6",
"7",
"8"
]
Related
I need to delete rows from a dataframe in which a particular column contains string which contains numeric substrings. See the shaded column of my dataframe.
rows with values like 0E as prefix or 21 (any two digit number) as suffix or 24A (any two digit number with a letter) as suffix should be deleted.
Any suggestions?
Thanks in advance.
You can use boolean indexing with a str.contains() regex:
^0E - starts with 0E
\d{2}$ - ends with 2 digits
\d{2}[A-Z]$ - ends with 2 digits and 1 capital letter
col = ... # target column
mask = df[col].str.contains(r'^0E|\d{2}$|\d{2}[A-Z]$')
df = df.loc[~mask]
#tdy gave a good answer, but only one place need to be modified if I understand it correctly.
For value ends with two digits or two digits and a capital character, the regex should be:
.*\d{2}[A-Z]?$
I am trying to retrieved each sequence of 5 numbers / letters that are in brackets just like this example:
accuracy of action - [1232d, 74294, qw23t, 23d45, 76wer, 12874] march
and from that I want to extract 1232d 74294 qw23t 23d45 76wer 12874
I know that to extract only a single 5 digit sequence in square brackets I can do \[[a-z0-9 ]{5,7}\] But I don't know how to do retrieve various 5 digit sequences.
Right now, since all the words inside [...] consist of 5 alphanumeric chars, you can use
(?:\G(?!^),\s*|\[)(\w+)(?=[^\]\[]*])
See the regex demo.
Details:
(?:\G(?!^),\s*|\[) - either the end of the preceding successful match and a comma and zero or more whitesapces, or a [ char
(\w+) - Group 1: one or more word chars
(?=[^\]\[]*]) - followed with zero or more chars other than [ and ] and then a ].
I have tried followings;
vars.counter as Number {format:'00'}
vars.counter as Number {format:'##'}
vars.counter as String {format:'00'}
vars.counter as String {format:'##'}
None of the above making 1 to 01
How can i do this in mule4?
Numbers (integers, floating point) don't have format DataWeave, like in many other languages. You have to convert to a String with the desired pattern. I tried the following combinations:
%dw 2.0
output application/json
---
[
1 as String {format:'##'},
1 as String {format:'00'},
1 as String {format:'#0'}
// , 1 as String {format:'0#'} ERROR!
]
Output:
[
"1",
"01",
"1"
]
Only the all zeros combination gives the desired result.
I have input comprising five character upper-case English letters e.g ABCDE and I need to convert this into two character unique ASCII output.
e.g. ABCDE and ZZZZZ should both give two different outputs
I have converted from ABCDE into hex which gives me 4142434445, but from this can I get to a two character output value I require?
Example:
INPUT1 = ABCDE
Converted to hex = 4142434445
INPUT2 = 4142434445
OUTPUT = ?? Any 2 ASCII Characters
Other examples of INPUT1 =
BIRAL
BRMAL
KLAAX
So you're starting with a 5-digit base-26 number, and you want to squeeze that into some 2-digit scheme with base n?
All possible 1-5 digit base-26 numbers gives you a number space of 26^5 = 11,881,376.
So you want the minimum n where n^2 >= 11,881,376.
Which gives you 3446.
Now it's up to you to go and find a suitable glyph block somewhere in UTF where you can reliably block-out 3446 separate characters to act as your new base/alphabet. And construct a mapping from your 5-char base-26 ABCDE type number onto your 2-char base-3446 wierd-glyph number. Good luck with that.
There's not enough variety in ASCII to do this, since it's only 128 printable characters. Limiting yourself to 2-chars of ASCII means you can only address a number space of 16384.
I am calling
FORMAT(myNum, '#,###') AS myNum
Which works for 123456789 as the output is 123,456,789
Also works for negative numbers
However, 0 is showing up as a blank field.
How do I get 0 to show up as 0? I am also curious as to why 0 is being removed as the query without the format shows 0 in that column's field when there should be a 0.
Note: I do not need any decimals and would prefer to use the above code if at all possible.
If you want to display the 0, if it is zero, you should use:
FORMAT(myNum, '#,###0') AS myNum
According to this Reference:
https://msdn.microsoft.com/en-us/library/ee634206.aspx
0 (zero character)
Digit placeholder. Displays a digit or a zero. If the expression has a digit in the position where the zero appears in the format string, displays the digit; otherwise, displays a zero in that position.
If the number has fewer digits than there are zeros (on either side of the decimal) in the format expression, displays leading or trailing zeros. If the number has more digits to the right of the decimal separator than there are zeros to the right of the decimal separator in the format expression, rounds the number to as many decimal places as there are zeros. If the number has more digits to the left of the decimal separator than there are zeros to the left of the decimal separator in the format expression, displays the extra digits without modification.
# Digit placeholder:
Displays a digit or nothing. If the expression has a digit in the position where the # character appears in the format string, displays the digit; otherwise, displays nothing in that position.
This symbol works like the 0 digit placeholder, except that leading and trailing zeros aren't displayed if the number has fewer digits than there are # characters on either side of the decimal separator in the format expression.