Is this a bug in VBA's IsNumeric and CDbl() functions? - vba

Consider the following VBA function:
Function castAndAdd(inputValue As Variant) As Variant
If IsNumeric(inputValue) Then
castAndAdd = CDbl(inputValue) + 4
Else
castAndAdd = inputValue
End If
End Function
Calling it from the immediate window gives this output:
?castAndAdd("5,7")
61
?castAndAdd("5, 7")
5, 7
Stepping through the "5,7" call, I find that IsNumeric("5,7") returns true. I was thinking that maybe it gives this result because in Europe a comma is used as a decimal separator; this result is odd because I'm in the United States, so my locale should determine that Excel only recognizes a period as a decimal separator, right?
Even if we set aside the Europe/US issue, the bigger problem is that CDbl("5,7") returns 57, so that CDbl("5,7") + 4 returns 61, not 9.7 as I would have expected if the comma is a decimal separator. Is this a bug, or am I just not understanding how to use CDbl()?

The comma is not recognized as decimal, but as thousands separator. The mechanism is not so smart to require that then at least three digits should follow, but essentially it strips any of the thousands separators in interpreting it as a number.
So even CDbl("4,5,,6,7") would yield 4567 as a number. All this is true when the comma is the thousands separator. If, as in some European countries, the point is the thousands separator, then a similar thing will happen with points.

Related

REGEX Extract Amount Without Currency

SELECT
ocr_text,
bucket,
REGEXP_EXTRACT('-?[0-9]+(\.[0-9]+)?', ocr_text)
FROM temp
I am trying to extract amounts from a string that will not have currency present. Any number that does not have decimals should not match. Commas should be allowed assuming they follow the correct rules (at hundreds marker)
56 no (missing decimals)
56.45 yes
120 no (missing decimals)
120.00 yes
1200.00 yes
1,200.00 yes
1,200 no (missing decimals)
1200 no (missing decimals)
134.5 no (decimal not followed by 2 digits)
23,00.00 no (invalid comma location)
I'm a noob to REGEX so I know my above statement already does not meet the criteria i've listed. However, i'm already stuck getting the error (INVALID_FUNCTION_ARGUMENT) premature end of char-class on my REGEX_EXTRACT line
Can someone point me in the right direction? How can I resolve my current issue? How can I modify to correctly incorporate the other criteria listed?
Here is a general regex pattern for a positive/negative number with two decimal places and optional thousands comma separators:
(?<!\S)(?:-?[0-9]{1,3}(,[0-9]{3})*(\.[0-9]{2})|-?[0-9]+(\.[0-9]{2}))(?!\S)
Demo
Your updated query:
SELECT
ocr_text,
bucket,
REGEXP_EXTRACT(ocr_text, '(?<!\S)(?:-?[0-9]{1,3}(,[0-9]{3})*(\.[0-9]{2})|-?[0-9]+(\.[0-9]{2}))(?!\S)')
FROM temp;
From the Presto docs I read, it supposedly supports Java's regex syntax. In the event that lookarounds are not working, you may try this version:
SELECT
ocr_text,
bucket,
REGEXP_EXTRACT(ocr_text, '(\s|^)(?:-?[0-9]{1,3}(,[0-9]{3})*(\.[0-9]{2})|-?[0-9]+(\.[0-9]{2}))(\s|$)')
FROM temp;
REGEXP_EXTRACT('^[-]?(\d*.\d*)', ocr_text)
Pattern: ^[-]?(\d*\.\d*)
Explanation:
^ - Start of line
[-]? - With or without negative dash (-)
\d* - 0 or more digits
\. - a decimal (escaped, because in regex decimals are considered special characters)
\d* - 0 or more digits (the decimal part);
$ - End of the line.
Bonus tip: There are helpful tools online to test your regex!
The Below code works to extract the value like all numbers but it catches all, only specific to certain alphabets its not working well. Anyone, please suggest well.
-?\d+\.?\d*
I have done work on NLP using Regex.

What is significance of trailing "$" in some function names?

I recently looked at some vba source at Microsoft: [Convert Fractions to Decimal Values][1]
[1]: https://support.microsoft.com/en-us/kb/185424 and I noticed that several functions had a trailing "$", specifically trim$(), left$(), and mid$(). My question is: what does the "$" signify?
I downloaded the microsoft function and it ran correctly under Excel 2007.
Since VBA trim() works differently from the worksheet function trim(), I wrote a small program to compare the operation of the 3 possible trim() calls. I found that trim() and trim$() produced identical output. worksheetfunction.trim(), of course, produces output that has extraneous space characters removed from inside the string.
I am very curious about the trailing "$", and will be grateful for enlightenment!
Thank you,
Dave
To quote from https://bytes.com/topic/access/answers/196893-difference-between-left-left-function
Allen Browne
The trailing $ is a type declaration character for the String data type in
VBA.
The result returned from Left$() is a string, whereas Left() returns a
Variant.
You must use Left(), not Left$() if there is any chance of Null values,
since the Variant can be Null but the String cannot.
That post has a full worked example
The syntax is a left-over habit from ancient history. In early versions of Basic variables did not have to be declared but data types were implied by the name of the variable. Any variable ending with $ was a string and any variable ending with % was an integer.
FORTRAN had a similar convention: any variable starting with the letters I, J, K, L, M or N were integers, all others were real.

Trailing space in i to string conversion in ABAP

On a SAP system, ABAP version 7.40 SP05, I just encountered a failure in unit tests on string comparison, but both strings should be the same?! Turns out it's not the case, as preceding conversion from i to string seems to produce extra trailing space in one of the strings.
This code bit:
DATA(i) = 111.
DATA(s1) = CONV string( i ).
DATA(s2) = '111'.
DATA(s3) = |111|.
Produces (as seen in debugger):
S1 111 3100310031002000 CString{4}
S2 111 310031003100 C(3)
S3 111 310031003100 CString{3}
The converted one has an extra trailing space. How does this happen and how can I prevent this to happen in i to string conversions? Obviously stuff like this makes me debug for a long time to find what is up (because unless I check the hex values, the debuger does not show that extra space...).
To understand why the space is added in the first place, check the documentation on the default conversion rules that are applied by CONV:
The character "-" is set at the last position for a negative value,
and a blank is set for a positive value.
Since you can't use the formatting options of string expressions with the CONV operator, I'd suggest changing the code to use |{ i }| (which might be a good idea for other values as well, since you'll probably need some formatting options when comparing date / time values in unit tests anyway).
You cannot prevent it. The best way I found so far in ABAP is use CONDENSE s1
DATA i type i VALUE 12.
DATA idx TYPE string.
idx = i. " idx = '12 '.
CONDENSE idx. " idx = '12'.

How can you format the result of a formula as s number

I'm using a formula in Excel 2007 to grab a mailbox size from a string. I'm stripping out all of the text before and after, and removing the , characters, yet Excel will not format the result as a number.
Because of this, I can't run any statistics such as total size or average size.
=SUBSTITUTE(MID(MailboxEX01[[#This Row],[TotalItemSize]], FIND("(", MailboxEX01[[#This Row],[TotalItemSize]]) + 1, FIND("bytes", MailboxEX01[[#This Row],[TotalItemSize]]) - (FIND("(", MailboxEX01[[#This Row],[TotalItemSize]]) + 2)), ",", "")
I tried =TEXT({the above}, "#,##0"), which successfully added the , character as the thousands separator, but (I guess unsurprisingly) still failed to format the cell as a number.
Does anybody know how I can force the result in this cell to format as a number? Thanks.
Your formula takes text values, strips stuff out, and replaces stuff. But the result is still text. If the result of that formula contains only the characters 0 to 9, you can coerce it into a number by using
={your formula}+0
or
={your formula}*1
or
=--({your formula})
Using =TEXT({your formula}, "format") will not produce a number. The TEXT() function returns text, as the name implies.
But if you use a mathematical operator on a number that is stored as text (or a text that represents a number), the maths operation will coerce that text value into a real number. The third suggestion uses the double unary (Google that) to coerce the text into a number.

How Do I get this Split Function to Work? (VB.NET)

So, I made a program that for the most part, converts numbers to letters. My problem before was it was converting each individual digit instead of each number e.g. (1-0-1 instead of 101). Someone suggested that I use the Split function:
Dim numbers As String() = DTB.Split(" ")
So now it's reading the number all the way through being that it will only the split if there's a space in between. My problem now is that it's translating for example: "[102, 103, 104]" as "[102", "103" and "104]" because it will only split if there's a space between. Obviously, you can't convert "[102" or "104]" because they aren't actual numbers.
Does anyone have a solution on what I should do to get this to convert no matter the spacing? Would Regex be the way to go?
use a regular expression with \d+ it will match numbers
so
12234abcsdf23434
will return two matches
12234
23434