String to Int throwing a NumberFormatException - kotlin

I am wanting to convert a String to an Int.
The String, in question, is 1585489022235.
I have to convert to a String, first, which appears to be successful (I am printing out the above).
val id = data.get("id").toString()
println(id)
val toInt = id.toInt()
When I attempt to convert from a String to an Int, this exception is thrown:
Error Message: java.lang.NumberFormatException: For input string: "1585489022235"
Stacktrace:
java.base/java.lang.NumberFormatException.forInputString(NumberFormatException.java:68)
java.base/java.lang.Integer.parseInt(Integer.java:658)
java.base/java.lang.Integer.parseInt(Integer.java:776)
Many thanks in advance

The value "1585489022235" can't be converted to an Int because it is greater than the maximum value of the data type Int which you can get by Int.MAX_VALUE and it is equal to 2147483647.
What you can do is covert the string to Long:
val x = id.toLong()
The maximum value of the data type long is 9223372036854775807, which you can get by Long.MAX_VALUE.
You can find more here: Basic Types - Numbers

Related

Sum over numeric string values in SQL

I have numeric values stored in a column of a Spark SQL database as strings. I store these numeric values as strings since they can potentially overflow all numeric datatypes (>128bit numbers).
So far, I could use the normal SUM() function to sum up values. I am curious if it's always safe to do this and how I can deal with cases where it doesn't work.
My thoughts:
I think internally numeric string values are casted to real numeric datatypes during the summation. In cases where this internal casting fails, the whole summation will fail.
Decimal type has maximum of 38 digits. It is not mentioned in Spark docs actually, but you get an error trying to create a type with bigger precision. Databricks docs state this explicitly.
Still, 38 digits allow you to store and safely add 126-bit (log2(10^38)) numbers.
val df = Seq(
"123456789012345678901234567890",
"1",
"100000100000100000100000100001"
).toDF("x")
df.withColumn("x", $"x".cast(new DecimalType(38))).agg(sum($"x")).show(false)
+------------------------------+
|sum(x) |
+------------------------------+
|223456889012445679001234667892|
+------------------------------+
If you need even more you can create a custom aggregate function operating on strings.
import org.apache.spark.sql.{Encoder, Encoders}
import org.apache.spark.sql.expressions.Aggregator
object BigSum extends Aggregator[String, String, String] {
def zero: String = "0"
def reduce(buffer: String, x: String): String = (BigInt(buffer) + BigInt(x)).toString
def merge(b1: String, b2: String): String = (BigInt(b1) + BigInt(b2)).toString
def finish(b: String): String = b
def bufferEncoder: Encoder[String] = Encoders.STRING
def outputEncoder: Encoder[String] = Encoders.STRING
}
val big_sum = udaf(BigSum)
df.agg(big_sum($"x")).show(false)
+------------------------------+
|bigsum$(x) |
+------------------------------+
|223456889012445679001234667892|
+------------------------------+
PS. My BigSum is not super efficient due to converting back and forth, it would be better to have buffer as BigInt - I just didn't know how to write Encoder[BigInt]...

Getting Typerror trying to convert Object values to Integer or Datetime Value Year

This is the dataframe
The column movie_df["year"] was created from extracting values from the title column:
movie_df["year"] = movie_df["title"].str.extract("(\d+)")
Line of code
Error
In python int is a function that converts a string to an integer. So when you pass it you are actually passing a method as the error describes. From the documentation of pandas, it looks like .astype takes a string as an argument. For example:
movie_df["year"].astype("int32")
Documentation:
https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.astype.html

Select columns based on exact row value matches

I am trying to select columns of a specified integer value (24). However, my new dataframe includes all values = and > 24. I have tried converting from integer to both float and string, and it gives the same results. Writing "24" and 24 gives same result. The dataframe is loaded from a .csv file.
data_PM1_query24 = data_PM1.query('hours == "24"' and 'averages_590_nm_minus_blank > 0.3')
data_PM1_sorted24 = data_PM1_query24.sort_values(by=['hours', 'averages_590_nm_minus_blank'])
data_PM1_sorted24
What am I missing here?
Please try out the below codes. I'm assuming that the data type of "hours" and "averages_590_nm_minus_blank" is float. If not float, convert them to float.
data_PM1_query24 = data_PM1.query('hours == 24 & averages_590_nm_minus_blank > 0.3')
or you can also use,
data_PM1_query24 = data_PM1[(data_PM1.hours == 24) & (data.averages_590_nm_minus_blank > 0.3)]
Hope this solves your query!

FormatNumber replacing number with 0

Not understanding this:
Number returned from DataReader: 185549633.66000035
We have a requirement to maintain the number of decimal places per a User Choice.
For example: maintain 7 places.
We are using:
FormatNumber(dr.Item("Field"), 7, TriState.false, , TriState.True)
The result is: 185,549,633.6600000.
We would like to maintain the 3 (or 35) at the end.
When subtracting two numbers from the resulting query we are getting a delta but trying to show these two numbers out to 6,7,8 digits is not working thus indicating a false delta to the user.
Any advice would be appreciated.
Based on my testing, you must be working with Double values rather than Decimal. Not surprisingly, the solution to your problem can be found in the documentation.
For a start, you should not be using FormatNumber. We're not in VB6 anymore ToTo. To format a number in VB.NET, call ToString on that number. I tested this:
Dim dbl = 185549633.66000035R
Dim dec = 185549633.66000035D
Dim dblString = dbl.ToString("n7")
Dim decString = dec.ToString("n7")
Console.WriteLine(dblString)
Console.WriteLine(decString)
and I saw the behaviour you describe, i.e. the output was:
185,549,633.6600000
185,549,633.6600004
I read the documentation for the Double.ToString method (note that FormatNumber would be calling ToString internally) and this is what it says:
By default, the return value only contains 15 digits of precision although a maximum of 17 digits is maintained internally. If the value of this instance has greater than 15 digits, ToString returns PositiveInfinitySymbol or NegativeInfinitySymbol instead of the expected number. If you require more precision, specify format with the "G17" format specification, which always returns 17 digits of precision, or "R", which returns 15 digits if the number can be represented with that precision or 17 digits if the number can only be represented with maximum precision.
I then tested this:
Dim dbl = 185549633.66000035R
Dim dblString16 = dbl.ToString("G16")
Dim dblString17 = dbl.ToString("G17")
Console.WriteLine(dblString16)
Console.WriteLine(dblString17)
and the result was:
185549633.6600004
185549633.66000035

Format a number to display a comma when larger than a thousand

I am writing some code in Visual Basic.net and have a question.
If I have a long number, that is larger than 1000, how can I format this value to be 1,000 (with a comma) and for this to be stored in a string?
For e.g.
1234 will be stored as 1,234
12345 will be stored as 12,345
123456 will be stored as 123,456
Is this done with a TryParse statement?
May I have some help to so this?
Take a look at The Numeric ("N") Format Specifier
General use:
Dim dblValue As Double = -12445.6789
Console.WriteLine(dblValue.ToString("N", CultureInfo.InvariantCulture))
' Displays -12,445.68
If you are only using integers then the following:
Dim numberString As String = 1234.ToString("N0")
Will show numberString = "1,234" as the "N0" format will not add any figures after a decimal point.
For those wanting to do a currency with commas and decimals use the following: .ToString("$0,00.00")
Using $ notation:
int myvar = 12345;
Console.WriteLine($"Here is my number: {myvar:N0}");