I need to call scala.math.pow to calulate a number, but I'm having issues casting a column created in scala sql and cast to a double.
This is the line I use to call the power function.
scala.math.pow(pr,$”numinLinks”)
I have a spark sql data frame that has a column that I attempted to cast to a double using this UDL.
val toDouble = udf[Double, Int]( _.toDouble)
Then I called this on my data frame.
val joinDFAdjusted = join.withColumn(“numInLinks”, toDouble(joinDF(“numInLinks”)))
In the schema, it shows that my column is of StructField(numInLinks, Double, true). This is the error I receive.
found: org.apache.spark.sql.ColumnName
required: Double
Just use pow function:
import org.apache.spark.sql.functions.pow
join.withColumn("numInLinksExp", pow($"pr", $"numinLinks"))
Related
as the title states, when creating a table, when definining an variable + datatype like:
CREATE TABLE ExampleTable{
ID INTEGER,
NAME VARCHAR(200),
Integerandfloat
}
Question: You can define a variable as integer or as float etc. however, is there a datatype that can hold both values, integer as well as a float number ?
Some databases support variant data types that can have an arbitrary type. For instance, SQL Server has sql_variant.
Most databases also allow you to create your own data type (using create type). However, the power of that functionality depends on the database.
For the choice between a float and an integer, there isn't much choice. An 8-byte floating point representation covers all 4-byte integers, so you can just use a float. However, float is generally not very useful in relational databases. Fixed-point representations (numeric/decimal) are more common and might also do what you want.
Just store it using float.
Think in this way: you have two variables, one integer type (let's call it i) and another float type (let's call it f).
If you do:
i = 0.55
RESULT -> i = 0
But if you have:
f = 0.55
RESULT -> f = 0.55
In this way you can store in f also integer value:
f = 1
RESULT -> f = 1
I am trying to convert/cast a string of scientific notation (for example, '9.62809864308e-05') into a float in SQL.
I tried the standard method: CONVERT(FLOAT, x) where x = '9.62809864308e-05', but it returns the error message: Unimplemented fixed char conversion function - bpchar_float8:2585.
What I'm doing is very straightforward. My table has 2 columns: ID and rate (with rate being the string scientific notation that I am trying to cast to float). I added a 3rd column to my table and tried to populate the 3rd column with the float representation of x:
UPDATE my_table
SET 3rd_column = CONVERT(FLOAT, 2nd_column)
Data type of 2nd_column is CHAR(20)
Furthermore, not every string float is in scientific notation -- some are in normal float notation. So I'm wondering if there is a built in function that can take care of all of this.
Thank you!
Turns out that for any string representation of a float x, so let's say x = '0.00023' or x = '2.3e-04'
CONVERT(FLOAT, x) will Convert the data type of x from char (string) to float.
The reason why it didn't work for me was my string contained white spaces.
I am new to elm, and functional programming in general. I was getting a puzzling type mismatch when doing division with a call to 'show'. This code produces the mismatch:
import Graphics.Element exposing (..)
columns = 2
main = placePiece 10
placePiece: Int -> Element
placePiece index =
show (index/columns)
The code produces this error:
Type mismatch between the following types on line 9, column 3 to 22:
Int
Float
It is related to the following expression:
show (index / columns)
Which I read to mean that it expects and Int, but got a Float. But show works with any type. If I use floor to force the division into an Int, I get the same error. But, if I hard code the numbers, e.g. show (10/2) It works fine.
So what part of the code above is expecting to get an Int?
Reason for the error
Actually in this case the compiler is expecting a Float but getting an Int. The Int is the argument index of the placePiece function, and it expects a Float because Basics.(/) expects Float arguments.
Why literal numbers work
When you just hard code numbers, the compiler can figure out that although you're using whole numbers, you may want to use them as Float instead of Int.
Fixing the error
There are three ways to fix this error. If you really want to accept an Int but want floating point division, you'll have to turn the integer into a floating point number:
import Graphics.Element exposing (..)
columns = 2
main = placePiece 10
placePiece: Int -> Element
placePiece index =
show (toFloat index / columns)
If you're ok with the placePiece function taking a floating point number you can change the type signature:
import Graphics.Element exposing (..)
columns = 2
main = placePiece 10
placePiece: Float -> Element
placePiece index =
show (index/columns)
If you wanted integer division, you can use the Basics.(//) operator:
import Graphics.Element exposing (..)
columns = 2
main = placePiece 10
placePiece: Int -> Element
placePiece index =
show (index//columns)
I can't understand the way to invoke Java UDF which accepts Tuple as input.
gsmCell = LOAD '$gsmCell' using PigStorage('\t') as
(branchId,
cellId: int,
lac: int,
lon: double,
lat: double
);
gsmCellFiltered = FILTER gsmCell BY cellId is not null and
lac is not null and
lon is not null and
lat is not null;
gsmCellFixed = FOREACH gsmCellFiltered GENERATE FLATTEN (pig.parser.GSMCellParser(* ) ) as
(cellId: int,
lac: int,
lon: double,
lat: double,
);
When I wrap input for GSMCellParser using () I get inside UDF:
Tuple(Tuple).
Pig does wraps all fields into tuple and puts it inside one more tuple.
When I try to pass a list of fields, use * or $0.. I do get exception:
sed by: org.apache.pig.impl.logicalLayer.validators.TypeCheckerException: ERROR 1045:
<line 28, column 57> Could not infer the matching function for pig.parser.GSMCellParser as multiple or none of them fit. Please use an explicit cast.
at org.apache.pig.newplan.logical.visitor.TypeCheckingExpVisitor.visit(TypeCheckingExpVisitor.java:761)
at org.apache.pig.newplan.logical.expression.UserFuncExpression.accept(UserFuncExpression.java:88)
at org.apache.pig.newplan.ReverseDependencyOrderWalker.walk(ReverseDependencyOrderWalker.java:70)
at org.apache.pig.newplan.PlanVisitor.visit(PlanVisitor.java:52)
at org.apache.pig.newplan.logical.visitor.TypeCheckingRelVisitor.visitExpressionPlan(TypeCheckingRelVisitor.java:191)
at org.apache.pig.newplan.logical.visitor.TypeCheckingRelVisitor.visit(TypeCheckingRelVisitor.java:157)
at org.apache.pig.newplan.logical.relational.LOGenerate.accept(LOGenerate.java:246)
What Do i do wrong?
My aim is to feed my UDF with tuple. Tuple should contain a list of fields. (i.e. size of tuple should be 4: cellid, lac, lon. lat)
UPD:
I've tried GROUP ALL:
--filter non valid records
gsmCellFiltered = FILTER gsmCell BY cellId is not null and
lac is not null and
lon is not null and
lat is not null and
azimuth is not null and
angWidth is not null;
gsmCellFilteredGrouped = GROUP gsmCellFiltered ALL;
--fix records
gsmCellFixed = FOREACH gsmCellFilteredGrouped GENERATE FLATTEN (pig.parser.GSMCellParser($1)) as
(cellId: int,
lac: int,
lon: double,
lat: double,
azimuth: double,
ppw,
midDist: double,
maxDist,
cellType: chararray,
angWidth: double,
gen: chararray,
startAngle: double
);
Caused by: org.apache.pig.impl.logicalLayer.validators.TypeCheckerException: ERROR 1045:
<line 27, column 64> Could not infer the matching function for pig.parser.GSMCellParser as multiple or none of them fit. Please use an explicit cast.
The input schema for this UDF is: Tuple
I do't get the idea.
Tuple is an ordered set of fileds. LOAD function returns a tuple to me.
I want to pass the whole tuple to my UDF.
From the signature of the T EvalFunc<T>.eval(Tuple) method, you can see that all EvalFunc UDFs are passed a Tuple - this tuple contains all the arguments passed to the UDF.
In your case, calling GSMCellParser(*) means that the first argument of the Tuple will be the current tuple being processed (hence the tuple in a tuple).
Conceptually if you want the tuple to just contain the fields you should invoke as GSMCellParser(cellid, lac, lat, lon), then the Tuple passed to the eval func would have a schema of (int, int, double, double). This also makes your Tuple coding easier as you don't have to fish out the fields from the passed 'tuple in a tuple', rather you know that field 0 is the cellid, field 1 id the lac, etc.
I'm trying to divide one number by another and then immediately ceil() the result. These would normally be variables, but for simplicity let's stick with constants.
If I try any of the following, I get 3 when I want to get 4.
double num = ceil(25/8); // 3
float num = ceil(25/8); // 3
int num = ceil(25/8); // 3
I've read through a few threads on here (tried the nextafter() suggestion from this thread) as well as other sites and I don't understand what's going on. I've checked and my variables are the numbers I expect them to be and I've in fact tried the above, using constants, and am still getting unexpected results.
Thanks in advance for the help. I'm sure it's something simple that I'm missing but I'm at a loss at this point.
This is because you are doing integer arithmetic. The value is 3 before you are calling ceil, because 25 and 8 are both integers. 25/8 is calculated first using integer arithmetic, evaluating to 3.
Try:
double value = ceil(25.0/8);
This will ensure the compiler treats the constant 25.0 as a floating point number.
You can also use an explicit cast to achieve the same result:
double value = ceil(((double)25)/8);
This is because the expressions are evaluated before being passed as an argument to the ceil function. You need to cast one of them to a double first so the result will be a decimal that will be passed to ceil.
double num = ceil((double)25/8);