Tagged Unions in Elm - elm

I'm reading http://elm-lang.org/guide/model-the-problem and want to better understand Tagged Unions in Elm. Specifically I came across this example:
type Scale = Normal | Logarithmic
type Widget
= ScatterPlot (List (Int, Int))
| LogData (List String)
| TimePlot Scale (List (Time, Int))
The way I think it's interpreted is as follows:
Scale is a type with 2 possible values: Normal or Logarithmic
Widget is a type with 3 possible values: ScatterPlot, LogData, or TimePlot
However, how do I interpret the (List (Int, Int)) part in ScatterPlot? Similarly, how do I interpret the Scale (List (Time, Int)) part in TimePlot?

List is a built-in type, taking one parameter (another type) and meaning "a list containing values of this type as its elements". So List (Int, Int) is a list of (Int, Int). So what's (Int, Int)?
In general any (a, b) is a tuple with members of type a and type b. A tuple is a bit like a record without field names, so you can only distinguish elements by their position - however unlike a list the elements can be of different types. So (Int, Int) is a tuple containing two Ints, where Int is just an integer.
Thus, List (Int, Int) is a list of tuples of two integers.
With TimePlot you've actually got two different type parameters - Scale and List (Time, Int). The latter should now make sense given the explanation of List (Int, Int) - just the tuple has Time as its first type instead of Int.
So TimePlot takes two types as parameters, and it becomes a TimePlot Scale (List (Time, Int)).
In Elm and related languages, type notation (and function application) are defined such that any expression a b c d means a with parameters b, c, and d. If c d is meant to be one parameter it is put in parentheses.
As Andreas says, think of the union 'tags' as functions - they really are, in fact they're called "type constructors". TimePlot is a function taking a Scale and a List (Time, Int) and returning a Widget. Normal is a function with no parameters which returns a Scale, and so on.

Just think about them as function signatures. So Scatterplot must be created like this
ScatterPlot [(1,1), (2,2)]
and when you pattern match this in a case statement
case widget of
ScatterPlot l -> l -- l is from type (List (Int, Int))
LogData l -> l -- l is from type (List String)
TimePlot l -> l -- l is from type Scale (List (Time, Int))

Related

List split in Elm

Write a function to split a list into two lists. The length of the first part is specified by the caller.
I am new to Elm so I am not sure if my reasoning is correct. I think that I need to transform the input list in an array so I am able to slice it by the provided input number. I am struggling a bit with the syntax as well. Here is my code so far:
listSplit: List a -> Int -> List(List a)
listSplit inputList nr =
let myArray = Array.fromList inputList
in Array.slice 0 nr myArray
So I am thinking to return a list containing 2 lists(first one of the specified length), but I am stuck in the syntax. How can I fix this?
Alternative implementation:
split : Int -> List a -> (List a, List a)
split i xs =
(List.take i xs, List.drop i xs)
I'll venture a simple recursive definition, since a big part of learning functional programming is understanding recursion (which foldl is just an abstraction of):
split : Int -> List a -> (List a, List a)
split splitPoint inputList =
splitHelper splitPoint inputList []
{- We use a typical trick here, where we define a helper function
that requires some additional arguments. -}
splitHelper : Int -> List a -> List a -> (List a, List a)
splitHelper splitPoint inputList leftSplitList =
case inputList of
[] ->
-- This is a base case, we end here if we ran out of elements
(List.reverse leftSplitList, [])
head :: tail ->
if splitPoint > 0 then
-- This is the recursive case
-- Note the typical trick here: we are shuffling elements
-- from the input list and putting them onto the
-- leftSplitList.
-- This will reverse the list, so we need to reverse it back
-- in the base cases
splitHelper (splitPoint - 1) tail (head :: leftSplitList)
else
-- here we got to the split point,
-- so the rest of the list is the output
(List.reverse leftSplitList, inputList)
Use List.foldl
split : Int -> List a -> (List a, List a)
split i xs =
let
f : a -> (List a, List a) -> (List a, List a)
f x (p, q) =
if List.length p >= i then
(p, q++[x])
else
(p++[x], q)
in
List.foldl f ([], []) xs
When list p reaches the desired length, append element x to the second list q.
Append element x to list p otherwise.
Normally in Elm, you use List for a sequence of values. Array is used specifically for fast indexing access.
When dealing with lists in functional programming, try to think in terms of map, filter, and fold. They should be all you need.
To return a pair of something (e.g. two lists), use tuple. Elm supports tuples of up to three elements.
Additionally, there is a function splitAt in the List.Extra package that does exactly the same thing, although it is better to roll your own for the purpose of learning.

Create Table variable datatype that would allow to save integer/floats [SQL]

as the title states, when creating a table, when definining an variable + datatype like:
CREATE TABLE ExampleTable{
ID INTEGER,
NAME VARCHAR(200),
Integerandfloat
}
Question: You can define a variable as integer or as float etc. however, is there a datatype that can hold both values, integer as well as a float number ?
Some databases support variant data types that can have an arbitrary type. For instance, SQL Server has sql_variant.
Most databases also allow you to create your own data type (using create type). However, the power of that functionality depends on the database.
For the choice between a float and an integer, there isn't much choice. An 8-byte floating point representation covers all 4-byte integers, so you can just use a float. However, float is generally not very useful in relational databases. Fixed-point representations (numeric/decimal) are more common and might also do what you want.
Just store it using float.
Think in this way: you have two variables, one integer type (let's call it i) and another float type (let's call it f).
If you do:
i = 0.55
RESULT -> i = 0
But if you have:
f = 0.55
RESULT -> f = 0.55
In this way you can store in f also integer value:
f = 1
RESULT -> f = 1

Range of two arbitrary numbers

In Kotlin, one can create a range of two numbers by writing a..b, but a < b is necessary for this to not be empty.
Is there a short way for creating the range "between" two arbitrary numbers?
The logic for this would be: min(a,b)..max(a,b)
There's no short way built into the standard library, I'm afraid.  But you can easily add your own.  Your question gives one way:
fun rangeBetween(a: Int, b: Int) = min(a, b) .. max(a, b)
And here's another:
fun rangeBetween(a: Int, b: Int) = if (a > b) a downTo b else a .. b
(They both behave the same for in checks, but differ in the iteration order: the first one always counts up from the lower to the higher, while the latter will count up or down from the first number to the second.)
Unfortunately those can't be made generic, as both the min()/max() methods and the type of range are different for Ints, Longs, Bytes, Shorts, etc.  But you could add overloads for other types if needed.
(I don't know why Kotlin is so fussy about distinguishing ascending and descending ranges.  You'd think that this was a fairly common case, and that it would be a simplification to allow ranges to count up or down as needed.)

What does comparable mean in Elm?

I'm having trouble understanding what exactly a comparable is in Elm. Elm seems as confused as I am.
On the REPL:
> f1 = (<)
<function> : comparable -> comparable -> Bool
So f1 accepts comparables.
> "a"
"a" : String
> f1 "a" "b"
True : Bool
So it seems String is comparable.
> f2 = (<) 1
<function> : comparable -> Bool
So f2 accepts a comparable.
> f2 "a"
As I infer the type of values flowing through your program, I see a conflict
between these two types:
comparable
String
So String is and is not comparable?
Why is the type of f2 not number -> Bool? What other comparables can f2 accept?
Normally when you see a type variable in a type in Elm, this variable is unconstrained. When you then supply something of a specific type, the variable gets replaced by that specific type:
-- says you have a function:
foo : a -> a -> a -> Int
-- then once you give an value with an actual type to foo, all occurences of `a` are replaced by that type:
value : Float
foo value : Float -> Float -> Int
comparable is a type variable with a built-in special meaning. That meaning is that it will only match against "comparable" types, like Int, String and a few others. But otherwise it should behave the same. So I think there is a little bug in the type system, given that you get:
> f2 "a"
As I infer the type of values flowing through your program, I see a conflict
between these two types:
comparable
String
If the bug weren't there, you would get:
> f2 "a"
As I infer the type of values flowing through your program, I see a conflict
between these two types:
Int
String
EDIT: I opened an issue for this bug
Compare any two comparable values. Comparable values include String, Char, Int, Float, Time, or a list or tuple containing comparable values. These are also the only values that work as Dict keys or Set members.
taken from the elm docs here.
In older Elm versions:
Comparable types includes numbers, characters, strings,~~
lists of comparable things, and tuples of comparable things. Note that
tuples with 7 or more elements are not comparable; why are your tuples
so big?
This means that:
[(1,"string"), (2, "another string")] : List (Int, String) -- is comparable
But having
(1, "string", True)` : (Int, String, Bool) -- or...
[(1,True), (2, False)] : List (Int, Bool ) -- are ***not comparable yet***.
This issue is discussed here
Note: Usually people encounter problems with the comparable type when they try to use a union type as a Key in a Dict.
Tags and Constructors of union types are not comparable. So the following doesn't even compile.
type SomeUnion = One | Two | Three
Dict.fromList [ (One, "one related"), (Two, "two related") ] : Dict SomeUnion String
Usually when you try to do this, there is a better approach to your data structure. But until this gets decided - an AllDict can be used.
I think this question can be related to this one. Int and String are both comparable in the sense that strings can be compared to strings and ints can be compared to ints. A function that can take any two comparables would have a signature comparable -> comparable -> ... but within any one evaluation of the function both of the comparables must be of the same type.
I believe the reason f2 is confusing above is that 1 is a number instead of a concrete type (which seems to stop the compiler from recognizing that the comparable must be of a certain type, probably should be fixed). If you were to do:
i = 4 // 2
f1 = (<) i -- type Int -> Bool
f2 = (<) "a" -- type String -> Bool
you would see it actually does collapse comparable to the correct type when it can.

Can't invoke Java UDF which accepts Tuple input

I can't understand the way to invoke Java UDF which accepts Tuple as input.
gsmCell = LOAD '$gsmCell' using PigStorage('\t') as
(branchId,
cellId: int,
lac: int,
lon: double,
lat: double
);
gsmCellFiltered = FILTER gsmCell BY cellId is not null and
lac is not null and
lon is not null and
lat is not null;
gsmCellFixed = FOREACH gsmCellFiltered GENERATE FLATTEN (pig.parser.GSMCellParser(* ) ) as
(cellId: int,
lac: int,
lon: double,
lat: double,
);
When I wrap input for GSMCellParser using () I get inside UDF:
Tuple(Tuple).
Pig does wraps all fields into tuple and puts it inside one more tuple.
When I try to pass a list of fields, use * or $0.. I do get exception:
sed by: org.apache.pig.impl.logicalLayer.validators.TypeCheckerException: ERROR 1045:
<line 28, column 57> Could not infer the matching function for pig.parser.GSMCellParser as multiple or none of them fit. Please use an explicit cast.
at org.apache.pig.newplan.logical.visitor.TypeCheckingExpVisitor.visit(TypeCheckingExpVisitor.java:761)
at org.apache.pig.newplan.logical.expression.UserFuncExpression.accept(UserFuncExpression.java:88)
at org.apache.pig.newplan.ReverseDependencyOrderWalker.walk(ReverseDependencyOrderWalker.java:70)
at org.apache.pig.newplan.PlanVisitor.visit(PlanVisitor.java:52)
at org.apache.pig.newplan.logical.visitor.TypeCheckingRelVisitor.visitExpressionPlan(TypeCheckingRelVisitor.java:191)
at org.apache.pig.newplan.logical.visitor.TypeCheckingRelVisitor.visit(TypeCheckingRelVisitor.java:157)
at org.apache.pig.newplan.logical.relational.LOGenerate.accept(LOGenerate.java:246)
What Do i do wrong?
My aim is to feed my UDF with tuple. Tuple should contain a list of fields. (i.e. size of tuple should be 4: cellid, lac, lon. lat)
UPD:
I've tried GROUP ALL:
--filter non valid records
gsmCellFiltered = FILTER gsmCell BY cellId is not null and
lac is not null and
lon is not null and
lat is not null and
azimuth is not null and
angWidth is not null;
gsmCellFilteredGrouped = GROUP gsmCellFiltered ALL;
--fix records
gsmCellFixed = FOREACH gsmCellFilteredGrouped GENERATE FLATTEN (pig.parser.GSMCellParser($1)) as
(cellId: int,
lac: int,
lon: double,
lat: double,
azimuth: double,
ppw,
midDist: double,
maxDist,
cellType: chararray,
angWidth: double,
gen: chararray,
startAngle: double
);
Caused by: org.apache.pig.impl.logicalLayer.validators.TypeCheckerException: ERROR 1045:
<line 27, column 64> Could not infer the matching function for pig.parser.GSMCellParser as multiple or none of them fit. Please use an explicit cast.
The input schema for this UDF is: Tuple
I do't get the idea.
Tuple is an ordered set of fileds. LOAD function returns a tuple to me.
I want to pass the whole tuple to my UDF.
From the signature of the T EvalFunc<T>.eval(Tuple) method, you can see that all EvalFunc UDFs are passed a Tuple - this tuple contains all the arguments passed to the UDF.
In your case, calling GSMCellParser(*) means that the first argument of the Tuple will be the current tuple being processed (hence the tuple in a tuple).
Conceptually if you want the tuple to just contain the fields you should invoke as GSMCellParser(cellid, lac, lat, lon), then the Tuple passed to the eval func would have a schema of (int, int, double, double). This also makes your Tuple coding easier as you don't have to fish out the fields from the passed 'tuple in a tuple', rather you know that field 0 is the cellid, field 1 id the lac, etc.