Pattern behind shapeless Aux classes - api

While studying shapeless and spray libraries, i've seen many inner Aux types, traits, objects and classes. It's not hard to understand that it is used for augmenting existing internal API, it looks much like a "companion object pattern" for factories and helper method. Example from HList sources:
trait Length[-L <: HList] {
type Out <: Nat
def apply() : Out
}
trait LengthAux[-L <: HList, N <: Nat] {
def apply() : N
}
object Length {
implicit def length[L <: HList, N <: Nat](implicit length : LengthAux[L, N]) = new Length[L] {
type Out = N
def apply() = length()
}
}
object LengthAux {
import Nat._
implicit def hnilLength = new LengthAux[HNil, _0] {
def apply() = _0
}
implicit def hlistLength[H, T <: HList, N <: Nat](implicit lt : LengthAux[T, N], sn : Succ[N]) = new LengthAux[H :: T, Succ[N]] {
def apply() = sn
}
}

In the case of Length for instance, the Length trait is the shape we are hoping to end up with, because it conveniently has the length encoded as a member, but that isn't a convenient form doing an implicit search. So an "Aux" class is introduced which takes the result parameter, named Out in the Length trait, and adds it to the type parameters of the LengthAux as N, which is the length. Once this result parameter is encoded into the actual type of the trait, we can search for LengthAux traits in implicit scope, knowing that if we find any with the L we are searching for, that this type will have the correct length as the N parameter.

Related

Dataframe processing generically using Scala

This code below I understand and was helpful.
But I would like to make this a generic approach, but cannot actually get started, and think that it is not possible actually with the case statement. I am looking at another approach, but am interested if a generic approach is also possible here.
import spark.implicits._
import org.apache.spark.sql.Encoders
// Creating case classes with the schema of your json objects. We're making
// these to make use of strongly typed Datasets. Notice that the MyChgClass has
// each field as an Option: this will enable us to choose between "chg" and
// "before"
case class MyChgClass(b: Option[String], c: Option[String], d: Option[String])
case class MyFullClass(k: Int, b: String, c: String, d: String)
case class MyEndClass(id: Int, after: MyFullClass)
// Creating schemas for the from_json function
val chgSchema = Encoders.product[MyChgClass].schema
val beforeSchema = Encoders.product[MyFullClass].schema
// Your dataframe from the example
val df = Seq(
(1, """{"b": "new", "c": "new"}""", """{"k": 1, "b": "old", "c": "old", "d": "old"}""" ),
(2, """{"b": "new", "d": "new"}""", """{"k": 2, "b": "old", "c": "old", "d": "old"}""" )
).toDF("id", "chg", "before")
// Parsing the json string into our case classes and finishing by creating a
// strongly typed dataset with the .as[] method
val parsedDf = df
.withColumn("parsedChg",from_json(col("chg"), chgSchema))
.withColumn("parsedBefore",from_json(col("before"), beforeSchema))
.drop("chg")
.drop("before")
.as[(Int, MyChgClass, MyFullClass)]
// Mapping over our dataset with a lot of control of exactly what we want. Since
// the "chg" fields are options, we can use the getOrElse method to choose
// between either the "chg" field or the "before" field
val output = parsedDf.map{
case (id, chg, before) => {
MyEndClass(id, MyFullClass(
before.k,
chg.b.getOrElse(before.b),
chg.c.getOrElse(before.c),
chg.d.getOrElse(before.d)
))
}
}
output.show(false)
parsedDf.printSchema()
We have many such situations, but with differing payload. I can get the fields of the case class, but cannot see the forest for the trees how to make this generic. E,g, [T] type approach for the below. I am wondering if this can be done in fact?
I can get a List of attributes, and am wondering if something like attrList.map(x => ...) with substitution can be used for the chg.b etc?
val output = parsedDf.map{
case (id, chg, before) => {
MyEndClass(id, MyFullClass(
before.k,
chg.b.getOrElse(before.b),
chg.c.getOrElse(before.c),
chg.d.getOrElse(before.d)
))
}
}
Does the following macro work for your use case?
// libraryDependencies += scalaOrganization.value % "scala-reflect" % scalaVersion.value
import scala.language.experimental.macros
import scala.reflect.macros.blackbox
def mkInstance[A, B](before: A, chg: B): A = macro mkInstanceImpl[A]
def mkInstanceImpl[A: c.WeakTypeTag](c: blackbox.Context)(before: c.Tree, chg: c.Tree): c.Tree = {
import c.universe._
val A = weakTypeOf[A]
val classAccessors = A.decls.collect {
case m: MethodSymbol if m.isCaseAccessor => m
}
val arg = q"$before.${classAccessors.head}"
val args = classAccessors.tail.map(m => q"$chg.${m.name}.getOrElse($before.$m)")
q"new $A($arg, ..$args)"
}
// in a different subproject
val output = parsedDf.map{
case (id, chg, before) => {
MyEndClass(id,
mkInstance(before, chg)
)
}
}
// scalacOptions += "-Ymacro-debug-lite"
// scalac: new MyFullClass(before.k, chg.b.getOrElse(before.b), chg.c.getOrElse(before.c), chg.d.getOrElse(before.d))
https://scastie.scala-lang.org/bXq5FHb3QuC5PqlhZOfiqA
Alternatively you can use Shapeless
// libraryDependencies += "com.chuusai" %% "shapeless" % "2.3.10"
import shapeless.{Generic, HList, LabelledGeneric, Poly2}
import shapeless.ops.hlist.{IsHCons, Mapped, ZipWith}
import shapeless.ops.record.Keys
def mkInstance[A, B, L <: HList, H, T <: HList, OptT <: HList, L1 <: HList, T1 <: HList, T2 <: HList, K <: HList](
before: A, chg: B
)(implicit
// checking that field names in tail of A are equal to field names in B
aLabelledGeneric: LabelledGeneric.Aux[A, L1],
bLabelledGeneric: LabelledGeneric.Aux[B, T2],
isHCons1: IsHCons.Aux[L1, _, T1],
keys: Keys.Aux[T1, K],
keys1: Keys.Aux[T2, K],
// checking that field types in B are Options of field types in tail of A
aGeneric: Generic.Aux[A, L],
isHCons: IsHCons.Aux[L, H, T],
mapped: Mapped.Aux[T, Option, OptT],
bGeneric: Generic.Aux[B, OptT],
zipWith: ZipWith.Aux[OptT, T, getOrElsePoly.type, T],
): A = {
val aHList = aGeneric.to(before)
aGeneric.from(isHCons.cons(isHCons.head(aHList), zipWith(bGeneric.to(chg), isHCons.tail(aHList))))
}
object getOrElsePoly extends Poly2 {
implicit def cse[A]: Case.Aux[Option[A], A, A] = at(_ getOrElse _)
}
Since all the classes are now known at compile-time it's better to use compile-time reflection (macros themselves or macros hidden inside type classes as in Shapeless) but in principle runtime reflection also can be used
import scala.reflect.ClassTag
import scala.reflect.runtime.{currentMirror => rm}
import scala.reflect.runtime.universe._
def mkInstance[A: TypeTag : ClassTag, B: TypeTag : ClassTag](before: A, chg: B): A = {
val A = typeOf[A]
val B = typeOf[B]
val classAccessors = A.decls.collect {
case m: MethodSymbol if m.isCaseAccessor => m
}.toList
val arg = rm.reflect(before).reflectMethod(classAccessors.head)()
val args = classAccessors.tail.map(m =>
rm.reflect(chg).reflectMethod(B.decl(m.name).asMethod)()
.asInstanceOf[Option[_]].getOrElse(
rm.reflect(before).reflectMethod(m)()
)
)
rm.reflectClass(A.typeSymbol.asClass)
.reflectConstructor(A.decl(termNames.CONSTRUCTOR).asMethod)(arg :: args : _*)
.asInstanceOf[A]
}

Generic transpose (or anything else really!) in Kotlin

Working on an Advent of Code puzzle I had found myself defining a function to transpose matrices of integers:
fun transpose(xs: Array<Array<Int>>): Array<Array<Int>> {
val cols = xs[0].size // 3
val rows = xs.size // 2
var ys = Array(cols) { Array(rows) { 0 } }
for (i in 0..rows - 1) {
for (j in 0..cols - 1)
ys[j][i] = xs[i][j]
}
return ys
}
Turns out that in the following puzzle I also needed to transpose a matrix, but it wasn't a matrix of Ints, so i tried to generalize. In Haskell I would have had something of type
transpose :: [[a]] -> [[a]]
and to replicate that in Kotlin I tried the following:
fun transpose(xs: Array<Array<Any>>): Array<Array<Any>> {
val cols = xs[0].size
val rows = xs.size
var ys = Array(cols) { Array(rows) { Any() } } // maybe this is the problem?
for (i in 0..rows - 1) {
for (j in 0..cols - 1)
ys[j][i] = xs[i][j]
}
return ys
}
This seems ok but it isn't. In fact, when I try calling it on the original matrix of integers I get Type mismatch: inferred type is Array<Array<Int>> but Array<Array<Any>> was expected.
The thing is, I don't really understand this error message: I thought Any was a supertype of anything else?
Googling around I thought I understood that I should use some sort of type constraint syntax (sorry, not sure it's called like that in Kotlin), thus changing the type to fun <T: Any> transpose(xs: Array<Array<T>>): Array<Array<T>>, but then at the return line I get Type mismatch: inferred type is Array<Array<Any>> but Array<Array<T>> was expected
So my question is, how do I write a transpose matrix that works on any 2-dimensional array?
As you pointed out yourself, the line Array(cols) { Array(rows) { Any() } } creates an Array<Array<Any>>, so if you use it in your generic function, you won't be able to return it when Array<Array<T>> is expected.
Instead, you should make use of this lambda to directly provide the correct value for the correct index (instead of initializing to arbitrary values and replacing all of them):
inline fun <reified T> transpose(xs: Array<Array<T>>): Array<Array<T>> {
val cols = xs[0].size
val rows = xs.size
return Array(cols) { j ->
Array(rows) { i ->
xs[i][j]
}
}
}
I don't really understand this error message: I thought Any was a supertype of anything else?
This is because arrays in Kotlin are invariant in their element type. If you don't know about generic variance, it's about describing how the hierarchy of a generic type compares to the hierarchy of their type arguments.
For example, assume you have a type Foo<T>. Now, the fact that Int is a subtype of Any doesn't necessarily imply that Foo<Int> is a subtype of Foo<Any>. You can look up the jargon, but essentially you have 3 possibilities here:
We say that Foo is covariant in its type argument T if Foo<Int> is a subtype of Foo<Any> (Foo types "vary the same way" as T)
We say that Foo is contravariant in its type argument T if Foo<Int> is a supertype of Foo<Any> (Foo types "vary the opposite way" compared to T)
We say that Foo is invariant in its type argument T if none of the above can be said
Arrays in Kotlin are invariant. Kotlin's read-only List, however, is covariant in the type of its elements. This is why it's ok to assign a List<Int> to a variable of type List<Any> in Kotlin.

How to make deeply nested function call polymorphic?

So I have a custom programming language, and in it I am doing some math formalization/modeling. In this instance I am doing basically this (a pseudo-javascript representation):
isIntersection([1, 2, 3], [1, 2], [2, 3]) // => true
isIntersection([1, 2, 3], [1, 2, 3], [3, 4, 5]) // => false
function isIntersection(setTest, setA, setB) {
i = 0
while (i < setTest.length) {
let t = setTest[i]
if (includes(setA, t) || includes(setB, t)) {
i++
} else {
return false
}
}
return true
}
function includes(set, element) {
for (x in set) {
if (isEqual(element, x)) {
return true
}
}
return false
}
function isEqual(a, b) {
if (a is Set && b is Set) {
return isSetEqual(a, b)
} else if (a is X... && b is X...) {
return isX...Equal(a, b)
} ... {
...
}
}
function isSetEqual(a, b) {
i = 0
while (i < a.length) {
let x = a[i]
let y = b[i]
if (!isEqual(x, y)) {
return false
}
i++
}
return true
}
The isIntersection is checking isEqual, and isEqual is configured to be able to handle all kinds of cases of equality check, from sets compared to sets, objects to objects, X's to X's, etc..
The question is, how can we make the isEqual somehow ignorant of the implementation details? Right now you have to have one big if/else/switch statement for every possible type of object. If we add a new type, we have to modify this gigantic isEqual method to add support for it. How can we avoid this, and just define them separately and cleanly?
I was thinking initially of making the objects be "instances of classes" so to speak, with class methods. But I like the purity of having everything just be functions and structs (objects without methods). Is there any way to implement this sort of thing without using classes with methods, instead keeping it just functions and objects?
If not, then how would you implement it with classes? Would it just be something like this?
class Set {
isEqual(set) {
i = 0
while (i < this.length) {
let x = this[i]
let y = set[i]
if (!x.isEqual(y)) {
return false
}
i++
}
return true
}
}
This would mean every object would have to have an isEqual defined on it. How does Haskell handle such a system? Basically looking for inspiration on how this can be most cleanly done. I want to ideally avoid having classes with methods.
Note: You can't just delegate to == native implementation (like assuming this is in JavaScript). We are using a custom programming language and are basically trying to define the meaning of == in the first place.
Another approach is to pass around an isEqual function along with everything somehow, though I don't really see how to do this and if it were possible it would be clunky. So not sure what the best approach is.
Haskell leverages its type and type-class system to deal with polymorphic equality.
The relevant code is
class Eq a where
(==) :: a -> a -> Bool
The English translation is: a type a implements the Eq class if, and only if, it defines a function (==) which takes two inputs of type a and outputs a Bool.
Generally, we declare certain "laws" that type-classes should abide by. For example, x == y should be identical to y == x in all cases, and x == x should never be False. There's no way for the compiler to check these laws, so one typically just writes them into the documentation.
Once we have defined the typeclass Eq in the above manner, we have access to the (==) function (which can be called using infix notation - ie, we can either write (==) x y or x == y). The type of this function is
(==) :: forall a . Eq a => a -> a -> Bool
In other words, for every a that implements the typeclass Eq, (==) is of type a -> a -> Bool.
Consider an example type
data Boring = Dull | Uninteresting
The type Boring has two proper values, Dull and Uninteresting. We can define the Eq implementation as follows:
instance Eq Boring where
Dull == Dull = True
Dull == Uninteresting = False
Uninteresting == Uninteresting = True
Uninteresting == Dull = False
Now, we will be able to evaluate whether two elements of type Boring are equal.
ghci> Dull == Dull
True
ghci> Dull == Uninteresting
False
Note that this is very different from Javascript's notion of equality. It's not possible to compare elements of different types using (==). For example,
ghci> Dull == 'w'
<interactive>:146:9: error:
* Couldn't match expected type `Boring' with actual type `Char'
* In the second argument of `(==)', namely 'w'
In the expression: Dull == 'w'
In an equation for `it': it = Dull == 'w'
When we try to compare Dull to the character 'w', we get a type error because Boring and Char are different types.
We can thus define
includes :: Eq a => [a] -> a -> Bool
includes [] _ = False
includes (x:xs) element = element == x || includes xs element
We read this definition as follows:
includes is a function that, for any type a which implements equality testing, takes a list of as and a single a and checks whether the element is in the list.
If the list is empty, then includes list element will evaluate to False.
If the list is not empty, we write the list as x : xs (a list with the first element as x and the remaining elements as xs). Then x:xs includes element iff either x equals element, or xs includes element.
We can also define
instance Eq a => Eq [a] where
[] == [] = True
[] == (_:_) = False
(_:_) == [] = False
(x:xs) == (y:ys) = x == y && xs == ys
The English translation of this code is:
Consider any type a such that a implements the Eq class (in other words, so that (==) is defined for type a). Then [a] also implements the Eq type class - that is, we can use (==) on two values of type [a].
The way that [a] implements the typeclass is as follows:
The empty list equals itself.
An empty list does not equal a non-empty list.
To decide whether two non-empty lists (x:xs) and (y:ys) are equal, check whether their first elements are equal (aka whether x == y). If the first elements are equal, check whether the remaining elements are equal (whether xs == ys) recursively. If both of these are true, the two lists are equal. Otherwise, they're not equal.
Notice that we're actually using two different ==s in the implementation of Eq [a]. The equality x == y is using the Eq a instance, while the equality xs == ys is recursively using the Eq [a] instance.
In practice, defining Eq instances is typically so simple that Haskell lets the compiler do the work. For example, if we had instead written
data Boring = Dull | Uninteresting deriving (Eq)
Haskell would have automatically generated the Eq Boring instance for us. Haskell also lets us derive other type classes like Ord (where the functions (<) and (>) are defined), show (which allows us to turn our data into Strings), and read (which allows us to turn Strings back into our data type).
Keep in mind that this approach relies heavily on static types and type-checking. Haskell makes sure that we only ever use the (==) function when comparing elements of the same type. The compiler also always knows at compile type which definition of (==) to use in any given situation because it knows the types of the values being compared, so there is no need to do any sort of dynamic dispatch (although there are situations where the compiler will choose to do dynamic dispatch).
If your language uses dynamic typing, this method will not work and you'll be forced to use dynamic dispatch of some variety if you want to be able to define new types. If you use static typing, you should definitely look into Haskell's type class system.

Kotlin - Type of `if` and `when` Expressions

I understand that Kotlin is a statically-typed language, and all the types are defined at the compile time itself.
Here is a when expression that returns different types:
fun main(){
val x = readLine()?.toInt() ?: 0
val y = when(x){
1 -> 42
2 -> "Hello"
else -> 3.14F
}
println(y::class.java)
}
During runtime (Kotlin 1.3.41 on JVM 1.8) this is the output:
When x = 1, it prints class java.lang.Integer
When x = 2, it prints class java.lang.String
Otherwise, it prints class java.lang.Float
When does the compiler determine the type of y? Or, how does the compiler infers the type of y during compile-time?
Actually, the type of the when expression resolves to Any in this case, so the y variable can have any value. An IDE even warns you, that Conditional branch result of type X is implicitly cast to Any, at least Android Studio does, as well as Kotlin Playground.
The type of that variable for you is Any (as the smallest possible superclass for all that types), but underlying value is untouched.
What does it mean? You can safely access only properties that are common for all that types (so only properties available for Any type. And property ::class.java is available for all types.
See this example - I use some other types to good visualise what is it about.
abstract class FooGoo {
fun foogoo(): String = "foo goo"
}
class Foo: FooGoo() {
fun foo(): String = "foo foo"
}
class Goo: FooGoo() {
fun goo(): String = "goo goo"
}
class Moo {
fun moo(): String = "moo moo"
}
fun main(x: Int) {
val n = when (x) {
0 -> Foo()
1 -> Goo()
else -> throw IllegalStateException()
} // n is implicitly cast to FooGoo, as it's the closes superclass of both, Foo and Goo
// n now has only methods available for FooGoo, so, only `foogoo` can be called (and all methods for any)
val m = when (x) {
0 -> Foo()
1 -> Goo()
else -> Moo()
} // m is implicitly cast to Any, as there is no common supertype except Any
// m now has only methods available for Any() - but properties for that class are not changed
// so, `m::class.java` will return real type of that method.
println(m::class.java) // // Real type of m is not erased, we still can access it
if (m is FooGoo) {
m.foogoo() // After explicit cast we are able to use methods for that type.
}
}
During compile-time, the inferred type of y is Any which is the supertype of all types in Kotlin. During run-time, y can reference [literally] any type of object. The IDE generates a warning "Conditional branch result of type Int/String/Float is implicitly cast to Any".
In the example,
When x = 1, it refers to an object of type java.lang.Integer.
When x = 2, it refers to an object of type java.lang.String.
Otherwise, it refers to an object of type java.lang.Float.
Thanks Slaw for the quick explanation:
There's a difference between the declared type of a variable and the actual type of the object it references. It's no different than doing val x: Any = "Hello, Wold!";

Is there a way to get a Curried form of the binary operators in SML/NJ?

For example, instead of
- op =;
val it = fn : ''a * ''a -> bool
I would rather have
- op =;
val it = fn : ''a -> ''a -> bool
for use in
val x = getX()
val l = getList()
val l' = if List.exists ((op =) x) l then l else x::l
Obviously I can do this on my own, for example,
val l' = if List.exists (fn y => x = y) l then l else x::l
but I want to make sure I'm not missing a more elegant way.
You could write a helper function that curries a function:
fun curry f x y = f (x, y)
Then you can do something like
val curried_equals = curry (op =)
val l' = if List.exists (curried_equals x) l then l else x::l
My knowledge of SML is scant, but I looked through the Ullman book and couldn't find an easy way to convert a function that accepts a tuple to a curried function. They have two different signatures and aren't directly compatible with one another.
I think you're going to have to roll your own.
Or switch to Haskell.
Edit: I've thought about it, and now know why one isn't the same as the other. In SML, nearly all of the functions you're used to actually accept only one parameter. It just so happens that most of the time you're actually passing it a tuple with more than one element. Still, a tuple is a single value and is treated as such by the function. You can't pass such function a partial tuple. It's either the whole tuple or nothing.
Any function that accepts more than one parameter is, by definition, curried. When you define a function that accepts multiple parameters (as opposed to a single tuple with multiple elements), you can partially apply it and use its return value as the argument to another function.