Can I set/unset a default Coder in Scio? - serialization

I would like to consistently apply a custom RicherIndicatorCoder for my case class RicherIndicator. Moreover if I fail to provide a new Coder for Tuples or KVs containing RicherIndicator then I would like to obtain a compile-time or runtime error rather than fall back on a suboptimal coder.
However Scio does not seem to honor the #DefaultCoder annotation:
#DefaultCoder(classOf[RicherIndicatorCoder]) // Ignored
case class RicherIndicator (
policy: Policy,
indicator: Indicator
)
Nor does Scio give priority to custom coders registered with the CoderRegistry, instead falling back on its own default coder:
val registry = sc.pipeline.getCoderRegistry
registry.registerCoderForClass(classOf[RicherIndicator], RicherIndicatorCoder.of) // Not used
Therefore I must use setCoder(RicherIndicatorCoder.of) wherever an SCollection of this type appears, and carefully comb through the pipeline in case there are composite types which include a RicherIndicator.
Is there a way to set my custom coder as the default, or to disable falling back on the default Magnolia or Kryo based coder?

Java annotation does not work in Scala. You can wrap your Beam Coder as an implicit Scio Coder like this:
implicit val richIndicateCoder = Coder.beam(RicherIndicatorCoder)
It should be picked up as long as the implicit is in scope.

Related

Convert IVectorView to std::span

winrt::hstring is convertible to std::basic_string_view which comes in handy quite often. However, I am unable to do the same for IVectorView.
Looking at the interface of IVector, I imagine you would have to convert it back to the underlying implementation type so I tried
using impl_type = winrt::impl::vector_impl<float, std::vector<float>, winrt::impl::single_threaded_collection_base>;
winrt::Windows::Foundation::Collections::IVectorView vector_view = GetIVectorView();
auto& impl = *winrt::get_self<impl_type>(vector_view);
auto& container = impl.get_container();
which compiles but container.size() is 0 which is incorrect.
Edit:
vector_view was the result of the TensorFloat.GetAsVectorView Method. So I can solve my problem by using the TensorFloat.CreateReference Method to get a IMemoryBufferReference instead of a IVectorView.
However, I'd still like to know whether IVectorView can be converted to a std::span, if not why is this not allowed.
The IVector and IVectorView interfaces are specifically designed not to expose the underlying contiguous memory, probably to support cases where there is no underlying contiguous memory or the implementation language doesn't expose it as such (javascript??).
You probably could get back the implementation type in when you know cppwinrt provides the implementation, however I'm my case there is no possible way of knowing the implemention type. In any case, it's inadvisable to do this.
In my case it would have been better if TensorFloat.GetAsVectorView did not exist so I could find TensorFloat.CreateReference.
Also it would be nice if cppwinrt made themselves range-v3 compatible. But until the most advisable thing to do is just copy to a std::vector.

Kotlin [1..n] constructor parameter

Is there a way to enforce 1..* parameters in Kotlin that will still allow the spread operator?
I've tried:
class Permission(
// 1..n compliance
accessiblePage: Webpage,
vararg accessiblePages: Webpage
) {
And that does enforce 1..*, but it also means that Permission(*pages) won't work, so that's a pretty awkward interface.
Is there an easy way to enforce 1..* without a runtime constructor error?
There is, unfortunately, no way to check this in Kotlin at compile time aside from the way you mentioned. Since vararg parameters are really just syntactic sugar for an array, your code is essentially
class Permission (
accessiblePage: Webpage,
accessiblePages: Array<Webpage>
)
So the question then becomes "Can you ensure that an array has at least one element in it at compile time?" For most languages, that's a clear no, although the Kotlin team did at one point experiment with it:
[C]urrently, Kotlin compiler doesn't collect static information about
collections size. FYI, at some point Kotlin team tried to collect such
information and use it for warnings about possible
IndexOutOfBoundException and stuff like that, but it was found that
there were a very little demand on such diagnostics in real-life
projects, so, given complexity of such analysis, it was abandoned[.]
(https://github.com/Kotlin/KEEP/issues/139#issuecomment-405551324)
It's possible that this metadata will be added at some point, but you shouldn't expect it soon.
That said, you could always combine a runtime check in the case of an Array with an overloaded signature in the case of varargs. This would mean that your vararg example would work the same, but passing an array to the function would subject it to a runtime check (you'd also not have to use the spread operator anymore):
class Permission (
accessiblePage: Webpage
vararg accessiblePages: Webpage
) {
constructor(accessiblePages: Array<Webpage>) {
require(accessiblePages.isNotEmpty()) {
"Must have at least one accessible page."
}
}
}
called like
val permission1 = Permission(Webpage(), Webpage())
val permission2 = Permission() // Would fail at compile time
val pages = arrayOf()
val permission3 = Permission(pages) // Would fail at runtime. Note also the lack of the spread operator.

how to convert Java Map to read it in Kotlin?

I am facing some very basic problem (that never faced in java before) and might be due my lack of knowledge in Kotlin.
I am currently trying to read a YML file. So Im doing it in this way:
private val factory = YamlConfigurationFactory(LinkedHashMap::class.java, validator, objectMapper, "dw")
Best on Dropwizard guide for configurations.
https://www.dropwizard.io/1.3.12/docs/manual/testing.html
So later in my function I do this"
val yml = File(Paths.get("config.yml").toUri())
var keyValues = factory.build(yml)
When using my debugger I can see there is a Map with key->values, just as it should be.
now when I do keyValues.get("my-key")
type inference failed. the value of the type parameter k should be mentioned in input types
Tried this but no luck
var keyValues = LinkedHashMap<String, Any>()
keyValues = factory.build(yml)
The YamlConfigurationFactory requires a class to map to, but I dont know if there is a more direct way to specify a Kotlin class than with the current solution +.kotlin, like
LinkedHashMap::class.java.kotlin
Here it also throws an error.
Ideas?
Well, this is a typical problem with JVM generics. Class<LinkedHashMap> carries no info on what are the actual types of its keys and values, so the keyValues variable always ends up with the type LinkedHashMap<*, *> simply because it can't be checked at compile time. There are two ways around this:
Unsafe Cast
This is how you would deal with the problem in standard Java: just cast the LinkedHashMap<*, *> to LinkedHashMap<String, Any> (or whatever is the actual expected type). This produces a warning because the compiler can't verify the cast is safe, but it is also generally known such situations are often unavoidable when dealing with JVM generics and serialisation.
YamlConfigurationFactory(LinkedHashMap::class.java, ...) as LinkedHashMap<String, Any>
Type Inference Magic
When using Kotlin, you can avoid the cast by actually creating instance of Class<LinkedHashMap<String, Any>> explicitly. Of course, since this is still JVM, you lose all the type info at runtime, but it should be enough to tell the type inference engine what your result should be. However, you'll need a special helper method for this (or at least I haven't found a simpler solution yet), but that method needs to be declared just once somewhere in your project:
inline fun <reified T> classOf(): Class<T> = T::class.java
...
val factory = YamlConfigurationFactory(classOf<LinkedHashMap<String, Any>>(), ...)
Using this "hack", you'll get an instance of LinkedHashMap directly, however, always remember that this is just extra info for the type inference engine but effectively it just hides the unsafe cast. Also, you can't use this if the type is not known at compile type (reified).

Why use Arrow's Options instead of Kotlin nullable

I was having a look at the Arrow library found here. Why would ever want to use an Option type instead of Kotlin's built in nullables?
I have been using the Option data type provided by Arrow for over a year, and there at the beginning, we did the exact same question to ourselves. The answer follows.
Option vs Nullable
If you compare just the option data type with nullables in Kotlin, they are almost even. Same semantics (there is some value or not), almost same syntax (with Option you use map, with nullables you use safe call operator).
But when using Options, you enable the possibility to take benefits from the arrow ecosystem!
Arrow ecosystem (functional ecosystem)
When using Options, you are using the Monad Pattern. When using the monad pattern with libraries like arrow, scala cats, scalaz, you can take benefits from several functional concepts. Just 3 examples of benefits (there is a lot more than that):
1. Access to other Monads
Option is not the only one! For instance, Either is a lot useful to express and avoid to throw Exceptions. Try, Validated and IO are examples of other common monads that help us to do (in a better way) things we do on typical projects.
2. Conversion between monads + abstractions
You can easily convert one monad to another. You have a Try but want to return (and express) an Either? Just convert to it. You have an Either but doesn't care about the error? Just convert to Option.
val foo = Try { 2 / 0 }
val bar = foo.toEither()
val baz = bar.toOption()
This abstraction also helps you to create functions that doesn't care about the container (monad) itself, just about the content. For example, you can create an extension method Sum(anyContainerWithBigDecimalInside, anotherContainerWithBigDecimal) that works with ANY MONAD (to be more precise: "to any instance of applicative") this way:
fun <F> Applicative<F>.sum(vararg kinds: Kind<F, BigDecimal>): Kind<F, BigDecimal> {
return kinds.reduce { kindA, kindB ->
map(kindA, kindB) { (a, b) -> a.add(b) }
}
}
A little complex to understand, but very helpful and easy to use.
3. Monad comprehensions
Going from nullables to monads is not just about changing safe call operators to map calls. Take a look at the "binding" feature that arrow provides as the implementation of the pattern "Monad Comprehensions":
fun calculateRocketBoost(rocketStatus: RocketStatus): Option<Double> {
return binding {
val (gravity) = rocketStatus.gravity
val (currentSpeed) = rocketStatus.currentSpeed
val (fuel) = rocketStatus.fuel
val (science) = calculateRocketScienceStuff(rocketStatus)
val fuelConsumptionRate = Math.pow(gravity, fuel)
val universeStuff = Math.log(fuelConsumptionRate * science)
universeStuff * currentSpeed
}
}
All the functions used and also the properties from rocketStatus parameter in the above example are Options. Inside the binding block, the flatMap call is abstracted for us. The code is a lot easier to read (and write) and you don't need to check if the values are present, if some of them is not, the computation will stop and the result will be an Option with None!
Now try to imagine this code with null verifications instead. Not just safe call operators but also probably if null then return code paths. A lot harder isn't it?
Also, the above example uses Option but the true power about monad comprehensions as an abstraction is when you use it with monads like IO in which you can abstract asynchronous code execution in the exact same "clean, sequential and imperative" way as above :O
Conclusion
I strongly recommend you to start using monads like Option, Either, etc as soon as you see the concept fits the semantics you need, even if you are not sure if you will take the other big benefits from the functional ecosystem or if you don't know them very well yet. Soon you'll be using it without noticing the learning-curve. In my company, we use it in almost all Kotlin projects, even in the object-oriented ones (which are the majority).
Disclaimer: If you really want to have a detailed talk about why Arrow is useful, then please head over to https://soundcloud.com/user-38099918/arrow-functional-library and listen to one of the people who work on it. (5:35min)
The people who create and use that library simple want to use Kotlin differently than the people who created it and use "the Option datatype similar to how Scala, Haskell and other FP languages handle optional values".
This is just another way of defining return types of values that you do not know the output of.
Let me show you three versions:
nullability in Kotlin
val someString: String? = if (condition) "String" else null
object with another value
val someString: String = if (condition) "String" else ""
the Arrow version
val someString: Option<String> = if (condition) Some("String") else None
A major part of Kotlin logic can be to never use nullable types like String?, but you will need to use it when interopting with Java. When doing that you need to use safe calls like string?.split("a") or the not-null assertion string!!.split("a").
I think it is perfectly valid to use safe calls when using Java libraries, but the Arrow guys seem to think different and want to use their logic all the time.
The benefit of using the Arrow logic is "empowering users to define pure FP apps and libraries built atop higher order abstractions. Use the below list to learn more about Λrrow's main features".
One thing other answers haven't mentioned: you can have Option<Option<SomeType>> where you can't have SomeType??. Or Option<SomeType?>, for that matter. This is quite useful for compositionality. E.g. consider Kotlin's Map.get:
abstract operator fun get(key: K): V?
Returns the value corresponding to the given key, or null if such a key is not present in the map.
But what if V is a nullable type? Then when get returns null it can be because the map stored a null value for the given key or because there was no value; you can't tell! If it returned Option<V>, there wouldn't be a problem.

Lambdas with captured variables

Consider the following line of code:
private void DoThis() {
int i = 5;
var repo = new ReportsRepository<RptCriteriaHint>();
// This does NOT work
var query1 = repo.Find(x => x.CriteriaTypeID == i).ToList<RptCriteriaHint>();
// This DOES work
var query1 = repo.Find(x => x.CriteriaTypeID == 5).ToList<RptCriteriaHint>();
}
So when I hardwire an actual number into the lambda function, it works fine. When I use a captured variable into the expression it comes back with the following error:
No mapping exists from object type
ReportBuilder.Reporter+<>c__DisplayClass0
to a known managed provider native
type.
Why? How can I fix it?
Technically, the correct way to fix this is for the framework that is accepting the expression tree from your lambda to evaluate the i reference; in other words, it's a LINQ framework limitation for some specific framework. What it is currently trying to do is interpret the i as a member access on some type known to it (the provider) from the database. Because of the way lambda variable capture works, the i local variable is actually a field on a hidden class, the one with the funny name, that the provider doesn't recognize.
So, it's a framework problem.
If you really must get by, you could construct the expression manually, like this:
ParameterExpression x = Expression.Parameter(typeof(RptCriteriaHint), "x");
var query = repo.Find(
Expression.Lambda<Func<RptCriteriaHint,bool>>(
Expression.Equal(
Expression.MakeMemberAccess(
x,
typeof(RptCriteriaHint).GetProperty("CriteriaTypeID")),
Expression.Constant(i)),
x)).ToList();
... but that's just masochism.
Your comment on this entry prompts me to explain further.
Lambdas are convertible into one of two types: a delegate with the correct signature, or an Expression<TDelegate> of the correct signature. LINQ to external databases (as opposed to any kind of in-memory query) works using the second kind of conversion.
The compiler converts lambda expressions into expression trees, roughly speaking, by:
The syntax tree is parsed by the compiler - this happens for all code.
The syntax tree is rewritten after taking into account variable capture. Capturing variables is just like in a normal delegate or lambda - so display classes get created, and captured locals get moved into them (this is the same behaviour as variable capture in C# 2.0 anonymous delegates).
The new syntax tree is converted into a series of calls to the Expression class so that, at runtime, an object tree is created that faithfully represents the parsed text.
LINQ to external data sources is supposed to take this expression tree and interpret it for its semantic content, and interpret symbolic expressions inside the tree as either referring to things specific to its context (e.g. columns in the DB), or immediate values to convert. Usually, System.Reflection is used to look for framework-specific attributes to guide this conversion.
However, it looks like SubSonic is not properly treating symbolic references that it cannot find domain-specific correspondences for; rather than evaluating the symbolic references, it's just punting. Thus, it's a SubSonic problem.