Serializing rust HashMap with custom hash function - serialization

I'm trying to serialize a rust HashMap with a custom hash function applying bincode::serialize_into(writer, cache)?; where cache is a struct-object defined as following:
use std::hash::{BuildHasherDefault, Hasher};
use serde::{Deserialize, Serialize};
#[derive(Deserialize, Serialize)]
pub(crate) struct Cache<S>
where
S: Default + Hasher,
{
records: HashMap<DNA, Mapping, BuildHasherDefault<S>>,
}
When compiling I receive an error the trait 'cli::_::_serde::Serialize' is not implemented for 'S' referring to the cache variable.
As described in this issue serialization for custom hash functions (in my case FnvHasher) should exist requiring only to implement the BuildHasher trait.
That trait is already implemented by BuildHasherDefault which is used to create a BuildHasher instance.
Edit:
To illustrate the issue I've create a playground example (as bincode is not available in playground I used serde_json instead).
As the example shows it is possible to create and serialize a HashMap custom_hashmap with custom hash-function (in this case FnvHasher).
But this does not work for a Cache-object cache wrapping a HashMap with a custom hash-function. In that case the compiler asks for adding the Serialize trait which does not exist for the FnvHasher

Related

How to create JSON object strategy according to a schema with rust proptest?

I'd like to create a JSON strategy using rust proptest library. However, I do not want to create an arbitrary JSON. I'd like to create it according to a schema (more specifically, OpenAPI schema). This means that keys of the JSON are known and I do not want to create them using any strategy, but I'd like to create the values using the strategy (pretty-much recursively).
I already implemented the strategy for primitive types, but I do not how to create a JSON object strategy.
I would like the strategy to have the type BoxedStratedy<serde_json::Value> or be able to map the strategy to this type because the JSON objects can contain other objects, and thus I need to be able to compose the strategies.
I found a HashMapStrategy strategy, however, it can be only created by a hash_map function that takes two strategies - one for generating keys and one for values. I thought that I could use Just strategy for the keys, but it did not lead anywhere. Maybe prop_filter_map could be used.
Here is the code. There are tests too. One is passing because it tests only primitive type and the other is failing since I did not find a way to implement generate_json_object function.
I tried this but the types do not match. Instead of a strategy of map from string to JSON value, it is a strategy of a map from string to BoxedStrategy.
fn generate_json_object(object: &ObjectType) -> BoxedStrategy<serde_json::Value> {
let mut json_object = serde_json::Map::with_capacity(object.properties.len());
for (name, schema) in &object.properties {
let schema_kind = &schema.to_item_ref().schema_kind;
json_object.insert(name.clone(), schema_kind_to_json(schema_kind));
}
Just(serde_json::Value::Object(json_object)).boxed()
}
One can create a vector of strategies, which implements a Strategy trait and can be boxed. So to create a serde_json::Value::Object, we create a vector of tuples. The first element will be a Just of key and the second element will be a boxed strategy of value. The boxed strategy of value can be created by schema_kind_to_json function. After we have a vector of tuples which implement a Strategy, we can use .prop_map to transform it to a serde_json::Value::Object.
fn generate_json_object(object: &ObjectType) -> BoxedStrategy<serde_json::Value> {
let mut vec = Vec::with_capacity(object.properties.len());
for (name, schema) in &object.properties {
let schema_kind = &schema.to_item_ref().schema_kind;
vec.push((Just(name.clone()), schema_kind_to_json(schema_kind)));
}
vec.prop_map(|vec| serde_json::Value::Object(serde_json::Map::from_iter(vec)))
.boxed()
}

Kotlin: Generic types in Kotlin

To get the class definition to be used for example for json deserialization the following can be used in Kotlin:
Map::class.java
A example usage is the following:
val map = mapper.readValue(json, Map::class.java)
But now how to have the generic type definition?
Something like this does not compile:
val map = mapper.readValue(decodedString, Map<String, String>::class.java)
So my question is: What is the generic equivalent to *::class.java
Class<T> (in Java) or KClass<T> (in Kotlin) can only represent classes, not all types. If the API you're using only uses Class<T> or KClass<T>, it simply doesn't support generic types (at least in those functions).
Instead, KType (or Type in Java) is the proper type to use to represent the complete type information including generics. You could use it this way:
val myMapType: KType = typeOf<Map<String,String>>()
Unfortunately, KType doesn't have a type parameter (it's not KType<T>), and that makes it impossible to use for compile-time type checking: you can't have the equivalent of fun deserialize(Input, KClass<T>): T using KType instead of KClass, because you can't define the T for the return type by using only a KType argument.
There are several tricks to work around this:
In both Java and Kotlin, one of the ways is to get this information through inheritance by providing a generic superclass and inheriting from it.
In general, serialization APIs (especially the deserializing part) provide workarounds using this, such as Jackson's TypeReference or Gson's TypeToken. It's basically their version of Type but with a type parameter to have some compile-time type safety.
In Kotlin, there is sometimes another way depending on the situation: making use of reified type parameters. Using inline functions, the compiler can know more information at compile time about the type parameters by replacing them with the actual inferred type at the call site when inlining the function's body. This allows things like T::class in the inline function's body. This is how you can get functions like typeOf to get a KType.
Some Kotlin-specific APIs of deserialization libraries use inline functions to remove the hassle from the user, and get type information directly. This is what jackson-module-kotlin does by providing an inline readValue extension without a Class argument, which reifies the type parameter to get the target type information

What are nullable rules when calling Java from Kotlin

Why does Kotlin in one case infer type returned from Java to be nullable and in another case it is can be either, nullable or non-nullable?
I've checked both HashMap.get and JsonNode.get and I could not identify any #NotNull-like annotations neither in calsses nor anywhere in inheritance chain. What makes Kotlin treating those 2 calls differently?
I have read documentation https://kotlinlang.org/docs/java-interop.html#null-safety-and-platform-types but it explanation use "Platform Types" without explaining what those are and it does not explain differences in behavior anyway.
import com.fasterxml.jackson.databind.JsonNode
private fun docType(node: JsonNode, map: java.util.HashMap<String,String>) {
val x: JsonNode = node.get("doc_type") // DOES compile and can throw NPE at runtime
val y: JsonNode? = node.get("doc_type") // DOES compile and Kotlin's type system will force you to check for null
val z: String = map.get("a") // ERROR: Type mismatch: inferred type is String? but String was expected
}
Kotlin provides seamless interoperability with Java, without compromising its own null-safety... almost. One exception is that Kotlin assumes that all types that are defined in Java are not-null.
To understand, let's look at JsonNode.get()
Platform types
public JsonNode get(String fieldName) { return null; }
Note that JsonNode is defined in Java, and is a therefore 'platform type' - and Kotlin does not 'translate' it to JsonNode?, even though that would be technically correct (because in Java all types are nullable).
When calling Java from Kotlin, for convenience it's assumed that the platform type is non-nullable. If this wasn't the case, you would always have to check that any instance of any platform type is not null.
So, to answer your question about what a 'platform type' is, it's a term that means
some type that is defined in an external target language,
you can't mention it explicitly in Kotlin code (but there's probably a synonymous Kotlin equivalent),
and we're going to assume that it's non-nullable for convenience.
Also the notation is <type>!, for example String! - which we can take to mean String or String?
Nullability annotations
The closest Java equivalent of Kotlin's nullable ? symbol are nullability annotations, which the Kotlin compiler can parse and take into account. However, none are used on JsonNode methods. And so Kotlin will quite happily assume that node.get("") will return JsonNode, not JsonNode?.
As you noted, there are none defined for HashMap.get(...).
So how does Kotlin know that map.get("a") returns a nullable type?
Type inference
Type inference can't help. The (Java) method signature
public V get(Object key) {
//...
}
indicates that a HashMap<String, String> should return String, not String?. Something else must be going on...
Mapped types
For most Java types, Kotlin will just use the definition as provided. But for some, Kotlin decides to treat them specially, and completely replace the Java definition with its own version.
You can see the list of mapped types in the docs. And while HashMap isn't in there, Map is. And so, when we're writing Kotlin code, HashMap doesn't inherit from java.util.Map - because it's mapped to kotlin.collections.Map
Aside: in fact if you try and use java.util.Map you'll get a warning
So if we look at the code for the get function that kotlin.collections.Map defines, we can see that it returns a nullable value type
/**
* Returns the value corresponding to the given [key], or `null` if such a key is not present in the map.
*/
public operator fun get(key: K): V?
And so the Kotlin compiler can look at HashMap.get(...) and deduce that, because it's implementing kotlin.collections.Map.get(...), the returned value must be a nullable value, which in our case is String?.
Workaround: External annotations
For whatever reason, Jackson doesn't use the nullability annotations that would solve this problem. Fortunately IntelliJ provides a workaround that, while not as strict, will provide helpful warnings: external annotations.
Once I follow the instructions...
Alt+Enter → 'Annotate method...'
Select 'Nullable' annotation
Save annotations.xml
Now node.get("") will show an warning.
This annotation isn't visible to the Kotlin compiler, so it can only be a warning - not a compilation error.
java.util.HashMap.get implements the interface method java.util.Map.get. Kotlin maps some Java types to its own types internally. The full table of these mappings is available on the website. In our particular case, we see that java.util.Map gets mapped internally to kotlin.collections.Map, whose get function looks like
abstract operator fun get(key: K): V?
So as far as Kotlin is concerned, java.util.Map is just a funny name for kotlin.collections.Map, and all of the methods on java.util.Map actually have the signatures of the corresponding ones from kotlin.collections.Map (which are basically the same except with correct null annotations).
So while the first two node.get calls are Java calls and return platform types, the third one (as far as Kotlin is concerned) is actually calling a method Kotlin understands: namely, get from its own Map type. And that type has an explicit nullability annotation already available, so Kotlin can confidently say that that value can be null and needs to be checked.

Property references vs. lambdas for getter/setter

I need to get and set a property of another class from a method and therefore need to pass in either the property reference of lambdas for the getter and the setter:
Passing in the property reference
otherInstance::property
Passing in a lambda for the getter and one for the setter:
{otherInstance.property} // getter
{value -> otherInstance.property = value} // setter
I like the first one, because for me the code is easier to read and shorter, but my alarm bells ring when I read about it on the official documentation, because of the term "reflection". My knowledge from Java is that reflection generally isn't a good thing. Is that also valid with Kotlin? Is it valid with this case? Is one of both ways (property reference or lambdas) more performant or more safe?
By using KMutableProperty0 you would technically be exposing an object that can be used for reflection. If you want to be strict about avoiding reflection, you could use the separate function references for the getter and setter. Note that it's not necessary to pass a lambda as a function reference to a higher-order function. The compiler can interpret property references as functions if the effective signature matches. This would unfortunately mean having to pass the property reference twice. Unfortunately, the setter has to be retrieved via what is technically reflection in this case:
class Test (var x: Int)
fun foo(getter: () -> Int, setter: (Int) -> Unit) {
//...
}
val test = Test(1)
foo(test::x, test::x.setter)
// Zero reflection call:
foo(test::x) { test.x = it }
At some point you have to question how badly you want to avoid reflection, because the above code looks very messy to me. If your class takes a KMutableProperty0 reference, it is much simpler to use. As long as your receiving function isn't using the reference to introspect the code, and only calls get() or set() on it, you are not really using reflection in the ways that are suggested should be avoided.
fun foo(property: KMutableProperty0<Int>) {
//...
}
val test = Test(1)
foo(test::x)
The documentation is about Member references and reflection,
If you are referring to Property references which isn't using reflection itself,
Reflection is only referred in different section Obtaining member references from a class reference
dynamically inspect an object to see e.g. what properties and functions it contains and which annotations exist on them. This is called reflection, and it's not very performant, so avoid it unless you really need it.
Kotlin has got its own reflection library (kotlin-reflect.jar must be included in your build). When targeting the JVM, you can also use the Java reflection facilities. Note that the Kotlin reflection isn't quite feature-complete yet - in particular, you can't use it to inspect built-in classes like String.

Kotlin and Immutable Collections?

I am learning Kotlin and it is looking likely I may want to use it as my primary language within the next year. However, I keep getting conflicting research that Kotlin does or does not have immutable collections and I'm trying to figure out if I need to use Google Guava.
Can someone please give me some guidance on this? Does it by default use Immutable collections? What operators return mutable or immutable collections? If not, are there plans to implement them?
Kotlin's List from the standard library is readonly:
interface List<out E> : Collection<E> (source)
A generic ordered collection of elements. Methods in this interface
support only read-only access to the list; read/write access is
supported through the MutableList interface.
Parameters
E - the type of elements contained in the list.
As mentioned, there is also the MutableList
interface MutableList<E> : List<E>, MutableCollection<E> (source)
A generic ordered collection of elements that supports adding and
removing elements.
Parameters
E - the type of elements contained in the list.
Due to this, Kotlin enforces readonly behaviour through its interfaces, instead of throwing Exceptions on runtime like default Java implementations do.
Likewise, there is a MutableCollection, MutableIterable, MutableIterator, MutableListIterator, MutableMap, and MutableSet, see the stdlib documentation.
It is confusing but there are three, not two types of immutability:
Mutable - you are supposed to change the collection (Kotlin's MutableList)
Readonly - you are NOT supposed to change it (Kotlin's List) but something may (cast to Mutable, or change from Java)
Immutable - no one can change it (Guavas's immutable collections)
So in case (2) List is just an interface that does not have mutating methods, but you can change the instance if you cast it to MutableList.
With Guava (case (3)) you are safe from anybody to change the collection, even with a cast or from another thread.
Kotlin chose to be readonly in order to use Java collections directly, so there is no overhead or conversion in using Java collections..
As you see in other answers, Kotlin has readonly interfaces to mutable collections that let you view a collection through a readonly lens. But the collection can be bypassed via casting or manipulated from Java. But in cooperative Kotlin code that is fine, most uses do not need truly immutable collections and if your team avoids casts to the mutable form of the collection then maybe you don't need fully immutable collections.
The Kotlin collections allow both copy-on-change mutations, as well as lazy mutations. So to answer part of your questions, things like filter, map, flatmap, operators + - all create copies when used against non lazy collections. When used on a Sequence they modify the values as the collection as it is accessed and continue to be lazy (resulting in another Sequence). Although for a Sequence, calling anything such as toList, toSet, toMap will result in the final copy being made. By naming convention almost anything that starts with to is making a copy.
In other words, most operators return you the same type as you started with, and if that type is "readonly" then you will receive a copy. If that type is lazy, then you will lazily apply the change until you demand the collection in its entirety.
Some people want them for other reasons, such as parallel processing. In those cases, it might be best to look at really high performance collections designed just for those purposes. And only use them in those cases, not in all general cases.
In the JVM world it is hard to avoid interop with libraries that want standard Java collections, and converting to/from these collections adds a lot of pain and overhead for libraries that do not support the common interfaces. Kotlin gives a good mix of interop and lack of conversion, with readonly protection by contract.
So if you can't avoid wanting immutable collections, Kotlin easily works with anything from the JVM space:
Guava (https://github.com/google/guava)
Dexx a port of the Scala collections to Java (https://github.com/andrewoma/dexx) with Kotlin helpers (https://github.com/andrewoma/dexx/blob/master/kollection/README.md)
Eclipse Collections (formerly GS-Collections) a really high performance, JDK compatible, top performer in parallel processing with immutable and mutable variations (home: https://www.eclipse.org/collections/ and Github: https://github.com/eclipse/eclipse-collections)
PCollections (http://pcollections.org/)
Also, the Kotlin team is working on Immutable Collections natively for Kotlin, that effort can be seen here:
https://github.com/Kotlin/kotlinx.collections.immutable
There are many other collection frameworks out there for all different needs and constraints, Google is your friend for finding them. There is no reason the Kotlin team needs to reinvent them for its standard library. You have a lot of options, and they specialize in different things such as performance, memory use, not-boxing, immutability, etc. "Choice is Good" ... therefore some others: HPCC, HPCC-RT, FastUtil, Koloboke, Trove and more...
There are even efforts like Pure4J which since Kotlin supports Annotation processing now, maybe can have a port to Kotlin for similar ideals.
Kotlin 1.0 will not have immutable collections in the standard library. It does, however, have read-only and mutable interfaces. And nothing prevents you from using third party immutable collection libraries.
Methods in Kotlin's List interface "support only read-only access to the list" while methods in its MutableList interface support "adding and removing elements". Both of these, however, are only interfaces.
Kotlin's List interface enforces read-only access at compile-time instead of deferring such checks to run-time like java.util.Collections.unmodifiableList(java.util.List) (which "returns an unmodifiable view of the specified list... [where] attempts to modify the returned list... result in an UnsupportedOperationException." It does not enforce immutability.
Consider the following Kotlin code:
import com.google.common.collect.ImmutableList
import kotlin.test.assertEquals
import kotlin.test.assertFailsWith
fun main(args: Array<String>) {
val readOnlyList: List<Int> = arrayListOf(1, 2, 3)
val mutableList: MutableList<Int> = readOnlyList as MutableList<Int>
val immutableList: ImmutableList<Int> = ImmutableList.copyOf(readOnlyList)
assertEquals(readOnlyList, mutableList)
assertEquals(mutableList, immutableList)
// readOnlyList.add(4) // Kotlin: Unresolved reference: add
mutableList.add(4)
assertFailsWith(UnsupportedOperationException::class) { immutableList.add(4) }
assertEquals(readOnlyList, mutableList)
assertEquals(mutableList, immutableList)
}
Notice how readOnlyList is a List and methods such as add cannot be resolved (and won't compile), mutableList can naturally be mutated, and add on immutableList (from Google Guava) can also be resolved at compile-time but throws an exception at run-time.
All of the above assertions pass with exception of the last one which results in Exception in thread "main" java.lang.AssertionError: Expected <[1, 2, 3, 4]>, actual <[1, 2, 3]>. i.e. We successfully mutated a read-only List!
Note that using listOf(...) instead of arrayListOf(...) returns an effectively immutable list as you cannot cast it to any mutable list type. However, using the List interface for a variable does not prevent a MutableList from being assigned to it (MutableList<E> extends List<E>).
Finally, note that an interface in Kotlin (as well as in Java) cannot enforce immutability as it "cannot store state" (see Interfaces). As such, if you want an immutable collection you need to use something like those provided by Google Guava.
See also ImmutableCollectionsExplained · google/guava Wiki · GitHub
NOTE: This answer is here because the code is simple and open-source and you can use this idea to make your collections that you create immutable. It is not intended only as an advertisement of the library.
In Klutter library, are new Kotlin Immutable wrappers that use Kotlin delegation to wrap a existing Kotlin collection interface with a protective layer without any performance hit. There is then no way to cast the collection, its iterator, or other collections it might return into something that could be modified. They become in effect Immutable.
Klutter 1.20.0 released which adds immutable protectors for existing collections, based on a SO answer by #miensol provides a light-weight delegate around collections that prevents any avenue of modification including casting to a mutable type then modifying. And Klutter goes a step further by protecting sub collections such as iterator, listIterator, entrySet, etc. All of those doors are closed and using Kotlin delegation for most methods you take no hit in performance. Simply call myCollection.asReadonly() (protect) or myCollection.toImmutable() (copy then protect) and the result is the same interface but protected.
Here is an example from the code showing how simply the technique is, by basically delegating the interface to the actual class while overriding mutation methods and any sub-collections returned are wrapped on the fly.
/**
* Wraps a List with a lightweight delegating class that prevents casting back to mutable type
*/
open class ReadOnlyList <T>(protected val delegate: List<T>) : List<T> by delegate, ReadOnly, Serializable {
companion object {
#JvmField val serialVersionUID = 1L
}
override fun iterator(): Iterator<T> {
return delegate.iterator().asReadOnly()
}
override fun listIterator(): ListIterator<T> {
return delegate.listIterator().asReadOnly()
}
override fun listIterator(index: Int): ListIterator<T> {
return delegate.listIterator(index).asReadOnly()
}
override fun subList(fromIndex: Int, toIndex: Int): List<T> {
return delegate.subList(fromIndex, toIndex).asReadOnly()
}
override fun toString(): String {
return "ReadOnly: ${super.toString()}"
}
override fun equals(other: Any?): Boolean {
return delegate.equals(other)
}
override fun hashCode(): Int {
return delegate.hashCode()
}
}
Along with helper extension functions to make it easy to access:
/**
* Wraps the List with a lightweight delegating class that prevents casting back to mutable type,
* specializing for the case of the RandomAccess marker interface being retained if it was there originally
*/
fun <T> List<T>.asReadOnly(): List<T> {
return this.whenNotAlreadyReadOnly {
when (it) {
is RandomAccess -> ReadOnlyRandomAccessList(it)
else -> ReadOnlyList(it)
}
}
}
/**
* Copies the List and then wraps with a lightweight delegating class that prevents casting back to mutable type,
* specializing for the case of the RandomAccess marker interface being retained if it was there originally
*/
#Suppress("UNCHECKED_CAST")
fun <T> List<T>.toImmutable(): List<T> {
val copy = when (this) {
is RandomAccess -> ArrayList<T>(this)
else -> this.toList()
}
return when (copy) {
is RandomAccess -> ReadOnlyRandomAccessList(copy)
else -> ReadOnlyList(copy)
}
}
You can see the idea and extrapolate to create the missing classes from this code which repeats the patterns for other referenced types. Or view the full code here:
https://github.com/kohesive/klutter/blob/master/core-jdk6/src/main/kotlin/uy/klutter/core/common/Immutable.kt
And with tests showing some of the tricks that allowed modifications before, but now do not, along with the blocked casts and calls using these wrappers.
https://github.com/kohesive/klutter/blob/master/core-jdk6/src/test/kotlin/uy/klutter/core/collections/TestImmutable.kt
Now we have https://github.com/Kotlin/kotlinx.collections.immutable.
fun Iterable<T>.toImmutableList(): ImmutableList<T>
fun Iterable<T>.toImmutableSet(): ImmutableSet<T>
fun Iterable<T>.toPersistentList(): PersistentList<T>
fun Iterable<T>.toPersistentSet(): PersistentSet<T>