Using a Jackson attribute to accumulate state as a byproduct of serialization - jackson

Here's my scenario:
I have a deep compositional tree of POJOs from various classes. I need to write a utility that can dynamically process this tree without having a baked in understanding of the class/composition structure
Some properties in my POJOs are annotated with a custom annotation #PIIData("phone-number") that declares that the property may contain PII, and optionally what kind of PII (e.g. phone number)
As a byproduct of serializing the root object, I'd like to accumulate a registry of PII locations based on their JSON path
Desired data structure:
path
type
household.primaryEmail
email-address
household.members[0].cellNumber
phone-number
household.members[0].firstName
first-name
household.members[1].cellNumber
phone-number
I don't care about the specific pathing/location language used (JSON Pointer, Json Path).
I could achieve this with some reflection and maintenance of my own path, but it feels like something I should be able to do with Jackson since it's already doing the traversal. I'm pretty sure that using Jackson's attributes feature is the right way to attach my object that will accumulate the data structure. However, I can't figure out a way to get at the path at runtime. Here's my current Scala attempt (hackily?) built on top of a filter that is applied to all objects through a mixin:
object Test {
#JsonFilter("pii")
class PiiMixin {
}
class PiiAccumulator {
val state = mutable.ArrayBuffer[String]()
def accumulate(test: String): Unit = state += test
}
def main(args: Array[String]): Unit = {
val filter = new SimpleBeanPropertyFilter() {
override def serializeAsField(pojo: Any, jgen: JsonGenerator, provider: SerializerProvider, writer: PropertyWriter): Unit = {
if (writer.getAnnotation(classOf[PiiData]) != null) {
provider.getAttribute("pii-accumulator").asInstanceOf[PiiAccumulator].accumulate(writer.getFullName.toString)
}
super.serializeAsField(pojo, jgen, provider, writer)
}
override def include(writer: BeanPropertyWriter): Boolean = true
override def include(writer: PropertyWriter): Boolean = true
}
val provider = new SimpleFilterProvider().addFilter("pii", filter)
val mapper = new ObjectMapper()
mapper.addMixIn(classOf[Object], classOf[PiiMixin])
val accum = new PiiAccumulator()
mapper.writer(provider)
.withAttributes("pii-accumulator", accum)
.writeValueAsString(null) // Pass in any arbitrary object here
}
}
This code has enabled me to dynamically buffer up a list of property names that contain PII, but I can't figure out how to get their locations within the resulting JSON doc. Perhaps the Jackson architecture somehow precludes knowing that at runtime. Is there some other place I can hook in to do something like this, perhaps while converting to a JsonNode?
Thanks!

Okay, found it. You can access the recursive path/location during serialization via JsonGenerator.getOutputContext.pathAsPointer(). So by changing my code above to the following:
if (writer.getAnnotation(classOf[PIIData]) != null) {
provider.getAttribute("pii").asInstanceOf[PiiAccumulator]
.accumulate(jgen.getOutputContext.pathAsPointer().toString + "/" + writer.getName)
}
I'm able to dynamically buffer a list of special locations in the resulting JSON document for further dynamic processing.

Related

kotlinx.serialization JSON replacing default serializers in gradle mpp multiplatform project

I want to use my own custom KSerializer<LocalDateTime> with kotlinx.serialization and kotlinx.datetime
#ExperimentalSerializationApi
#Serializer(forClass = LocalDateTime::class)
object LocalDateTimeSerializer : KSerializer<LocalDateTime> {
...
I create my Json like this:
val JSON = Json {
prettyPrint = true; prettyPrintIndent = " ".repeat(2)
serializersModule = this.serializersModule.apply {
overwriteWith(
SerializersModule {
contextual(Instant::class, InstantSerializer)
contextual(LocalDateTime::class, LocalDateTimeSerializer)
}
)
}
}
but whatever I try, I cannot succeed to replace the default LocalDateTimeIso8601Serializer with mine:
val l = JSON.decodeFromString<LocalDateTime>(s) // does NOT(!) use my own Serializer
// have to give it explicitly to work, but that's not what I want
val l = JSON.decodeFromString<LocalDateTime>(LocalDateTimeSerializer, s) // works, but explicitly
is it possible to replace a default Serializer??
anyone?
This is not possible.
The closest to what you want to achieve is by specifying a default serializer for a full file.
If I'm not mistaken, the reason for this is kotlinx serialization is a reflectionless serializer; all serializers are defined at compile time. Applying #Contextual is a way to disable that behavior and determine a serializer at runtime based on context (not what you are after here). I guess you could request a feature to apply a default serializer to a full module (likely already requested), but I can see how it's harder to implement/can lead to more unexpected conflicts/behavior than on file scope, which is why it may currently not be supported.
As a hack, you could consider using a wrapper type for LocalDateTime which uses your custom serializer. But, I'd recommend against this. Essentially, this is the same as applying the annotation everywhere, in that at every occurrence you need to make sure to use the right type.

Union classes or class erasure for Firestore desereliazation in Kotlin

I have a Firestore collection that holds different data objects with no common key or values.
In Kotlin, this is represented by something like
sealed class Task()
data class WorkTask(val id: String): Task()
data class ReductionTask(val time: Date): Task()
I would like to deserialize the data from the Firestore collection in a way like:
val tasks = result.toObjects(Task::class.java)
val workTasks = result.filterInstance(WorkTask::class.java)
val reductionTasks= result.filterInstance(ReductionTask::class.java)
In summary, I would like to retrieve a union from Firestore WorkTask | ReductionTask | OtherTask that I would be able to hold in one list and later either filter or patternmatch by instance.
EDIT:
Currently, my workaround is to have 1 common key (type) that holds the type of the object:
inline fun <reifed T: Any> QuerySnapshot.deserializeByType(
crossinline selector: (type:String) -> Class<out T>
): List<T> {
return this.documents.map({ document ->
val type = firestoreDoc.getString("type")
document.toObject(selector(type))
})
}
querySnapshot.deserializeByType<Task> { type ->
when (type) {
"WORK" -> WorkTask::class.java
"REDUCE" -> ReductionTask::class.java
...
}
}
And in theory, I could just provide a list of classes and let it try/catch to deserialize. But that seems to be hacky as hell.
Since the Firestore SDK is implemented in Java, it doesn't know anything about Kotlin sealed classes. It is working purely off what understands about JavaBean style POJO objects that use conventions for names of getter and setter methods on classes. toObject() is simply mapping the names of Firestore document fields to the getters and setters (obtained by reflection) on the provided class instance. That's all.
Your workaround (or some variation of it) is currently the only viable option, since your code needs to make a judgement about which actual class is actually being represented by the document in question. There's really no way for the Firestore SDK to know which class the document should populate - you have to tell it that.

Kotlin multiple class for data storage

I am developing a simple Android app, that will display an icon of a vehicle and the user can click on the icon to display the vehicle information. I want to load the data dynamically when I build the app i.e. the data will come from an external source including the picture for the icon.
I am new to Kotlin and not sure what to search for to understand a suitable solution. What is the correct way to define the data, is it best to create an class as below then create an array of the class (not sure if this is possible)
public class VehicleSpec()
{
var OEM: String? = null
var ModelName: String? = null
var EngineSize: String? = null
}
Or would be better to create a multiple dimension array and then link the data to the cells?
var VehicleSpec = arrayOf(20,20)
VehicleSpec[0][0] = Null //OEM
VehicleSpec[0][1] = Null //ModelName
VehicleSpec[0][2] = Null //EngineSize
What is the best way to set up the data storage, is there any good references to understand how this should be setup?
What is the correct way to define the data, is it best to create an class as below then create an array of the class
Using an array for the properties of an object is not making the full use of the type safety you have in Kotlin (and even Java for that matter).
If what you want to express is multiple properties of an object, then you should use a class to define those properties. This is especially true if the properties have different types.
There is no performance difference between an array and a class, because you'll get a reference to the heap in both cases. You could save on performance only if you convert your multi-dimensional array approach to a single-dimension array with smart indexing. Most of the time, you should not consider this option unless you are handling a lot of data and if you know that performance is an issue at this specific level.
(not sure if this is possible)
Defining lists/arrays of classes is definitely possible.
Usually, for classes that are only used as data containers, you should prefer data classes, because they give you useful methods for free, and these methods totally make sense for simple "data bags" like in your case (equals, hashcode, component access, etc.).
data class Vehicle(
val OEM: String,
val ModelName: String,
val EngineSize: String
)
Also, I suggest using val instead of var as much as possible. Immutability is more idiomatic in Kotlin.
Last but not least, prefer non-null values to null values if you know a value must always be present. If there are valid cases where the value is absent, you should use null instead of a placeholder value like empty string or -1.
First at all, using the "class aprocah" makes it easy for you to understand and give you the full benefits of the language itself... so dont dry to save data in an array .. let the compiler handle those stuff.
Secondly i suggest you have maybe two types (and use data classes ;-) )
data class VehicleListEntry(
val id: Long,
val name: String
)
and
data class VehicleSpec(
val id: Long,
val oem: String = "",
val modelName: String = "",
val engineSize: String = ""
)
from my perspective try to avoid null values whenever possible.
So if you have strings - which you are display only - use empty strings instead of null.
and now have a Model to store your data
class VehicleModel() {
private val specs: MutableMap<Long, VehicleSpec> = mutableMapOf()
private var entries: List<VehicleListEntry> = listOf()
fun getSpec(id: Long) = specs[id]
fun addSpec(spec: VehicleSpec) = specs[spec.id] = spec
fun getEntries(): List<VehicleListEntry> = entries
fun setEntries(data: List<VehicleListEntry>) {
entries = data.toMutableList()
}
}
You could also use a data class for your model which looks like
data class VehicleModel(
val specs: MutableMap<Long, VehicleSpec> = mutableMapOf(),
var entries: List<VehicleListEntry> = listOf()
)
And last but not least a controller for getting stuff together
class VehicleController() {
private val model = VehicleModel()
init{
// TODO get the entries list together
}
fun getEntries() = model.entries
fun getSpec(id: Long) : VehicleSpec? {
// TODO load the data from external source (or check the model first)
// TODO store the data into the model
// TODO return result
}
}

How to make a builder for a Kotlin data class with many immutable properties

I have a Kotlin data class that I am constructing with many immutable properties, which are being fetched from separate SQL queries. If I want to construct the data class using the builder pattern, how do I do this without making those properties mutable?
For example, instead of constructing via
var data = MyData(val1, val2, val3)
I want to use
builder.someVal(val1)
// compute val2
builder.someOtherVal(val2)
// ...
var data = builder.build()
while still using Kotlin's data class feature and immutable properties.
I agree with the data copy block in Grzegorz answer, but it's essentially the same syntax as creating data classes with constructors. If you want to use that method and keep everything legible, you'll likely be computing everything beforehand and passing the values all together in the end.
To have something more like a builder, you may consider the following:
Let's say your data class is
data class Data(val text: String, val number: Int, val time: Long)
You can create a mutable builder version like so, with a build method to create the data class:
class Builder {
var text = "hello"
var number = 2
var time = System.currentTimeMillis()
internal fun build()
= Data(text, number, time)
}
Along with a builder method like so:
fun createData(action: Builder.() -> Unit): Data {
val builder = Builder()
builder.action()
return builder.build()
}
Action is a function from which you can modify the values directly, and createData will build it into a data class for you directly afterwards.
This way, you can create a data class with:
val data: Data = createData {
//execute stuff here
text = "new text"
//calculate number
number = -1
//calculate time
time = 222L
}
There are no setter methods per say, but you can directly assign the mutable variables with your new values and call other methods within the builder.
You can also make use of kotlin's get and set by specifying your own functions for each variable so it can do more than set the field.
There's also no need for returning the current builder class, as you always have access to its variables.
Addition note: If you care, createData can be shortened to this:
fun createData(action: Builder.() -> Unit): Data = with(Builder()) { action(); build() }.
"With a new builder, apply our action and build"
I don't think Kotlin has native builders. You can always compute all values and create the object at the end.
If you still want to use a builder you will have to implement it by yourself. Check this question
There is no need for creating custom builders in Kotlin - in order to achieve builder-like semantics, you can leverage copy method - it's perfect for situations where you want to get object's copy with a small alteration.
data class MyData(val val1: String? = null, val val2: String? = null, val val3: String? = null)
val temp = MyData()
.copy(val1 = "1")
.copy(val2 = "2")
.copy(val3 = "3")
Or:
val empty = MyData()
val with1 = empty.copy(val1 = "1")
val with2 = with1.copy(val2 = "2")
val with3 = with2.copy(val3 = "3")
Since you want everything to be immutable, copying must happen at every stage.
Also, it's fine to have mutable properties in the builder as long as the result produced by it is immutable.
It's possible to mechanize the creation of the builder classes with annotation processors.
I just created ephemient/builder-generator to demonstrate this.
Note that currently, kapt works fine for generated Java code, but there are some issues with generated Kotlin code (see KT-14070). For these purposes this isn't an issue, as long as the nullability annotations are copied through from the original Kotlin classes to the generated Java builders (so that Kotlin code using the generated Java code sees nullable/non-nullable types instead of just platform types).

How to find all classes in a package using reflection in kotlin

Is it possible to find all kotlin classes in a given package?
I also need only annotated classes but it's not a big deal. Any suggestions ?
Kotlin on the JVM suffers the same issue as Java in this regard due to the implementation of class loaders.
Class loaders are not required to tell the VM which classes it can provide, instead they are just handed requests for classes, and have to return a class or throw an exception.
Source and more information: Can you find all classes in a package using reflection?
To summarize the linked thread, there are a number of solutions that allow you to inspect your current class path.
The Reflections library is pretty straight forward and has a lot of additional functionality like getting all subtypes of a class, get all types/members annotated with some annotation, optionally with annotation parameters matching, etc.
Guava has ClassPath, which returns ClassInfo POJO's - not enough for your use case, but useful to know as Guava is available almost everywhere.
Write your own by querying classloader resources and code sources. Would not suggest this route unless you absolutely cannot add library dependencies.
Here's an example of querying classloader resources, adapted from https://www.javaworld.com/article/2077477/java-tip-113--identify-subclasses-at-runtime.html
Requires Java 8 or higher.
// Call this function using something like:
// findClasses("com.mypackage.mysubpackage")
// Modified from https://www.javaworld.com/article/2077477/java-tip-113--identify-subclasses-at-runtime.html
fun findClasses(pckgname: String) {
// Translate the package name into an absolute path
var name = pckgname
if (!name.startsWith("/")) {
name = "/$name"
}
name = name.replace('.', '/')
// Get a File object for the package
val url: URL = Launcher::class.java.getResource(name)
val directory = File(url.getFile())
println("Finding classes:")
if (directory.exists()) {
// Get the list of the files contained in the package
directory.walk()
.filter { f -> f.isFile() && f.name.contains('$') == false && f.name.endsWith(".class") }
.forEach {
val fullyQualifiedClassName = pckgname +
it.canonicalPath.removePrefix(directory.canonicalPath)
.dropLast(6) // remove .class
.replace('/', '.')
try {
// Try to create an instance of the object
val o = Class.forName(fullyQualifiedClassName).getDeclaredConstructor().newInstance()
if (o is MyInterfaceOrClass) {
println(fullyQualifiedClassName)
// Optionally, make a function call here: o.myFunction()
}
} catch (cnfex: ClassNotFoundException) {
System.err.println(cnfex)
} catch (iex: InstantiationException) {
// We try to instantiate an interface
// or an object that does not have a
// default constructor
} catch (iaex: IllegalAccessException) {
// The class is not public
}
}
}
}