How to deserialize JSON from Kafka Consumer Record

How to deserialize JSON from Kafka Consumer Record - kotlin

I'm looking to access some fields on a Kafka Consumer record. I'm able to receive the event data which is a Java object i.e ConsumerRecord(topic = test.topic, partition = 0, leaderEpoch = 0, offset = 0, CreateTime = 1660933724665, serialized key size = 32, serialized value size = 394, headers = RecordHeaders(headers = [], isReadOnly = false), key = db166cbf1e9e438ab4eae15093f89c34, value = {"eventInfo":...}).
I'm able to access the eventInfo values which comes back as a json string. I'm fairly new to Kotlin and using Kafka so I'm not entirely sure if this is correct but I'm looking to basically access the fields in value but I can't get rid of an error that appears when trying to use mapper.readValue which is:
None of the following functions can be called with the arguments supplied.
import com.afterpay.shop.favorites.model.Product
import com.fasterxml.jackson.module.kotlin.jacksonObjectMapper
import org.apache.avro.generic.GenericData.Record
import org.apache.kafka.clients.consumer.ConsumerRecord
import org.springframework.kafka.annotation.KafkaListener
import org.springframework.kafka.support.Acknowledgment
import org.springframework.stereotype.Component
#Component
class KafkaConsumer {
#KafkaListener(topics = ["test.topic"], groupId = "group-id")
fun consume(consumerRecord: ConsumerRecord<String, Any>, ack: Acknowledgment) {
val mapper = jacksonObjectMapper()
val value = consumerRecord.value()
val record = mapper.readValue(value, Product::class.java)
println(value)
ack.acknowledge()
}
}
Is this the correct way to accomplish this?

First, change ConsumerRecord<String, Any> to ConsumerRecord<String, Product>, then change value.deserializer in your consumer config/factory to use JSONDeserializer
Then your consumerRecord.value() will already be a Product instance, and you don't need an ObjectMapper
https://docs.spring.io/spring-kafka/docs/current/reference/html/#json-serde
Otherwise, if you use StringDeserializer, change Any to String so that the mapper.readValue argument types are correct.

Related

Kotlin short-cut to assign value to variable using stream function or other

for (i in 0 until result.size){ result[i].config= addConfig(taskNames!![i],processKeys!![i]) }
Here result is a list of class which has datamember config and tasNames and processKeys are list of string.
Is there a way in kotlin to map result.config with respective taskNames and processKeys without using traditional loop and mentioning length of result.I am new to kotlin.
class Process {
var processKey: String? = null
var task: List<Task>? = null}
class Task {
var taskName: String? = null
var processVariables: List<ProcessVariable>? = null}
class ProcessVariable {
var name: String? = null
var label: String? = null
var applicableValue: List<String>? = null}
Result is already present with datamember config pf type ProcessVariable

If I understand your problem correctly, you need to combine 3 lists.
So iterating over the lists may be easier to understand than some clever way of list transformations.
You can get rid of the traditional for loop, so you don't need to calculate the size of the loop:
result.forEachIndexed {
i, resultData -> resultData.config = addConfig(taskNames[i], processKeys[i])
}
If you want to combine two lists, you can use the zip method:
val configList = taskNames.zip(processKeys) {tsk, prc -> addConfig(tsk, prc)}
In your example, the result-object was already existing. Maybe it is easier to create new result-objects:
val results = configList.map {
Result(config = it)
}

Kafka-Streams: How to query RocksDB persistent store when the key is avro serialized?

I am building a kafka streams topology in which I need to materialize the contents of a Global KTable in a persistent RocksDB store. This store will be further used on every instance of my application for querying.
The store consinsts of key values pairs, where the key is serialized using Avro and the payload is just a byte array.
My problem is when I try to query the store by a specific key, as shown in the code below:
fun topology() {
val streamsBuilder = StreamsBuilder()
streamsBuilder.globalTable<Key, ByteArray>(
SOURCE_TOPIC_FOR_GLOBAL_KTABLE,
Consumed.with(null /*Uses default serde provided in config which is SpecificAvroSerde*/,
Serdes.ByteArray()),
Materialized.`as`(STORE_NAME))
val kafkaStreams = KafkaStreams(streamsBuilder.build(), getProperties(), DefaultKafkaClientSupplier())
kafkaStreams.start()
}
private fun getProperties(): Properties {
val properties = Properties()
properties[StreamsConfig.BOOTSTRAP_SERVERS_CONFIG] = "kafka-brokers-ip"
properties[AbstractKafkaAvroSerDeConfig.SCHEMA_REGISTRY_URL_CONFIG] = "schema-registry-ip"
properties[StreamsConfig.APPLICATION_ID_CONFIG] = "app-id"
properties[ConsumerConfig.CLIENT_ID_CONFIG] = "client-id"
properties[ConsumerConfig.AUTO_OFFSET_RESET_CONFIG] = "earliest"
properties[StreamsConfig.STATE_DIR_CONFIG] = "C:/dev/kafka-streams"
properties[StreamsConfig.DEFAULT_KEY_SERDE_CLASS_CONFIG] = SpecificAvroSerde::class.java
properties[StreamsConfig.DEFAULT_VALUE_SERDE_CLASS_CONFIG] = SpecificAvroSerde::class.java
return properties
}
And this is the function that I use to query the store:
val store = kafkaStreams
.store(STORE_NAME, QueryableStoreTypes.keyValueStore<Key, ByteArray>())
store.all().forEach { keyValue ->
if (keyValue.key == Key("25-OCT-19 USD")) {
//we get here, so the key is in the store
println("Keys are equal.")
println(keyValue) // this gets printed: KeyValue({"value": "25-OCT-19 USD"}, [B#3ca343b5)
println(store.get(Key("25-OCT-19 USD"))) //this is null
println(store.get(keyValue.key)) //this is also null
}
}
As you can see in the comments of code, there are values in the store, but when I try to query by a specific key, I get nothing back.

Get or Insert within a Transaction on Doobie in Scala

I'm reading through the Doobie documentation and trying to do a simple get or create within a transaction. I get an option off the first query and attempt to do a getOrElse and run an insert within the else, however I keep getting a value map is not a member of Any within the getOrElse call. What's the correct way to either get an existing or create a new row in instances and return that result in a transaction?
import doobie._
import doobie.implicits._
import cats._
import cats.effect._
import cats.implicits._
import org.joda.time.DateTime
import scala.concurrent.ExecutionContext
case class Instance(id : Int, hostname : String)
case class User(id : Int, instanceId: Int, username : String, email : String, created : DateTime)
class Database(dbUrl : String, dbUser: String, dbPass: String) {
implicit val cs = IO.contextShift(ExecutionContext.global)
val xa = Transactor.fromDriverManager[IO](
"org.postgresql.Driver", dbUrl, dbUser, dbPass
)
def getOrCreateInstance(hostname: String) = for {
existingInstance <- sql"SELECT id, hostname FROM instances i WHERE i.hostname = $hostname".query[Instance].option
ensuredInstance <- existingInstance.getOrElse(sql"INSERT INTO instances(hostname) VALUES(?)".update.withGeneratedKeys[Instance]("id", "hostname"))
} yield ensuredInstance
}

I got the following answer thanks to the people on the #scala/freenode chatroom. I'm posting it here for completeness and if people are interested in doing this without the for comprehension in the other answer.
def getOrCreateInstance(hostname: String): ConnectionIO[Instance] =
OptionT(sql"SELECT id, hostname FROM instances i WHERE i.hostname = $hostname".query[Instance].option)
.getOrElseF(sql"INSERT INTO instances(hostname) VALUES($hostname)".update.withGeneratedKeys[Instance]("id", "hostname").compile.lastOrError)

I believe something like this should work for you,
def getOrCreateInstance(hostname: String): ConnectionIO[Instance] = for {
existingInstance <- sql"SELECT id, hostname FROM instances i WHERE i.hostname = $hostname".query[Instance].option
ensuredInstance <- existingInstance.fold(sql"INSERT INTO instances(hostname) VALUES($hostname)".update.withGeneratedKeys[Instance]("id", "hostname").take(1).compile.lastOrError)(_.pure[ConnectionIO])
} yield ensuredInstance
where you are compiling the fs2 Stream and also lifting the existing instance into a ConnectionIO in the case that it does already exist.

Avro specific vs generic record types - which is best or can I convert between?

We’re trying to decide between providing generic vs specific record formats for consumption by our clients
with an eye to providing an online schema registry clients can access when the schemas are updated.
We expect to send out serialized blobs prefixed with a few bytes denoting the version number so schema
retrieval from our registry can be automated.
Now, we’ve come across code examples illustrating the relative adaptability of the generic format for
schema changes but we’re reluctant to give up the type safety and ease-of-use provided by the specific
format.
Is there a way to obtain the best of both worlds? I.e. could we work with and manipulate the specific generated
classes internally and then have them converted them to generic records automatically just before serialization?
Clients would then deserialize the generic records (after looking up the schema).
Also, could clients convert these generic records they received to specific ones at a later time? Some small code examples would be helpful!
Or are we looking at this all the wrong way?

What you are looking for is Confluent Schema registry service and libs which helps to integrate with this.
Providing a sample to write Serialize De-serialize avro data with a evolving schema. Please note providing sample from Kafka.
import io.confluent.kafka.serializers.KafkaAvroDeserializer;
import io.confluent.kafka.serializers.KafkaAvroSerializer;
import org.apache.avro.generic.GenericRecord;
import org.apache.commons.codec.DecoderException;
import org.apache.commons.codec.binary.Hex;
import java.util.HashMap; import java.util.Map;
public class ConfluentSchemaService {
public static final String TOPIC = "DUMMYTOPIC";
private KafkaAvroSerializer avroSerializer;
private KafkaAvroDeserializer avroDeserializer;
public ConfluentSchemaService(String conFluentSchemaRigistryURL) {
//PropertiesMap
Map<String, String> propMap = new HashMap<>();
propMap.put("schema.registry.url", conFluentSchemaRigistryURL);
// Output afterDeserialize should be a specific Record and not Generic Record
propMap.put("specific.avro.reader", "true");
avroSerializer = new KafkaAvroSerializer();
avroSerializer.configure(propMap, true);
avroDeserializer = new KafkaAvroDeserializer();
avroDeserializer.configure(propMap, true);
}
public String hexBytesToString(byte[] inputBytes) {
return Hex.encodeHexString(inputBytes);
}
public byte[] hexStringToBytes(String hexEncodedString) throws DecoderException {
return Hex.decodeHex(hexEncodedString.toCharArray());
}
public byte[] serializeAvroPOJOToBytes(GenericRecord avroRecord) {
return avroSerializer.serialize(TOPIC, avroRecord);
}
public Object deserializeBytesToAvroPOJO(byte[] avroBytearray) {
return avroDeserializer.deserialize(TOPIC, avroBytearray);
} }
Following classes have all the code you are looking for.
io.confluent.kafka.serializers.KafkaAvroDeserializer;
io.confluent.kafka.serializers.KafkaAvroSerializer;
Please follow the link for more details :
http://bytepadding.com/big-data/spark/avro/avro-serialization-de-serialization-using-confluent-schema-registry/

Can I convert between them?
I wrote the following kotlin code to convert from a SpecificRecord to GenericRecord and back - via JSON.
PositionReport is an object generated off of avro with the avro plugin for gradle - it is:
#org.apache.avro.specific.AvroGenerated
public class PositionReport extends org.apache.avro.specific.SpecificRecordBase implements org.apache.avro.specific.SpecificRecord {
...
The functions used are below
/**
* Encodes a record in AVRO Compatible JSON, meaning union types
* are wrapped. For prettier JSON just use the Object Mapper
* #param pos PositionReport
* #return String
*/
private fun PositionReport.toAvroJson() : String {
val writer = SpecificDatumWriter(PositionReport::class.java)
val baos = ByteArrayOutputStream()
val jsonEncoder = EncoderFactory.get().jsonEncoder(this.schema, baos)
writer.write(this, jsonEncoder)
jsonEncoder.flush()
return baos.toString("UTF-8")
}
/**
* Converts from Genreic Record into JSON - Seems smarter, however,
* to unify this function and the one above but whatevs
* #param record GenericRecord
* #param schema Schema
*/
private fun GenericRecord.toAvroJson(): String {
val writer = GenericDatumWriter<Any>(this.schema)
val baos = ByteArrayOutputStream()
val jsonEncoder = EncoderFactory.get().jsonEncoder(this.schema, baos)
writer.write(this, jsonEncoder)
jsonEncoder.flush()
return baos.toString("UTF-8")
}
/**
* Takes a Generic Record of a position report and hopefully turns
* it into a position report... maybe it will work
* #param gen GenericRecord
* #return PositionReport
*/
private fun toPosition(gen: GenericRecord) : PositionReport {
if (gen.schema != PositionReport.getClassSchema()) {
throw Exception("Cannot convert GenericRecord to PositionReport as the Schemas do not match")
}
// We will convert into JSON - and use that to then convert back to the SpecificRecord
// Probalby there is a better way
val json = gen.toAvroJson()
val reader: DatumReader<PositionReport> = SpecificDatumReader(PositionReport::class.java)
val decoder: Decoder = DecoderFactory.get().jsonDecoder(PositionReport.getClassSchema(), json)
val pos = reader.read(null, decoder)
return pos
}
/**
* Converts a Specific Record to a Generic Record (I think)
* #param pos PositionReport
* #return GenericData.Record
*/
private fun toGenericRecord(pos: PositionReport): GenericData.Record {
val json = pos.toAvroJson()
val reader : DatumReader<GenericData.Record> = GenericDatumReader(pos.schema)
val decoder: Decoder = DecoderFactory.get().jsonDecoder(pos.schema, json)
val datum = reader.read(null, decoder)
return datum
}
There are a couple difference however between the two:
Fields in the SpecificRecord that are of Instant type will be encoded in the GenericRecord as long and Enums are slightly different
So for example in my unit test of this function time fields are tested like this:
val gen = toGenericRecord(basePosition)
assertEquals(basePosition.getIgtd().toEpochMilli(), gen.get("igtd"))
And enums are validated by string
val gen = toGenericRecord(basePosition)
assertEquals(basePosition.getSource().toString(), gen.get("source").toString())
So to convert between you can do:
val gen = toGenericRecord(basePosition)
val newPos = toPosition(gen)
assertEquals(newPos, basePosition)

ReactiveMongo : How to write macros handler to Enumeration object?

I use ReactiveMongo 0.10.0, and I have following user case class and gender Enumeration object:
case class User(
_id: Option[BSONObjectID] = None,
name: String,
gender: Option[Gender.Gender] = None)
object Gender extends Enumeration {
type Gender = Value
val MALE = Value("male")
val FEMALE = Value("female")
val BOTH = Value("both")
}
And I declare two implicit macros handler:
implicit val genderHandler = Macros.handler[Gender.Gender]
implicit val userHandler = Macros.handler[User]
but, when I run application, I get following error:
Error:(123, 48) No apply function found for reactive.userservice.Gender.Gender
implicit val genderHandler = Macros.handler[Gender.Gender]
^
Error:(125, 46) Implicit reactive.userservice.Gender.Gender for 'value gender' not found
implicit val userHandler = Macros.handler[User]
^
Anybody know how to write macros handler to Enumeration object?
Thanks in advance!

I stumbled upon your question a few times searching for the same answer. I did it this way:
import myproject.utils.EnumUtils
import play.api.libs.json.{Reads, Writes}
import reactivemongo.bson._
object DBExecutionStatus extends Enumeration {
type DBExecutionStatus = Value
val Error = Value("Error")
val Started = Value("Success")
val Created = Value("Running")
implicit val enumReads: Reads[DBExecutionStatus] = EnumUtils.enumReads(DBExecutionStatus)
implicit def enumWrites: Writes[DBExecutionStatus] = EnumUtils.enumWrites
implicit object BSONEnumHandler extends BSONHandler[BSONString, DBExecutionStatus] {
def read(doc: BSONString) = DBExecutionStatus.Value(doc.value)
def write(stats: DBExecutionStatus) = BSON.write(stats.toString)
}
}
You have to create a read/write pair by hand and populate with your values.
Hope you already solved this issue given the question age :D

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

How to deserialize JSON from Kafka Consumer Record - kotlin

Related

Kotlin short-cut to assign value to variable using stream function or other

Kafka-Streams: How to query RocksDB persistent store when the key is avro serialized?

Get or Insert within a Transaction on Doobie in Scala

Avro specific vs generic record types - which is best or can I convert between?

ReactiveMongo : How to write macros handler to Enumeration object?

Categories

Resources