Spring Webflux: I want to send data to kafka after saving to database - spring-webflux

I'm trying to do send data to kafka after my database operation is successful.
I have a /POST endpoint which store the data in mongodb and return the whole object along with mongoDB uuid.
Now I want to perform an addition task, if data is successfully saved in mongodb i should call my kafka producer method and send the data.
Not sure how to do it.
Current Codebase
public Mono<?> createStock(StockDTO stockDTONBody) {
// logger.info("Received StockDTO body: {}, ", stockDTONBody);
Mono<StockDTO> stockDTO = mongoTemplate.save(stockDTONBody);
// HERE I WANT TO SEND TO KAFKA IF DATA IS SAVED TO MONGO.
return stockDTO;
}

Thanks #Alex for help. I
Adding my answer for others.
public Mono<?> createStock(StockDTO stockDTONBody) {
// logger.info("Received StockDTO body: {}, ", stockDTONBody);
Mono<StockDTO> stockDTO = mongoTemplate.save(stockDTONBody);
// =============== Kafka Code added======================
return stockDTO.flatMap(data -> sendToKafka(data, "create"));
}
public Mono<?> sendToKafka(StockDTO stockDTO, String eventName){
Map<String, Object> data = new HashMap<String, Object>();
data.put("event", eventName);
data.put("campaign", stockDTO);
template.send(kafkaTopicName, data.toString()).log().subscribe();
System.out.println("sending to Kafka "+ eventName + data.toString());
return Mono.just(stockDTO);
}

This can result in dual writes if your data is saved in mongo and something goes wrong while publishing to kafka. Data will be missing in kafka. Instead you should use change data capture for this. Mongo provides mongo change streams which can be used here or there are other open source kafka connectors available where you can configure the connectors to listen to changelogs of Mongo and stream those to kafka.

Related

problem with masstransit dynamic event publish ( json event )

we need publish multiple event as json string from DB. publish this json event by masstransit like this:
using var scope = _serviceScopeFactory.CreateScope();
var sendEndpointProvider = scope.ServiceProvider.GetService<ISendEndpointProvider>();
var endpoint = await sendEndpointProvider.GetSendEndpoint(new System.Uri("exchange:IntegrationEvents.DynamicEvent:DynamicEvent"))
var json = JsonConvert.SerializeObject(dynamicObject, Newtonsoft.Json.Formatting.None);// sample
var obj = JsonConvert.DeserializeObject(json, new JsonSerializerSettings { });
await endpoint.Send(obj,i=>i.Serializer.ContentType.MediaType= "application/json");
and in config we use this config:
cfg.UseRawJsonSerializer();
when use this config, json event is successful published but we have strange problem : "all" event consumer is called by empty message data ! ... in Rabbitmq jsut published our "Dynamic Event", but in masstrasit all consumers called !!
Thank you for letting us know if we made a mistake
You don't need all of that JSON manipulation, just send the message object using the endpoint with the serializer configured for RawJson. I cover JSON interoperability in this video.
Also, MassTransit does not allow anonymous types to be published. You might be able to publish dynamic or Expando objects.
I used ExpandoObject like this and get this exception "Messages types must not be in the System namespace: System.Dynamic.ExpandoObject" :
dynamic dynamicObject = new ExpandoObject();
dynamicObject.Id = 1;
dynamicObject.Name = "NameForName";
await endpoint.Send(dynamicObject);
and using like this we get same result as "all consumers called":
var dynamicObject = new ExpandoObject() as IDictionary<string, object>;
dynamicObject.Add("Id", 1);
dynamicObject.Add("Name", "NameForName");
I watch your great video, you used from rabbitmq directly .. how "send the message object using the endpoint with the serializer configured for RawJson" in C# code.

How to get lastest data from redis at real time?

I want to implement real-time applications with Redis.
There are data that are pushed in real time on Redis, like the source code below that used lettuce library.
RedisClient redisClient = RedisClient.create(uri);
StatefulRedisConnection<String, String> connection = redisClient.connect()
RedisStringAsyncCommands<String, String> asyncCommands = connection.async();
List<RedisFuture<?>> futures = Lists.newArrayList();
while(true) {
futures.add(asyncCommands.set("key", "value"));
}
If I want to check the data at client in real time, how can I implement it?
At first time, I used pub/sub way, but the pub/sub method could not get the stored data. It was just publish data - channel - subscribe data in real time.
For example, Kafka can continuously get data through the consumer, like that.
while(true) {
ConsumerRecords<String, String> records = consumer.poll(Duration.ofMillis(100));
for (ConsumerRecord<String, String> record : records) {
logger.info("offset = {}, value = {}", record.offset(), record.value());
}
}
Are there some ways?

Streaming objects from S3 Object using Spring Aws Integration

I am working on a usecase where I am supposed to poll S3 -> read the stream for the content -> do some processing and upload it to another bucket rather than writing the file in my server.
I know I can achieve it using S3StreamingMessageSource in Spring aws integration but the problem I am facing is that I do not know on how to process the message stream received by polling
public class S3PollerConfigurationUsingStreaming {
#Value("${amazonProperties.bucketName}")
private String bucketName;
#Value("${amazonProperties.newBucket}")
private String newBucket;
#Autowired
private AmazonClientService amazonClient;
#Bean
#InboundChannelAdapter(value = "s3Channel", poller = #Poller(fixedDelay = "100"))
public MessageSource<InputStream> s3InboundStreamingMessageSource() {
S3StreamingMessageSource messageSource = new S3StreamingMessageSource(template());
messageSource.setRemoteDirectory(bucketName);
messageSource.setFilter(new S3PersistentAcceptOnceFileListFilter(new SimpleMetadataStore(),
"streaming"));
return messageSource;
}
#Bean
#Transformer(inputChannel = "s3Channel", outputChannel = "data")
public org.springframework.integration.transformer.Transformer transformer() {
return new StreamTransformer();
}
#Bean
public S3RemoteFileTemplate template() {
return new S3RemoteFileTemplate(new S3SessionFactory(amazonClient.getS3Client()));
}
#Bean
public PollableChannel s3Channel() {
return new QueueChannel();
}
#Bean
IntegrationFlow fileStreamingFlow() {
return IntegrationFlows
.from(s3InboundStreamingMessageSource(),
e -> e.poller(p -> p.fixedDelay(30, TimeUnit.SECONDS)))
.handle(streamFile())
.get();
}
}
Can someone please help me with the code to process the stream ?
Not sure what is your problem, but I see that you have a mix of concerns. If you use messaging annotations (see #InboundChannelAdapter in your config), what is the point to use the same s3InboundStreamingMessageSource in the IntegrationFlow definition?
Anyway it looks like you have already explored for yourself a StreamTransformer. This one has a charset property to convert your InputStreamfrom the remote S3 resource to the String. Otherwise it returns a byte[]. Everything else is up to you what and how to do with this converted content.
Also I don't see reason to have an s3Channel as a QueueChannel, since the start of your flow is pollable anyway by the #InboundChannelAdapter.
From big height I would say we have more questions to you, than vise versa...
UPDATE
Not clear what is your idea for InputStream processing, but that is really a fact that after S3StreamingMessageSource you are going to have exactly InputStream as a payload in the next handler.
Also not sure what is your streamFile(), but it must really expect InputStream as an input from the payload of the request message.
You also can use the mentioned StreamTransformer over there:
#Bean
IntegrationFlow fileStreamingFlow() {
return IntegrationFlows
.from(s3InboundStreamingMessageSource(),
e -> e.poller(p -> p.fixedDelay(30, TimeUnit.SECONDS)))
.transform(Transformers.fromStream("UTF-8"))
.get();
}
And the next .handle() will be ready for String as a payload.

Java - Insert a single row at a time into google Big Query ?

I am creating an application where every time a user clicks on an article, I need to capture the article data and the user data to calculate the reach of every article and be able to run analytics on the reached data.
My application is on App Engine.
When I check documentation for inserts into BQ, most of them point towards bulk inserts in the form of Jobs or Streams.
Question:
Is it even a good practice to insert into big Query one row at a time every time a user action is initiated ? If so, could you point me to some Java code to effectively do this ?
There are limits on the number of load jobs and DML queries (1,000 per day), so you'll need to use streaming inserts for this kind of application. Note that streaming inserts are different from loading data from a Java stream.
TableId tableId = TableId.of(datasetName, tableName);
// Values of the row to insert
Map<String, Object> rowContent = new HashMap<>();
rowContent.put("booleanField", true);
// Bytes are passed in base64
rowContent.put("bytesField", "Cg0NDg0="); // 0xA, 0xD, 0xD, 0xE, 0xD in base64
// Records are passed as a map
Map<String, Object> recordsContent = new HashMap<>();
recordsContent.put("stringField", "Hello, World!");
rowContent.put("recordField", recordsContent);
InsertAllResponse response =
bigquery.insertAll(
InsertAllRequest.newBuilder(tableId)
.addRow("rowId", rowContent)
// More rows can be added in the same RPC by invoking .addRow() on the builder
.build());
if (response.hasErrors()) {
// If any of the insertions failed, this lets you inspect the errors
for (Entry<Long, List<BigQueryError>> entry : response.getInsertErrors().entrySet()) {
// inspect row error
}
}
(From the example at https://cloud.google.com/bigquery/streaming-data-into-bigquery#bigquery-stream-data-java)
Note especially that a failed insert does not always throw an exception. You must also check the response object for errors.
Is it even a good practice to insert into big Query one row at a time every time a user action is initiated ?
Yes, it's pretty typical to stream event streams to BigQuery for analytics. You'll could get better performance if you buffer multiple events into the same streaming insert request to BigQuery, but one row at a time is definitely supported.
A simplified version of Google's example.
Map<String, Object> row1Data = new HashMap<>();
row1Data.put("booleanField", true);
row1Data.put("stringField", "myString");
Map<String, Object> row2Data = new HashMap<>();
row2Data.put("booleanField", false);
row2Data.put("stringField", "myOtherString");
TableId tableId = TableId.of("myDatasetName", "myTableName");
InsertAllResponse response =
bigQuery.insertAll(
InsertAllRequest.newBuilder(tableId)
.addRow("row1Id", row1Data)
.addRow("row2Id", row2Data)
.build());
if (response.hasErrors()) {
// If any of the insertions failed, this lets you inspect the errors
for (Map.Entry<Long, List<BigQueryError>> entry : response.getInsertErrors().entrySet()) {
// inspect row error
}
}
You can use Cloud Logging API to write one row at a time.
https://cloud.google.com/logging/docs/reference/libraries
Sample code from document
public class QuickstartSample {
/** Expects a new or existing Cloud log name as the first argument. */
public static void main(String... args) throws Exception {
// Instantiates a client
Logging logging = LoggingOptions.getDefaultInstance().getService();
// The name of the log to write to
String logName = args[0]; // "my-log";
// The data to write to the log
String text = "Hello, world!";
LogEntry entry =
LogEntry.newBuilder(StringPayload.of(text))
.setSeverity(Severity.ERROR)
.setLogName(logName)
.setResource(MonitoredResource.newBuilder("global").build())
.build();
// Writes the log entry asynchronously
logging.write(Collections.singleton(entry));
System.out.printf("Logged: %s%n", text);
}
}
In this case you need to create sink from dataflow logs. Then message will be redirect to the big Query table.
https://cloud.google.com/logging/docs/export/configure_export_v2

RabbitMQ-- selectively retrieving messages from a queue

I'm new to RabbitMQ and was wondering of a good approach to this problem I'm mulling over. I want to create a service that subscribes to a queue and only pulls messages that meet a specific criteria; for instance, if a specific subject header is in the message.
I'm still learning about RabbitMQ, and was looking for tips on how to approach this. My questions include: how can the consumer pull only specific messages from the queue? How can the producer set a subject header in the message (if that's even the right term?)
RabbitMQ is perfect for this situation. You have a number of options to do what you want. I suggest reading the documentation to get a better understanding. I would suggest that you use a topic or direct exchange. Topic is more flexible. It goes like this.
Producer code connects to the RabbitMQ Broker and creates and Exchange with a specific name.
Producer publishes to exchange. Each message published will be published with a routing key.
Consumer connects to RabbitMQ broker.
Consumer creates Queue
Consumer binds Queue to the exchange, the same exchange defined in the producer. The binding also includes the routing keys for each message require for this particular consumer.
Lets say you were publishing log messages. The routing key might be something like "log.info", "log.warn", "log.error". Each message published by the producer will have the relevant routing key attached. You will then have a consumer that sends and email for all the error messages and another one that writes all the error messages to a file. So the emailer will define the binding from its queue to the exchange with the routing key "log.error". This way though the exchange receives all messages, the queue defined for the emailer will only contain the error messages. The filelogger will define a new separate queue bound to the same exchange and set up a different routing key. You could do three separate bindings for the three different routing keys require or just use the wildcard "log.*" to request all messages from the exchange starting with log.
This is a simple example that shows how you can achieve what you want to do.
look here for code examples specifically number tutorial number 5.
Making the best of exchange/routing of rabbitmq is recommended. If you do want to check according to the message content, the following code is a viable solution.
Retrieve messages from a queue and check, selectively ack the messages in which you're interested.
pull one message
GetResponse resp = channel.basicGet(QUEUE_NAME, false);
ack one message
channel.basicAck(resp.getEnvelope().getDeliveryTag(), false);
Example
import com.rabbitmq.client.*;
public class ReceiveLogs {
private final static String QUEUE_NAME = "hello";
public static void main(String[] argv) throws Exception {
ConnectionFactory factory = new ConnectionFactory();
factory.setHost("localhost");
try(Connection connection = factory.newConnection();
Channel channel = connection.createChannel();){
channel.queueDeclare(QUEUE_NAME, true, false, false, null);
// pull one message and ack manually and exit
GetResponse resp = channel.basicGet(QUEUE_NAME, false);
if( resp != null ){
String message = new String(resp.getBody(), "UTF-8");
System.out.println(" [x] Received '" + message + "'");
channel.basicAck(resp.getEnvelope().getDeliveryTag(), false);
}
System.out.println();
}
}
}
dependency
compile group: 'com.rabbitmq', name: 'amqp-client', version: '5.8.0'
To Retrieve Message from RabbitMQ we need to first connect with RabbitMQ server
public WebClient GetRabbitMqConnection(string userName, string password)
{
var client = new WebClient();
client.Credentials = new NetworkCredential(userName, password);
return client;
}
Now retrieve message from RabbitMQ using below code.
public string GetRabbitMQMessages(string domainName, string port,
string queueName, string virtualHost, WebClient client, string methodType)
{
string messageResult = string.Empty;
string strUri = "http://" + domainName + ":" + port +
"/api/queues/" + virtualHost + "/";
var data = client.DownloadString(strUri + queueName + "/");
var queueInfo = JsonConvert.DeserializeObject<QueueInfo>(data);
if (queueInfo == null || queueInfo.messages == 0)
return string.Empty;
if (methodType == "POST")
{
string postbody = "
{\"ackmode\":\"ack_requeue_true\",\"count\":
\"$totalMessageCount\",\"name\":\"${DomainName}\",
\"requeue\":\"false\",\"encoding\":\"auto\",\"vhost\" :
\"${QueueName}\"}";
postbody = postbody
.Replace("$totalMessageCount", queueInfo.messages.ToString())
.Replace("${DomainName}", domainName)
.Replace("${QueueName}", queueName);
messageResult = client.UploadString(strUri + queueName +
"/get", "POST", postbody);
}
return messageResult;
}
I think this will help you to implement RabbitMQ.
If you want to retrieve single message at a time please add the following properties with your Retrieving code .
Boolean autoAck = false;
model.BasicConsume(Queuename, autoAck);
model.BasicGet("Queuename", false);
model.BasicGet("Queuename", false);
By adding this properties of RabbitMQ you can retrieve the message one by one from the queue .Same like FIFO criteria