byte array in heapdump on Subscriber in Spring cloud Sleuth Reactor ScopePassingSpanSubscriber - spring-webflux

We are using AWS Fargate ECS Tasks for our spring webflux java 11 microservice.We are using a FROM gcr.io/distroless/java:11 java image. We have downloaded the heap dump of the application and can see huge byte[] being held in the memory by spring cloud sleuth.Below is our spring cloud sleuth settings:
sleuth:
baggage:
correlation-fields: client-id
remote-fields: client-id
propagation:
type: AWS
reactor:
instrumentation-type: DECORATE_QUEUES
instrument:
web:
skipPattern: (^swagger.*|.+docs.*)
Below is a screenshot of the heap dump object in memory.This byte[] has the references as shown below.We do process document byte[] by downloading it from AWS S3 but these byte[] references in the memory seem to have come from the spring sleuth library.They account for around 63% of the consumed heap memory.We are purposely using DECORATE_QUEUES for lower performance impact

Related

AWS S3 Java SDK V1 expose JMX metrics

I want to view AmazonS3 JMX metrics. Specifically the number of retries performed by the client. The only problem is that according to this article it seems that it supports only Cloudwatch, and I want to see it momentarily using jconsole.
Enabling without credential file, doesn't seem to work.
Is there some kind of workaround?

Connecting to pivotal cloud cache from a spring boot gemfire client app on non PCF (VSI) Platform

I have Pivotal cloud cache service with https URL , i can connect to the https service via gfsh .
I have a spring boot app annotated with #ClientCacheAPplication which is running on a VSI , on a seperate VSI server , on a non PCF / non cloud environment .
Is there a way to connect to the https PCC service from the spring boot client app ?
First, you should be using Spring Boot for Apache Geode [alternatively, VMware Tanzu GemFire] (SBDG); see project page and documentation for more details.
By using SBDG, it eliminates the need to explicitly annotate your Spring Boot (and Apache Geode or GemFire ClientCache) application with SDG's #ClientCacheApplication annotation. See here for more details.
NOTE: If you are unfamiliar with SBDG, then you can follow along in the Getting Started Sample. Keep in mind that SBDG is just an extension of Spring Boot dedicated to Apache Geode (and GemFire).
I also have documentation on connecting your Spring Boot app to a Pivotal Cloud Cache (or now know as VMware Tanzu GemFire for VMs) instance.
1 particular piece of documentation that is not present in SBDG's docs is when you are running your Spring Boot application off-platform (that is when your Spring Boot app has not been deployed to Pivotal CloudFoundry (or rather, VMware Tanzu Application Service)) and you are "connecting" to the Pivotal Cloud Cache (VMware Tanzu GemFire for VMs) service on platform (that is GemFire running in PCF as PCC, or running VMW TAS as VMW Tanzu GemFire for VMs).
To do this, you need to use the new SNI Services Gateway provided by the GemFire itself. This interface allows GemFire/Geode clients (whether Spring Boot application or otherwise) to run off-platform, yet still connect to the GemFire service (PCC or VMW Tanzu GemFire for VMs) on-platform (e.g. PCF or VMW TAS).
This is also required if you are deploying your Spring Boot application in its own foundation on-platform, separately from the services foundation where the GemFire service is running. For example, if you deploy and run your Spring Boot app in the APP_A_FOUNDATION and the GemFire service is running the the SERV_2_FOUNDATION, both on-platform, then you would also need to use the GemFire SNI Service Gateway feature.
This can be configured using Spring Boot easily enough.
I have posted an internal query, reaching out to the people who have more information on this subject, and I am currently waiting to hear back from them.
Supposedly (so I was told) there is an acceptance test (SNIAcceptanceTest) that would demonstrate how this feature works and how to use, but I cannot find any references to this in the Apache Geode codebase (develop branch).
I will get back to you (in the comments below) if I hear back from anyone.

How can I configure Redis as a Spring Cloud Dataflow Source?

I've search for examples and I have not found any.
My intention is to use a Redis Stream as a source to Spring Cloud Dataflow and route messages to AWS Kinesis or S3 data sinks
Redis is not listed as a Spring Cloud Dataflow source. Will I have to create a custom binder?
Redis only seems available as a sink with PubSub
There used to be a redis-binder for Spring Cloud Stream, but that has been deprecated for a while now. We have plans to implement a binder for Redis Streams in the future, though.
That said, if you have data in Redis, it'd be good to start building a redis-source as a custom application. We have many suppliers/sources that you can use as a reference.
There's currently also a blog-series in the works, which can be of further guidance when building custom applications.
Lastly, feel free to contribute the redis-supplier/source to the applications repo, we can collaborate on a pull request.

Scaling of Spring Batch application in PCF without Spring Cloud Dataflow and other cloud services disabled

I have read a lot of articles about Spring Batch scaling in cloud platforms, I have also followed Michael Minella's video on high performance batch processing in youtube (https://www.youtube.com/watch?v=J6IPlfm7N6w).
My usecase is that I would be processing a large file of more than 1GB using Spring Batch in PCF. I understand that the files can be split and the class DeployerPartitionhandler can be used to start a new instance dynamically in PCF per partition/file, but the catch is we don't have Spring Cloud Dataflow and Spring Cloud services enabled in our PCF environment.
I saw that we can combine Spring Batch with Spring Integration and rabbitmq to do remote chunking of the large file using a master/worker configuration. But these workers need to manually started in PCF as a separate instance. Based on the load we have to manually start more worker instances.
But is there any other way provided by Spring Batch and PCF to autoscale the worker instances as per load? Or is there a way to dynamically start a new instance in PCF when the master is ready with the chunk while reading the file?
FYI : If I use the Autoscaler feature of PCF based on some metric such as CPU utilization, for every new instance it reads the whole file again processes it.

Is Redis a good idea for Spring Cloud Stream? Should I use Kafka or RabbitMQ?

I'm deploying a small Spring Cloud Stream project,
using only http sources and jdbc sinks (3 instances each). The estimated load is 10 hits/second.
I was thinking on using redis because I feel more confortable with it, but in the latest documentation almost all the refereces are to kafka and RabbitMQ so I am wondering if redis is not going to be supported in the future or if there is any issue using redis.
Regards
Redis is not recommended for production with Spring Cloud Stream - the binder is not fully functional and message loss is possible.