Java.io.NotSerializableException: org.apache.storm.tuple.TupleImpl in Storm BaseStatefulWindowedBolt (2.2.0) - serialization

My storm topology (2.2.0) crashes sometimes with:
java.lang.RuntimeException: java.lang.RuntimeException: java.lang.RuntimeException: java.io.NotSerializableException: org.apache.storm.tuple.TupleImplException
in a bolt extending the BaseStatefulWindowedBolt (my "windowbolt").
Interestingly, this seems to happen only when some throughput level is reached or network latency / packet loss occurs (I inject such network conditions on one of the storm docker containers for bench-marking purposes).
After one of this bolts' tasks failed, the others usually also do after some point and the topology is not able to recover from that failure in a state-ful way.
As the exception seems to not involve any custom classes i am quite unsure about the reason behind the failure.
Is this possibly an implementation / configuration problem on my side, or a problem / maybe even normal behavior of Apache storm under these conditions?
Error fetched by the storm cli:
{"Comp-Errors":{"windowbolt":"java.lang.RuntimeException: java.lang.RuntimeException: java.lang.RuntimeException: java.io.NotSerializableException: org.apache.storm.tuple.TupleImpl\n\tat org.apache.storm.utils.Utils$1.run(Utils.java:409)\n\tat java.base\/java.lang.Thread.run(Unknown Source)\nCaused by: java.lang.RuntimeException: java.lang.RuntimeException: java.io.NotSerializableException: org.apache.storm.tuple.TupleImpl\n\tat org.apache.storm.executor.Executor.accept(Executor.java:290)\n\tat org.apache.storm.utils.JCQueue.consumeImpl(JCQueue.java:131)\n\tat org.apache.storm.utils.JCQueue.consume(JCQueue.java:111)\n\tat org.apache.storm.executor.bolt.BoltExecutor$1.call(BoltExecutor.java:172)\n\tat org.apache.storm.executor.bolt.BoltExecutor$1.call(BoltExecutor.java:159)\n\tat org.apache.storm.utils.Utils$1.run(Utils.java:394)\n\t... 1 more\nCaused by: java.lang.RuntimeException: java.io.NotSerializableException: org.apache.storm.tuple.TupleImpl\n\tat org.apache.storm.serialization.SerializableSerializer.write(SerializableSerializer.java:36)\n\tat com.esotericsoftware.kryo.Kryo.writeClassAndObject(Kryo.java:628)\n\tat com.esotericsoftware.kryo.serializers.CollectionSerializer.write(CollectionSerializer.java:100)\n\tat com.esotericsoftware.kryo.serializers.CollectionSerializer.write(CollectionSerializer.java:40)\n\tat com.esotericsoftware.kryo.Kryo.writeObject(Kryo.java:534)\n\tat org.apache.storm.serialization.KryoValuesSerializer.serializeInto(KryoValuesSerializer.java:38)\n\tat org.apache.storm.serialization.KryoTupleSerializer.serialize(KryoTupleSerializer.java:40)\n\tat org.apache.storm.daemon.worker.WorkerTransfer.tryTransferRemote(WorkerTransfer.java:116)\n\tat org.apache.storm.daemon.worker.WorkerState.tryTransferRemote(WorkerState.java:524)\n\tat org.apache.storm.executor.ExecutorTransfer.tryTransfer(ExecutorTransfer.java:68)\n\tat org.apache.storm.executor.bolt.BoltOutputCollectorImpl.boltEmit(BoltOutputCollectorImpl.java:112)\n\tat org.apache.storm.executor.bolt.BoltOutputCollectorImpl.emit(BoltOutputCollectorImpl.java:65)\n\tat org.apache.storm.task.OutputCollector.emit(OutputCollector.java:93)\n\tat org.apache.storm.task.OutputCollector.emit(OutputCollector.java:93)\n\tat org.apache.storm.task.OutputCollector.emit(OutputCollector.java:93)\n\tat org.apache.storm.task.OutputCollector.emit(OutputCollector.java:93)\n\tat org.apache.storm.task.OutputCollector.emit(OutputCollector.java:42)\n\tat org.apache.storm.topology.WindowedBoltExecutor.execute(WindowedBoltExecutor.java:313)\n\tat org.apache.storm.topology.PersistentWindowedBoltExecutor.execute(PersistentWindowedBoltExecutor.java:137)\n\tat org.apache.storm.topology.StatefulBoltExecutor.doExecute(StatefulBoltExecutor.java:145)\n\tat org.apache.storm.topology.StatefulBoltExecutor.handleTuple(StatefulBoltExecutor.java:137)\n\tat org.apache.storm.topology.BaseStatefulBoltExecutor.execute(BaseStatefulBoltExecutor.java:71)\n\tat org.apache.storm.executor.bolt.BoltExecutor.tupleActionFn(BoltExecutor.java:236)\n\tat org.apache.storm.executor.Executor.accept(Executor.java:283)\n\t... 6 more\nCaused by: java.io.NotSerializableException: org.apache.storm.tuple.TupleImpl\n\tat java.base\/java.io.ObjectOutputStream.writeObject0(Unknown Source)\n\tat java.base\/java.io.ObjectOutputStream.writeObject(Unknown Source)\n\tat org.apache.storm.serialization.SerializableSerializer.write(SerializableSerializer.java:33)\n\t... 29 more\n"},"Topology Name":"KafkaTopology"}
"windowbolt" implementation:
public class StatefulWindowBolt extends BaseStatefulWindowedBolt<KeyValueState<String, AvgState>> {
private OutputCollector collector;
private Counter counter;
private Counter windowCounter;
private KeyValueState<String, AvgState> state;
#Override
public void prepare(Map stormConf, TopologyContext context, OutputCollector collector) {
this.collector = collector;
this.counter = context.registerCounter("WindowBolt_Executed");
this.windowCounter = context.registerCounter("WindowBolt_WindowNumber");
}
#Override
public void execute(TupleWindow inputWindow) {
long window_sum = 0;
long window_length = 0;
long ts = 0;
long max_ts = 0;
long start_event_time = inputWindow.getStartTimestamp();
long end_event_time = inputWindow.getEndTimestamp();
long partition = -1;
String note = "/";
Map<String, AvgState> map = new HashMap<String, AvgState>();
Iterator<Tuple> it = inputWindow.getIter();
while (it.hasNext()) {
Tuple tuple = it.next();
if (window_length == 0){
//same for whole window because of FieldsGrouping by partition
partition = tuple.getIntegerByField("partition");
note = tuple.getStringByField("note");
}
Long sensordata = tuple.getLongByField("sensordata");
window_sum += sensordata;
ts = tuple.getLongByField("timestamp");
if (ts > max_ts) {
max_ts = ts;
} else {
//
}
String city = tuple.getStringByField("city");
AvgState state = map.get(city);
if (state == null){
state = new AvgState(0,0);
}
map.put(city, new AvgState(state.sum+sensordata, state.count + 1));
counter.inc();
window_length++;
}
long window_avg = window_sum / window_length;
// emit the results
JSONObject json_message = new JSONObject();
json_message.put("window_avg", window_avg);
json_message.put("start_event_time", start_event_time);
json_message.put("end_event_time", end_event_time);
json_message.put("window_size", window_length);
json_message.put("last_event_ts", max_ts);
json_message.put("count_per_city", print(map));
json_message.put("partition", partition);
json_message.put("note", note);
String kafkaMessage = json_message.toString();
String kafkaKey = "window_id: " + windowCounter.getCount();
collector.emit(new Values(kafkaKey, kafkaMessage));
windowCounter.inc();
}
#Override
public void declareOutputFields(OutputFieldsDeclarer declarer) {
declarer.declare(new Fields("key", "message"));
}
public String print(Map<String, AvgState> map) {
StringBuilder mapAsString = new StringBuilder("{");
for (String key : map.keySet()) {
AvgState state = map.get(key);
mapAsString.append(key + "=" + state.count + ", ");
}
mapAsString.delete(mapAsString.length()-2, mapAsString.length()).append("}");
return mapAsString.toString();
}
#Override
public void initState(KeyValueState<String, AvgState> state) {
this.state = state;
}
}
in my main i set:
config.setFallBackOnJavaSerialization(true);
config.registerSerialization(AvgState.class);

Related

Hangfire - Recurring job can’t be scheduled, see inner exception for details

I have an app; which is live on three different servers, using a loadbalancer for user distribution.
The app uses its own queue and I have added a filter for jobs to keep their original queue in case they fail at some point. But then again, it continues to act like the app is not running. The error is like below;
System.InvalidOperationException: Recurring job can't be scheduled, see inner exception for details.
---> Hangfire.Common.JobLoadException: Could not load the job. See inner exception for the details.
---> System.IO.FileNotFoundException: Could not resolve assembly 'My.Api'.
at System.TypeNameParser.ResolveAssembly(String asmName, Func`2 assemblyResolver, Boolean throwOnError, StackCrawlMark& stackMark)
at System.TypeNameParser.ConstructType(Func`2 assemblyResolver, Func`4 typeResolver, Boolean throwOnError, Boolean ignoreCase, StackCrawlMark& stackMark)
at System.TypeNameParser.GetType(String typeName, Func`2 assemblyResolver, Func`4 typeResolver, Boolean throwOnError, Boolean ignoreCase, StackCrawlMark& stackMark)
at System.Type.GetType(String typeName, Func`2 assemblyResolver, Func`4 typeResolver, Boolean throwOnError)
at Hangfire.Common.TypeHelper.DefaultTypeResolver(String typeName)
at Hangfire.Storage.InvocationData.DeserializeJob()
--- End of inner exception stack trace ---
at Hangfire.Storage.InvocationData.DeserializeJob()
at Hangfire.RecurringJobEntity..ctor(String recurringJobId, IDictionary`2 recurringJob, ITimeZoneResolver timeZoneResolver, DateTime now)
--- End of inner exception stack trace ---
at Hangfire.Server.RecurringJobScheduler.ScheduleRecurringJob(BackgroundProcessContext context, IStorageConnection connection, String recurringJobId, RecurringJobEntity recurringJob, DateTime now)
What can be the issue here? The apps are running. And once I trigger the recurring jobs, they are good to go, until they show the above error.
This is my AppStart file;
private IEnumerable<IDisposable> GetHangfireServers()
{
Hangfire.GlobalConfiguration.Configuration
.SetDataCompatibilityLevel(CompatibilityLevel.Version_170)
.UseSimpleAssemblyNameTypeSerializer()
.UseRecommendedSerializerSettings()
.UseSqlServerStorage(HangfireServer, new SqlServerStorageOptions
{
CommandBatchMaxTimeout = TimeSpan.FromMinutes(5),
SlidingInvisibilityTimeout = TimeSpan.FromMinutes(5),
QueuePollInterval = TimeSpan.Zero,
UseRecommendedIsolationLevel = true,
DisableGlobalLocks = true
});
yield return new BackgroundJobServer(new BackgroundJobServerOptions {
Queues = new[] { "myapp" + GetEnvironmentName() },
ServerName = "MyApp" + ConfigurationHelper.GetAppSetting("Environment")
});
}
public void Configuration(IAppBuilder app)
{
var container = new Container();
container.Options.DefaultScopedLifestyle = new AsyncScopedLifestyle();
RegisterTaskDependencies(container);
container.RegisterWebApiControllers(System.Web.Http.GlobalConfiguration.Configuration);
container.Verify();
var configuration = new HttpConfiguration();
configuration.DependencyResolver = new SimpleInjectorWebApiDependencyResolver(container);
/* HANGFIRE CONFIGURATION */
if (Environment == "Production")
{
GlobalJobFilters.Filters.Add(new PreserveOriginalQueueAttribute());
Hangfire.GlobalConfiguration.Configuration.UseActivator(new SimpleInjectorJobActivator(container));
Hangfire.GlobalConfiguration.Configuration.UseLogProvider(new Api.HangfireArea.Helpers.CustomLogProvider(container.GetInstance<Core.Modules.LogModule>()));
app.UseHangfireAspNet(GetHangfireServers);
app.UseHangfireDashboard("/hangfire", new DashboardOptions
{
Authorization = new[] { new DashboardAuthorization() },
AppPath = GetBackToSiteURL(),
DisplayStorageConnectionString = false
});
AddOrUpdateJobs();
}
/* HANGFIRE CONFIGURATION */
app.UseWebApi(configuration);
WebApiConfig.Register(configuration);
}
public static void AddOrUpdateJobs()
{
var queueName = "myapp" + GetEnvironmentName();
RecurringJob.AddOrUpdate<HangfireArea.BackgroundJobs.AttachmentCreator>(
"MyApp_MyTask",
(service) => service.RunMyTask(),
"* * * * *", queue: queueName, timeZone: TimeZoneInfo.FindSystemTimeZoneById("Turkey Standard Time"));
}
What can be the problem here?
Turns out, Hangfire itself does not work great when multiple apps use the same sql schema. To solve this problem I used Hangfire.MAMQSqlExtension. It is a third-party extension but the repo says that it is officially recognized by Hangfire.
If you are using the same schema for multiple apps, you have to use this extension in all your apps, otherwise you'll face the error mentioned above.
If your apps have different versions alive at the same time (e.g. production, test, development) this app itself does not fully work for failed jobs. If a job fails, regular Hangfire will not respect it's original queue, hence will move it to the default queue. Which will eventually create problems if your app only works with your app's queue or if the default queue is shared. To solve that issue, to force Hangfire to respect the original queue attribute, I used this solution. Which works great, and you get to name your app's queue depending on your web.config or appsettings.json.
My previous answer was deleted for some reason? This solves the problem and there's no other way. Do not delete the answer, for people who will experience this issue.
Another option I found was to use Hangfire's background process https://www.hangfire.io/overview.html#background-process.
public class CleanTempDirectoryProcess : IBackgroundProcess
{
public void Execute(BackgroundProcessContext context)
{
Directory.CleanUp(Directory.GetTempDirectory());
context.Wait(TimeSpan.FromHours(1));
}
}
And set the delay. This solved the issue for me as I need to the job to run repeatedly. I'm not sure of the implications this might have with the dashboard.
You can create Job Filters that will do the same as Retry by placing the Queue.
The difference is that you cannot wait to run the job. It will run immediately.
public class AutomaticRetryQueueAttribute : JobFilterAttribute, IApplyStateFilter, IElectStateFilter
{
private string queue;
private int attempts;
private readonly object _lockObject = new object();
private readonly ILog _logger = LogProvider.For<AutomaticRetryQueueAttribute>();
public AutomaticRetryQueueAttribute(int Attempts = 10, string Queue = "Default")
{
queue = Queue;
attempts = Attempts;
}
public int Attempts
{
get { lock (_lockObject) { return attempts; } }
set
{
if (value < 0)
{
throw new ArgumentOutOfRangeException(nameof(value), #"Attempts value must be equal or greater than zero.");
}
lock (_lockObject)
{
attempts = value;
}
}
}
public string Queue
{
get { lock (_lockObject) { return queue; } }
set
{
lock (_lockObject)
{
queue = value;
}
}
}
public void OnStateApplied(ApplyStateContext context, IWriteOnlyTransaction transaction)
{
var newState = context.NewState as EnqueuedState;
if (!string.IsNullOrWhiteSpace(queue) && newState != null && newState.Queue != Queue)
{
newState.Queue = String.Format(Queue, context.BackgroundJob.Job.Args.ToArray());
}
if ((context.NewState is ScheduledState || context.NewState is EnqueuedState) &&
context.NewState.Reason != null &&
context.NewState.Reason.StartsWith("Retry attempt"))
{
transaction.AddToSet("retries", context.BackgroundJob.Id);
}
}
public void OnStateUnapplied(ApplyStateContext context, IWriteOnlyTransaction transaction)
{
if (context.OldStateName == ScheduledState.StateName)
{
transaction.RemoveFromSet("retries", context.BackgroundJob.Id);
}
}
public void OnStateElection(ElectStateContext context)
{
var failedState = context.CandidateState as FailedState;
if (failedState == null)
{
// This filter accepts only failed job state.
return;
}
var retryAttempt = context.GetJobParameter<int>("RetryCount") + 1;
if (retryAttempt <= Attempts)
{
ScheduleAgainLater(context, retryAttempt, failedState);
}
else
{
_logger.ErrorException($"Failed to process the job '{context.BackgroundJob.Id}': an exception occurred.", failedState.Exception);
}
}
private void ScheduleAgainLater(ElectStateContext context, int retryAttempt, FailedState failedState)
{
context.SetJobParameter("RetryCount", retryAttempt);
const int maxMessageLength = 50;
var exceptionMessage = failedState.Exception.Message.Length > maxMessageLength
? failedState.Exception.Message.Substring(0, maxMessageLength - 1) + "…"
: failedState.Exception.Message;
// If attempt number is less than max attempts, we should
// schedule the job to run again later.
var reason = $"Retry attempt {retryAttempt} of {Attempts}: {exceptionMessage}";
context.CandidateState = (IState)new EnqueuedState { Reason = reason };
if (context.CandidateState is EnqueuedState enqueuedState)
{
enqueuedState.Queue = String.Format(Queue, context.BackgroundJob.Job.Args.ToArray());
}
_logger.WarnException($"Failed to process the job '{context.BackgroundJob.Id}': an exception occurred. Retry attempt {retryAttempt} of {Attempts} will be performed.", failedState.Exception);
}
}
Detele the old hangfire data base and recreate the db with new name
Or Use in Memory strorage method (UseInMemoryStorage)

Park XML message in invalid format to AMQP parking lot queue

Given I have IntegrationFlow
IntegrationFlows.from(
Amqp.inboundAdapter(rabbitConnectionFactory, QUEUE)
.messageConverter(new MarshallingMessageConverter(xmlMarshaller))
.defaultRequeueRejected(false)
.concurrentConsumers(2)
.maxConcurrentConsumers(4)
.channelTransacted(true)
.errorHandler(new ConditionalRejectingErrorHandler())
)
.log(INFO, AMQP_LOGGER_CATEGORY)
.publishSubscribeChannel(s -> s
.subscribe(f -> f
.handle(deathCheckHandler))
.subscribe(f -> f.handle(service))
)
.get();
where deathCheckHandler is
#Component
public class DeathCheckHandler {
private static final Logger logger = LoggerFactory.getLogger(lookup().lookupClass());
private static final int RETRY_COUNT = 3;
private final RabbitTemplate rabbitTemplate;
private final Jaxb2Marshaller xmlMarshaller;
public DeathCheckHandler(RabbitTemplate rabbitTemplate, Jaxb2Marshaller xmlMarshaller) {
this.rabbitTemplate = rabbitTemplate;
this.xmlMarshaller = xmlMarshaller;
}
#ServiceActivator
public void check(Message<?> message) {
MessageHeaders headers = message.getHeaders();
Optional<XDeath> rejected = findAnyRejectedXDeathMessageHeader(headers);
if (rejected.isPresent()) {
int rejectedCount = rejected.get().getCount();
logger.debug("Rejected count is {}", rejectedCount);
if (rejectedCount > RETRY_COUNT) {
parkMessage(message);
}
}
}
private void parkMessage(Message<?> message) {
Object payload = message.getPayload();
MessageHeaders headers = message.getHeaders();
String parkingExchange = (String) headers.get("amqp_receivedExchange");
String parkingRoutingKey = ((String) headers.get("amqp_consumerQueue")).replace("queue", "plq");
rabbitTemplate.setMessageConverter(new MarshallingMessageConverter(xmlMarshaller));
logger.warn("Tried more than {} times. Parking rejected message: {} to exchange {} and routing key {}", RETRY_COUNT, payload, parkingExchange, parkingRoutingKey);
rabbitTemplate.convertAndSend(parkingExchange, parkingRoutingKey, payload);
// cause the message to be acknowledged and not routed to DLQ
throw new ImmediateAcknowledgeAmqpException("Give up retrying message: " + payload);
}
}
DeathCheckHandler handles dead-lettering which is set up on AMQP queues.
How can I park an XML message in incorrect format, i.e. when MarshallingMessageConverter throws UnmarshallingFailureException.
I want to park it in a similar way how I do it in DeathCheckHandler#parkMessage
It should be probably possible with ConditionalRejectingErrorHandler, but I don't know how.
Clone the ConditionalRejectingErrorHandler.
Use this method as a template...
#Override
public void handleError(Throwable t) {
log(t);
if (!this.causeChainContainsARADRE(t) && this.exceptionStrategy.isFatal(t)) {
if (this.discardFatalsWithXDeath && t instanceof ListenerExecutionFailedException) {
Message failed = ((ListenerExecutionFailedException) t).getFailedMessage();
if (failed != null) {
List<Map<String, ?>> xDeath = failed.getMessageProperties().getXDeathHeader();
if (xDeath != null && xDeath.size() > 0) {
this.logger.error("x-death header detected on a message with a fatal exception; "
+ "perhaps requeued from a DLQ? - discarding: " + failed);
throw new ImmediateAcknowledgeAmqpException("Fatal and x-death present");
}
}
}
throw new AmqpRejectAndDontRequeueException("Error Handler converted exception to fatal", this.rejectManual,
t);
}
}
By default, fatal exceptions with an x-death header are discarded via a ImmediateAcknowledgeAmqpException.
It's not easy to subclass and override this method because the fields are private so it would be easiest to just copy this class (and publish to the parking lot before throwing the IAAE).
I will make some improvements to this class to make it easier to customize/override.
Pull Request.

Spring AOP - passing arguments between annotated methods

i've written a utility to monitor individual business transactions. For example, Alice calls a method which calls more methods and i want info on just Alice's call, separate from Bob's call to the same method.
Right now the entry point creates a Transaction object and it's passed as an argument to each method:
class Example {
public Item getOrderEntryPoint(int orderId) {
Transaction transaction = transactionManager.create();
transaction.trace("getOrderEntryPoint");
Order order = getOrder(orderId, transaction);
transaction.stop();
logger.info(transaction);
return item;
}
private Order getOrder(int orderId, Transaction t) {
t.trace("getOrder");
Order order = getItems(itemId, t);
t.addStat("number of items", order.getItems().size());
for (Item item : order.getItems()) {
SpecialOffer offer = getSpecialOffer(item, t);
if (null != offer) {
t.incrementStat("offers", 1);
}
}
t.stop();
return order;
}
private SpecialOffer getSpecialOffer(Item item, Transaction t) {
t.trace("getSpecialOffer(" + item.id + ")", TraceCategory.Database);
return offerRepository.getByItem(item);
t.stop();
}
}
This will print to the log something like:
Transaction started by Alice at 10:42
Statistics:
number of items : 3
offers : 1
Category Timings (longest first):
DB : 2s 903ms
code : 187ms
Timings (longest first):
getSpecialOffer(1013) : 626ms
getItems : 594ms
Trace:
getOrderEntryPoint (7ms)
getOrder (594ms)
getSpecialOffer(911) (90ms)
getSpecialOffer(1013) (626ms)
getSpecialOffer(2942) (113ms)
It works great but passing the transaction object around is ugly. Someone suggested AOP but i don't see how to pass the transaction created in the first method to all the other methods.
The Transaction object is pretty simple:
public class Transaction {
private String uuid = UUID.createRandom();
private List<TraceEvent> events = new ArrayList<>();
private Map<String,Int> stats = new HashMap<>();
}
class TraceEvent {
private String name;
private long durationInMs;
}
The app that uses it is a Web app, and this multi-threaded, but the individual transactions are on a single thread - no multi-threading, async code, competition for resources, etc.
My attempt at an annotation:
#Around("execution(* *(..)) && #annotation(Trace)")
public Object around(ProceedingJoinPoint point) {
String methodName = MethodSignature.class.cast(point.getSignature()).getMethod().getName();
//--- Where do i get this call's instance of TRANSACTION from?
if (null == transaction) {
transaction = TransactionManager.createTransaction();
}
transaction.trace(methodName);
Object result = point.proceed();
transaction.stop();
return result;
Introduction
Unfortunately, your pseudo code does not compile. It contains several syntactical and logical errors. Furthermore, some helper classes are missing. If I did not have spare time today and was looking for a puzzle to solve, I would not have bothered making my own MCVE out of it, because that would actually have been your job. Please do read the MCVE article and learn to create one next time, otherwise you will not get a lot of qualified help here. This was your free shot because you are new on SO.
Original situation: passing through transaction objects in method calls
Application helper classes:
package de.scrum_master.app;
public class Item {
private int id;
public Item(int id) {
this.id = id;
}
public int getId() {
return id;
}
#Override
public String toString() {
return "Item[id=" + id + "]";
}
}
package de.scrum_master.app;
public class SpecialOffer {}
package de.scrum_master.app;
public class OfferRepository {
public SpecialOffer getByItem(Item item) {
if (item.getId() < 30)
return new SpecialOffer();
return null;
}
}
package de.scrum_master.app;
import java.util.ArrayList;
import java.util.List;
public class Order {
private int id;
public Order(int id) {
this.id = id;
}
public List<Item> getItems() {
List<Item> items = new ArrayList<>();
int offset = id == 12345 ? 0 : 1;
items.add(new Item(11 + offset, this));
items.add(new Item(22 + offset, this));
items.add(new Item(33 + offset, this));
return items;
}
}
Trace classes:
package de.scrum_master.trace;
public enum TraceCategory {
Code, Database
}
package de.scrum_master.trace;
class TraceEvent {
private String name;
private TraceCategory category;
private long durationInMs;
private boolean finished = false;
public TraceEvent(String name, TraceCategory category, long startTime) {
this.name = name;
this.category = category;
this.durationInMs = startTime;
}
public long getDurationInMs() {
return durationInMs;
}
public void setDurationInMs(long durationInMs) {
this.durationInMs = durationInMs;
}
public boolean isFinished() {
return finished;
}
public void setFinished(boolean finished) {
this.finished = finished;
}
#Override
public String toString() {
return "TraceEvent[name=" + name + ", category=" + category +
", durationInMs=" + durationInMs + ", finished=" + finished + "]";
}
}
Transaction classes:
Here I tried to mimic your own Transaction class with as few as possible changes, but there was a lot I had to add and modify in order to emulate a simplified version of your trace output. This is not thread-safe and the way I am locating the last unfinished TraceEvent is not nice and only works cleanly if there are not exceptions. But you get the idea, I hope. The point is to just make it basically work and subsequently get log output similar to your example. If this was originally my code, I would have solved it differently.
package de.scrum_master.trace;
import java.util.ArrayList;
import java.util.HashMap;
import java.util.List;
import java.util.Map;
import java.util.Map.Entry;
import java.util.UUID;
public class Transaction {
private String uuid = UUID.randomUUID().toString();
private List<TraceEvent> events = new ArrayList<>();
private Map<String, Integer> stats = new HashMap<>();
public void trace(String message) {
trace(message, TraceCategory.Code);
}
public void trace(String message, TraceCategory category) {
events.add(new TraceEvent(message, category, System.currentTimeMillis()));
}
public void stop() {
TraceEvent event = getLastUnfinishedEvent();
event.setDurationInMs(System.currentTimeMillis() - event.getDurationInMs());
event.setFinished(true);
}
private TraceEvent getLastUnfinishedEvent() {
return events
.stream()
.filter(event -> !event.isFinished())
.reduce((first, second) -> second)
.orElse(null);
}
public void addStat(String text, int size) {
stats.put(text, size);
}
public void incrementStat(String text, int increment) {
Integer currentCount = stats.get(text);
if (currentCount == null)
currentCount = 0;
stats.put(text, currentCount + increment);
}
#Override
public String toString() {
return "Transaction {" +
toStringUUID() +
toStringStats() +
toStringEvents() +
"\n}\n";
}
private String toStringUUID() {
return "\n uuid = " + uuid;
}
private String toStringStats() {
String result = "\n stats = {";
for (Entry<String, Integer> statEntry : stats.entrySet())
result += "\n " + statEntry;
return result + "\n }";
}
private String toStringEvents() {
String result = "\n events = {";
for (TraceEvent event : events)
result += "\n " + event;
return result + "\n }";
}
}
package de.scrum_master.trace;
public class TransactionManager {
public Transaction create() {
return new Transaction();
}
}
Example driver application:
package de.scrum_master.app;
import de.scrum_master.trace.TraceCategory;
import de.scrum_master.trace.Transaction;
import de.scrum_master.trace.TransactionManager;
public class Example {
private TransactionManager transactionManager = new TransactionManager();
private OfferRepository offerRepository = new OfferRepository();
public Order getOrderEntryPoint(int orderId) {
Transaction transaction = transactionManager.create();
transaction.trace("getOrderEntryPoint");
sleep(100);
Order order = getOrder(orderId, transaction);
transaction.stop();
System.out.println(transaction);
return order;
}
private Order getOrder(int orderId, Transaction t) {
t.trace("getOrder");
sleep(200);
Order order = new Order(orderId);
t.addStat("number of items", order.getItems().size());
for (Item item : order.getItems()) {
SpecialOffer offer = getSpecialOffer(item, t);
if (null != offer)
t.incrementStat("special offers", 1);
}
t.stop();
return order;
}
private SpecialOffer getSpecialOffer(Item item, Transaction t) {
t.trace("getSpecialOffer(" + item.getId() + ")", TraceCategory.Database);
sleep(50);
SpecialOffer specialOffer = offerRepository.getByItem(item);
t.stop();
return specialOffer;
}
private void sleep(long millis) {
try {
Thread.sleep(millis);
} catch (InterruptedException e) {
e.printStackTrace();
}
}
public static void main(String[] args) {
new Example().getOrderEntryPoint(12345);
new Example().getOrderEntryPoint(23456);
}
}
If you run this code, the output is as follows:
Transaction {
uuid = 62ec9739-bd32-4a56-b6b3-a8a13624961a
stats = {
special offers=2
number of items=3
}
events = {
TraceEvent[name=getOrderEntryPoint, category=Code, durationInMs=561, finished=true]
TraceEvent[name=getOrder, category=Code, durationInMs=451, finished=true]
TraceEvent[name=getSpecialOffer(11), category=Database, durationInMs=117, finished=true]
TraceEvent[name=getSpecialOffer(22), category=Database, durationInMs=69, finished=true]
TraceEvent[name=getSpecialOffer(33), category=Database, durationInMs=63, finished=true]
}
}
Transaction {
uuid = a420cd70-96e5-44c4-a0a4-87e421d05e87
stats = {
special offers=2
number of items=3
}
events = {
TraceEvent[name=getOrderEntryPoint, category=Code, durationInMs=469, finished=true]
TraceEvent[name=getOrder, category=Code, durationInMs=369, finished=true]
TraceEvent[name=getSpecialOffer(12), category=Database, durationInMs=53, finished=true]
TraceEvent[name=getSpecialOffer(23), category=Database, durationInMs=63, finished=true]
TraceEvent[name=getSpecialOffer(34), category=Database, durationInMs=53, finished=true]
}
}
AOP refactoring
Preface
Please note that I am using AspectJ here because two things about your code would never work with Spring AOP because it works with a delegation pattern based on dynamic proxies:
self-invocation (internally calling a method of the same class or super-class)
intercepting private methods
Because of these Spring AOP limitations I advise you to either refactor your code so as to avoid the two issues above or to configure your Spring applications to use full AspectJ via LTW (load-time weaving) instead.
As you noticed, my sample code does not use Spring at all because AspectJ is completely independent of Spring and works with any Java application (or other JVM languages, too).
Refactoring idea
Now what should you do in order to get rid of passing around tracing information (Transaction objects), polluting your core application code and tangling it with trace calls?
You extract transaction tracing into an aspect taking care of all trace(..) and stop() calls.
Unfortunately your Transaction class contains different types of information and does different things, so you cannot completely get rid of context information about how to trace for each affected method. But at least you can extract that context information from the method bodies and transform it into a declarative form using annotations with parameters.
These annotations can be targeted by an aspect taking care of handling transaction tracing.
Added and updated code, iteration 1
Annotations related to transaction tracing:
package de.scrum_master.trace;
import static java.lang.annotation.ElementType.METHOD;
import static java.lang.annotation.RetentionPolicy.RUNTIME;
import java.lang.annotation.Retention;
import java.lang.annotation.Target;
#Retention(RUNTIME)
#Target(METHOD)
public #interface TransactionEntryPoint {}
package de.scrum_master.trace;
import static java.lang.annotation.ElementType.METHOD;
import static java.lang.annotation.RetentionPolicy.RUNTIME;
import java.lang.annotation.Retention;
import java.lang.annotation.Target;
#Retention(RUNTIME)
#Target(METHOD)
public #interface TransactionTrace {
String message() default "__METHOD_NAME__";
TraceCategory category() default TraceCategory.Code;
String addStat() default "";
String incrementStat() default "";
}
Refactored application classes with annotations:
package de.scrum_master.app;
import java.util.ArrayList;
import java.util.List;
import de.scrum_master.trace.TransactionTrace;
public class Order {
private int id;
public Order(int id) {
this.id = id;
}
#TransactionTrace(message = "", addStat = "number of items")
public List<Item> getItems() {
List<Item> items = new ArrayList<>();
int offset = id == 12345 ? 0 : 1;
items.add(new Item(11 + offset));
items.add(new Item(22 + offset));
items.add(new Item(33 + offset));
return items;
}
}
Nothing much here, only added an annotation to getItems(). But the sample application class changes massively, getting much cleaner and simpler:
package de.scrum_master.app;
import de.scrum_master.trace.TraceCategory;
import de.scrum_master.trace.TransactionEntryPoint;
import de.scrum_master.trace.TransactionTrace;
public class Example {
private OfferRepository offerRepository = new OfferRepository();
#TransactionEntryPoint
#TransactionTrace
public Order getOrderEntryPoint(int orderId) {
sleep(100);
Order order = getOrder(orderId);
return order;
}
#TransactionTrace
private Order getOrder(int orderId) {
sleep(200);
Order order = new Order(orderId);
for (Item item : order.getItems()) {
SpecialOffer offer = getSpecialOffer(item);
// Do something with special offers
}
return order;
}
#TransactionTrace(category = TraceCategory.Database, incrementStat = "specialOffers")
private SpecialOffer getSpecialOffer(Item item) {
sleep(50);
SpecialOffer specialOffer = offerRepository.getByItem(item);
return specialOffer;
}
private void sleep(long millis) {
try {
Thread.sleep(millis);
} catch (InterruptedException e) {
e.printStackTrace();
}
}
public static void main(String[] args) {
new Example().getOrderEntryPoint(12345);
new Example().getOrderEntryPoint(23456);
}
}
See? Except for a few annotations there is nothing left of the transaction tracing logic, the application code only takes care of its core concern. If you also remove the sleep() method which only makes the application slower for demonstration purposes (because we want some nice statistics with measured times >0 ms), the class gets even more compact.
But of course we need to put the transaction tracing logic somewhere, more precisely modularise it into an AspectJ aspect:
Transaction tracing aspect:
package de.scrum_master.trace;
import java.lang.reflect.Array;
import java.util.Arrays;
import java.util.Collection;
import java.util.stream.Collectors;
import org.aspectj.lang.JoinPoint;
import org.aspectj.lang.ProceedingJoinPoint;
import org.aspectj.lang.annotation.After;
import org.aspectj.lang.annotation.Around;
import org.aspectj.lang.annotation.Aspect;
import org.aspectj.lang.annotation.Pointcut;
import org.aspectj.lang.reflect.MethodSignature;
#Aspect("percflow(entryPoint())")
public class TransactionTraceAspect {
private static TransactionManager transactionManager = new TransactionManager();
private Transaction transaction = transactionManager.create();
#Pointcut("execution(* *(..)) && #annotation(de.scrum_master.trace.TransactionEntryPoint)")
private static void entryPoint() {}
#Around("execution(* *(..)) && #annotation(transactionTrace)")
public Object doTrace(ProceedingJoinPoint joinPoint, TransactionTrace transactionTrace) throws Throwable {
preTrace(transactionTrace, joinPoint);
Object result = joinPoint.proceed();
postTrace(transactionTrace);
addStat(transactionTrace, result);
incrementStat(transactionTrace, result);
return result;
}
private void preTrace(TransactionTrace transactionTrace, ProceedingJoinPoint joinPoint) {
String traceMessage = transactionTrace.message();
if ("".equals(traceMessage))
return;
MethodSignature signature = (MethodSignature) joinPoint.getSignature();
if ("__METHOD_NAME__".equals(traceMessage)) {
traceMessage = signature.getName() + "(";
traceMessage += Arrays.stream(joinPoint.getArgs()).map(arg -> arg.toString()).collect(Collectors.joining(", "));
traceMessage += ")";
}
transaction.trace(traceMessage, transactionTrace.category());
}
private void postTrace(TransactionTrace transactionTrace) {
if ("".equals(transactionTrace.message()))
return;
transaction.stop();
}
private void addStat(TransactionTrace transactionTrace, Object result) {
if ("".equals(transactionTrace.addStat()) || result == null)
return;
if (result instanceof Collection)
transaction.addStat(transactionTrace.addStat(), ((Collection<?>) result).size());
else if (result.getClass().isArray())
transaction.addStat(transactionTrace.addStat(), Array.getLength(result));
}
private void incrementStat(TransactionTrace transactionTrace, Object result) {
if ("".equals(transactionTrace.incrementStat()) || result == null)
return;
transaction.incrementStat(transactionTrace.incrementStat(), 1);
}
#After("entryPoint()")
public void logFinishedTransaction(JoinPoint joinPoint) {
System.out.println(transaction);
}
}
Let me explain what this aspect does:
#Pointcut(..) entryPoint() says: Find me all methods in the code annotated by #TransactionEntryPoint. This pointcut is used in two places:
#Aspect("percflow(entryPoint())") says: Create one aspect instance for each control flow beginning at a transaction entry point.
#After("entryPoint()") logFinishedTransaction(..) says: Execute this advice (AOP terminology for a method linked to a pointcut) after an entry point methods is finished. The corresponding method just prints the transaction statistics just like in the original code at the end of Example.getOrderEntryPoint(..).
#Around("execution(* *(..)) && #annotation(transactionTrace)") doTrace(..)says: Wrap methods annotated by TransactionTrace and do the following (method body):
add new trace element and start measuring time
execute original (wrapped) method and store result
update trace element with measured time
add one type of statistics (optional)
increment another type of statistics (optional)
return wrapped method's result to its caller
The private methods are just helpers for the #Around advice.
The console log when running the updated Example class and active AspectJ is:
Transaction {
uuid = 4529d325-c604-441d-8997-45ca659abb14
stats = {
specialOffers=2
number of items=3
}
events = {
TraceEvent[name=getOrderEntryPoint(12345), category=Code, durationInMs=468, finished=true]
TraceEvent[name=getOrder(12345), category=Code, durationInMs=366, finished=true]
TraceEvent[name=getSpecialOffer(Item[id=11]), category=Database, durationInMs=59, finished=true]
TraceEvent[name=getSpecialOffer(Item[id=22]), category=Database, durationInMs=50, finished=true]
TraceEvent[name=getSpecialOffer(Item[id=33]), category=Database, durationInMs=51, finished=true]
}
}
Transaction {
uuid = ef76a996-8621-478b-a376-e9f7a729a501
stats = {
specialOffers=2
number of items=3
}
events = {
TraceEvent[name=getOrderEntryPoint(23456), category=Code, durationInMs=452, finished=true]
TraceEvent[name=getOrder(23456), category=Code, durationInMs=351, finished=true]
TraceEvent[name=getSpecialOffer(Item[id=12]), category=Database, durationInMs=50, finished=true]
TraceEvent[name=getSpecialOffer(Item[id=23]), category=Database, durationInMs=50, finished=true]
TraceEvent[name=getSpecialOffer(Item[id=34]), category=Database, durationInMs=50, finished=true]
}
}
You see, it looks almost identical to the original application.
Idea for further simplification, iteration 2
When reading method Example.getOrder(int orderId) I was wondering why you are calling order.getItems(), looping over it and calling getSpecialOffer(item) inside the loop. In your sample code you do not use the results for anything other than updating the transaction trace object. I am assuming that in your real code you do something with the order and with the special offers in that method.
But just in case you really do not need those calls inside that method, I suggest
you factor the calls out right into the aspect, getting rid of the TransactionTrace annotation parameters String addStat() and String incrementStat().
The Example code would get even simpler and
the annotation #TransactionTrace(message = "", addStat = "number of items") in class would go away, too.
I am leaving this refactoring to you if you think it makes sense.

The implementation of the FlinkKafkaConsumer010 is not serializable error

I created a custom class that is based on Apache Flink. The following are some parts of the class definition:
public class StreamData {
private StreamExecutionEnvironment env;
private DataStream<byte[]> data ;
private Properties properties;
public StreamData(){
env = StreamExecutionEnvironment.getExecutionEnvironment();
}
public StreamData(StreamExecutionEnvironment e , DataStream<byte[]> d){
env = e ;
data = d ;
}
public StreamData getDataFromESB(String id, int from) {
final Pattern TOPIC = Pattern.compile(id);
Properties properties = new Properties();
properties.setProperty("bootstrap.servers", "localhost:9092");
properties.setProperty("group.id", Long.toString(System.currentTimeMillis()));
properties.setProperty("key.deserializer", "org.apache.kafka.common.serialization.StringDeserializer");
properties.setProperty("value.deserializer", "org.apache.kafka.common.serialization.ByteArrayDeserializer");
properties.put("metadata.max.age.ms", 30000);
properties.put("enable.auto.commit", "false");
if (from == 0)
properties.setProperty("auto.offset.reset", "earliest");
else
properties.setProperty("auto.offset.reset", "latest");
StreamExecutionEnvironment e = StreamExecutionEnvironment.getExecutionEnvironment();
DataStream<byte[]> stream = env
.addSource(new FlinkKafkaConsumer011<>(TOPIC, new AbstractDeserializationSchema<byte[]>() {
#Override
public byte[] deserialize(byte[] bytes) {
return bytes;
}
}, properties));
return new StreamData(e, stream);
}
public void print(){
data.print() ;
}
public void execute() throws Exception {
env.execute() ;
}
Using class StreamData, trying to get some data from Apache Kafka and print them in the main function:
StreamData stream = new StreamData();
stream.getDataFromESB("original_data", 0);
stream.print();
stream.execute();
I got the error:
Exception in thread "main" org.apache.flink.api.common.InvalidProgramException: The implementation of the FlinkKafkaConsumer010 is not serializable. The object probably contains or references non serializable fields.
Caused by: java.io.NotSerializableException: StreamData
As mentioned here, I think it's because of some data type in getDataFromESB function is not serializable. But I don't know how to solve the problem!
Your AbstractDeserializationSchema is an anonymous inner class, which as a result contains a reference to the outer StreamData class which isn't serializable. Either let StreamData implement Serializable, or define your schema as a top-level class.
It seems that you are importing FlinkKafkaConsumer010 in your code but using FlinkKafkaConsumer011. Please use the following dependency in your sbt file:
"org.apache.flink" %% "flink-connector-kafka-0.11" % flinkVersion

AbstractStringBuilder.ensureCapacityInternal get NullPointerException in storm bolt

online system, the storm Bolt get NullPointerException,though I think I check it before line 61; It gets NullPointerException once in a while;
import ***.KeyUtils;
import ***.redis.PipelineHelper;
import ***.redis.PipelinedCacheClusterClient;
import **.redis.R2mClusterClient;
import org.apache.commons.lang3.StringUtils;
import org.apache.storm.task.OutputCollector;
import org.apache.storm.task.TopologyContext;
import org.apache.storm.topology.IRichBolt;
import org.apache.storm.topology.OutputFieldsDeclarer;
import org.apache.storm.tuple.Tuple;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import org.springframework.context.ApplicationContext;
import org.springframework.context.support.ClassPathXmlApplicationContext;
import java.util.Map;
/**
* RedisBolt batch operate
*/
public class RedisBolt implements IRichBolt {
static final long serialVersionUID = 737015318988609460L;
private static ApplicationContext applicationContext;
private static long logEmitNumber = 0;
private static StringBuffer totalCmds = new StringBuffer();
private Logger logger = LoggerFactory.getLogger(getClass());
private OutputCollector _collector;
private R2mClusterClient r2mClusterClient;
#Override
public void prepare(Map map, TopologyContext topologyContext, OutputCollector outputCollector) {
_collector = outputCollector;
if (applicationContext == null) {
applicationContext = new ClassPathXmlApplicationContext("spring/spring-config-redisbolt.xml");
}
if (r2mClusterClient == null) {
r2mClusterClient = (R2mClusterClient) applicationContext.getBean("r2mClusterClient");
}
}
#Override
public void execute(Tuple tuple) {
String log = tuple.getString(0);
String lastCommands = tuple.getString(1);
try {
//log count
if (StringUtils.isNotEmpty(log)) {
logEmitNumber++;
}
if (StringUtils.isNotEmpty(lastCommands)) {
if(totalCmds==null){
totalCmds = new StringBuffer();
}
totalCmds.append(lastCommands);//line 61
}
//日志数量控制
int numberLimit = 1;
String flow_log_limit = r2mClusterClient.get(KeyUtils.KEY_PIPELINE_LIMIT);
if (StringUtils.isNotEmpty(flow_log_limit)) {
try {
numberLimit = Integer.parseInt(flow_log_limit);
} catch (Exception e) {
numberLimit = 1;
logger.error("error", e);
}
}
if (logEmitNumber >= numberLimit) {
StringBuffer _totalCmds = new StringBuffer(totalCmds);
try {
//pipeline submit
PipelinedCacheClusterClient pip = r2mClusterClient.pipelined();
String[] commandArray = _totalCmds.toString().split(KeyUtils.REDIS_CMD_SPILT);
PipelineHelper.cmd(pip, commandArray);
pip.sync();
pip.close();
totalCmds = new StringBuffer();
} catch (Exception e) {
logger.error("error", e);
}
logEmitNumber = 0;
}
} catch (Exception e) {
logger.error(new StringBuffer("====RedisBolt error for log=[ ").append(log).append("] \n commands=[").append(lastCommands).append("]").toString(), e);
_collector.reportError(e);
_collector.fail(tuple);
}
_collector.ack(tuple);
}
#Override
public void cleanup() {
}
#Override
public void declareOutputFields(OutputFieldsDeclarer outputFieldsDeclarer) {
}
#Override
public Map<String, Object> getComponentConfiguration() {
return null;
}
}
exception info:
java.lang.NullPointerException at java.lang.AbstractStringBuilder.ensureCapacityInternal(AbstractStringBuilder.java:113) at java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:415) at java.lang.StringBuffer.append(StringBuffer.java:237) at com.jd.jr.dataeye.storm.bolt.RedisBolt.execute(RedisBolt.java:61) at org.apache.storm.daemon.executor$fn__5044$tuple_action_fn__5046.invoke(executor.clj:727) at org.apache.storm.daemon.executor$mk_task_receiver$fn__4965.invoke(executor.clj:459) at org.apache.storm.disruptor$clojure_handler$reify__4480.onEvent(disruptor.clj:40) at org.apache.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:472) at org.apache.storm.utils.DisruptorQueue.consumeBatchWhenAvailable(DisruptorQueue.java:451) at org.apache.storm.disruptor$consume_batch_when_available.invoke(disruptor.clj:73) at org.apache.storm.daemon.executor$fn__5044$fn__5057$fn__5110.invoke(executor.clj:846) at org.apache.storm.util$async_loop$fn__557.invoke(util.clj:484) at clojure.lang.AFn.run(AFn.java:22) at java.lang.Thread.run(Thread.java:745)
can anyone give me some advice to find the reason.
That is really odd thing to happen. Please read the code for two classes.
https://github.com/openjdk-mirror/jdk7u-jdk/blob/master/src/share/classes/java/lang/AbstractStringBuilder.java
https://github.com/openjdk-mirror/jdk7u-jdk/blob/master/src/share/classes/java/lang/StringBuffer.java
AbstractStringBuilder has constructor with no args which doesn't allocate the field 'value', which makes accessing the 'value' field being NPE. Any constructors in StringBuffer use that constructor. So maybe some odd thing happens in serialization/deserialization and unfortunately 'value' field in AbstractStringBuilder is being null.
Maybe initializing totalCmds in prepare() would be better, and also you need to consider synchronization (thread-safety) between bolts. prepare() can be called per bolt instance so fields are thread-safe, but class fields are not thread-safe.
I think I find the problem maybe;
the key point is
"StringBuffer _totalCmds = new StringBuffer(totalCmds);" and " totalCmds.append(lastCommands);//line 61"
when new a object, It takes serval steps:
(1) allocate memory and return reference
(2) initialize
if append after (1) and before (2) then the StringBuffer.java extends AbstractStringBuilder.java
/**
* The value is used for character storage.
*/
char[] value;
value is not initialized;so this will get null:
#Override
public synchronized void ensureCapacity(int minimumCapacity) {
if (minimumCapacity > value.length) {
expandCapacity(minimumCapacity);
}
}
this blot has a another question, some data maybe lost under a multithreaded environment