I'm doing something in an Eclipse plugin that throws a ResourceException. I in turn need to know what the path of the resource involved is.
I can do this via ((ResourceStatus) caughtException.getStatus()).getPath() , however I then get admonished with Discouraged access: The type 'ResourceException' is not API (same warning for ResourceStatus). Should I just ignore the warning, or is there a better way for me to get the path? I'm just worried that this might change later on.
Technically I could extract the path from the exception's message, but that feels gross & have back luck with scraping data out of human-presentable strings :-/
ResourceException extends CoreException which is part of the official API so you can catch that.
ResourceStatus implements IResourceStatus which again is an official API.
So something like:
catch (CoreException ex) {
IStatus status = ex.getStatus();
if (status instanceof IResourceStatus) {
IPath path = ((IResourceStatus)status).getPath();
....
}
}
IResourceStatus also contains the definitions of the error codes that IStatus.getCode returns for a resource exception.
I have a basic stream processing flow which looks like
master topic -> my processing in a mapper/filter -> output topics
and I am wondering about the best way to handle "bad messages". This could potentially be things like messages that I can't deserialize properly, or perhaps the processing/filtering logic fails in some unexpected way (I have no external dependencies so there should be no transient errors of that sort).
I was considering wrapping all my processing/filtering code in a try catch and if an exception was raised then routing to an "error topic". Then I can study the message and modify it or fix my code as appropriate and then replay it on to master. If I let any exceptions propagate, the stream seems to get jammed and no more messages are picked up.
Is this approach considered best practice?
Is there a convenient Kafka streams way to handle this? I don't think there is a concept of a DLQ...
What are the alternative ways to stop Kafka jamming on a "bad message"?
What alternative error handling approaches are there?
For completeness here is my code (pseudo-ish):
class Document {
// Fields
}
class AnalysedDocument {
Document document;
String rawValue;
Exception exception;
Analysis analysis;
// All being well
AnalysedDocument(Document document, Analysis analysis) {...}
// Analysis failed
AnalysedDocument(Document document, Exception exception) {...}
// Deserialisation failed
AnalysedDocument(String rawValue, Exception exception) {...}
}
KStreamBuilder builder = new KStreamBuilder();
KStream<String, AnalysedPolecatDocument> analysedDocumentStream = builder
.stream(Serdes.String(), Serdes.String(), "master")
.mapValues(new ValueMapper<String, AnalysedDocument>() {
#Override
public AnalysedDocument apply(String rawValue) {
Document document;
try {
// Deserialise
document = ...
} catch (Exception e) {
return new AnalysedDocument(rawValue, exception);
}
try {
// Perform analysis
Analysis analysis = ...
return new AnalysedDocument(document, analysis);
} catch (Exception e) {
return new AnalysedDocument(document, exception);
}
}
});
// Branch based on whether analysis mapping failed to produce errorStream and successStream
errorStream.to(Serdes.String(), customPojoSerde(), "error");
successStream.to(Serdes.String(), customPojoSerde(), "analysed");
KafkaStreams streams = new KafkaStreams(builder, config);
streams.start();
Any help greatly appreciated.
Right now, Kafka Streams offers only limited error handling capabilities. There is work in progress to simplify this. For now, your overall approach seems to be a good way to go.
One comment about handling de/serialization errors: handling those error manually, requires you to do de/serialization "manually". This means, you need to configure ByteArraySerdes for key and value for you input/output topic of your Streams app and add a map() that does the de/serialization (ie, KStream<byte[],byte[]> -> map() -> KStream<keyType,valueType> -- or the other way round if you also want to catch serialization exceptions). Otherwise, you cannot try-catch deserialization exceptions.
With your current approach, you "only" validate that the given string represents a valid document -- but it could be the case, that the message itself is corrupted and cannot be converted into a String in the source operator in the first place. Thus, you don't actually cover deserialization exception with you code. However, if you are sure a deserialization exception can never happen, you approach would be sufficient, too.
Update
This issues is tackled via KIP-161 and will be included in the next release 1.0.0. It allows you to register an callback via parameter default.deserialization.exception.handler. The handler will be invoked every time a exception occurs during deserialization and allows you to return an DeserializationResponse (CONTINUE -> drop the record an move on, or FAIL that is the default).
Update 2
With KIP-210 (will be part of in Kafka 1.1) it's also possible to handle errors on the producer side, similar to the consumer part, by registering a ProductionExceptionHandler via config default.production.exception.handler that can return CONTINUE.
Update Mar 23, 2018: Kafka 1.0 provides much better and easier handling for bad error messages ("poison pills") via KIP-161 than what I described below. See default.deserialization.exception.handler in the Kafka 1.0 docs.
This could potentially be things like messages that I can't deserialize properly [...]
Ok, my answer here focuses on the (de)serialization issues as this might be the most tricky scenario to handle for most users.
[...] or perhaps the processing/filtering logic fails in some unexpected way (I have no external dependencies so there should be no transient errors of that sort).
The same thinking (for deserialization) can also be applied to failures in the processing logic. Here, most people tend to gravitate towards option 2 below (minus the deserialization part), but YMMV.
I was considering wrapping all my processing/filtering code in a try catch and if an exception was raised then routing to an "error topic". Then I can study the message and modify it or fix my code as appropriate and then replay it on to master. If I let any exceptions propagate, the stream seems to get jammed and no more messages are picked up.
Is this approach considered best practice?
Yes, at the moment this is the way to go. Essentially, the two most common patterns are (1) skipping corrupted messages or (2) sending corrupted records to a quarantine topic aka a dead letter queue.
Is there a convenient Kafka streams way to handle this? I don't think there is a concept of a DLQ...
Yes, there is a way to handle this, including the use of a dead letter queue. However, it's (at least IMHO) not that convenient yet. If you have any feedback on how the API should allow you to handle this -- e.g. via a new or updated method, a configuration setting ("if serialization/deserialization fails send the problematic record to THIS quarantine topic") -- please let us know. :-)
What are the alternative ways to stop Kafka jamming on a "bad message"?
What alternative error handling approaches are there?
See my examples below.
FWIW, the Kafka community is also discussing the addition of a new CLI tool that allows you to skip over corrupted messages. However, as a user of the Kafka Streams API, I think ideally you want to handle such scenarios directly in your code, and fallback to CLI utilities only as a last resort.
Here are some patterns for the Kafka Streams DSL to handle corrupted records/messages aka "poison pills". This is taken from http://docs.confluent.io/current/streams/faq.html#handling-corrupted-records-and-deserialization-errors-poison-pill-messages
Option 1: Skip corrupted records with flatMap
This is arguably what most users would like to do.
We use flatMap because it allows you to output zero, one, or more output records per input record. In the case of a corrupted record we output nothing (zero records), thereby ignoring/skipping the corrupted record.
Benefit of this approach compared to the others ones listed here: We need to manually deserialize a record only once!
Drawback of this approach: flatMap "marks" the input stream for potential data re-partitioning, i.e. if you perform a key-based operation such as groupings (groupBy/groupByKey) or joins afterwards, your data will be re-partitioned behind the scenes. Since this might be a costly step we don't want that to happen unnecessarily. If you KNOW that the record keys are always valid OR that you don't need to operate on the keys (thus keeping them as "raw" keys in byte[] format), you can change from flatMap to flatMapValues, which will not result in data re-partitioning even if you join/group/aggregate the stream later.
Code example:
Serde<byte[]> bytesSerde = Serdes.ByteArray();
Serde<String> stringSerde = Serdes.String();
Serde<Long> longSerde = Serdes.Long();
// Input topic, which might contain corrupted messages
KStream<byte[], byte[]> input = builder.stream(bytesSerde, bytesSerde, inputTopic);
// Note how the returned stream is of type KStream<String, Long>,
// rather than KStream<byte[], byte[]>.
KStream<String, Long> doubled = input.flatMap(
(k, v) -> {
try {
// Attempt deserialization
String key = stringSerde.deserializer().deserialize(inputTopic, k);
long value = longSerde.deserializer().deserialize(inputTopic, v);
// Ok, the record is valid (not corrupted). Let's take the
// opportunity to also process the record in some way so that
// we haven't paid the deserialization cost just for "poison pill"
// checking.
return Collections.singletonList(KeyValue.pair(key, 2 * value));
}
catch (SerializationException e) {
// log + ignore/skip the corrupted message
System.err.println("Could not deserialize record: " + e.getMessage());
}
return Collections.emptyList();
}
);
Option 2: dead letter queue with branch
Compared to option 1 (which ignores corrupted records) option 2 retains corrupted messages by filtering them out of the "main" input stream and writing them to a quarantine topic (think: dead letter queue). The drawback is that, for valid records, we must pay the manual deserialization cost twice.
KStream<byte[], byte[]> input = ...;
KStream<byte[], byte[]>[] partitioned = input.branch(
(k, v) -> {
boolean isValidRecord = false;
try {
stringSerde.deserializer().deserialize(inputTopic, k);
longSerde.deserializer().deserialize(inputTopic, v);
isValidRecord = true;
}
catch (SerializationException ignored) {}
return isValidRecord;
},
(k, v) -> true
);
// partitioned[0] is the KStream<byte[], byte[]> that contains
// only valid records. partitioned[1] contains only corrupted
// records and thus acts as a "dead letter queue".
KStream<String, Long> doubled = partitioned[0].map(
(key, value) -> KeyValue.pair(
// Must deserialize a second time unfortunately.
stringSerde.deserializer().deserialize(inputTopic, key),
2 * longSerde.deserializer().deserialize(inputTopic, value)));
// Don't forget to actually write the dead letter queue back to Kafka!
partitioned[1].to(Serdes.ByteArray(), Serdes.ByteArray(), "quarantine-topic");
Option 3: Skip corrupted records with filter
I only mention this for completeness. This option looks like a mix of options 1 and 2, but is worse than either of them. Compared to option 1, you must pay the manual deserialization cost for valid records twice (bad!). Compared to option 2, you lose the ability to retain corrupted records in a dead letter queue.
KStream<byte[], byte[]> validRecordsOnly = input.filter(
(k, v) -> {
boolean isValidRecord = false;
try {
bytesSerde.deserializer().deserialize(inputTopic, k);
longSerde.deserializer().deserialize(inputTopic, v);
isValidRecord = true;
}
catch (SerializationException e) {
// log + ignore/skip the corrupted message
System.err.println("Could not deserialize record: " + e.getMessage());
}
return isValidRecord;
}
);
KStream<String, Long> doubled = validRecordsOnly.map(
(key, value) -> KeyValue.pair(
// Must deserialize a second time unfortunately.
stringSerde.deserializer().deserialize(inputTopic, key),
2 * longSerde.deserializer().deserialize(inputTopic, value)));
Any help greatly appreciated.
I hope I could help. If yes, I'd appreciate your feedback on how we could improve the Kafka Streams API to handle failures/exceptions in a better/more convenient way than today. :-)
For the processing logic you could take this approach:
someKStream
.mapValues(inputValue -> {
// for each execution the below "return" could provide a different class than the previous run!
// e.g. "return isFailedProcessing ? failValue : successValue;"
// where failValue and successValue have no related classes
return someObject; // someObject class vary at runtime depending on your business
}) // here you'll have KStream<whateverKeyClass, Object> -> yes, Object for the value!
// you could have a different logic for choosing
// the target topic, below is just an example
.to((k, v, recordContext) -> v instanceof failValueClass ?
"dead-letter-topic" : "success-topic",
// you could completelly ignore the "Produced" part
// and rely on spring-boot properties only, e.g.
// spring.kafka.streams.properties.default.key.serde=yourKeySerde
// spring.kafka.streams.properties.default.value.serde=org.springframework.kafka.support.serializer.JsonSerde
Produced.with(yourKeySerde,
// JsonSerde could be an instance configured as you need
// (with type mappings or headers setting disabled, etc)
new JsonSerde<>()));
Your classes, though different and landing into different topics, will serialize as expected.
When not using to(), but instead one wants to continue with other processing, he could use branch() with splitting the logic based on the kafka-value class; the trick for branch() is to return KStream<keyClass, ?>[] in order to further allow one to cast to the appropriate class the individual array items.
If you want to send an exception (custom exception) to another topic (ERROR_TOPIC_NAME):
#Bean
public KStream<String, ?> kafkaStreamInput(StreamsBuilder kStreamBuilder) {
KStream<String, InputModel> input = kStreamBuilder.stream(INPUT_TOPIC_NAME);
return service.messageHandler(input);
}
public KStream<String, ?> messageHandler(KStream<String, InputModel> inputTopic) {
KStream<String, Object> output;
output = inputTopic.mapValues(v -> {
try {
//return InputModel
return normalMethod(v);
} catch (Exception e) {
//return ErrorModel
return errorHandler(e);
}
});
output.filter((k, v) -> (v instanceof ErrorModel)).to(KafkaStreamsConfig.ERROR_TOPIC_NAME);
output.filter((k, v) -> (v instanceof InputModel)).to(KafkaStreamsConfig.OUTPUT_TOPIC_NAME);
return output;
}
If you want to handle Kafka exceptions and skip it:
#Autowired
public ConsumerErrorHandler(
KafkaProducer<String, ErrorModel> dlqProducer) {
this.dlqProducer = dlqProducer;
}
#Bean
ConcurrentKafkaListenerContainerFactory<?, ?> kafkaListenerContainerFactory(
ConcurrentKafkaListenerContainerFactoryConfigurer configurer,
ObjectProvider<ConsumerFactory<Object, Object>> kafkaConsumerFactory) {
ConcurrentKafkaListenerContainerFactory<Object, Object> factory = new ConcurrentKafkaListenerContainerFactory<>();
configurer.configure(factory, kafkaConsumerFactory.getIfAvailable());
factory.setErrorHandler(((exception, data) -> {
ErrorModel errorModel = ErrorModel.builder().message()
.status("500").build();
assert data != null;
dlqProducer.send(new ProducerRecord<>(DLQ_TOPIC, data.key().toString(), errorModel));
}));
return factory;
}
All above answers although valid and useful, they are assuming that your streams topology is stateless. For example going back to the original example,
master topic -> my processing in a mapper/filter -> output topics
"my processing in a mapper/filter" should be stateless. I.e. Not re-partitioning (aka writing to a persistent re-partition topic) or doing a toTable() (aka writing to a changelog topic). If the processing fails further down the topology and you commit the transaction (by following any of the 3 option mention above - flatmap, branch or filter - then you have to cater for manually or programmatically eventually deleting that inconsistent state. That would mean writing extra custom code for automatic this.
I would personally expect Streams to also give you a LogAndSkip option for any unhandled runtime exception, not only for deserialization and production ones.
Has anyone any ideas on this?
I don't believe these examples work at all when working with Avro.
When the schema can't be resolved (i.e there is bad/non-avro message corrupting the topic, for example) there is no key or value to deserialize in the first place because by the time the DSL .branch() code is called, the exception has already been thrown (or handled).
Can anyone confirm if this i indeed the case? The very fluent approach you refer to here isn't possible when working with Avro?
KIP-161 does explain how to use a handler, however, it's much more fluent to see it as part of the topology.
I am new to JavaFX coding (in IntelliJ IDEA), and have been reading / searching all over on how to swap scenes in the main controller / container. I found jewelsea's answer in another thread (Loading new fxml in the same scene), but am receiving an error on the following code.
public static void loadVista(String fxml) {
try {
mainController.setVista(
FXMLLoader.load(VistaNavigator.class.getResource(fxml)));
} catch (IOException e) {
e.printStackTrace();
}
}
The error I am receiving is the following:
Error:(56, 27) java: method setVista in class sample.MainController cannot be applied to given types;
required: javafx.scene.Node
found: java.lang.Object
reason: actual argument java.lang.Object cannot be converted to javafx.scene.Node by method invocation conversion
I know other people have gotten this to work, but all I have done is create a new project and copy the code. Can anyone help me?
It looks like you are trying to compile this with JDK 1.7: the code will only work in JDK 1.8 (the difference here being the enhanced type inference for generic methods introduced in JDK 1.8).
You should configure IntelliJ to use JDK 1.8 instead of 1.7.
If you want to try to revert the code to be JDK 1.7 compatible, you can try to replace it with
public static void loadVista(String fxml) {
try {
mainController.setVista(
FXMLLoader.<Node>load(VistaNavigator.class.getResource(fxml)));
} catch (IOException e) {
e.printStackTrace();
}
}
(with the appropriate import javafx.scene.Node ;, if needed). Of course, there may be other incompatibilities since the code you are using is targeted at JDK 1.8.
According to Microsoft's samples, here's how one would go about streaming a file throuhg WCF:
// Service class which implements the service contract
public class StreamingService : IStreamingSample
{
public System.IO.Stream GetStream(string data)
{
//this file path assumes the image is in
// the Service folder and the service is executing
// in service/bin
string filePath = Path.Combine(
System.Environment.CurrentDirectory,
".\\image.jpg");
//open the file, this could throw an exception
//(e.g. if the file is not found)
//having includeExceptionDetailInFaults="True" in config
// would cause this exception to be returned to the client
try
{
FileStream imageFile = File.OpenRead(filePath);
return imageFile;
}
catch (IOException ex)
{
Console.WriteLine(
String.Format("An exception was thrown while trying to open file {0}", filePath));
Console.WriteLine("Exception is: ");
Console.WriteLine(ex.ToString());
throw ex;
}
}
...
Now, how do I know who's responsible for releasing the FileStream when the transfer is done?
EDIT: If the code is put inside a "using" block the stream gets shut down before the client receives anything.
The service should clean up and not the client. WCF's default for OperationBehaviorAttribute.AutoDisposeParameters seems to be true, therefore it should do the disposing for you. Although there doesn't seem to be a fixed answer on this.
You could try using the OperationContext.OperationCompleted Event:
OperationContext clientContext = OperationContext.Current;
clientContext.OperationCompleted += new EventHandler(delegate(object sender, EventArgs args)
{
if (fileStream != null)
fileStream.Dispose();
});
Put that before your return.
Check this blog
Short answer: the calling code, via a using block.
Long answer: sample code should never be held up as an exemplar of good practice, it's only there to illustrate one very specific concept. Real code would never have a try block like that, it adds no value to the code. Errors should be logged at the topmost level, not down in the depths. Bearing that in mind, the sample becomes a single expression, File.OpenRead(filePath), that would be simply plugged into the using block that requires it.
UPDATE (after seeing more code):
Just return the stream from the function, WCF will decide when to dispose it.
The stream needs to be closed by party who is responsible to read it. For example, if service returns the stream to client, it's client application responsibility close the stream as Service doesn't know or have control when client finishes reading stream. Also, WCF will not take care of closing the stream again because of the fact that it doesn't know when receiving party has finished reading. :)
HTH,
Amit Bhatia
Working with Linq2Sql as a driver for a Wcf Service. Lets go bottom up....
Down at the bottom, we have the method that hits Linq2Sql...
public virtual void UpdateCmsDealer(CmsDealer currentCmsDealer)
{
this.Context.CmsDealers.Attach(currentCmsDealer,
this.ChangeSet.GetOriginal(currentCmsDealer));
}
That gets used by my Wcf service as such...
public bool UpdateDealer(CmsDealer dealer)
{
try
{
domainservice.UpdateCmsDealer(dealer);
return true;
}
catch
{
return false;
}
}
And called from my Wpf client code thus (pseudocode below)...
[...pull the coreDealer object from Wcf, it is a CmsDealer...]
[...update the coreDealer object with new data, not touchign the relation fields...]
try
{
contextCore.UpdateDealer(coreDealer);
}
catch (Exception ex)
{
[...handle the error...]
}
Now, the CmsDealer type has >1< foriegn key relationship, it uses a "StateId" field to link to a CmsItemStates table. So yes, in the above coreDealer.StateId is filled, and I can access data on coreDealer.CmsItemState.Title does show me the tile of the appropriate state.
Now, here is the thing... if you comment out the line...
domainservice.UpdateCmsDealer(dealer);
In the Wcf service it STILL bombs with the exception below, which indicates to me that it isn't really a Linq2Sql problem but rather a Linq2Sql over Wcf issue.
"System.Data.Linq.ForeignKeyReferenceAlreadyHasValueException was unhandled by user code
Message="Operation is not valid due to the current state of the object."
InnerException is NULL. The end result of it all when it bubles up to the error handler (the Catch ex bloc) the exception message will complain about the deserializer. When I can snatch a debug, the actual code throwing the error is this snippit from the CmsDealer model code built by Linq2Sql.
[Column(Storage="_StateId", DbType="UniqueIdentifier NOT NULL")]
public System.Guid StateId
{
get
{
return this._StateId;
}
set
{
if ((this._StateId != value))
{
if (this._CmsItemState.HasLoadedOrAssignedValue)
{
throw new System.Data.Linq.ForeignKeyReferenceAlreadyHasValueException();
}
this.OnStateIdChanging(value);
this.SendPropertyChanging();
this._StateId = value;
this.SendPropertyChanged("StateId");
this.OnStateIdChanged();
}
}
}
In short, it would appear that some stuff is happening "under the covers" which is fine but the documentation is nonexistent. Hell googleing for "ForeignKeyReferenceAlreadyHasValueException" turns up almost nothing :)
I would prefer to continue working with the Linq2Sql objects directly over Wcf. I could, if needed, create a flat proxy class that had no association, ship it up the wire to Wcf then use it as a data source for a server side update... but that seems like a lot of effort when clearly this is an intended scenario... right?
Thanks!
The default serializer will first set the State, which will set the StateId. After that it will try to set the serialized StateId and then the exception is thrown.
The problem is that you did not specify that you want you classes to be decorated with the DataContract attribute.
Go to the properties of your LinqToSqlGenerator and set the Serialization Mode to Unidirectional
This will cause the tool to add the DataMember attribute to the required properties and you will see that the StateId will not be a DataMember since it will be automatically set when the State Property is set while deserializing.
The error is likely due to something changing the fk value after it has been initially set - are you sure you don't have some custom initialisation code somewhere that might be initially setting the value?
You could breakpoint the set (where it's throwing), and step out each time it's set (skipping the exception if you need to) which should hopefully point you in the right direction.