Testing Flink with embedded Kafka

Testing Flink with embedded Kafka - testing

I have a simple Flink application, which sums up the events with the same id and timestamp within the last minute:
DataStream<String> input = env
.addSource(consumerProps)
.uid("app");
DataStream<Event> events = input.map(record -> mapper.readValue(record, Event.class));
pixels
.assignTimestampsAndWatermarks(new TimestampsAndWatermarks())
.keyBy("id")
.timeWindow(Time.minutes(1))
.sum("constant")
.addSink(simpleNotificationServiceSink);
env.execute(jobName);
private static class TimestampsAndWatermarks extends BoundedOutOfOrdernessTimestampExtractor<Pixel> {
public TimestampsAndWatermarks() {
super(Time.seconds(90));
}
// timestampReadable is timestamp rounded on minutes, in format yyyyMMddhhmm
#Override
public long extractTimestamp(Pixel pixel) {
return Long.parseLong(pixel.timestampReadable);
}
}
I would like to implement the scenario:
Start embedded Kafka
Publish couple of messages to the topic
Consume the messages with Flink
Check the correctness of the output produced by Flink
Does Flink provides utilities to test the job with embedded Kafka? If yes, what is the recommended approach?
Thanks.

There's a JUnit rule you can use to bring up an embedded Kafka -- see (see https://github.com/charithe/kafka-junit).
To have tests that terminate cleanly, try something like this:
public class TestDeserializer extends YourKafkaDeserializer<T> {
public final static String END_APP_MARKER = "END_APP_MARKER"; // tests send as last record
#Override
public boolean isEndOfStream(ParseResult<T> nextElement) {
if (nextElement.getParseError() == null)
return false;
if (END_APP_MARKER.equals(nextElement.getParseError().getRawData()))
return true;
return false;
}
}

Related

Handling a received not covered in become

I am using Akka.NET to develop a logistics simulation.
Having tried various patterns, it seems to me that FSM-type behaviour using become will substantially simplify development.
The system has a repeating clock tick message that all relevant actors receive in order to simulate accelerated passage of time for the entire simulation system. This clock tick message should be handled by all actors that are subscribed to it regardless of which message loop is currently active for any specific actor.
Am I correct in thinking that the only way to handle the clock message in all message loops is by explicitly checking for it in all message loops, or is there a way of defining messages that are handled regardless of which message loop is active?
If the former is the case my idea is to check for a clock tick message in a ReceiveAny, which all the message loops need to have anyway, and to then pass it on to an appropriate handler.

You could use Stashing to Stash the messages while Simulating. I came up with the following code sample to better explain how that works:
// See https://aka.ms/new-console-template for more information
using Akka.Actor;
using Akka.NET_StackOverflow_Questions_tryout.Questions;
var actorSystem = ActorSystem.Create("stackOverFlow");
var sim = actorSystem.ActorOf(Props.Create(()=> new StackOverflow71079733()));
sim.Tell(5000L);
sim.Tell("string");
sim.Tell(1000L);
sim.Tell("strin2");
sim.Tell("strin3");
Console.ReadLine();
public class StackOverflow71079733 : ReceiveActor, IWithUnboundedStash
{
public IStash Stash { get ; set ; }
private readonly IActorRef _simActor;
public StackOverflow71079733()
{
_simActor = Context.ActorOf<SimulationActor>();
ClockTickMessage();
}
private void Simulate(long ticks)
{
Console.WriteLine($"Ticks: {ticks}");
Receive<Done>(d =>
{
Console.WriteLine("Simulation done");
Become(ClockTickMessage);
Stash?.Unstash();
});
// you can add additional messages that may to be handled while the simulation is happening
// e.g:
Receive<string>(s => Console.WriteLine($"received in '{s}' in simulation"));
//While the simulation is on-going, add the incoming message into a queue/stash it
// so that it is not lost and can be picked and handled after stimulation is done
ReceiveAny(any =>
{
Stash.Stash();
Console.WriteLine($"Stashed Ticks: {any}");
});
_simActor.Tell(ticks);
}
private void ClockTickMessage()
{
// you can create an object to represent the ClockTickMessage
Receive<long>(ticks =>
{
Become(() => Simulate(ticks));
});
}
}
/// <summary>
/// We need to run simulation in a another actor so that the parent actor can keep receiving ClockTicksMessages
/// In case the sim takes a long time to become
/// </summary>
public sealed class SimulationActor : ReceiveActor
{
private IActorRef _sender;
public SimulationActor()
{
Receive<long>(l =>
{
_sender = Sender;
Thread.Sleep(TimeSpan.FromMilliseconds(l));
_sender.Tell(Done.Instance);
});
}
}
public sealed class Done
{
public static Done Instance = new Done();
}

How to determine job's queue at runtime

Our web app allows the end-user to set the queue of recurring jobs on the UI. (We create a queue for each server (use server name) and allow users to choose server to run)
How the job is registered:
RecurringJob.AddOrUpdate<IMyTestJob>(input.Id, x => x.Run(), input.Cron, TimeZoneInfo.Local, input.QueueName);
It worked properly, but sometimes we check the log on Production and found that it runs on the wrong queue (server). We don't have more access to Production so that we try to reproduce at Development but it's not happened.
To temporarily fix this issue, we need to get the queue name when the job running, then compare it with the current server name and stop it when they are diferent.
Is it possible and how to get it from PerformContext?
Noted: We use HangFire version: 1.7.9 and ASP.NET Core 3.1

You may have a look at https://github.com/HangfireIO/Hangfire/pull/502
A dedicated filter intercepts the queue changes and restores the original queue.
I guess you can just stop the execution in a very similar filter, or set a parameter to cleanly stop execution during the IElectStateFilter.OnStateElection phase by changing the CandidateState to FailedState
Maybe your problem comes from an already existing filter which messes up with the queues.
Here is the code from the link above :
public class PreserveOriginalQueueAttribute : JobFilterAttribute, IApplyStateFilter
{
public void OnStateApplied(ApplyStateContext context, IWriteOnlyTransaction transaction)
{
var enqueuedState = context.NewState as EnqueuedState;
// Activating only when enqueueing a background job
if (enqueuedState != null)
{
// Checking if an original queue is already set
var originalQueue = JobHelper.FromJson<string>(context.Connection.GetJobParameter(
context.BackgroundJob.Id,
"OriginalQueue"));
if (originalQueue != null)
{
// Override any other queue value that is currently set (by other filters, for example)
enqueuedState.Queue = originalQueue;
}
else
{
// Queueing for the first time, we should set the original queue
context.Connection.SetJobParameter(
context.BackgroundJob.Id,
"OriginalQueue",
JobHelper.ToJson(enqueuedState.Queue));
}
}
}
public void OnStateUnapplied(ApplyStateContext context, IWriteOnlyTransaction transaction)
{
}
}

I have found the simple solution: since we have known the Recurring Job Id, we can get its information from JobStorage and compare it with the current queue (current server name):
public bool IsCorrectQueue()
{
List<RecurringJobDto> recurringJobs = Hangfire.JobStorage.Current.GetConnection().GetRecurringJobs();
var myJob = recurringJobs.FirstOrDefault(x => x.Id.Equals("My job Id"));
var definedQueue = myJob.Queue;
var currentServerQueue = string.Concat(Environment.MachineName.ToLowerInvariant().Where(char.IsLetterOrDigit));
return definedQueue == "default" || definedQueue == currentServerQueue;
}
Then check it inside the job:
public async Task Run()
{
//Check correct queue
if (!IsCorrectQueue())
{
Logger.Error("Wrong queue detected");
return;
}
//Job logic
}

Analog of TestNG Reporter.getOutput() in Spock

I run API tests using Groovy and Spock.
Request/response data, produced by third-party libraries, appear in the system out (I see it in the Jenkins log).
Question:
what is the proper way to start-stop system out recording for each test iteration to some list of strings?
TestNG has Reporter.getOutput(result) which returns all log entries, appeared while test iteration run.
Is it something similar in Spock?
Am I right assuming it should be some implementation of Run listener where I start recording in beforeIteration() and attach it to report in afterIteration()?

Solved by using the OutputCapture from
spring-boot-starter-test
The sample of RunListener:
class RunListener extends AbstractRunListener {
OutputCapture outputCapture
void beforeSpec(SpecInfo spec) {
Helper.log "[BEFORE SPEC]: ${spec.name}"
outputCapture = new OutputCapture()
outputCapture.captureOutput() //register a copy off system.out, system.err streams
}
void beforeFeature(FeatureInfo feature) {
Helper.log "[BEFORE FEATURE]: ${feature.name}", 2
}
void beforeIteration(IterationInfo iteration) {
outputCapture.reset() //clear the stream copy before each test iteration
}
void afterIteration(IterationInfo iteration) {
}
void error(ErrorInfo error) {
//attach the content of copy stream object to the report if test iteration failed
Allure.addAttachment("${error.method.iteration.name}_console_out", "text/html", outputCapture.toString(), "txt")
}
void afterFeature(FeatureInfo feature) {
}
void afterSpec(SpecInfo spec) {
outputCapture.releaseOutput()
}
}
Shortly:
CaptureOutput is an implementation of Output Stream which logs in both initial out and copy stream object. The copy gets cleared in beforeIteration() and is attached to the report in afterIteration() in RunListener, so each test receives its own part of output.

I need answer of one jade agent to depend on information from others and don't know how to do it

I'm new to jade and I have 5 agents in eclipse that have formula for finding an average and the question is how to send information from agent to this formula for calculation?
I'll be glad if someone can help me with this.
For example, there is one of my agents. There's no formula, because I don't know how to represent it. This is math expression of it: n+=alfa(y(1,2)-y(1,1))
public class FirstAgent extends Agent {
private Logger myLogger = Logger.getMyLogger(getClass().getName());
public class WaitInfoAndReplyBehaviour extends CyclicBehaviour {
public WaitInfoAndReplyBehaviour(Agent a) {
super(a);
}
public void action() {
ACLMessage msg = myAgent.receive();
if(msg != null){
ACLMessage reply = msg.createReply();
if(msg.getPerformative()== ACLMessage.REQUEST){
String content = msg.getContent();
if ((content != null) && (content.indexOf("What is your number?") != -1)){
myLogger.log(Logger.INFO, "Agent "+getLocalName()+" - Received Info Request from "+msg.getSender().getLocalName());
reply.setPerformative(ACLMessage.INFORM);
try {
reply.setContentObject(7);
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
else{
myLogger.log(Logger.INFO, "Agent "+getLocalName()+" - Unexpected request ["+content+"] received from "+msg.getSender().getLocalName());
reply.setPerformative(ACLMessage.REFUSE);
reply.setContent("( UnexpectedContent ("+content+"))");
}
}
else {
myLogger.log(Logger.INFO, "Agent "+getLocalName()+" - Unexpected message ["+ACLMessage.getPerformative(msg.getPerformative())+"] received from "+msg.getSender().getLocalName());
reply.setPerformative(ACLMessage.NOT_UNDERSTOOD);
reply.setContent("( (Unexpected-act "+ACLMessage.getPerformative(msg.getPerformative())+") )");
}
send(reply);
}
else {
block();
}
}
}

So from what I can make out you want to (1) send a formula/task to multiple platforms, (2) have them performed locally, (3) and have the results communicated back.
I think there are atleast two ways of doing this:
The first is sending an object in an ACLMessage using Java Serialisation. This is a more OOP approach and not very "Agenty".
The second being the cloning or creating a local task agent.
Using Java SerialZation. (Solution 1)
Create an object for the calculation
class CalculationTask implements serialization
{ int n;
int calculate(){
n+=alfa(y(1,2)-y(1,1));
}
}
send the calculation object via ACLMESSAGE from the senderAgent.
request.setContentObject(new CalculationTask())
recieve the calculation object by recieverAgent and perform calculation on the object. Then response setting the complete task in the response.
CalculationTask myTask = request.getContentObject();
myTask.calculate();
ACLMESSAGE response = request.createReply();
response.setContentObject(myTask());
response.setPerformative(ACLMESSAGE.INFORM)
send(response)
The senderAgent then receives the complete job.
ACLMESSAGE inform = getMessage();
CalculationTask completeTask = inform.getContentObject();
completeTask.process()
Creating local Task Agents (Solution 2)
The Agent Orientated way of doing it would be to launch a task agent on each platform. Have each task agent complete the task and respond appropriately.

Can collections / iterables / streams be passed into #ParamterizedTests?

In Junit5 5.0.0 M4 I could do this:
#ParameterizedTest
#MethodSource("generateCollections")
void testCollections(Collection<Object> collection) {
assertOnCollection(collection);
}
private static Iterator<Collection<Object>> generateCollections() {
Random generator = new Random();
// We'll run as many tests as possible in 500 milliseconds.
final Instant endTime = Instant.now().plusNanos(500000000);
return new Iterator<Collection<Object>>() {
#Override public boolean hasNext() {
return Instant.now().isBefore(endTime);
}
#Override public Collection<Object> next() {
// Dummy code
return Arrays.asList("this", "that", Instant.now());
}
};
}
Or any number of other things that ended up with collections of one type or another being passed into my #ParameterizedTest. This no longer works: I now get the error
org.junit.jupiter.api.extension.ParameterResolutionException:
Error resolving parameter at index 0
I've been looking through the recent commits to SNAPSHOT and I there's a few changes in the area, but I can't see anything that definitely changes this.
Is this a deliberate change? I'd ask this on a JUnit5 developer channel but I can't find one. And it's not a bug per se: passing a collection is not a documented feature.
If this is a deliberate change, then this is a definite use-case for #TestFactory...

See https://github.com/junit-team/junit5/issues/872
The next snapshot build should fix the regression.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Testing Flink with embedded Kafka - testing

Related

Handling a received not covered in become

How to determine job's queue at runtime

Analog of TestNG Reporter.getOutput() in Spock

I need answer of one jade agent to depend on information from others and don't know how to do it

Can collections / iterables / streams be passed into #ParamterizedTests?

Categories

Resources