correct way of using IgniteDataStreamer API - ignite

I have a multi threaded application which keeps writing to ignite cache using write() method. At startup, it calls init() method which creates a cache and streamer object. Once all threads are done, flush() is called which close the streamer and write any keys which failed to insert in write() method. I have some queries.
In a write() method, what should i do with IgniteFuture? Should i wait on it until it is completed?
Is this future get complete when entry is written into cache or into streamer internal buffer?
In a flush() method, I am writing all the entries which failed in a write() method before closing streamer object. Is this a correct way?
code below:
public void initialize(final String cacheName) {
getOrCreateCache(cacheName, true); // create new cache or get existing if it already exists
this.streamer = ignite.dataStreamer(cacheName);
this.cacheName = cacheName;
}
public void write(K key, V value) {
try {
IgniteFuture<?> future = streamer.addData(key, value); // what to do with this future. Should i wait on that?
numberOfEntriesWrittentIntoCache.incrementAndGet();
} catch (IgniteInterruptedException | IgniteDataStreamerTimeoutException | CacheException | IllegalStateException e) {
failedMap.put(key, value);
}
}
public void flush() {
try {
if (streamer != null) {
if (failedMap.size() > 0) {
LOGGER.info("Writing " + failedMap.size() + " failed entries");
failedMap.forEach((k, v) -> writeToGrid(k, v));
}
}
} catch (IllegalStateException | CacheException | IgniteException e) {
LOGGER.error("Exception while writing/closing ignite");
} catch (Exception e) {
LOGGER.error("Exception while writing/closing ignite");
} finally {
failedMap.clear();
if (streamer != null) {
streamer.close();
streamer = null;
}
LOGGER.info("Number of entries written into cache are " + numberOfEntriesWrittentIntoCache.intValue());
}
}

Data streamer sends data to other nodes in batches. Future, returned from addData method is completed, when a batch with the provided entry is flushed into a cache.
So, if you wait for completion of each future, returned from addData(...) method, then you may never wait till it's completed, if autoFlushFrequency is not configured and flush() or close() is not called. And even if autoFlushFrequency is configured, then each call to write() method will wait till the batch is flushed.
I don't see anything bad in trying to write failed entries to data streamer again at the end of all processing. But I don't really know a case, when it may be useful.
The only thing, that I would change is wrapping writeToGrid(k, v) inside the forEach into its own try-catch block. Otherwise one exception will stop processing of all failed entries.

Related

Spring Integration testing a Files.inboundAdapter flow

I have this flow that I am trying to test but nothing works as expected. The flow itself works well but testing seems a bit tricky.
This is my flow:
#Configuration
#RequiredArgsConstructor
public class FileInboundFlow {
private final ThreadPoolTaskExecutor threadPoolTaskExecutor;
private String filePath;
#Bean
public IntegrationFlow fileReaderFlow() {
return IntegrationFlows.from(Files.inboundAdapter(new File(this.filePath))
.filterFunction(...)
.preventDuplicates(false),
endpointConfigurer -> endpointConfigurer.poller(
Pollers.fixedDelay(500)
.taskExecutor(this.threadPoolTaskExecutor)
.maxMessagesPerPoll(15)))
.transform(new UnZipTransformer())
.enrichHeaders(this::headersEnricher)
.transform(Message.class, this::modifyMessagePayload)
.route(Map.class, this::channelsRouter)
.get();
}
private String channelsRouter(Map<String, File> payload) {
boolean isZip = payload.values()
.stream()
.anyMatch(file -> isZipFile(file));
return isZip ? ZIP_CHANNEL : XML_CHANNEL; // ZIP_CHANNEL and XML_CHANNEL are PublishSubscribeChannel
}
#Bean
public SubscribableChannel xmlChannel() {
var channel = new PublishSubscribeChannel(this.threadPoolTaskExecutor);
channel.setBeanName(XML_CHANNEL);
return channel;
}
#Bean
public SubscribableChannel zipChannel() {
var channel = new PublishSubscribeChannel(this.threadPoolTaskExecutor);
channel.setBeanName(ZIP_CHANNEL);
return channel;
}
//There is a #ServiceActivator on each channel
#ServiceActivator(inputChannel = XML_CHANNEL)
public void handleXml(Message<Map<String, File>> message) {
...
}
#ServiceActivator(inputChannel = ZIP_CHANNEL)
public void handleZip(Message<Map<String, File>> message) {
...
}
//Plus an #Transformer on the XML_CHANNEL
#Transformer(inputChannel = XML_CHANNEL, outputChannel = BUS_CHANNEL)
private List<BusData> xmlFileToIngestionMessagePayload(Map<String, File> xmlFilesByName) {
return xmlFilesByName.values()
.stream()
.map(...)
.collect(Collectors.toList());
}
}
I would like to test multiple cases, the first one is checking the message payload published on each channel after the end of fileReaderFlow.
So I defined this test classe:
#SpringBootTest
#SpringIntegrationTest
#ExtendWith(SpringExtension.class)
class FileInboundFlowTest {
#Autowired
private MockIntegrationContext mockIntegrationContext;
#TempDir
static Path localWorkDir;
#BeforeEach
void setUp() {
copyFileToTheFlowDir(); // here I copy a file to trigger the flow
}
#Test
void checkXmlChannelPayloadTest() throws InterruptedException {
Thread.sleep(1000); //waiting for the flow execution
PublishSubscribeChannel xmlChannel = this.getBean(XML_CHANNEL, PublishSubscribeChannel.class); // I extract the channel to listen to the message sent to it.
xmlChannel.subscribe(message -> {
assertThat(message.getPayload()).isInstanceOf(Map.class); // This is never executed
});
}
}
As expected that test does not work because the assertThat(message.getPayload()).isInstanceOf(Map.class); is never executed.
After reading the documentation I didn't find any hint to help me solved that issue. Any help would be appreciated! Thanks a lot
First of all that channel.setBeanName(XML_CHANNEL); does not effect the target bean. You do this on the bean creation phase and dependency injection container knows nothing about this setting: it just does not consult with it. If you really would like to dictate an XML_CHANNEL for bean name, you'd better look into the #Bean(name) attribute.
The problem in the test that you are missing the fact of async logic of the flow. That Files.inboundAdapter() works if fully different thread and emits messages outside of your test method. So, even if you could subscribe to the channel in time, before any message is emitted to it, that doesn't mean your test will work correctly: the assertThat() will be performed on a different thread. Therefore no real JUnit report for your test method context.
So, what I'd suggest to do is:
Have Files.inboundAdapter() stopped in the beginning of the test before any setup you'd like to do in the test. Or at least don't place files into that filePath, so the channel adapter doesn't emit messages.
Take the channel from the application context and if you wish subscribe or use a ChannelInterceptor.
Have an async barrier, e.g. CountDownLatch to pass to that subscriber.
Start the channel adapter or put file into the dir for scanning.
Wait for the async barrier before verifying some value or state.

Curator LeaderLatch EOFException on shutdown

We use LeaderLatch to select leader on my cluster.
we use it like this:
leaderLatch.addListener(new LeaderLatchListener() {
#Override
public void isLeader() {
// create leader tasks runner
}
#Override
public void notLeader() {
// shutdown leader tasks runner
});
leaderLatch.start();
leaderLatch.await();
We also have a graceful shutdown process:
CloseableUtils.closeQuietly(leaderLatch);
now, the problem is when I shutdown a non-leader instance, the await() method throws a EOFException.
This is the code from LeaderLatch itself:
public void await() throws InterruptedException, EOFException
{
synchronized(this)
{
while ( (state.get() == State.STARTED) && !hasLeadership.get() )
{
wait();
}
}
if ( state.get() != State.STARTED )
{
throw new EOFException();
}
}
since I have closed it - the state is not STARTED but CLOSED so empty EOFException is thrown.
Is there a better way?
We use curator-recepies-4.2.0
Regards,
Ido
The contract for await() is to not return until it owns the lock. It has no way of indicating that you don't own the lock other than to throw an exception. I suggest you use the version of await that takes a timeout and returns a boolean. You can then close the lock and check the result of await(). Do this in a loop if you want.

getting apache ignite continuous query to work without enabling p2p class loading

I have been trying to get my ignite continuous query code to work without setting the peer class loading to enabled. However I find that the code does not work.I tried debugging and realised that the call to cache.query(qry) errors out with the message "Failed to marshal custom event" error. When I enable the peer class loading , the code works as expected. Could someone provide guidance on how I can make this work without peer class loading?
Following is the code snippet that calls the continuous query.
public void subscribeEvent(IgniteCache<String,String> cache,String inKeyStr,ServerWebSocket websocket ){
System.out.println("in thread "+Thread.currentThread().getId()+"-->"+"subscribe event");
//ArrayList<String> inKeys = new ArrayList<String>(Arrays.asList(inKeyStr.split(",")));
ContinuousQuery<String, String> qry = new ContinuousQuery<>();
/****
* Continuous Query Impl
*/
inKeys = ","+inKeyStr+",";
qry.setInitialQuery(new ScanQuery<String, String>((k, v) -> inKeys.contains(","+k+",")));
qry.setTimeInterval(1000);
qry.setPageSize(1);
// Callback that is called locally when update notifications are received.
// Factory<CacheEntryEventFilter<String, String>> rmtFilterFactory = new com.ccx.ignite.cqfilter.FilterFactory().init(inKeyStr);
qry.setLocalListener(new CacheEntryUpdatedListener<String, String>() {
#Override public void onUpdated(Iterable<CacheEntryEvent<? extends String, ? extends String>> evts) {
for (CacheEntryEvent<? extends String, ? extends String> e : evts)
{
System.out.println("websocket locallsnr data in thread "+Thread.currentThread().getId()+"-->"+"key=" + e.getKey() + ", val=" + e.getValue());
try{
websocket.writeTextMessage("key=" + e.getKey() + ", val=" + e.getValue());
}
catch (Exception e1){
System.out.println("exception local listener "+e1.getMessage());
qry.setLocalListener(null) ; }
}
}
} );
qry.setRemoteFilterFactory( new com.ccx.ignite.cqfilter.FilterFactory().init(inKeys));
try{
cur = cache.query(qry);
for (Cache.Entry<String, String> e : cur)
{
System.out.println("websocket initialqry data in thread "+Thread.currentThread().getId()+"-->"+"key=" + e.getKey() + ", val=" + e.getValue());
websocket.writeTextMessage("key=" + e.getKey() + ", val=" + e.getValue());
}
}
catch (Exception e){
System.out.println("exception cache.query "+e.getMessage());
}
}
Following is the remote filter class that I have made into a self contained jar and pushed into the libs folder of ignite, so that this can be picked up by the server nodes
public class FilterFactory
{
public Factory<CacheEntryEventFilter<String, String>> init(String inKeyStr ){
System.out.println("factory init called jun22 ");
return new Factory <CacheEntryEventFilter<String, String>>() {
private static final long serialVersionUID = 5906783589263492617L;
#Override public CacheEntryEventFilter<String, String> create() {
return new CacheEntryEventFilter<String, String>() {
#Override public boolean evaluate(CacheEntryEvent<? extends String, ? extends String> e) {
//List inKeys = new ArrayList<String>(Arrays.asList(inKeyStr.split(",")));
System.out.println("inside remote filter factory ");
String inKeys = ","+inKeyStr+",";
return inKeys.contains(","+e.getKey()+",");
}
};
}
};
}
}
Overall logic that I'm trying to implement is to have a websocket client subscribe to an event by specifying a cache name and key(s) of interest.
The subscribe event code is called which creates a continuous query and registers a local listener callback for any update event on the key(s) of interest.
The remote filter is expected to filter the update event based on the key(s) passed to it as a string and the local listener is invoked if the filter event succeeds. The local listener writes the updated key value to the web socket reference passed to the subscribe event code.
The version of ignite Im using is 1.8.0. However the behaviour is the same in 2.0 as well.
Any help is greatly appreciated!
Here is the log snippet containing the relevant error
factory init called jun22
exception cache.query class org.apache.ignite.spi.IgniteSpiException: Failed to marshal custom event: StartRoutineDiscoveryMessage [startReqData=StartRequestData [prjPred=org.apache.ignite.configuration.CacheConfiguration$IgniteAllNodesPredicate#269707de, clsName=null, depInfo=null, hnd=CacheContinuousQueryHandlerV2 [rmtFilterFactory=com.ccx.ignite.cqfilter.FilterFactory$1#5dc301ed, rmtFilterFactoryDep=null, types=0], bufSize=1, interval=1000, autoUnsubscribe=true], keepBinary=false, routineId=b40ada9f-552d-41eb-90b5-3384526eb7b9]
From FilterFactory you are returning an instance of an anonymous class which in turn refers to the enclosing FilterFactory which is not serializable.
Just replace the returned anonymous CacheEntryEventFilter based class with a corresponding nested static class.
You need to explicitly deploy you CQ classes (remote filters specifically) on all nodes in topology. Just create a JAR file with them and put into libs folder prior to starting nodes.

Outlook Add-In :: COM object that has been separated from its underlying RCW cannot be used

While I have found many instances of this question on SO, none of the solutions I have implemented have solved my problem; hopefully you can help me solve this riddle. Note: This is my first foray into the world of COM objects, so my ignorance is as deep as it is wide.
As a beginning, I am using Adrian Brown's Outlook Add-In code. I won't duplicate his CalendarMonitor class entirely; here are the relevant parts:
public class CalendarMonitor
{
private ItemsEvents_ItemAddEventHandler itemAddEventHandler;
public event EventHandler<EventArgs<AppointmentItem>> AppointmentAdded = delegate { };
public CalendarMonitor(Explorer explorer)
{
_calendarItems = new List<Items>();
HookupDefaultCalendarEvents(session);
}
private void HookupDefaultCalendarEvents(_NameSpace session)
{
var folder = session.GetDefaultFolder(OlDefaultFolders.olFolderCalendar);
if (folder == null) return;
try
{
HookupCalendarEvents(folder);
}
finally
{
Marshal.ReleaseComObject(folder);
folder = null;
}
}
private void HookupCalendarEvents(MAPIFolder calendarFolder)
{
var items = calendarFolder.Items;
_calendarItems.Add(items);
// Add listeners
itemAddEventHandler = new ItemsEvents_ItemAddEventHandler(CalendarItems_ItemAdd);
items.ItemAdd += itemAddEventHandler;
}
private void CalendarItems_ItemAdd(object obj)
{
var appointment = (obj as AppointmentItem);
if (appointment == null) return;
try
{
AppointmentAdded(this, new EventArgs<AppointmentItem>(appointment));
}
finally
{
Marshal.ReleaseComObject(appointment);
appointment = null;
}
}
Bits not relevant to adding appointments have been redacted.
I instantiate the CalendarMonitor class when I spool up the Add-in, and do the work in the AppointmentAdded event, including adding a UserProperty to the AppointmentItem:
private void ThisAddIn_Startup(object sender, EventArgs e)
{
_calendarMonitor = new CalendarMonitor(Application.ActiveExplorer());
_calendarMonitor.AppointmentAdded += monitor_AppointmentAdded;
}
private async void monitor_AppointmentAdded(object sender, EventArgs<AppointmentItem> e)
{
var item = e.Value;
Debug.Print("Outlook Appointment Added: {0}", item.GlobalAppointmentID);
try
{
var result = await GCalUtils.AddEventAsync(item);
//store a reference to the GCal Event for later.
AddUserProperty(item, Resources.GCalId, result.Id);
Debug.Print("GCal Appointment Added: {0}", result.Id);
}
catch (GoogleApiException ex)
{
PrintToDebug(ex);
}
finally
{
Marshal.ReleaseComObject(item);
item = null;
}
}
The error is thrown here, where I try to add a UserProperty to the AppointmentItem. I have followed the best example I could find:
private void AddUserProperty(AppointmentItem item, string propertyName, object value)
{
UserProperties userProperties = null;
UserProperty userProperty = null;
try
{
userProperties = item.UserProperties;
userProperty = userProperties.Add(propertyName, OlUserPropertyType.olText);
userProperty.Value = value;
item.Save();
}
catch (Exception ex)
{
Debug.Print("Error setting User Properties:");
PrintToDebug(ex);
}
finally
{
if (userProperty != null) Marshal.ReleaseComObject(userProperty);
if (userProperties != null) Marshal.ReleaseComObject(userProperties);
userProperty = null;
userProperties = null;
}
}
... but it chokes on when I try to add the UserProperty to the AppointmentItem. I get the ever-popular error: COM object that has been separated from its underlying RCW cannot be used. In all honesty, I have no idea what I'm doing; so I'm desperately seeking a Jedi Master to my Padawan.
The main problem here is using Marshal.ReleaseComObject for RCW's that are used in more than one place by the managed runtime.
In fact, this code provoked the problem. Let's see class CalendarMonitor:
private void CalendarItems_ItemAdd(object obj)
{
var appointment = (obj as AppointmentItem);
if (appointment == null) return;
try
{
AppointmentAdded(this, new EventArgs<AppointmentItem>(appointment));
}
finally
{
Marshal.ReleaseComObject(appointment);
After the event returns, it tells the managed runtime to release the COM object (from the point of view of the whole managed runtime, but no further).
appointment = null;
}
}
Then, an async event is attached, which will actually return before using the appointment, right at the await line:
private async void monitor_AppointmentAdded(object sender, EventArgs<AppointmentItem> e)
{
var item = e.Value;
Debug.Print("Outlook Appointment Added: {0}", item.GlobalAppointmentID);
try
{
var result = await GCalUtils.AddEventAsync(item);
This method actually returns here. C#'s async code generation breaks async methods at await points, generating continuation passing style (CPS) anonymous methods for each block of code that handles an awaited result.
//store a reference to the GCal Event for later.
AddUserProperty(item, Resources.GCalId, result.Id);
Debug.Print("GCal Appointment Added: {0}", result.Id);
}
catch (GoogleApiException ex)
{
PrintToDebug(ex);
}
finally
{
Marshal.ReleaseComObject(item);
Look, it's releasing the COM object again. No problem, but not optimal at all. This is an indicator of not knowing what is going on by using ReleaseComObject, it's better to avoid it unless proven necessary.
item = null;
}
}
In essence the use of ReleaseComObject should be subject to a thorough review of the following points:
Do I need to actually make sure the managed environment releases the object right now instead of at an indeterminate time?
Occasionally, some native objects need to be released to cause relevant side effects.
For instance, under a distributed transaction to make sure the object commits, but if you find the need to do that, then perhaps you're developing a serviced component and you're not enlisting objects in manual transactions properly.
Other times, you're iterating a huge set of objects, no matter how small each object is, and you may need to free them in order to not bring either your application or the remote application down. Sometimes, GC'ing more often, switching to 64-bit and/or adding RAM solves the problem in one way or the other.
Am I the sole owner of/pointer to the object from the managed environment's point of view?
For instance, did I create it, or was the object provided indirectly by another object I created?
Are there no further references to this object or its container in the managed environment?
Am I definitely not using the object after ReleaseComObject, in the code that follows it, or at any other time (e.g. by making sure not to store it in a field, or closure, even in the form of an iterator method or async method)?
This is to avoid the dreaded disconnected RCW exception.

What is the reason that setDefaultUseCaches(false) of URLConnection is eagerly called in the org.apache.catalina.core.JreMemoryLeakPreventionListener

This question could be a bit difficult to find the answer. It's a questions in one series with What is the reason that Policy.getPolicy() is considered as it will retain a static reference to the context and can cause memory leak. You can read it so you may know more background.
Graped the source code from org.apache.cxf.common.logging.JDKBugHacks and also from org.apache.catalina.core.JreMemoryLeakPreventionListener.
There is a piece of code. Here it is.
URL url = new URL("jar:file://dummy.jar!/");
URLConnection uConn = new URLConnection(url) {
#Override
public void connect() throws IOException{
// NOOP
}
};
uConn.setDefaultUseCaches(false);
The comment said
/*
* Several components end up opening JarURLConnections without
* first disabling caching. This effectively locks the file.
* Whilst more noticeable and harder to ignore on Windows, it
* affects all operating systems.
*
* Those libraries/components known to trigger this issue
* include:
* - log4j versions 1.2.15 and earlier
* - javax.xml.bind.JAXBContext.newInstance()
*/
However I can hardly understand it. Why did they eagerly call setDefaultUseCaches(false) and why on Windows it's harmful that by default cache is true? I cannot find any clue in java.net.JarURLConnection.
I myself find an answer. Any one can correct me if you think I am wrong.
in sun.net.www.protocol.jar.JarURLConnection. I assume this is the default implementation of java.net.JarURLConnection. There is a piece of code below.
If cache is set to true, then it will not close the JarFile's connection. Which means it is locked.
class JarURLInputStream extends java.io.FilterInputStream {
JarURLInputStream (InputStream src) {
super (src);
}
public void close () throws IOException {
try {
super.close();
} finally {
if (!getUseCaches()) {
jarFile.close(); //will not close
}
}
}
}
public void connect() throws IOException {
if (!connected) {
/* the factory call will do the security checks */
jarFile = factory.get(getJarFileURL(), getUseCaches());
/* we also ask the factory the permission that was required
* to get the jarFile, and set it as our permission.
*/
if (getUseCaches()) {
jarFileURLConnection = factory.getConnection(jarFile);
}
if ((entryName != null)) {
jarEntry = (JarEntry)jarFile.getEntry(entryName);
if (jarEntry == null) {
try {
if (!getUseCaches()) {
jarFile.close(); //will not close
}
} catch (Exception e) {
}
throw new FileNotFoundException("JAR entry " + entryName +
" not found in " +
jarFile.getName());
}
}
connected = true;
}
}