Apache Ignite, Scan Query without filter and Iterator over cache, both returns empty result set. The number of entries in the cache is non-zero - ignite

Apache Ignite Version: 2.8.0
While starting the first ignite cache node, below code snippets, give expected output. When the second ignite cache joins the cluster, the entries object in the first code snippet is empty while in the second snippet iter.hasNext() gives false.
The ignite cluster is started with ShareNothing persistance.
Cache mode is set to CacheMode.REPLICATED.
Both the nodes are in server mode.
call to size() in both instances give a non-zero value.
call to localSize() gives non-zero value in the first instance, but 0 in second instance.
Map results = new HashMap();
QueryCursor<Cache.Entry> entries = binaryCache.query(new ScanQuery(null));
logger.log(Level.DEBUG, "CacheSize from entrySet: [%s]", binaryCache.size()); // Outputs non zero number.
try {
for (Cache.Entry e : entries) {
results.put(e.getKey(), e.getValue());
}
} catch (IOException e) {
throw new RuntimeException("Error deserializing ignite entry", e);
}
return results.entrySet();
OR
Map results = new HashMap();
Iterator<Cache.Entry<Object, BinaryObject>> iter = binaryCache.iterator();
logger.log(Level.DEBUG, "CacheSize from entrySet: [%s]", binaryCache.size()); // Outputs non zero number.
try {
while(iter.hasNext()) {
Cache.Entry e = iter.next();
results.put(e.getKey(), e.getValue());
}
} catch (IOException e) {
throw new RuntimeException("Error deserializing ignite entry", e);
}
return results.entrySet();
As null is passed to ScanQuery, ideally, all the entries from the cache should be fetched.

IgniteCache is iterable, as you noticed in the second example.
If you need to get the stored values, the simplest code would be like:
IgniteCache<Object, BinaryObject> myCache = ignite.getOrCreateCache(ccfg).withKeepBinary();
for(Cache.Entry<Object, BinaryObject> entry: myCache){
System.out.println(entry.getKey() + " " + entry.getValue());
}
System.out.println("trying again");
for(Cache.Entry<Object, BinaryObject> entry: myCache){
System.out.println(entry.getKey() + " " + entry.getValue());
}

If cache has cache store, it may desync since cache store loads are not propagated across cluster. This is how cache store works on replicated caches.

Related

ChronicleMap cannot store/use the defined No of Max.Entries after removing a few entries?

Chronicle Map Versions I used - 3.22ea5 / 3.21.86
I am trying to use ChronicleMap as an LRU cache.
I have two ChronicleMaps both equal in configuration with allowSegmentTiering set as false. Consider one as main and the other as backup.
So, when the main Map gets full, few entries will be removed from the main Map and in parallel the backup Map will be used. Once the entries are removed from main Map, the entries from the backup Map will be refilled in the Main Map.
Shown below a sample code.
ChronicleMap<ByteBuffer, ByteBuffer> main = ChronicleMapBuilder.of(ByteBuffer.class, ByteBuffer.class).name("main")
.entries(61500)
.averageKey(ByteBuffer.wrap(new byte[500]))
.averageValue(ByteBuffer.wrap(new byte[5120]))
.allowSegmentTiering(false)
.create();
ChronicleMap<ByteBuffer, ByteBuffer> backup = ChronicleMapBuilder.of(ByteBuffer.class, ByteBuffer.class).name("backup")
.entries(100)
.averageKey(ByteBuffer.wrap(new byte[500]))
.averageValue(ByteBuffer.wrap(new byte[5120]))
.allowSegmentTiering(false)
.create();
System.out.println("Main Heap Size -> "+main.offHeapMemoryUsed());
SecureRandom random = new SecureRandom();
while (true)
{
System.out.println();
AtomicInteger entriesAdded = new AtomicInteger(0);
try
{
int mainEntries = main.size();
while /*(true) Loop until error is thrown */(mainEntries < 61500)
{
try
{
byte[] keyN = new byte[500];
byte[] valueN = new byte[5120];
random.nextBytes(keyN);
random.nextBytes(valueN);
main.put(ByteBuffer.wrap(keyN), ByteBuffer.wrap(valueN));
mainEntries++;
}
catch (Throwable t)
{
System.out.println("Max Entries is not yet reached!!!");
break;
}
}
System.out.println("Main Entries -> "+main.size());
for (int i = 0; i < 10; i++)
{
byte[] keyN = new byte[500];
byte[] valueN = new byte[5120];
random.nextBytes(keyN);
random.nextBytes(valueN);
backup.put(ByteBuffer.wrap(keyN), ByteBuffer.wrap(valueN));
}
AtomicInteger removed = new AtomicInteger(0);
AtomicInteger i = new AtomicInteger(Math.max( (backup.size() * 5), ( (main.size() * 5) / 100 ) ));
main.forEachEntry(c -> {
if (i.get() > 0)
{
c.context().remove(c);
i.decrementAndGet();
removed.incrementAndGet();
}
});
System.out.println("Removed "+removed.get()+" Entries from Main");
backup.forEachEntry(b -> {
ByteBuffer key = b.key().get();
ByteBuffer value = b.value().get();
b.context().remove(b);
main.put(key, value);
entriesAdded.incrementAndGet();
});
if(backup.size() > 0)
{
System.out.println("It will never be logged");
backup.clear();
}
}
catch (Throwable t)
{
// System.out.println();
// t.printStackTrace(System.out);
System.out.println();
System.out.println("-------------------------Failed----------------------------");
System.out.println("Added "+entriesAdded.get()+" Entries in Main | Lost "+(backup.size() + 1)+" Entries in backup");
backup.clear();
break;
}
}
main.close();
backup.close();
The above code yields the following result.
Main Entries -> 61500
Removed 3075 Entries from Main
Main Entries -> 61500
Removed 3075 Entries from Main
Main Entries -> 61500
Removed 3075 Entries from Main
Max Entries is not yet reached!!!
Main Entries -> 59125
Removed 2956 Entries from Main
Max Entries is not yet reached!!!
Main Entries -> 56227
Removed 2811 Entries from Main
Max Entries is not yet reached!!!
Main Entries -> 53470
Removed 2673 Entries from Main
-------------------------Failed----------------------------
Added 7 Entries in Main | Lost 3 Entries in backup
In the above result, The Max Entries of the Main map got decreased in the subsequent iterations and the refilling from the backup Map also got failed.
In the Issue 128, it was said the entries are deleted properly.
Then why the above sample code fails? What am I doing wrong in here? Is the Chronicle Map not designed for such usage pattern?
Even If I use one Map only, the max Entries the Map can hold gets reduced after each removal of entries.

Why am I getting a missing hash-tags error when I try to run JedisCluster.scan() using a match pattern?

I'm trying to run scan on my redis cluster using Jedis. I tried using the .scan(...) method as follows for a match pattern but I get the following error:
"JedisCluster only supports SCAN commands with MATCH patterns containing hash-tags"
my code is as follows (excerpted):
private final JedisCluster redis;
...
String keyPrefix = "helloWorld:*";
ScanParams params = new ScanParams()
.match(keyPrefix)
.count(100);
String cur = SCAN_POINTER_START;
boolean done = false;
while (!done) {
ScanResult<String> resp = redis.scan(cur, params);
...
cur = resp.getStringCursor();
if (resp.getStringCursor().equals(SCAN_POINTER_START)) {
done = true;
}
}
When I run my code, it gives a weird error talking about hashtags:
"JedisCluster only supports SCAN commands with MATCH patterns containing hash-tags"
In the redis-cli I could just use match patterns like that what I wrote for the keyPrefix variable. Why am I getting an error?
How do I get Jedis to show me all the the keys that match a given substring?
The problem was that the redis variable is a RedisCluster object and not a Redis object.
A redis cluster object has a collection of nodes and scanning that is different than scanning an individual node.
To solve the issue, you can scan through each individual node as such:
String keyPrefix = "helloWorld:*";
ScanParams params = new ScanParams()
.match(keyPrefix)
.count(100);
redis.getClusterNodes().values().stream().forEach(pool -> {
boolean done = false;
String cur = SCAN_POINTER_START;
try (Jedis jedisNode = pool.getResource()) {
while (!done) {
ScanResult<String> resp = jedisNode.scan(cur, params);
...
cur = resp.getStringCursor();
if (cur.equals(SCAN_POINTER_START)) {
done = true;
}
}
}
});

Hibernate Search manual indexing throw a "org.hibernate.TransientObjectException: The instance was not associated with this session"

I use Hibernate Search 5.11 on my Spring Boot 2 application, allowing to make full text research.
This librairy require to index documents.
When my app is launched, I try to re-index manually data of an indexed entity (MyEntity.class) each five minutes (for specific reason, due to my server context).
I try to index data of the MyEntity.class.
MyEntity.class has a property attachedFiles, which is an hashset, filled with a join #OneToMany(), with lazy loading mode enabled :
#OneToMany(mappedBy = "myEntity", cascade = CascadeType.ALL, orphanRemoval = true)
private Set<AttachedFile> attachedFiles = new HashSet<>();
I code the required indexing process, but an exception is thrown on "fullTextSession.index(result);" when attachedFiles property of a given entity is filled with one or more items :
org.hibernate.TransientObjectException: The instance was not associated with this session
The debug mode indicates a message like "Unable to load [...]" on entity hashset value in this case.
And if the HashSet is empty (not null, only empty), no exception is thrown.
My indexing method :
private void indexDocumentsByEntityIds(List<Long> ids) {
final int BATCH_SIZE = 128;
Session session = entityManager.unwrap(Session.class);
FullTextSession fullTextSession = Search.getFullTextSession(session);
fullTextSession.setFlushMode(FlushMode.MANUAL);
fullTextSession.setCacheMode(CacheMode.IGNORE);
CriteriaBuilder builder = session.getCriteriaBuilder();
CriteriaQuery<MyEntity> criteria = builder.createQuery(MyEntity.class);
Root<MyEntity> root = criteria.from(MyEntity.class);
criteria.select(root).where(root.get("id").in(ids));
TypedQuery<MyEntity> query = fullTextSession.createQuery(criteria);
List<MyEntity> results = query.getResultList();
int index = 0;
for (MyEntity result : results) {
index++;
try {
fullTextSession.index(result); //index each element
if (index % BATCH_SIZE == 0 || index == ids.size()) {
fullTextSession.flushToIndexes(); //apply changes to indexes
fullTextSession.clear(); //free memory since the queue is processed
}
} catch (TransientObjectException toEx) {
LOGGER.info(toEx.getMessage());
throw toEx;
}
}
}
Does someone have an idea ?
Thanks !
This is probably caused by the "clear" call you have in your loop.
In essence, what you're doing is:
load all entities to reindex into the session
index one batch of entities
remove all entities from the session (fullTextSession.clear())
try to index the next batch of entities, even though they are not in the session anymore... ?
What you need to do is to only load each batch of entities after the session clearing, so that you're sure they are still in the session when you index them.
There's an example of how to do this in the documentation, using a scroll and an appropriate batch size: https://docs.jboss.org/hibernate/search/5.11/reference/en-US/html_single/#search-batchindex-flushtoindexes
Alternatively, you can just split your ID list in smaller lists of 128 elements, and for each of these lists, run a query to get the corresponding entities, reindex all these 128 entities, then flush and clear.
Thanks for the explanations #yrodiere, they helped me a lot !
I chose your alternative solution :
Alternatively, you can just split your ID list in smaller lists of 128 elements, and for each of these lists, run a query to get the corresponding entities, reindex all these 128 entities, then flush and clear.
...and everything works perfectly !
Well seen !
See the code solution below :
private List<List<Object>> splitList(List<Object> list, int subListSize) {
List<List<Object>> splittedList = new ArrayList<>();
if (!CollectionUtils.isEmpty(list)) {
int i = 0;
int nbItems = list.size();
while (i < nbItems) {
int maxLastSubListIndex = i + subListSize;
int lastSubListIndex = (maxLastSubListIndex > nbItems) ? nbItems : maxLastSubListIndex;
List<Object> subList = list.subList(i, lastSubListIndex);
splittedList.add(subList);
i = lastSubListIndex;
}
}
return splittedList;
}
private void indexDocumentsByEntityIds(Class<Object> clazz, String entityIdPropertyName, List<Object> ids) {
Session session = entityManager.unwrap(Session.class);
List<List<Object>> splittedIdsLists = splitList(ids, 128);
for (List<Object> splittedIds : splittedIdsLists) {
FullTextSession fullTextSession = Search.getFullTextSession(session);
fullTextSession.setFlushMode(FlushMode.MANUAL);
fullTextSession.setCacheMode(CacheMode.IGNORE);
Transaction transaction = fullTextSession.beginTransaction();
CriteriaBuilder builder = session.getCriteriaBuilder();
CriteriaQuery<Object> criteria = builder.createQuery(clazz);
Root<Object> root = criteria.from(clazz);
criteria.select(root).where(root.get(entityIdPropertyName).in(splittedIds));
TypedQuery<Object> query = fullTextSession.createQuery(criteria);
List<Object> results = query.getResultList();
int index = 0;
for (Object result : results) {
index++;
try {
fullTextSession.index(result); //index each element
if (index == splittedIds.size()) {
fullTextSession.flushToIndexes(); //apply changes to indexes
fullTextSession.clear(); //free memory since the queue is processed
}
} catch (TransientObjectException toEx) {
LOGGER.info(toEx.getMessage());
throw toEx;
}
}
transaction.commit();
}
}

Which part of the following code will run at the server side

I am loading data from mysql into Ignite cache with following code. The code is run with client mode Ignite and will load the data into Ignite cluster.
I would ask:
Which parts of the code will run at the server side?
The working mechanism of loading data into cache looks like map-reduce, so, what tasks are sent to the server? the sql?
I would particularlly ask: will the following code run at the client side or the server sdie?
CacheConfiguration cfg = StudentCacheConfig.cache("StudentCache", storeFactory);
IgniteCache cache = ignite.getOrCreateCache(cfg);
Following is the full code that loads the data into cache
public class LoadStudentIntoCache {
public static void main(String[] args) {
Ignition.setClientMode(false);
String configPath = "default-config.xml";
Ignite ignite = Ignition.start(configPath);
CacheJdbcPojoStoreFactory storeFactory = new CacheJdbcPojoStoreFactory<Integer, Student>();
storeFactory.setDialect(new MySQLDialect());
IDataSourceFactory factory = new MySqlDataSourceFactory();
storeFactory.setDataSourceFactory(new Factory<DataSource>() {
public DataSource create() {
try {
DataSource dataSource = factory.createDataSource();
return dataSource;
} catch (Exception e) {
return null;
}
}
});
//
CacheConfiguration<Integer, Student> cfg = StudentCacheConfig.cache("StudentCache", storeFactory);
IgniteCache<Integer, Student> cache = ignite.getOrCreateCache(cfg);
List<String> sqls = new ArrayList<String>();
sqls.add("java.lang.Integer");
sqls.add("select id, name, birthday from db1.student where id < 1000" );
sqls.add("java.lang.Integer");
sqls.add("select id, name, birthday from db1.student where id >= 1000 and id < 1000" );
cache.loadCache(null, , sqls.toArray(new String[0]));
Student s = cache.get(1);
System.out.println(s.getName() + "," + s.getBirthday());
ignite.close();
}
}
The code you showed here will be executed within your application, there is no magic happening. Usually it's a client node, however in your case it's started in server mode (probably by mistake): Ignition.setClientMode(false).
The data loading process will happen on each server node. I.e. each server node will execute SQL queries provided to load the data from the DB.

jdbc and processing output using Jakarta Commons Math

Using jdbc I am querying my database of ambulance response times. My goal is to take the output and process it into statistics using Jakarta Commons Math library. So far I am successful in querying my database and outputting the response times to the console. My next step is to process this output statistically, such as mean, medians, mode, etc. This is where I am stuck. Shown below is my code.
package javaDatabase;
import java.sql.*;
import org.apache.commons.math3.stat.StatUtils;
public class javaConnect3
{
public static void main(String[] args)
{
Connection conn = null;
Statement stmt = null;
try
{
conn = DriverManager
.getConnection("jdbc:sqlserver://myServerAddress;database=myDatabase;integratedsecurity=false;user=myUser;password=myPassword");
stmt = conn.createStatement();
String strSelect = "SELECT M_SecondsAtStatus FROM MManpower WHERE M_tTime > 'august 25, 2014' AND M_Code = 'USAR'";
ResultSet rset = stmt.executeQuery(strSelect);
while (rset.next())
{
int values = rset.getInt("M_SecondsAtStatus");
System.out.println(values);
}
// I am hoping to derive useful statistics from my database, such as the following.
// this uses Jakarta Commons Math
// System.out.println("min: " + StatUtils.min(values));
// System.out.println("max: " + StatUtils.max(values));
// System.out.println("mean: " + StatUtils.mean(values));
// System.out.println("product: " + StatUtils.product(values));
// System.out.println("sum: " + StatUtils.sum(values));
// System.out.println("variance: " + StatUtils.variance(values));
} catch (SQLException ex)
{
ex.printStackTrace();
} finally
{
try
{
if (stmt != null)
stmt.close();
if (conn != null)
conn.close();
} catch (SQLException ex)
{
ex.printStackTrace();
}
}
}
}
An error message pops up in Eclipse and the variable "values" is red underlined; "values cannot be resolved to a variable".
I am not sure how to get this to work. I don't understand how to output my ambulance response times from the database into something Apache Commons math will understand.
How can I get Apache Commons math to take the output from my database and generate a statistical result?
You need something like this:
List<Double> values = new ArrayList<Double>();
while (rset.next()) {
values.add(rset.getInt("M_SecondsAtStatus"));
}
double mean = StatsUtils.mean(values.toArray(new Double [](values.size()));
It's likely that you could query the database for statistics. The name of the method depends on your database:
SELECT avg(M_SecondsAtStatus)
FROM MManpower
WHERE M_tTime > 'august 25, 2014'
AND M_Code = 'USAR'
I'd say the second one is more efficient, because you don't have to transfer all those values to the Java JVM to do the same calculation that the database can do for you.
Another alternative is to use a DescriptiveStatistics object to collect all the data and then use it to compute the summary statistics you need. Using it alleviates the need for doing all explicit type casting and conversion:
DescriptiveStatistics ds = new DescriptiveStatistics();
while(rset.next()) {
int observation = rset.getInt("M_SecondsAtStatus");
ds.addValue(observation);
}
System.out.println("min: " + ds.getMin());
System.out.println("max: " + ds.getMax());
...