ignite V2.7.0, in the one ignite transaction first query data by objetc ID, then modify the data saved and then query data by objetc ID, but the Data is old?
see the test code:
Related
I have set backup count of ignite cache to zero. I have created two server node(say s1 and s2) and one client node(c1). I have set cache mode as Partitioned. I have inserted data in ignite cache. Stopped server 2 and tried access data it is not getting data. If backup count is 0 then how to copy data from one server node other server node. Does ignite does automatically when we stop node.
The way Ignite manages this is with backups. If you set it to zero, you have no resilience and removing a node will result in data loss (unless you enable persistence). You can configure how Ignite responds to this situation with the Partition Loss Policy.
I am doing a POC to ingest data from Oracle to Ignite cluster and Fetch the
data from Ignite in another application. When I created the Model and Cache,
I specified the Key as String and value as Custom Object. Data loaded to
cluster but then I querying "SELECT * FROM TB_USER" I am getting only two
column, i.e. _KEY and _VAL. I am trying to get all the column from the
TB_USER. What are the configuration required for this?
There are three ways of configuring SQL tables in Ignite:
DDL statements (create table). As far as I can see, you used something else.
QueryEntities. You should enlist all columns that you want to see in your table in the QueryEntity#fields property. All names should correspond to field names of your Java objects.
Annotations. Fields, that are annotated as #QuerySqlField will become columns in your table.
We are using NiFi to pull data from oracle and perform some transformation. The pipeline works fine for small amount but fails with the error no output to read from socket when data volume is high -> 1million records.
Any Help or configuration changes that i need to do.
I am using spark job server and using spark-sql to get data from a cassandra table as follows
public Object runJob(JavaSparkContext jsc, Config config) {
CassandraSQLContext sq = new CassandraSQLContext(JavaSparkContext.toSparkContext(jsc));
sq.setKeyspace("rptavlview");
DataFrame vadevent = sq.sql("SELECT username,plan,plate,ign,speed,datetime,odo,gd,seat,door,ac from rptavlview.vhistory ");
vadevent.registerTempTable("history");
sq.cacheTable("history");
DataFrame vadevent1 = sq.sql("SELECT plate,ign,speed,datetime FROM history where username='"+params[0]+"' and plan='"+params[1]+"'");
long count = vadevent.rdd().count();
}
But I am getting table not found history.
Can anybody mention how to cache cassandra data in spark memory and reuse the same data either in concurrent requests of same job or as two jobs one for caching and other for querying.
I am using dse5.0.4 so spark version is 1.6.1
You can allow spark jobs to share the state of other contexts. This link goes more in depth.
If I create a igniteRDD out of a cache with 10M entries in my spark job, will it load all 10M into my spark context? Please find my code below for reference.
SparkConf conf = new SparkConf().setAppName("IgniteSparkIntgr").setMaster("local");
JavaSparkContext context = new JavaSparkContext(conf);
JavaIgniteContext<Integer, Subscriber> igniteCxt = new JavaIgniteContext<Integer,Subscriber>(context,"example-ignite.xml");
JavaIgniteRDD<Integer,Subscriber> cache = igniteCxt.fromCache("subscriberCache");
DataFrame query_res = cache.sql("select id, lastName, company from Subscriber where id between ? and ?", 12, 15);
DataFrame input = loadInput(context);
DataFrame joined_df = input.join(query_res,input.col("id").equalTo(query_res.col("ID")));
System.out.println(joined_df.count());
In the above code, subscriberCache is having more than 10M entries. Will at any point of the above code the 10M Subscriber objects be loaded into JVM? Or it only loads the query output?
FYI:(Ignite is running in a separate JVM)
cache.sql(...) method queries the data that is already in Ignite in-memory cache, so before doing this you should load the data. You can use IgniteRDD.saveValues(...) or IgniteRDD.savePairs(...) method for this. Each of them will iterate through all partitions and load all the data that currently exists in Spark into Ignite.
Note that any transformations or joins that you're doing with the resulting DataFrame will be done locally on the driver. You should avoid this as much as possible to get the best performance from Ignite SQL engine.