Which part of the following code will run at the server side - ignite

I am loading data from mysql into Ignite cache with following code. The code is run with client mode Ignite and will load the data into Ignite cluster.
I would ask:
Which parts of the code will run at the server side?
The working mechanism of loading data into cache looks like map-reduce, so, what tasks are sent to the server? the sql?
I would particularlly ask: will the following code run at the client side or the server sdie?
CacheConfiguration cfg = StudentCacheConfig.cache("StudentCache", storeFactory);
IgniteCache cache = ignite.getOrCreateCache(cfg);
Following is the full code that loads the data into cache
public class LoadStudentIntoCache {
public static void main(String[] args) {
Ignition.setClientMode(false);
String configPath = "default-config.xml";
Ignite ignite = Ignition.start(configPath);
CacheJdbcPojoStoreFactory storeFactory = new CacheJdbcPojoStoreFactory<Integer, Student>();
storeFactory.setDialect(new MySQLDialect());
IDataSourceFactory factory = new MySqlDataSourceFactory();
storeFactory.setDataSourceFactory(new Factory<DataSource>() {
public DataSource create() {
try {
DataSource dataSource = factory.createDataSource();
return dataSource;
} catch (Exception e) {
return null;
}
}
});
//
CacheConfiguration<Integer, Student> cfg = StudentCacheConfig.cache("StudentCache", storeFactory);
IgniteCache<Integer, Student> cache = ignite.getOrCreateCache(cfg);
List<String> sqls = new ArrayList<String>();
sqls.add("java.lang.Integer");
sqls.add("select id, name, birthday from db1.student where id < 1000" );
sqls.add("java.lang.Integer");
sqls.add("select id, name, birthday from db1.student where id >= 1000 and id < 1000" );
cache.loadCache(null, , sqls.toArray(new String[0]));
Student s = cache.get(1);
System.out.println(s.getName() + "," + s.getBirthday());
ignite.close();
}
}

The code you showed here will be executed within your application, there is no magic happening. Usually it's a client node, however in your case it's started in server mode (probably by mistake): Ignition.setClientMode(false).
The data loading process will happen on each server node. I.e. each server node will execute SQL queries provided to load the data from the DB.

Related

How to retain mocked data in service

I have a service called AcctBeneficiaryService and trying to write junit test case for it and getting failed due to Null pointer exception at for loop
My service would look like this
#Service
public class AcctBeneficiaryService {
public List<AcctBeneficiary> getAccountBeneficiaryDetails(long rltshpId) throws Exception {
RltshpData rlshpData = rltshpClient.getRltshpInfo(rltshpId);
List<AcctBeneficiary> acctBeneficiaries = new ArrayList<>();
List<Long> conidList = new ArrayList<>();
**for(BigDecimal account : rlshpData.getAccounts())**
(Null pointer here) {
AcctBeneficiary acctBeneficiary = new AcctBeneficiary();
Account accountInfo = new Account();
My Test Case would look like this
public void testGetAccountBeneficiaryDetails() throws Exception {
//Initializing required objects for Mock
RltshpData rlshpData = new RltshpData();
RltshpInfo rltshpInfo = new RltshpInfo();
List<BigDecimal> accounts = new ArrayList<>();
//Mock Data for rltshpInfo
rltshpInfo.setIrNo(135434);
rltshpInfo.setIsoCtryCd("US");
rltshpInfo.setRltshpNa("Individual");
//Mock Data for accounts
accounts.add(new BigDecimal(4534));
//accounts.add(new BigDecimal(4564));
//populate mocking data to the object
rlshpData.setRltshpInfo(rltshpInfo);
rlshpData.setAccounts(accounts);
when(service.getAccountBeneficiaryDetails(new Long(1234))).thenReturn(Mockito.
<AcctBeneficiary>anyList());
assertNotNull(acctBeneficiaryList);
Error Details
java.lang.NullPointerException
at
.fplbeneficiariesrest.service.AcctBeneficiaryService.getAccountBeneficiaryDetails
(AcctBeneficiaryService.java:64)
at
.fplbeneficiariesrest.service.AcctBeneficiaryServiceTest.
testGetAccountBeneficiaryDetails(AcctBeneficiaryServiceTest.java:160)
Null pointer is at
for(BigDecimal account : rlshpData.getAccounts()) This for loop is not able to take the Mocked data as rlshpData.getAccounts() is getting executed before

Apache Calcite | HSQLDB - Table Not Found Exception

I am trying to learn Apache Calcite by following the RelBuilderExample with the storage layer being HSQLDB.
Unfortunately, I keep getting "Table Not Found exception" when i call builder.scan(tableName) API of Apache Calcite. When I query the data in HSQL directly using ResultSet rs = statement.executeQuery("SELECT * from file"); then i am able to retrieve the data. Here is the relevant code:
//I create an instance of RelBuilder using the Config defined below
RelBuilder builder = RelBuilder.create(config().build());
//This line throws me exception: org.apache.calcite.runtime.CalciteException: Table 'file' not found
builder = builder.scan("file");
/**
Building the configuration backed by HSQLDB
*/
public static Frameworks.ConfigBuilder config() throws Exception{
//Getting the ConnectionSpec for the in memory HSQLDB
//FileHSQLDB.URI = "jdbc:hsqldb:mem:intel"
//FileHSQLDB.USER = "user"
//FileHSQLDB.PASSWORD = "password"
final ConnectionSpec cs = new ConnectionSpec(FileHSQLDB.URI, FileHSQLDB.USER, FileHSQLDB.PASSWORD, "org.hsqldb.jdbcDriver", "intel");
//cs.url = "jdbc:hsqldb:mem:intel"
//cs.driver = "org.hsqldb.jdbcDriver"
//cs.username = "user"
//cs.password = "password"
DataSource dataSource = JdbcSchema.dataSource(cs.url, cs.driver, cs.username, cs.password);
Connection connection = dataSource.getConnection();
Statement statement = connection.createStatement();
//This returns me 3 results
ResultSet rs = statement.executeQuery("SELECT * from file");
while(rs.next()) {
String id = rs.getString("file_id");
System.out.println(id);
}
// Next I create the rootSchema
SchemaPlus rootSchema = Frameworks.createRootSchema(true);
//I suspect that there is some issue in the below line. I think I
//am not using Apache Calcite APIs properly, but not sure what I
//am doing wrong.
rootSchema.add("intel", JdbcSchema.create(rootSchema, "intel", dataSource, cs.catalog, cs.schema));
return Frameworks.newConfigBuilder().defaultSchema(rootSchema);
Can someone please help me what I may be doing wrong.
If your table is file (lowercase) then make sure you quote the table name in the query, i.e. "SELECT * from \"file\"".

Google dataflow write to mutiple tables based on input

I have logs which I am trying to push to Google BigQuery. I am trying to build the entire pipeline using google dataflow. The log structure is different and can be classified into four different type. In my pipeline I read logs from PubSub parse it and write to BigQuery table. The table to which the logs need to written is depending on one parameter in logs. The problem is I am stuck on a point where how to change TableName for BigQueryIO.Write at runtime.
You can use side outputs.
https://cloud.google.com/dataflow/model/par-do#emitting-to-side-outputs-in-your-dofn
The following sample code, reads a BigQuery table and splits it in 3 different PCollections. Each PCollections ends up sent to a different Pub/Sub topic (which could be different BigQuery tables instead).
Pipeline p = Pipeline.create(PipelineOptionsFactory.fromArgs(args).withValidation().create());
PCollection<TableRow> weatherData = p.apply(
BigQueryIO.Read.named("ReadWeatherStations").from("clouddataflow-readonly:samples.weather_stations"));
final TupleTag<String> readings2010 = new TupleTag<String>() {
};
final TupleTag<String> readings2000plus = new TupleTag<String>() {
};
final TupleTag<String> readingsOld = new TupleTag<String>() {
};
PCollectionTuple collectionTuple = weatherData.apply(ParDo.named("tablerow2string")
.withOutputTags(readings2010, TupleTagList.of(readings2000plus).and(readingsOld))
.of(new DoFn<TableRow, String>() {
#Override
public void processElement(DoFn<TableRow, String>.ProcessContext c) throws Exception {
if (c.element().getF().get(2).getV().equals("2010")) {
c.output(c.element().toString());
} else if (Integer.parseInt(c.element().getF().get(2).getV().toString()) > 2000) {
c.sideOutput(readings2000plus, c.element().toString());
} else {
c.sideOutput(readingsOld, c.element().toString());
}
}
}));
collectionTuple.get(readings2010)
.apply(PubsubIO.Write.named("WriteToPubsub1").topic("projects/fh-dataflow/topics/bq2pubsub-topic1"));
collectionTuple.get(readings2000plus)
.apply(PubsubIO.Write.named("WriteToPubsub2").topic("projects/fh-dataflow/topics/bq2pubsub-topic2"));
collectionTuple.get(readingsOld)
.apply(PubsubIO.Write.named("WriteToPubsub3").topic("projects/fh-dataflow/topics/bq2pubsub-topic3"));
p.run();

Dapper.Net and the DataReader

I have a very strange error with dapper:
there is already an open DataReader associated with this Command
which must be closed first
But I don't use DataReader! I just call select query on my server application and take first result:
//How I run query:
public static T SelectVersion(IDbTransaction transaction = null)
{
return DbHelper.DataBase.Connection.Query<T>("SELECT * FROM [VersionLog] WHERE [Version] = (SELECT MAX([Version]) FROM [VersionLog])", null, transaction, commandTimeout: DbHelper.CommandTimeout).FirstOrDefault();
}
//And how I call this method:
public Response Upload(CommitRequest message) //It is calling on server from client
{
//Prepearing data from CommitRequest
using (var tr = DbHelper.DataBase.Connection.BeginTransaction(IsolationLevel.Serializable))
{
int v = SelectQueries<VersionLog>.SelectVersion(tr) != null ? SelectQueries<VersionLog>.SelectVersion(tr).Version : 0; //Call my query here
int newVersion = v + 1; //update version
//Saving changes from CommitRequest to db
//Updated version saving to base too, maybe it is problem?
return new Response
{
Message = String.Empty,
ServerBaseVersion = versionLog.Version,
};
}
}
}
And most sadly that this exception appearing in random time, I think what problem in concurrent access to server from two clients.
Please help.
This some times happens if the model and database schema are not matching and an exception is being raised inside Dapper.
If you really want to get into this, best way is to include dapper source in your project and debug.

Amazon RDS w/ SQL Server wont allow bulk insert from CSV source

I've tried two methods and both fall flat...
BULK INSERT TEMPUSERIMPORT1357081926
FROM 'C:\uploads\19E0E1.csv'
WITH (FIELDTERMINATOR = ',',ROWTERMINATOR = '\n')
You do not have permission to use the bulk load statement.
but you cannot enable that SQL Role with Amazon RDS?
So I tried... using openrowset but it requires AdHoc Queries to be enabled which I don't have permission to do!
I know this question is really old, but it was the first question that came up when I searched bulk inserting into an aws sql server rds instance. Things have changed and you can now do it after integrating the RDS instance with S3. I answered this question in more detail on this question. But overall gist is that you setup the instance with the proper role, put your file on S3, then you can copy the file over to RDS with the following commands:
exec msdb.dbo.rds_download_from_s3
#s3_arn_of_file='arn:aws:s3:::bucket_name/bulk_data.csv',
#rds_file_path='D:\S3\seed_data\data.csv',
#overwrite_file=1;
Then BULK INSERT will work:
FROM 'D:\S3\seed_data\data.csv'
WITH
(
FIRSTROW = 2,
FIELDTERMINATOR = ',',
ROWTERMINATOR = '\n'
)
AWS doc
You can enable ad hoc distributed queries via heading to your Amazon Management Console, navigating to your RDS menu and then creating a DB Parameter group with ad hoc distributed queries set to 1, and then attaching this parameter group to your DB instance.
Don't forget to reboot your DB once you have made these changes.
Here is the source of my information:
http://blogs.lessthandot.com/index.php/datamgmt/dbadmin/turning-on-optimize-for-ad/
Hope this helps you.
2022
I'm adding for anyone like me who wants to quickly insert data into RDS from C#
While RDS allows csv bulk uploads directly from S3 instances, there are times when you just want to directly upload data straight from your program.
I've written a C# utility method which does inserts using a StringBuilder to concatenate statements to do 2000 inserts per call, which is way faster than an ORM like dapper which does one insert per call.
This method should handle date, int, double, and varchar fields, but I haven't had to use it for character escaping or anything like that.
//call as
FastInsert.Insert(MyDbConnection, new object[]{{someField = "someValue"}}, "my_table");
class FastInsert
{
static int rowSize = 2000;
internal static void Insert(IDbConnection connection, object[] data, string targetTable)
{
var props = data[0].GetType().GetProperties();
var names = props.Select(x => x.Name).ToList();
foreach(var batch in data.Batch(rowSize))
{
var sb = new StringBuilder($"insert into {targetTable} ({string.Join(",", names)})");
string lastLine = "";
foreach(var row in batch)
{
sb.Append(lastLine);
var values = props.Select(prop => CreateSQLString(row, prop));
lastLine = $"select '{string.Join("','", values)}' union all ";
}
lastLine = lastLine.Substring(0, lastLine.Length - " union all".Length) + " from dual";
sb.Append(lastLine);
var fullQuery = sb.ToString();
connection.Execute(fullQuery);
}
}
private static string CreateSQLString(object row, PropertyInfo prop)
{
var value = prop.GetValue(row);
if (value == null) return "null";
if (prop.PropertyType == typeof(DateTime))
{
return $"'{((DateTime)value).ToString("yyyy-MM-dd HH:mm:ss")}'";
}
//if (prop.PropertyType == typeof(string))
//{
return $"'{value.ToString().Replace("'", "''")}'";
//}
}
}
static class Extensions
{
public static IEnumerable<T[]> Batch<T>(this IEnumerable<T> source, int size) //split an IEnumerable into batches
{
T[] bucket = null;
var count = 0;
foreach (var item in source)
{
if (bucket == null)
bucket = new T[size];
bucket[count++] = item;
if (count != size)
continue;
yield return bucket;
bucket = null;
count = 0;
}
// Return the last bucket with all remaining elements
if (bucket != null && count > 0)
{
Array.Resize(ref bucket, count);
yield return bucket;
}
}
}