Deleting many managed objects selectd by fragment name - cumulocity

I want to delete many managed objects, selected by fragment type. There are more then 2000 elements in it. Unfortunately I can not delete all with one function call. I have to call this function many times until I have deleted all. How can I delete a list of managed objects in a sufficient way? Not defining page size did not help...
This is my current function:
InventoryFilter filter = new InventoryFilter();
filter.byFragmentType("xy_fragment");
ManagedObjectCollection moc = inventoryApi.getManagedObjectsByFilter(filter);
int count = 0;
// max page size is 2000
for (ManagedObjectRepresentation mo : moc.get(2000).allPages()) {
if (mo.get("c8y_IsBinary") != null) {
binariesApi.deleteFile(mo.getId());
} else {
inventoryApi.delete(mo.getId());
}
LOG.debug(count + " remove: " + mo.getName() + ", " + mo.getType());
count++;
}
LOG.info("all objectes removed, count:" + count);

By calling moc.get(2000).allPages() you already obtain an iterator that queries following pages on demand as you iterate over it.
The problem you are facing is caused by deleting elements from the same list you are iterating over. You delete element from the first page, but once the second page is queried from the server it does not contain the expected elements anymore because you already deleted the first page. Now all elements are shifted forward by your page size.
You can avoid all of that by making a local copy of all elements you want to delete first:
List<ManagedObjectRepresentation> allObjects = Lists.newArrayList( moc.get(2000).allPages())
for (ManagedObjectRepresentation mo : allObjects) {
//delete here
}

There is no bulk delete allowed on the inventory API so your method of looping through the objects is the correct approach.
A bulk delete is already a dangerous tool on the other APIs but on the inventory API it would give you the potential to accidentally delete all your data with just one call (as all data associated with a managedObject is also deleted upon the deletion of the managedObject).
That is why it is not available.

I solved the problem by calling the method until no elements can be found any more. It is not nice but I have no other idea.
public synchronized void removeManagedObjects(String deviceTypeKey) {
int count = 0;
do {
count = deleteManagedObjectes(deviceTypeKey);
}while(count > 0);
}
private int deleteManagedObjectes(String deviceTypeKey) {
InventoryFilter filter = new InventoryFilter();
filter.byFragmentType("xy_fragment");
ManagedObjectCollection moc = inventoryApi.getManagedObjectsByFilter(filter);
int count = 0;
if(moc == null) {
LOG.info("ManagedObjectCollection are NULL");
return count;
}
for (ManagedObjectRepresentation mo : moc.get(2000).allPages()) {
if (mo.get("c8y_IsBinary") != null) {
binariesApi.deleteFile(mo.getId());
} else {
inventoryApi.delete(mo.getId());
}
LOG.debug(count + " remove: " + mo.getName() + ", " + mo.getType());
count++;
}
LOG.info("all objectes removed, count:" + count);
return count;
}

Related

Hibernate Search manual indexing throw a "org.hibernate.TransientObjectException: The instance was not associated with this session"

I use Hibernate Search 5.11 on my Spring Boot 2 application, allowing to make full text research.
This librairy require to index documents.
When my app is launched, I try to re-index manually data of an indexed entity (MyEntity.class) each five minutes (for specific reason, due to my server context).
I try to index data of the MyEntity.class.
MyEntity.class has a property attachedFiles, which is an hashset, filled with a join #OneToMany(), with lazy loading mode enabled :
#OneToMany(mappedBy = "myEntity", cascade = CascadeType.ALL, orphanRemoval = true)
private Set<AttachedFile> attachedFiles = new HashSet<>();
I code the required indexing process, but an exception is thrown on "fullTextSession.index(result);" when attachedFiles property of a given entity is filled with one or more items :
org.hibernate.TransientObjectException: The instance was not associated with this session
The debug mode indicates a message like "Unable to load [...]" on entity hashset value in this case.
And if the HashSet is empty (not null, only empty), no exception is thrown.
My indexing method :
private void indexDocumentsByEntityIds(List<Long> ids) {
final int BATCH_SIZE = 128;
Session session = entityManager.unwrap(Session.class);
FullTextSession fullTextSession = Search.getFullTextSession(session);
fullTextSession.setFlushMode(FlushMode.MANUAL);
fullTextSession.setCacheMode(CacheMode.IGNORE);
CriteriaBuilder builder = session.getCriteriaBuilder();
CriteriaQuery<MyEntity> criteria = builder.createQuery(MyEntity.class);
Root<MyEntity> root = criteria.from(MyEntity.class);
criteria.select(root).where(root.get("id").in(ids));
TypedQuery<MyEntity> query = fullTextSession.createQuery(criteria);
List<MyEntity> results = query.getResultList();
int index = 0;
for (MyEntity result : results) {
index++;
try {
fullTextSession.index(result); //index each element
if (index % BATCH_SIZE == 0 || index == ids.size()) {
fullTextSession.flushToIndexes(); //apply changes to indexes
fullTextSession.clear(); //free memory since the queue is processed
}
} catch (TransientObjectException toEx) {
LOGGER.info(toEx.getMessage());
throw toEx;
}
}
}
Does someone have an idea ?
Thanks !
This is probably caused by the "clear" call you have in your loop.
In essence, what you're doing is:
load all entities to reindex into the session
index one batch of entities
remove all entities from the session (fullTextSession.clear())
try to index the next batch of entities, even though they are not in the session anymore... ?
What you need to do is to only load each batch of entities after the session clearing, so that you're sure they are still in the session when you index them.
There's an example of how to do this in the documentation, using a scroll and an appropriate batch size: https://docs.jboss.org/hibernate/search/5.11/reference/en-US/html_single/#search-batchindex-flushtoindexes
Alternatively, you can just split your ID list in smaller lists of 128 elements, and for each of these lists, run a query to get the corresponding entities, reindex all these 128 entities, then flush and clear.
Thanks for the explanations #yrodiere, they helped me a lot !
I chose your alternative solution :
Alternatively, you can just split your ID list in smaller lists of 128 elements, and for each of these lists, run a query to get the corresponding entities, reindex all these 128 entities, then flush and clear.
...and everything works perfectly !
Well seen !
See the code solution below :
private List<List<Object>> splitList(List<Object> list, int subListSize) {
List<List<Object>> splittedList = new ArrayList<>();
if (!CollectionUtils.isEmpty(list)) {
int i = 0;
int nbItems = list.size();
while (i < nbItems) {
int maxLastSubListIndex = i + subListSize;
int lastSubListIndex = (maxLastSubListIndex > nbItems) ? nbItems : maxLastSubListIndex;
List<Object> subList = list.subList(i, lastSubListIndex);
splittedList.add(subList);
i = lastSubListIndex;
}
}
return splittedList;
}
private void indexDocumentsByEntityIds(Class<Object> clazz, String entityIdPropertyName, List<Object> ids) {
Session session = entityManager.unwrap(Session.class);
List<List<Object>> splittedIdsLists = splitList(ids, 128);
for (List<Object> splittedIds : splittedIdsLists) {
FullTextSession fullTextSession = Search.getFullTextSession(session);
fullTextSession.setFlushMode(FlushMode.MANUAL);
fullTextSession.setCacheMode(CacheMode.IGNORE);
Transaction transaction = fullTextSession.beginTransaction();
CriteriaBuilder builder = session.getCriteriaBuilder();
CriteriaQuery<Object> criteria = builder.createQuery(clazz);
Root<Object> root = criteria.from(clazz);
criteria.select(root).where(root.get(entityIdPropertyName).in(splittedIds));
TypedQuery<Object> query = fullTextSession.createQuery(criteria);
List<Object> results = query.getResultList();
int index = 0;
for (Object result : results) {
index++;
try {
fullTextSession.index(result); //index each element
if (index == splittedIds.size()) {
fullTextSession.flushToIndexes(); //apply changes to indexes
fullTextSession.clear(); //free memory since the queue is processed
}
} catch (TransientObjectException toEx) {
LOGGER.info(toEx.getMessage());
throw toEx;
}
}
transaction.commit();
}
}

Entity Framework, update multiple fields more efficiently

Using Entity Framework, I am updating about 300 rows, and 9 columns about every 30 seconds. Below is how I am currently doing it. My question is, how can I make the code more efficient?
Every once in a while, I feel my database gets hit with the impact and I just want to make it as efficient as possible.
// FOREACH OF MY 300 ROWS
var original = db.MarketDatas.FirstOrDefault(x => x.BBSymbol == targetBBsymbol);
if (original != null)
{
//if (original.BBSymbol.ToUpper() == "NOH7 INDEX")
//{
// var x1 = 1;
//}
original.last_price = marketDataItem.last_price;
original.bid = marketDataItem.bid;
original.ask = marketDataItem.ask;
if (marketDataItem.px_settle_last_dt_rt != null)
{
original.px_settle_last_dt_rt = marketDataItem.px_settle_last_dt_rt;
}
if (marketDataItem.px_settle_actual_rt != 0)
{
original.px_settle_actual_rt = marketDataItem.px_settle_actual_rt;
}
original.chg_on_day = marketDataItem.chg_on_day;
if (marketDataItem.prev_close_value_realtime != 0)
{
original.prev_close_value_realtime = marketDataItem.prev_close_value_realtime;
}
if (marketDataItem.px_settle_last_dt_rt != null)
{
DateTime d2 = (DateTime)marketDataItem.px_settle_last_dt_rt;
if (d1.Day == d2.Day)
{
//market has settled
original.settled = "yes";
}
else
{
//market has NOT settled
original.settled = "no";
}
}
if (marketDataItem.updateTime.Year != 1)
{
original.updateTime = marketDataItem.updateTime;
}
db.SaveChanges();
}
Watching what is being hit in the debugger...
SELECT TOP (1)
[Extent1].[MarketDataID] AS [MarketDataID],
[Extent1].[BBSymbol] AS [BBSymbol],
[Extent1].[Name] AS [Name],
[Extent1].[fut_Val_Pt] AS [fut_Val_Pt],
[Extent1].[crncy] AS [crncy],
[Extent1].[fut_tick_size] AS [fut_tick_size],
[Extent1].[fut_tick_val] AS [fut_tick_val],
[Extent1].[fut_init_spec_ml] AS [fut_init_spec_ml],
[Extent1].[last_price] AS [last_price],
[Extent1].[bid] AS [bid],
[Extent1].[ask] AS [ask],
[Extent1].[px_settle_last_dt_rt] AS [px_settle_last_dt_rt],
[Extent1].[px_settle_actual_rt] AS [px_settle_actual_rt],
[Extent1].[settled] AS [settled],
[Extent1].[chg_on_day] AS [chg_on_day],
[Extent1].[prev_close_value_realtime] AS [prev_close_value_realtime],
[Extent1].[last_tradeable_dt] AS [last_tradeable_dt],
[Extent1].[fut_notice_first] AS [fut_notice_first],
[Extent1].[updateTime] AS [updateTime]
FROM [dbo].[MarketDatas] AS [Extent1]
WHERE ([Extent1].[BBSymbol] = #p__linq__0) OR (([Extent1].[BBSymbol] IS NULL) AND (#p__linq__0 IS NULL))
It seems it updates the same thing multiple times, if I am understanding it correctly.
UPDATE [dbo].[MarketDatas]
SET [last_price] = #0, [chg_on_day] = #1, [updateTime] = #2
WHERE ([MarketDataID] = #3)
UPDATE [dbo].[MarketDatas]
SET [last_price] = #0, [chg_on_day] = #1, [updateTime] = #2
WHERE ([MarketDataID] = #3)
You can reduce this to 2 round trips.
Don't call SaveChanges() in side the loop. Move it outside and call it after you are done processing everything.
Write the select in such a way that it retrieves all the originals in one go and pushes them to a memory collection, then retrieve from that for each item you are updating/inserting.
code
// use this as your source
// to retrieve an item later use TryGetValue
var originals = db.MarketDatas
.Where(x => arrayOftargetBBsymbol.Contains(x.BBSymbol));
.ToDictionary(x => x.BBSymbol, y => y);
// iterate over changes you want to make
foreach(var change in changes){
MarketData original = null;
// is there an existing entity
if(originals.TryGetValue(change.targetBBsymbol, out original)){
// update your original
}
}
// save changes all at once
db.SaveChanges();
You could only execute "db.SaveChanges" after your foreach loop. It think it you would do exactly what your are asking for.
It seems it updates the same thing multiple times, if I am
understanding it correctly.
Entity Framework performs a database round-trip for every entity to update.
Just check the parameter value, they will be different.
how can I make the code more efficient
The major problem is your current solution is not scalable.
It works well when you only have a few entities to update but will become worse and worse are the number of items to update in a batch will increase.
It's often better to make this kind of logic all in the database, but perhaps you cannot do it.
Disclaimer: I'm the owner of the project Entity Framework Extensions
This library can make your code more efficient by allowing you to save multiples entities at once. All bulk operations are supported:
BulkSaveChanges
BulkInsert
BulkUpdate
BulkDelete
BulkMerge
BulkSynchronize
Example:
// Easy to use
context.BulkSaveChanges();
// Easy to customize
context.BulkSaveChanges(bulk => bulk.BatchSize = 100);
// Perform Bulk Operations
context.BulkDelete(customers);
context.BulkInsert(customers);
context.BulkUpdate(customers);
// Customize Primary Key
context.BulkMerge(customers, operation => {
operation.ColumnPrimaryKeyExpression =
customer => customer.Code;
});

Sqlite3 - node js insert multiple rows in two tables, lastID not working

I know there are many solutions provided regarding multiple insertion in sqlite3 but I am looking for efficient method in which data is getting inserted into two tables and data of second table is dependent on first table. This is a node js application.
I have two tables programs and tests in sqlite3. Table tests contains id of programs i.e. one program can contains multiple tests.
The suggested method on official page of sqlite3 module is as follows:
var sqlite3 = require('sqlite3').verbose();
var db = new sqlite3.Database(':memory:');
db.serialize(function() {
db.run("CREATE TABLE lorem (info TEXT)");
var stmt = db.prepare("INSERT INTO lorem VALUES (?)");
for (var i = 0; i < 10; i++) {
stmt.run("Ipsum " + i);
}
stmt.finalize();
db.each("SELECT rowid AS id, info FROM lorem", function(err, row) {
console.log(row.id + ": " + row.info);
});
});
db.close();
As In my requirement I have to insert data in tow tables so I am using the following code:
var programData = resp.program_info; // contains complete data of programs and tests
db1.run("INSERT INTO programs (`parent_prog_id`, `prog_name`, `prog_exercises`, `prog_orgid`, `prog_createdby`, `prog_created`, `prog_modified`, `prog_status`) VALUES (?,?,?,?,?,?,?,?)",prog.parent_program_id,prog.programName, JSON.stringify(prog), req.session.org_id, prog.created_by, prog.created_at, prog.updated_at, prog.program_status,function(err){
if(err){
throw err;
}else{
var count = 1;
var step2PostedData = prog;
for (i in step2PostedData.testsInfo) {
var c = 0;
for(j in step2PostedData.testsInfo[i]){
var obj = Object.keys(step2PostedData.testsInfo[i])[c];
db1.prepare("INSERT INTO `tests` ( `parent_name`,`test_name`, `test_alias`, `sequences`, `duration`, `prog_id`, `org_id`, `test_createdby`, `test_created`, `test_modified`, `test_status`) VALUES(?,?,?,?,?,?,?,?,?,?,?)")
.run(count, obj,
step2PostedData.testsInfo[i][j].alias,
step2PostedData.testsInfo[i][j].sequences,
step2PostedData.testsInfo[i][j].duration,
this.lastID, // using this I am getting program id
req.session.org_id,
prog.created_by,
prog.created_at,
prog.updated_at,
prog.program_status);
c++;
count++;
}
}
Now, my query is If I am using the suggested method then I am not getting last inserted program id from programs table. without callback.
e.g. If I use the following code;
var stmt = db1.prepare("INSERT INTO programs (`parent_prog_id`, `prog_name`, `prog_exercises`, `prog_orgid`, `prog_createdby`, `prog_created`, `prog_modified`, `prog_status`) VALUES (?,?,?,?,?,?,?,?)";
var stmt2 = db1.prepare("INSERT INTO `tests` ( `parent_name`,`test_name`, `test_alias`, `sequences`, `duration`, `prog_id`, `org_id`, `test_createdby`, `test_created`, `test_modified`, `test_status`) VALUES(?,?,?,?,?,?,?,?,?,?,?)")
programData.forEach(function(prog){
// Inserts data in programs table
stmt.run(prog.parent_program_id,prog.programName, JSON.stringify(prog), req.session.org_id, prog.created_by, prog.created_at, prog.updated_at, prog.program_status);
for (i in step2PostedData.testsInfo) {
var c = 0;
for(j in step2PostedData.testsInfo[i]){
var obj = Object.keys(step2PostedData.testsInfo[i])[c];
stmt2.run(
count,
obj,
step2PostedData.testsInfo[i][j].alias,
step2PostedData.testsInfo[i][j].sequences,
step2PostedData.testsInfo[i][j].duration,
'what should be there',// How to get last program inserted ID there
req.session.org_id,
prog.created_by,
prog.created_at,
prog.updated_at,
prog.program_status);
} // inner for loop ends
} // outer for loop ends
stmt.finalize();
stmt2.finalize();
});
If I use this.lastID that returns null, obviously as no callback is
now.
If I use sqlite3_last_insert_rowid() then
sqlite3_last_insert_rowid is not defined error.
If I use last_insert_rowid() then last_insert_rowid() is not
defined.
Query:
How can I insert last inserted program id there, Currently I am getting last program id as null?
Edit:
If and only if using callback is the way or method to get the last ID then I will keep my code running as it is currently. Can anyone please suggest how can I increase the speed of insertion.
Thank you!
sqlite3_last_insert_rowid() is a part of the C API.
last_insert_rowid() is an SQL function.
If the documentation tells you that lastID is valid inside the callback, then you must use the callback.
Just move all the child INSERTs into the completion callback of the parent INSERT.
(The suggested code is not intended to show how to get the last inserted ID; it just demonstrates that the values actually have been inserted.)

Insert Multiple records RIA services

I have a service class to hold the calls to the RIA service.
I have the following method to save multiple records, on the first step I should get the famous MaxId from the table and increment it in the foreach to add the objects.
public bool SaveRecs(ObservableCollection<Acc> accList)
{
int i = 0;
invokeOperation = Context.GetMaxAcc();
invokeOperation.Completed +=
(s, a) =>
{
foreach (Acc item in accList)
{
//if (CheckAcc(item.name, item.id)) continue;
item.id = invokeOperation.Value + i;
Context.Accs.Add(item);
i++;
}
};
return Commit();
}
The problem is that when the method is called first time, it do nothing and second time it works, the strange is that it may give an error of duplicate.
when I debug the code the id was ZERO
Is this the correct way of doing it?
thanks
My mistake, I should move the commit after the foreach.

Why can I not use Continuation when using a proxy class to access MS CRM 2013?

So I have a standard service reference proxy calss for MS CRM 2013 (i.e. right-click add reference etc...) I then found the limitation that CRM data calls limit to 50 results and I wanted to get the full list of results. I found two methods, one looks more correct, but doesn't seem to work. I was wondering why it didn't and/or if there was something I'm doing incorrectly.
Basic setup and process
crmService = new CrmServiceReference.MyContext(new Uri(crmWebServicesUrl));
crmService.Credentials = System.Net.CredentialCache.DefaultCredentials;
var accountAnnotations = crmService.AccountSet.Where(a => a.AccountNumber = accountNumber).Select(a => a.Account_Annotation).FirstOrDefault();
Using Continuation (something I want to work, but looks like it doesn't)
while (accountAnnotations.Continuation != null)
{
accountAnnotations.Load(crmService.Execute<Annotation>(accountAnnotations.Continuation.NextLinkUri));
}
using that method .Continuation is always null and accountAnnotations.Count is always 50 (but there are more than 50 records)
After struggling with .Continutation for a while I've come up with the following alternative method (but it seems "not good")
var accountAnnotationData = accountAnnotations.ToList();
var accountAnnotationFinal = accountAnnotations.ToList();
var index = 1;
while (accountAnnotationData.Count == 50)
{
accountAnnotationData = (from a in crmService.AnnotationSet
where a.ObjectId.Id == accountAnnotationData.First().ObjectId.Id
select a).Skip(50 * index).ToList();
accountAnnotationFinal = accountAnnotationFinal.Union(accountAnnotationData).ToList();
index++;
}
So the second method seems to work, but for any number of reasons it doesn't seem like the best. Is there a reason .Continuation is always null? Is there some setup step I'm missing or some nice way to do this?
The way to get the records from CRM is to use paging here is an example with a query expression but you can also use fetchXML if you want
// Query using the paging cookie.
// Define the paging attributes.
// The number of records per page to retrieve.
int fetchCount = 3;
// Initialize the page number.
int pageNumber = 1;
// Initialize the number of records.
int recordCount = 0;
// Define the condition expression for retrieving records.
ConditionExpression pagecondition = new ConditionExpression();
pagecondition.AttributeName = "address1_stateorprovince";
pagecondition.Operator = ConditionOperator.Equal;
pagecondition.Values.Add("WA");
// Define the order expression to retrieve the records.
OrderExpression order = new OrderExpression();
order.AttributeName = "name";
order.OrderType = OrderType.Ascending;
// Create the query expression and add condition.
QueryExpression pagequery = new QueryExpression();
pagequery.EntityName = "account";
pagequery.Criteria.AddCondition(pagecondition);
pagequery.Orders.Add(order);
pagequery.ColumnSet.AddColumns("name", "address1_stateorprovince", "emailaddress1", "accountid");
// Assign the pageinfo properties to the query expression.
pagequery.PageInfo = new PagingInfo();
pagequery.PageInfo.Count = fetchCount;
pagequery.PageInfo.PageNumber = pageNumber;
// The current paging cookie. When retrieving the first page,
// pagingCookie should be null.
pagequery.PageInfo.PagingCookie = null;
Console.WriteLine("#\tAccount Name\t\t\tEmail Address");while (true)
{
// Retrieve the page.
EntityCollection results = _serviceProxy.RetrieveMultiple(pagequery);
if (results.Entities != null)
{
// Retrieve all records from the result set.
foreach (Account acct in results.Entities)
{
Console.WriteLine("{0}.\t{1}\t\t{2}",
++recordCount,
acct.EMailAddress1,
acct.Name);
}
}
// Check for more records, if it returns true.
if (results.MoreRecords)
{
// Increment the page number to retrieve the next page.
pagequery.PageInfo.PageNumber++;
// Set the paging cookie to the paging cookie returned from current results.
pagequery.PageInfo.PagingCookie = results.PagingCookie;
}
else
{
// If no more records are in the result nodes, exit the loop.
break;
}
}