Optimizing re-labeling code - optimization

Our company recently migrated from Exchange 2007 to Google Apps for Business. We have several shared mailboxes with a fairly complex and extensive folder structure and were asked to fully implement the labeling technique in these mailboxes (in order to optimize search).
e.g. Say, after migration, a conversation that used to be in MyCompany/Projects/2012/Q3/Approved Projects/StackOverflow (folder structure) now has the label MyCompany/Projects/2012/Q3/Approved Projects/StackOverflow.The intention here is that this conversation would have to be labeled with the labels MyCompany, Projects, 2012, Q3, Approved Projects, StackOverflow.
I have written a script that does exactly this (at least in my test environment). Problem is, according to what I've read, there are certain limitations involving the number of calls you are allowed to perform to the Google API. Also, script execution time is very, very poor.
I was wondering if there was a way to somehow perform operations client-side and send them to the google API in bulk. I have read about the Cache Services and was wondering if I was looking in the right direction.
This is my script:
function addLabels() {
//Get all labels
var allLabels = GmailApp.getUserLabels();
for (var i = 0; i < allLabels.length; i++) {
var label = allLabels[i];//label to get the threads from
var threads = label.getThreads(); //threads assigned with label
var labels = label.getName().split("/");//array of new label names
//add the new labels to the specified threads only if there's a "/" in the label's name
if(label.getName().indexOf("/") != null){
for (var a = 0; a < labels.length; a++){
trace("Adding label '" + labels[a] + "' to "+ threads.length +" threads in '"+ label.getName() + "'.");
//create a new label with the specified name and add it to the threads
//var newLabel = GmailApp.createLabel(labels[a]);//comment this line for test purposes
//newLabel.addToThreads(threads);//comment this line for test purposes
}
}
}
}
function trace(message){
Logger.log(message);
}
Thanks in advance!

You may want to run this script in stages. The first stage would be to determine all of the new labels that need to be created and the threads that should be assigned to them. The second stage would be to create those labels. The final stage would be to assign those labels to the threads. If you are dealing with a large amount of email you may need to shard this work, keeping queues of work that is left to do, and using triggers to continue the processing.

Related

How to create several new records in another SQL table from one button-click

I'm new here. Thanks in advance for your advice.
I’m working on an app which will ask the user how many items they made.
The user will enter a number. My app should then create that many new records in a table called 'Items_Made'.
E.g. The app asks “How many items did you make?”, the user enters “19”, the app then creates 19 new records in the 'Items_Made' table.
I've managed to pull together some code (shown below) that creates ONE new record, but I would like it to create several. I probably need some kind of loop or 'while' function but am unsure how to do so.
var ceateDatasource = app.datasources.Items_Made.modes.create;
var newItem = ceateDatasource.item;
ceateDatasource.createItem();
This code successfully creates 1 record. I would like it to be able to create several.
Creating a lot of records via client script is not recommended, especially if you loose connection or the app gets closed by mistake. In my opinion, the best way to handle this would be via server script for two things: First, It's more reliable and second, it's faster. As in the example from the official documentation, to create a record you need to do something like this:
// Assume a model called "Fruits" with a string field called "Name".
var newRecord = app.models.Fruits.newRecord();
newRecord.Name = "Kiwi"; // properties/fields can be read and written.
app.saveRecords([newRecord]); // save changes to database.
The example above is a clear example on how to create only one record. To create several records at once, you can use a for statement like this:
function createRecordsInBulk(){
var newRecords = [];
for(var i=0; i<19; i++){
var newRecord = app.models.Fruits.newRecord();
newRecord.Name = "Kiwi " + i;
newRecords.push(newRecord);
}
app.saveRecords(newRecords);
}
In the example above, you initiate newRecords, an empty array that will be responsible for holding all the new records to create at once. Then using a for statement, you generate 19 new records and push them into the newRecords. Finally, once the loop is finished, you save all the records at once by using app.saveRecords and passing the newRecords array as an argument.
Now, all this is happening on the server side. Obviously you need a way to call this from the client side. For that, you need to use the google.script.run method. So from the client side you need to do the following:
google.script.run.withSuccessHandler(function(result) {
app.datasources.Fruits.load();
}).createRecordsInBulk();
All this information is clearly documented on the app maker official documentation site. I strongly suggest you to always check there first as I believe you can get a faster resolution by reading the documentation.
I'd suggest making a dropdown or textbox where the user can select/enter the number of items they want to create and then attach the following code to your 'Create' button:
var createDatasource = app.datasources.Items_Made.modes.create;
var userinput = Number(widget.root.descendants.YourTextboxOrDropdown.value);
for (var i = 0; i <= userinput; i++) {
var newItem = createDatasource.item;
createDatasource.createItem();
}
Simple loop with your user input should get this accomplished.

how to improve speed with using RallyAPIForJava

Now, I use RallyApiForJava to get story from rally with getRequest method.It's very slow when get about 500 stories from rally.How to improve the speed.
Limiting the scope helps performance. Here is an example of limiting the query by LastUpdateDate, scoping the request to a project and fetching only some fields:
int x = -30;
Calendar cal = GregorianCalendar.getInstance();
cal.add( Calendar.DAY_OF_YEAR, x);
Date nDaysAgoDate = cal.getTime();
SimpleDateFormat iso = new SimpleDateFormat("yyyy-MM-dd'T'HH:mmZ");
QueryRequest defectRequest = new QueryRequest("Defect");
defectRequest.setProject(projectRef);
defectRequest.setFetch(new Fetch(new String[] {"Name", "FormattedID","State","Priority"}));
defectRequest.setQueryFilter(new QueryFilter("LastUpdateDate", ">", iso.format(nDaysAgoDate)));
Hydrating collections (e.g. Tasks on User Stories) requires a separate request, but if you only need a count of items in the collection, you may save time and not hydrate the collection. CRUD examples available in the User Guide illustrate this API extensively and I don't think that on the user side ,as far as the custom code, there are ways to make it faster other than to limit the results to only what's necessary.

Existing saga instances after applying the [Unique] attribute to IContainSagaData property

I have a bunch of existing sagas in various states of a long running process.
Recently we decided to make one of the properties on our IContainSagaData implementation unique by using the Saga.UniqueAttribute (about which more here http://docs.particular.net/nservicebus/nservicebus-sagas-and-concurrency).
After deploying the change, we realized that all our old saga instances were not being found, and after further digging (thanks Charlie!) discovered that by adding the unique attribute, we were required to data fix all our existing sagas in Raven.
Now, this is pretty poor, kind of like adding a index to a database column and then finding that all the table data no longer select-able, but being what it is, we decided to create a tool for doing this.
So after creating and running this tool we've now patched up the old sagas so that they now resemble the new sagas (sagas created since we went live with the change).
However, despite all the data now looking right we're still not able to find old instances of the saga!
The tool we wrote does two things. For each existing saga, the tool:
Adds a new RavenJToken called "NServiceBus-UniqueValue" to the saga metadata, setting the value to the same value as our unique property for that saga, and
Creates a new document of type NServiceBus.Persistence.Raven.SagaPersister.SagaUniqueIdentity, setting the SagaId, SagaDocId, and UniqueValue fields accordingly.
My questions are:
Is it sufficient to simply make the data look correct or is there something else we need to do?
Another option we have is to revert the change which added the unique attribute. However in this scenario, would those new sagas which have been created since the change went in be OK with this?
Code for adding metadata token:
var policyKey = RavenJToken.FromObject(saga.PolicyKey); // This is the unique field
sagaDataMetadata.Add("NServiceBus-UniqueValue", policyKey);
Code for adding new doc:
var policyKeySagaUniqueId = new SagaUniqueIdentity
{
Id = "Matlock.Renewals.RenewalSaga.RenewalSagaData/PolicyKey/" + Guid.NewGuid().ToString(),
SagaId = saga.Id,
UniqueValue = saga.PolicyKey,
SagaDocId = "RenewalSaga/" + saga.Id.ToString()
};
session.Store(policyKeySagaUniqueId);
Any help much appreciated.
EDIT
Thanks to David's help on this we have fixed our problem - the key difference was we used the SagaUniqueIdentity.FormatId() to generate our document IDs rather than a new guid - this was trivial tio do since we were already referencing the NServiceBus and NServiceBus.Core assemblies.
The short answer is that it is not enough to make the data resemble the new identity documents. Where you are using Guid.NewGuid().ToString(), that data is important! That's why your solution isn't working right now. I spoke about the concept of identity documents (specifically about the NServiceBus use case) during the last quarter of my talk at RavenConf 2014 - here are the slides and video.
So here is the long answer:
In RavenDB, the only ACID guarantees are on the Load/Store by Id operations. So if two threads are acting on the same Saga concurrently, and one stores the Saga data, the second thread can only expect to get back the correct saga data if it is also loading a document by its Id.
To guarantee this, the Raven Saga Persister uses an identity document like the one you showed. It contains the SagaId, the UniqueValue (mostly for human comprehension and debugging, the database doesn't technically need it), and the SagaDocId (which is a little duplication as its only the {SagaTypeName}/{SagaId} where we already have the SagaId.
With the SagaDocId, we can use the Include feature of RavenDB to do a query like this (which is from memory, probably wrong, and should only serve to illustrate the concept as pseudocode)...
var identityDocId = // some value based on incoming message
var idDoc = RavenSession
// Look at the identity doc's SagaDocId and pull back that document too!
.Include<SagaIdentity>(identityDoc => identityDoc.SagaDocId)
.Load(identityDocId);
var sagaData = RavenSession
.Load(idDoc.SagaDocId); // Already in-memory, no 2nd round-trip to database!
So then the identityDocId is very important because it describes the uniqueness of the value coming from the message, not just any old Guid will do. So what we really need to know is how to calculate that.
For that, the NServiceBus saga persister code is instructive:
void StoreUniqueProperty(IContainSagaData saga)
{
var uniqueProperty = UniqueAttribute.GetUniqueProperty(saga);
if (!uniqueProperty.HasValue) return;
var id = SagaUniqueIdentity.FormatId(saga.GetType(), uniqueProperty.Value);
var sagaDocId = sessionFactory.Store.Conventions.FindFullDocumentKeyFromNonStringIdentifier(saga.Id, saga.GetType(), false);
Session.Store(new SagaUniqueIdentity
{
Id = id,
SagaId = saga.Id,
UniqueValue = uniqueProperty.Value.Value,
SagaDocId = sagaDocId
});
SetUniqueValueMetadata(saga, uniqueProperty.Value);
}
The important part is the SagaUniqueIdentity.FormatId method from the same file.
public static string FormatId(Type sagaType, KeyValuePair<string, object> uniqueProperty)
{
if (uniqueProperty.Value == null)
{
throw new ArgumentNullException("uniqueProperty", string.Format("Property {0} is marked with the [Unique] attribute on {1} but contains a null value. Please make sure that all unique properties are set on your SagaData and/or that you have marked the correct properties as unique.", uniqueProperty.Key, sagaType.Name));
}
var value = Utils.DeterministicGuid.Create(uniqueProperty.Value.ToString());
var id = string.Format("{0}/{1}/{2}", sagaType.FullName.Replace('+', '-'), uniqueProperty.Key, value);
// raven has a size limit of 255 bytes == 127 unicode chars
if (id.Length > 127)
{
// generate a guid from the hash:
var key = Utils.DeterministicGuid.Create(sagaType.FullName, uniqueProperty.Key);
id = string.Format("MoreThan127/{0}/{1}", key, value);
}
return id;
}
This relies on Utils.DeterministicGuid.Create(params object[] data) which creates a Guid out of an MD5 hash. (MD5 sucks for actual security but we are only looking for likely uniqueness.)
static class DeterministicGuid
{
public static Guid Create(params object[] data)
{
// use MD5 hash to get a 16-byte hash of the string
using (var provider = new MD5CryptoServiceProvider())
{
var inputBytes = Encoding.Default.GetBytes(String.Concat(data));
var hashBytes = provider.ComputeHash(inputBytes);
// generate a guid from the hash:
return new Guid(hashBytes);
}
}
}
That's what you need to replicate to get your utility to work properly.
What's really interesting is that this code made it all the way to production - I'm surprised you didn't run into trouble before this, with messages creating new saga instances when they really shouldn't because they couldn't find the existing Saga data.
I almost think it might be a good idea if NServiceBus would raise a warning any time you tried to find Saga Data by anything other than a [Unique] marked property, because it's an easy thing to forget to do. I filed this issue on GitHub and submitted this pull request to do just that.

Should I use unique tables for every user?

I'm working on an web app that collects traffic information for websites that use my service. Think google analytics but far more visual. I'm using SQL Server 2012 for the backbone of my app and am considering using MongoDB as the data gathering analytic side of the site.
If I have 100 users with an average of 20,000 hits a month on their site, that's 2,000,000 records in a single collection that will be getting queried.
Should I use MongoDB to store this information (I'm new to it and new things are intimidating)?
Should I dynamically create new collections/tables for every new user?
Thanks!
With MongoDB the collection (aka sql table) can get quite big without much issue. That is largely what it is designed for. The Mongo is part HuMONGOus (pretty clever eh). This is a great use for mongodb which is great at storing point in time information.
Options :
1. New Collection for each Client
very easy to do I use a GetCollectionSafe Method for this
public class MongoStuff
private static MongoDatabase GetDatabase()
{
var databaseName = "dbName";
var connectionString = "connStr";
var client = new MongoClient(connectionString);
var server = client.GetServer();
return server.GetDatabase(databaseName);
}
public static MongoCollection<T> GetCollection<T>(string collectionName)
{
return GetDatabase().GetCollection<T>(collectionName);
}
public static MongoCollection<T> GetCollectionSafe<T>(string collectionName)
{
//var db = GetDatabase();
var db = GetDatabase();
if (!db.CollectionExists(collectionName)) {
db.CreateCollection(collectionName);
}
return db.GetCollection<T>(collectionName);
}
}
then you can call with :
var collection = MongoStuff.GetCollectionSafe<Record>("ClientName");
Running this script
static void Main(string[] args)
{
var times = new List<long>();
for (int i = 0; i < 1000; i++)
{
Stopwatch watch = new Stopwatch();
watch.Start();
MongoStuff.GetCollectionSafe<Person>(String.Format("Mark{0:000}", i));
watch.Stop();
Console.WriteLine(watch.ElapsedMilliseconds);
times.Add(watch.ElapsedMilliseconds);
}
Console.WriteLine(String.Format("Max : {0} \nMin : {1} \nAvg : {2}", times.Max(f=>f), times.Min(f=> f), times.Average(f=> f)));
Console.ReadKey();
}
Gave me (on my laptop)
Max : 180
Min : 1
Avg : 6.635
Benefits :
Ease of splitting data if one client needs to go on their own
Might match your brain map of the problem
Cons :
Almost impossible to do aggregate data over all collections
Hard to find collections in Management studios (like robomongo)
2. One Large Collection
Use one collection for it all access it this way
var coll = MongoStuff.GetCollection<Record>("Records");
Put an index on the table (the index will make reads orders of magnitude quicker)
coll.EnsureIndex(new IndexKeysBuilder().Ascending("ClientId"));
needs to only be run once (per collection, per index )
Benefits :
One Simple place to find data
Aggregate over all clients possible
More traditional Mongodb setup
Cons :
All Clients Data is intermingled
May not mentally map as well
Just as a reference the mongodb limits for sizes are here :
[http://docs.mongodb.org/manual/reference/limits/][1]
3. Store only aggregated data
If you are never intending to break down to an individual record just save the aggregates themselves.
Page Loads :
# Page Total Time Average Time
15 Default.html 1545 103
I will let someone else tackle the MongoDB side of your question as I don't feel I'm the best person to comment on it, I would point out that MongoDB is a very different animal and you'll lose a lot of the RI you enjoy in SQL.
In terms of SQL design I would not use a different schema for each customer approach. Your database schema and backups could grow uncontrollably, maintaining a dynamically growing schema will be a nightmare.
I would suggest one of two approaches:
Either you can create a new database for each customer:
This is more secure as users cannot access each other's data (just use different credentials) and users are easier to manage/migrate and separate.
However many hosting providers charge per database, it will cost more to run and maintain and should you wish to compare data across users it gets much more challenging.
Your second approach is to simply host all users in a single DB, your tables will grow large (although 2 million rows is not over the top for a well maintained SQL DB). You would simply use a UserID column to discriminate.
The emphasis will be on you to get the performance you need through proper indexing
Users' data will exist in the same system and there's no SQL defense against users accessing each other's data - your code will have to be good!

How can I speed my Entity Framework code?

My SQL and Entity Framework knowledge is a somewhat limited. In one Entity Framework (4) application, I notice it takes forever (about 2 minutes) to complete one of my method calls. The first queries do not take much time, but when I loop through the Entity Framework objects returned by the queries, even though I am only reading (not modifying) the data I supposedly got, it takes forever to complete the nested loops, even though there are only dozens of entries in each list and a few levels of looping.
I expect the example below could be re-written with a fancier query that could probably include all of the filtering I am doing in my loops with some SQL words I don't really know how to use, so if someone could show me what the equivalent SQL expression would be, that would be extremely educational to me and probably solve my current performance problem.
Moreover, since other parts of this and other applications I develop often want to do more complex computations on SQL data, I would also like to know a good way to retrieve data from Entity Framework to local memory objects that do not have huge delays in reading them. In my LINQ-to-SQL project there was a similar performance problem, and I solved it by refactoring the whole application to load all SQL data into parallel objects in RAM, which I had to write myself, and I wonder if there isn't a better way to either tell Entity Framework to not keep doing whatever high-latency communication it is doing, or to load into local RAM objects.
In the example below, the code gets a list of food menu items for a member (i.e. a person) on a certain date via a SQL query, and then I use other queries and loops to filter out the menu items on two criteria: 1) If the member has a rating of zero for any group id which the recipe is a member of (a many-to-many relationship) and 2) If the member has a rating of zero for the recipe itself.
Example:
List<PFW_Member_MenuItem> MemberMenuForCookDate =
(from item in _myPfwEntities.PFW_Member_MenuItem
where item.MemberID == forMemberId
where item.CookDate == onCookDate
select item).ToList();
// Now filter out recipes in recipe groups rated zero by the member:
List<PFW_Member_Rating_RecipeGroup> ExcludedGroups =
(from grpRating in _myPfwEntities.PFW_Member_Rating_RecipeGroup
where grpRating.MemberID == forMemberId
where grpRating.Rating == 0
select grpRating).ToList();
foreach (PFW_Member_Rating_RecipeGroup grpToExclude in ExcludedGroups)
{
List<PFW_Member_MenuItem> rcpsToRemove = new List<PFW_Member_MenuItem>();
foreach (PFW_Member_MenuItem rcpOnMenu in MemberMenuForCookDate)
{
PFW_Recipe rcp = GetRecipeById(rcpOnMenu.RecipeID);
foreach (PFW_RecipeGroup group in rcp.PFW_RecipeGroup)
{
if (group.RecipeGroupID == grpToExclude.RecipeGroupID)
{
rcpsToRemove.Add(rcpOnMenu);
break;
}
}
}
foreach (PFW_Member_MenuItem rcpToRemove in rcpsToRemove)
MemberMenuForCookDate.Remove(rcpToRemove);
}
// Now filter out recipes rated zero by the member:
List<PFW_Member_Rating_Recipe> ExcludedRecipes =
(from rcpRating in _myPfwEntities.PFW_Member_Rating_Recipe
where rcpRating.MemberID == forMemberId
where rcpRating.Rating == 0
select rcpRating).ToList();
foreach (PFW_Member_Rating_Recipe rcpToExclude in ExcludedRecipes)
{
List<PFW_Member_MenuItem> rcpsToRemove = new List<PFW_Member_MenuItem>();
foreach (PFW_Member_MenuItem rcpOnMenu in MemberMenuForCookDate)
{
if (rcpOnMenu.RecipeID == rcpToExclude.RecipeID)
rcpsToRemove.Add(rcpOnMenu);
}
foreach (PFW_Member_MenuItem rcpToRemove in rcpsToRemove)
MemberMenuForCookDate.Remove(rcpToRemove);
}
You can use EFProf http://www.hibernatingrhinos.com/products/EFProf to track see exactly what EF is sending to SQL. It can also show you how many queries you are sending and how many unique queries. It also provides you some analysis of each query (e.g. is it unbound etc). Entity Framework with its navigation properties, it is quite easy to not realize you are making a db request. When you are in a loop, and have a navigation property, you get in to the N + 1 problem.
You could use the Keyword Virtual on your List parts of your model if you are using code first to enable proxying, that way you will not have to get all the data back at once, only as you need it.
Also consider NoTracking for read only data
context.bigTable.MergeOption = MergeOption.NoTracking;