how to improve speed with using RallyAPIForJava - rally

Now, I use RallyApiForJava to get story from rally with getRequest method.It's very slow when get about 500 stories from rally.How to improve the speed.

Limiting the scope helps performance. Here is an example of limiting the query by LastUpdateDate, scoping the request to a project and fetching only some fields:
int x = -30;
Calendar cal = GregorianCalendar.getInstance();
cal.add( Calendar.DAY_OF_YEAR, x);
Date nDaysAgoDate = cal.getTime();
SimpleDateFormat iso = new SimpleDateFormat("yyyy-MM-dd'T'HH:mmZ");
QueryRequest defectRequest = new QueryRequest("Defect");
defectRequest.setProject(projectRef);
defectRequest.setFetch(new Fetch(new String[] {"Name", "FormattedID","State","Priority"}));
defectRequest.setQueryFilter(new QueryFilter("LastUpdateDate", ">", iso.format(nDaysAgoDate)));
Hydrating collections (e.g. Tasks on User Stories) requires a separate request, but if you only need a count of items in the collection, you may save time and not hydrate the collection. CRUD examples available in the User Guide illustrate this API extensively and I don't think that on the user side ,as far as the custom code, there are ways to make it faster other than to limit the results to only what's necessary.

Related

Should I use unique tables for every user?

I'm working on an web app that collects traffic information for websites that use my service. Think google analytics but far more visual. I'm using SQL Server 2012 for the backbone of my app and am considering using MongoDB as the data gathering analytic side of the site.
If I have 100 users with an average of 20,000 hits a month on their site, that's 2,000,000 records in a single collection that will be getting queried.
Should I use MongoDB to store this information (I'm new to it and new things are intimidating)?
Should I dynamically create new collections/tables for every new user?
Thanks!
With MongoDB the collection (aka sql table) can get quite big without much issue. That is largely what it is designed for. The Mongo is part HuMONGOus (pretty clever eh). This is a great use for mongodb which is great at storing point in time information.
Options :
1. New Collection for each Client
very easy to do I use a GetCollectionSafe Method for this
public class MongoStuff
private static MongoDatabase GetDatabase()
{
var databaseName = "dbName";
var connectionString = "connStr";
var client = new MongoClient(connectionString);
var server = client.GetServer();
return server.GetDatabase(databaseName);
}
public static MongoCollection<T> GetCollection<T>(string collectionName)
{
return GetDatabase().GetCollection<T>(collectionName);
}
public static MongoCollection<T> GetCollectionSafe<T>(string collectionName)
{
//var db = GetDatabase();
var db = GetDatabase();
if (!db.CollectionExists(collectionName)) {
db.CreateCollection(collectionName);
}
return db.GetCollection<T>(collectionName);
}
}
then you can call with :
var collection = MongoStuff.GetCollectionSafe<Record>("ClientName");
Running this script
static void Main(string[] args)
{
var times = new List<long>();
for (int i = 0; i < 1000; i++)
{
Stopwatch watch = new Stopwatch();
watch.Start();
MongoStuff.GetCollectionSafe<Person>(String.Format("Mark{0:000}", i));
watch.Stop();
Console.WriteLine(watch.ElapsedMilliseconds);
times.Add(watch.ElapsedMilliseconds);
}
Console.WriteLine(String.Format("Max : {0} \nMin : {1} \nAvg : {2}", times.Max(f=>f), times.Min(f=> f), times.Average(f=> f)));
Console.ReadKey();
}
Gave me (on my laptop)
Max : 180
Min : 1
Avg : 6.635
Benefits :
Ease of splitting data if one client needs to go on their own
Might match your brain map of the problem
Cons :
Almost impossible to do aggregate data over all collections
Hard to find collections in Management studios (like robomongo)
2. One Large Collection
Use one collection for it all access it this way
var coll = MongoStuff.GetCollection<Record>("Records");
Put an index on the table (the index will make reads orders of magnitude quicker)
coll.EnsureIndex(new IndexKeysBuilder().Ascending("ClientId"));
needs to only be run once (per collection, per index )
Benefits :
One Simple place to find data
Aggregate over all clients possible
More traditional Mongodb setup
Cons :
All Clients Data is intermingled
May not mentally map as well
Just as a reference the mongodb limits for sizes are here :
[http://docs.mongodb.org/manual/reference/limits/][1]
3. Store only aggregated data
If you are never intending to break down to an individual record just save the aggregates themselves.
Page Loads :
# Page Total Time Average Time
15 Default.html 1545 103
I will let someone else tackle the MongoDB side of your question as I don't feel I'm the best person to comment on it, I would point out that MongoDB is a very different animal and you'll lose a lot of the RI you enjoy in SQL.
In terms of SQL design I would not use a different schema for each customer approach. Your database schema and backups could grow uncontrollably, maintaining a dynamically growing schema will be a nightmare.
I would suggest one of two approaches:
Either you can create a new database for each customer:
This is more secure as users cannot access each other's data (just use different credentials) and users are easier to manage/migrate and separate.
However many hosting providers charge per database, it will cost more to run and maintain and should you wish to compare data across users it gets much more challenging.
Your second approach is to simply host all users in a single DB, your tables will grow large (although 2 million rows is not over the top for a well maintained SQL DB). You would simply use a UserID column to discriminate.
The emphasis will be on you to get the performance you need through proper indexing
Users' data will exist in the same system and there's no SQL defense against users accessing each other's data - your code will have to be good!

How can I speed my Entity Framework code?

My SQL and Entity Framework knowledge is a somewhat limited. In one Entity Framework (4) application, I notice it takes forever (about 2 minutes) to complete one of my method calls. The first queries do not take much time, but when I loop through the Entity Framework objects returned by the queries, even though I am only reading (not modifying) the data I supposedly got, it takes forever to complete the nested loops, even though there are only dozens of entries in each list and a few levels of looping.
I expect the example below could be re-written with a fancier query that could probably include all of the filtering I am doing in my loops with some SQL words I don't really know how to use, so if someone could show me what the equivalent SQL expression would be, that would be extremely educational to me and probably solve my current performance problem.
Moreover, since other parts of this and other applications I develop often want to do more complex computations on SQL data, I would also like to know a good way to retrieve data from Entity Framework to local memory objects that do not have huge delays in reading them. In my LINQ-to-SQL project there was a similar performance problem, and I solved it by refactoring the whole application to load all SQL data into parallel objects in RAM, which I had to write myself, and I wonder if there isn't a better way to either tell Entity Framework to not keep doing whatever high-latency communication it is doing, or to load into local RAM objects.
In the example below, the code gets a list of food menu items for a member (i.e. a person) on a certain date via a SQL query, and then I use other queries and loops to filter out the menu items on two criteria: 1) If the member has a rating of zero for any group id which the recipe is a member of (a many-to-many relationship) and 2) If the member has a rating of zero for the recipe itself.
Example:
List<PFW_Member_MenuItem> MemberMenuForCookDate =
(from item in _myPfwEntities.PFW_Member_MenuItem
where item.MemberID == forMemberId
where item.CookDate == onCookDate
select item).ToList();
// Now filter out recipes in recipe groups rated zero by the member:
List<PFW_Member_Rating_RecipeGroup> ExcludedGroups =
(from grpRating in _myPfwEntities.PFW_Member_Rating_RecipeGroup
where grpRating.MemberID == forMemberId
where grpRating.Rating == 0
select grpRating).ToList();
foreach (PFW_Member_Rating_RecipeGroup grpToExclude in ExcludedGroups)
{
List<PFW_Member_MenuItem> rcpsToRemove = new List<PFW_Member_MenuItem>();
foreach (PFW_Member_MenuItem rcpOnMenu in MemberMenuForCookDate)
{
PFW_Recipe rcp = GetRecipeById(rcpOnMenu.RecipeID);
foreach (PFW_RecipeGroup group in rcp.PFW_RecipeGroup)
{
if (group.RecipeGroupID == grpToExclude.RecipeGroupID)
{
rcpsToRemove.Add(rcpOnMenu);
break;
}
}
}
foreach (PFW_Member_MenuItem rcpToRemove in rcpsToRemove)
MemberMenuForCookDate.Remove(rcpToRemove);
}
// Now filter out recipes rated zero by the member:
List<PFW_Member_Rating_Recipe> ExcludedRecipes =
(from rcpRating in _myPfwEntities.PFW_Member_Rating_Recipe
where rcpRating.MemberID == forMemberId
where rcpRating.Rating == 0
select rcpRating).ToList();
foreach (PFW_Member_Rating_Recipe rcpToExclude in ExcludedRecipes)
{
List<PFW_Member_MenuItem> rcpsToRemove = new List<PFW_Member_MenuItem>();
foreach (PFW_Member_MenuItem rcpOnMenu in MemberMenuForCookDate)
{
if (rcpOnMenu.RecipeID == rcpToExclude.RecipeID)
rcpsToRemove.Add(rcpOnMenu);
}
foreach (PFW_Member_MenuItem rcpToRemove in rcpsToRemove)
MemberMenuForCookDate.Remove(rcpToRemove);
}
You can use EFProf http://www.hibernatingrhinos.com/products/EFProf to track see exactly what EF is sending to SQL. It can also show you how many queries you are sending and how many unique queries. It also provides you some analysis of each query (e.g. is it unbound etc). Entity Framework with its navigation properties, it is quite easy to not realize you are making a db request. When you are in a loop, and have a navigation property, you get in to the N + 1 problem.
You could use the Keyword Virtual on your List parts of your model if you are using code first to enable proxying, that way you will not have to get all the data back at once, only as you need it.
Also consider NoTracking for read only data
context.bigTable.MergeOption = MergeOption.NoTracking;

Optimizing re-labeling code

Our company recently migrated from Exchange 2007 to Google Apps for Business. We have several shared mailboxes with a fairly complex and extensive folder structure and were asked to fully implement the labeling technique in these mailboxes (in order to optimize search).
e.g. Say, after migration, a conversation that used to be in MyCompany/Projects/2012/Q3/Approved Projects/StackOverflow (folder structure) now has the label MyCompany/Projects/2012/Q3/Approved Projects/StackOverflow.The intention here is that this conversation would have to be labeled with the labels MyCompany, Projects, 2012, Q3, Approved Projects, StackOverflow.
I have written a script that does exactly this (at least in my test environment). Problem is, according to what I've read, there are certain limitations involving the number of calls you are allowed to perform to the Google API. Also, script execution time is very, very poor.
I was wondering if there was a way to somehow perform operations client-side and send them to the google API in bulk. I have read about the Cache Services and was wondering if I was looking in the right direction.
This is my script:
function addLabels() {
//Get all labels
var allLabels = GmailApp.getUserLabels();
for (var i = 0; i < allLabels.length; i++) {
var label = allLabels[i];//label to get the threads from
var threads = label.getThreads(); //threads assigned with label
var labels = label.getName().split("/");//array of new label names
//add the new labels to the specified threads only if there's a "/" in the label's name
if(label.getName().indexOf("/") != null){
for (var a = 0; a < labels.length; a++){
trace("Adding label '" + labels[a] + "' to "+ threads.length +" threads in '"+ label.getName() + "'.");
//create a new label with the specified name and add it to the threads
//var newLabel = GmailApp.createLabel(labels[a]);//comment this line for test purposes
//newLabel.addToThreads(threads);//comment this line for test purposes
}
}
}
}
function trace(message){
Logger.log(message);
}
Thanks in advance!
You may want to run this script in stages. The first stage would be to determine all of the new labels that need to be created and the threads that should be assigned to them. The second stage would be to create those labels. The final stage would be to assign those labels to the threads. If you are dealing with a large amount of email you may need to shard this work, keeping queues of work that is left to do, and using triggers to continue the processing.

Rails session variables aren't getting set/passed as expected

Thanks for taking a look at this relatively nu-b question.
I have a web app built on Rails 3 that allows users to view multiple stories at a time, with each story having multiple posts. I use JS to poll the server at regular intervals so to search for new posts on all of the open stories. I use session variables so to keep track of where I ended my last search for each of those open stories so that I don't have to search the entire table of posts from scratch each time I poll the server.
Here is the action for when a user first opens a story:
def open_story
story = Story.find(params[:story_id])
#keep track of the last post searched for each open story so to assist when we poll for new posts to that story
last_post_searched = Post.last.id
session["last_post_searched_for_story_#{story.id}"] = last_post_searched
#posts = story.posts.where("posts.id <= ?", last_post_searched)
respond_with #posts
end
Here is the action for when the client polls the server for new post updates on an array of open stories:
def update_stories
open_stories_id_array = params[:open_stories]
open_stories_id_array.each { |open_story_id|
debugger
start_search_at_post_id = session["last_post_searched_for_story_#{open_story_id}"] + 1
session["last_post_searched_for_story_#{open_story_id}"] = Post.last.id
story = Story.find(open_story_id)
updates = story.posts.where("posts.id between ? AND ?",
start_search_at_post_id, session["last_post_searched_for_story_#{open_story_id}"])
results[open_story_id] = updates
}
respond_with(results)
end
For reasons that I can't figure out, my session variables don't increment to the new Post.last.id in my update_stories action in a timely fashion. Here is how I can recreate the problem:
Say I have 30 posts in my db to various different stories.
I call open_story on story 1. This sets session["last_post_searched_for_story_1"] to 30.
I then make a new post to story 1 (post 31).
My client polls the update_stories action to get new posts for story 1.
update_stories searches for posts with ids between 31 and 31 for story with id of 1, and returns the post that I just made.
Then, a little while later my client automatically polls update_stories again to check for any new posts on story 1. This is where the problem occurs.
Instead of session["last_post_searched_for_story_1"] containing the value, 31, it retains its previous value of 30 so that my db search returns my original new post for a second time. Often, my client will call update_stories several times before session["last_post_searched_for_story_1"] increments to 31. It's almost as if the session variable is very slow to save its new value, or I'm experiencing some sort of lazy loading problem.
Any help figuring this problem out would be greatly appreciated and eagerly accepted.
Thanks
BTW, as I still have a lot to learn, feel free to give feedback on better ways to handle this issue or if I am violating any rails best practices.
I see 2 problems with your code:
You may want to order your results first before applying last method. The last record returned by the database is not necessarily the last one to be created.
Secondly, to select the last post, you should apply the last criterion to only the posts for that story and then select the last post for that story.
So, instead of this:
story = Story.find(params[:story_id])
#keep track of the last post searched for each open story so to assist when we poll for new posts to that story
last_post_searched = Post.last.id
You could have it like this:
story = Story.find(params[:story_id])
last_post_searched = Post.joins(:stories).where("stories.id = ?", story.id).order("posts.created_on DESC").first

Using Magento API to get Products

I'm using the Magento API to get product data for products from a certain category from another domain. I have made the API call etc... The code I'm currently using to get the product data looks like this:
$productList = $client->call($session, 'catalog_category.assignedProducts', 7);
foreach ($productList as $product){
$theProduct = array();
$theProduct['info'] = $client->call($session, 'catalog_product.info', $product['sku']);
$allProducts[] = $theProduct;
}
The code works fine, but it goes extremely slow. When I add the image call to the loop it takes about 50 seconds for the page to load, and that's for a site with only 5 products. What I want to know is the following:
Is the code above correct and it's just Magento's API script is very slow?
Is the code above not the best way of doing what I need?
Could there be any other factors making this go so slow?
Any help would be much appreciated. At least if I know I'm using the code right I can look at other avenues.
Thanks in advance!
================= EDIT =================
Using multicall suggested by Matthias Zeis, the data arrives much quicker. Here's the code I used:
$apicalls = array();
$i = 0;
$productList = $client->call($session, 'catalog_category.assignedProducts', 7);
foreach ($productList as $product){
$apicalls[$i] = array('catalog_product.info', $product['product_id']);
$i++;
}
$list = $client->multiCall($session, $apicalls);
This now works much quicker than before! The next issue I've found is that the catalog_product_attribute_media.list call doesn't seem to work in the same way, even though the products all have images set.
The error I'm getting in the var_dump is:
Requested image not exists in product images' gallery.
Anybody know why this may now be happening? Thanks again in advance.
1. Is the code above correct and it's just Magento's API script is very slow?
Your code is correct, but the script is slow because (a) the SOAP API is not blazingly fast and (b) you are doing seperate calls for every single product.
2. Is the code above not the best way of doing what I need?
If you use the SOAP v1 API or XML-RPC, you can test multiCall. At first, call catalog_category.assignedProducts to fetch the product ids. Collect the product ids and execute a multiCall call. That should cut the waiting time down quite a bit.
Unfortunately, Magento doesn't provide a nice solution out of the box to deliver the data like you need it. I recommend that you implement your own custom API call.
Use a product collection model:
$collection = Mage::getModel('catalog/product')->getCollection();
This will get you a Mage_Catalog_Model_Resource_Product_Collection object which can be used to filter, sort, paginate, ... your product list. Iterate over the collection and build an array containing the data you need. You also can generate thumbnails for your products directly while building the data array:
foreach ($products as $product) {
$data[$product->getSku()] = array(
/* the attributes no need ... */
'small_image' => Mage::helper('catalog/image')->init($product, 'image')
->constrainOnly(true)
->keepAspectRatio(true)
->keepFrame(false)
->resize(100,150)
->__toString(),
/* some more attributes ... */
);
}
This should give you quite a performance improvement.
But of course this only is the tip of the iceberg. If this solution is not fast enough for you, avoid SOAP and bypass a part of the Magento stack by building your own API. This doesn't have to be a complex solution: it could be a simple PHP script with HTTP Basic Authentication which parses the URL for filter criteria etc., includes app/Mage.php and calls Mage::app() to initialise the Magento framework. The benefit is that you have the comfort of using Magento classes but you don't have to go through the whole routing process.
Not to forget, you may cache the results because I could imagine that you will show the same products to quite a few visitors on the other domain. Even caching for a few minutes may help your server.
3. Could there be any other factors making this go so slow?
There may be some reasons why the calls are that slow on your server - but without knowing the volume of your data, your server hardware and the customisations you have done, even a best guess won't be that good.