What is the highest acceptable number for higherFetchLimit and endNumber with the executeUserSearchRequest API?
You can enter any number say up to 6 or even 7 digit number for HigherFetchLimit, while it is recommend to set max around 10000.
endNumber- depends upon your startNumber, as it will give you those many records out of the superset of records which were found for your query and kept in cache. Its like you are doing a pagination.
It is recommended to keep it somewhere 100-200 records(depending upon your start number).
Related
Did anyone knows the length of CustomerID field of Shopify Customer JSON, because I want to store the customerID into my database where column length is restricted that I can not change. So I need to know the length.
Thanks in advance.
finally I got the Answer from Shopify..
As for the IDs they obviously seem to be BIGINT. But that would be wasteful and I seriously cannot imagine Shopify having anticipated gazillions of data rows for the next 1000's of years to come. So what's more likely is that they're composite primary keys which also would make sense given that Shopify surely needs to do some kind of partitioning.
Generally, you will find that most resources follow a N1 * [10 pow 12, 10 pow 13 - 1]. Customer and Products are in the N=1 as far as I can tell. Options are in N=2, Images N=5 etc. What's beyond that is anyones guess but probably consists of some kind of composite keys or MMR sequence (among other solutions) to identify the DB within a cluster - for the first part and some random INT key for the actual row. Random as in something like FLOOR(rand() * (max - min) + min) because you don't want curious merchants and app vendors or up-to-no-good black hats to predict stuff e.g.
There isn't a predefined length on CustomerID, as their resources follow the ActiveRecord pattern of incrementing integers as IDs (for the time being at least). As of now, it's around 12 digits max, but that is growing.
I have SORTED SET user_id:rating for every level in the game(2000+ levels). There is 2 000 000 users in set.
I need to create 2 ratings - first - all users top 100, second - top 5 friends each player
First can be solved very easily with ZRANGE
But there is a problem with second, because in average - every user has 500 friends
There is 2 ways:
1) I can do 500 requests with ZSCORE\ZRANK and sort users on by backend (too many requests, bad performance)
2) I can create SORTED SET for each user and update it on background on every users update. (more data, more ram, more complex)
May be there are any others options I missed?
I believe your main concern here should be your data model. Does every user have a sorted set of his friends?
I would recommend something like this:
users:{id}:friends values as the ids of friends
users:scoreboard values as the users ids and score as the rating
of each
As an answer to your first concern, you can consider using pipelines, which will reduce the number of requests drastically, none the less you will still need to handle ordering the results.
The better answer for you problem would be, in case you have the two sorted sets as described earlier:
Get the intersection between the two, using the "zinterstore" command and storing the result in a sorted set created solely for this purpose. As a result, the new sorted set will contain all the user's friends ids with their rating as the score (need to be careful here since you will need to specify the score of the new sorted set, it can either be the SUM, MIN or MAX of the scores).
ref: http://redis.io/commands/zinterstore
At this point using a simple "zrevrangebyscore" and specifying a limit, will leverage the sorted result you are looking for.
I am using Wikipedia API where I get the images of certain string I input.
It always returns 10 result but I want more than that approx 50.
https://en.wikipedia.org/w/api.php?action=query&prop=pageimages&format=json&piprop=thumbnail&pithumbsize=500&pilimit=50&generator=prefixsearch&gpssearch=game of thrones
How do I get 50 results?
Solved also need to add
&gpslimit=100
it should be in the range 1-100
By default most of the generators and props used in query action are with limit of 10. Always when you need to increase the limit for your query you have to set the corresponding limit value for all them, because the resulting query limit is equal to smallest of all them.
So, if your query uses generator=geosearch with prop=links|extracts|categories|images and you need 20 results, you must set the limit parameters for geosearch, links, extracts, categories and images to 20.
https://en.wikipedia.org/w/api.php?...&ggslimit=20&pllimit=20&exlimit=20&cllimit=20&imlimit=20
Of course this has to comply with the allowed max limit for each parameter. For example, for extracts the allowed max limit is 20 (default: 1), which means that you can't get more than 20 pages in your final response, although others are more than 20. This also means that in your case above the effect of gpslimit=100 will be the same as gpslimit=50, because pilimit=50.
I want to setup a system that will end up ranking competitors against one another based on the votes. In this example, there will be 250 competitors, but only 4 people able to cast votes. We ideally want it setup in a hot-or-not fashion (using the Elo rating system), but I wonder how many votes must be cast before we'd get a fair ranking?
Does anyone have any thoughts on how I might establish a fair(ish) rating without each voter casting thousands of votes?
It depends on your k-factor, i.e. how quickly you want ratings to correct to changes in skill.
If you use a higher k-factor, the rankings will quickly approximate the skill of competitors. However, in that case the ranking will be mostly a short term value, with chance, pairings and "bad days" affecting it greatly.
Using a multiple level k-factor system, like the chess world does, lets you both quickly converge to approximate ratings for new players (and the initial set of players) and track a longer term ranking for established players.
I would recommend starting with the values FIDE uses, so you don't have to retest extensively:
400 as a denominator in the exponents, so that 200 points' difference = 75% winning chance
k = 30 for the first 30 games
k = 20 after that until the player has reached 2400 ranking at least once
k = 10 thereafter
If 30 games is too much for the initial period, you could use a lower number but increase the initial k proportionally. Beware that this will make the initial ranking very variable.
If you want a different normalization than the 200 points -> 75%, you can divide all the numbers above by the same constant.
I have an app that is displaying metrics about defects in a project.
I have the option of making one query that returns all the defects, and from that I can break out about four different metrics (How many defects escaped QA in 90 days, 180 days, and then the same metrics again but only counting sev1/sev2 defects).
I could make four queries and limit the results to one so that I just get a count for each. Or I could make one query that encompass them all (all defects that escaped QA in 180 days) and then count up the difference.
I'm figuring worst case, the number of defects that escaped QA in the last six months will generally be less than 100, certainly less 500 worst case.
Which would you do-- four queryies with one result each, or one single query that on average might return 50, perhaps worst case 500?
And I guess the key question is-- where are the inflections points? Perhaps I have more metrics tomorrow (who knows, 8?) and a different average defect counts. Is there a rule of thumb I could use to help choose which approach?
Well I would probably make the series of four queries and use the result count. If you are expecting 500 defects that will end up being three queries each with 200 defects anyways.
The solution where you do each individual query and use the total result count would be safe with even a very large amount of defects. Plus I usually find it to be a bad plan to think that I know the data sets that an App will be dealing with. Most of my Apps end up living much longer and being used on larger datasets than I intended.
The max page size is 200, so it sounds like you'd be requesting between 1 and 3 pages to get all the data vs. 4 queries with a page size of 1 and using the TotalResultCount...
You'd definitely have less aggregation code to write if you use the multi query approach (letting the server do the counting for you based on your supplied filters).
I'd guess the 4 independent queries might be faster but it would be interesting to hear back your experimental results...