Is there a way to get a collection in a random order? - morphia

Is it possible to get the full collection of documents in a random order? Or just a random sample?

You can use the aggregation framework's $sample. The Morphia API to access that is here.

Related

WikiMedia API: How to GET a List of Articles by Category and Location?

Assuming I would like to get a list of 5 articles that are belonging to the "Cathedrals in Paris" category AND are nearby my location (Lat=48.8,Lon=2.3).
Is there a way to achieve both at the same GET?
If not, what is considered the best practice for achieving this goal? I know I can use the geosearch (or the categorymembers) generator, but then what? loop through each article?
Thanks.
I checked at https://www.mediawiki.org/ and well, the simple answer is no.
The best option is to search by location OR by category (preferably the option that returns LESS results), and then loop through each article.

What's the best way to vectorise text data in NLTK if i want to preserve the order of sentence?

I'm classifying text data and want to feed it into a model but I am stuck with an issue. I don't want to use CountVectorizer because it doesn't preserve it's structure but also don't want to manually convert each word into an array to due inefficiency.
What methods can I use that will help in such a context.
Thanks
This is not a direct answer to the question but provides a perspective.
If sequence of words is more important than a bag-of-words approach, then using graph based models would help. For example, pycrfsuite is a good starting point.

Redis: finding keys that match some pattern

I want to get from redis all the keys that the number of elements in their list
(Each value is a list type) has more than x items?
How do I do that?
Any simple way or just have to use lua? if lua - how?
There are many ways to achieve that, each with its own pros and cons. The first decision you have to make is whether you want to have the answer to your query pre-prepared or computed ad-hoc.
For pre-prepared, you'll have to maintain an index of the lists by length. For ad-hoc, you'll have to scan all the lists and get their lengths at runtime.
Assuming you are trying to implement an ad-hoc query, a server-side Lua script is a good choice if you already know how. If not, you can either learn it (https://redis.io/commands/eval) or use a regular Redis client in your language of choosing.

The correct way to manage data for display in the YII framework?

If I have a shop that displays a bunch of products and I get these products returned from the database as an array, is there a specific way that you can display this data using YII templates or is it sufficient to simply loop through the array and print it out in "divs" as needed?
I know if I just spit it out in DIVs, it would work, but is it the "correct" way to do it according to the framework?
For this there are zii widgets, and also many extensions.
I think for a store CListView will be a good start. There are many wikis that explain a lot about CListView.
You can easily extend it and add functionality.
Zii widgets provide pagination, sorting, and custom styling too, when used alongwith a data provider.

Lucene: Getting terms from an unstored field

Is there any way to retrieve all the terms in a particular field which is unfortunately not stored. I can not rebuild the index. Positional based information is not necessary. I just need the list of terms.
UPDATE
I've built a sample index with one stored, another unstored field and tested it with Luke. I was wondering whether I could get access to all terms just like Luke did. This may not be the brightest idea, but might work.
Lucene uses two different concepts: Indexing and Storing. If you want to extract the terms, you dont need to store anything. You can use luke, as well as iterate over the terms through the API. For the java API you can use [1]: How can I get the list of unique terms from a specific field in Lucene?.
Luke is Open Source, so just look at how Luke does it.