Formatting couchdb-lucene results with a couchdb list

Formatting couchdb-lucene results with a couchdb list - lucene

Situation...
I have a simple couchapp that lists out emails that are stored in the couch database, these emails are queried with a simple view and then piped through a list to give me a pretty table that I can click on the emails to view them. That works great.
The next evolution of this app was to add some fulltext searching of the subject line of the emails with couchdb-lucene, and I think I have that nailed down as well as I can search using lucene and get valid results back. What I can't quite grasp is how do I take those results and pipe them back into my existing list function so they get formated correctly?
Here is an example of my view + list URL that gives me the HTML
http://localhost:5984/tenant103/_design/Email/_list/emaillist/by_type?startkey=["Email",2367264774866]&endkey=["Email",0]&limit=20&descending=true&include_docs=true
And here is my search URL that also gives me results
http://localhost:5984/_fti/local/tenant103/_design/Email/by_subject?q=OM-2875&include_docs=true
My thinking was I would build the URL like this
http://localhost:5984/_fti/local/tenant103/_design/Email/_list/emaillist/by_subject?q=OM-2875&include_docs=true
But that just returns
{
reason: "bad_request",
code: 400
}
This is a learning project for myself with CouchDB so I may not be getting some simple concepts here.

CouchDB-Lucene does not natively support list transformations and CouchDB can only apply list transformations to its own map/reduce views. Sorry about that!
Robert Newson.

Related

How to get some some specific result using MarkLogic search API

I am new to MarkLogic and now I am trying to get some specific result of searching query.
More specifically, searching some word through search API and supposed to get a result of documents which include specific word.
No header information, no rank or any other meta data, just want to get documents as a result.
Is there any way to just one request and get documents as a result?
Or do I need to write some code to get specific result.
I'll be appreciated if you help me.
Thanks

If you are accessing MarkLogic from outside, I'd have a look at a POST call to /v1/search with an Accept header of multipart/mixed. Details should be described here: https://docs.marklogic.com/REST/POST/v1/search
If running inside MarkLogic, you could consider using the low-level cts:search, which indeed returns documents directly. Keep in mind though that it won't paginate results, and it is usually unwise to return more than about 50 to 100 documents at once. It would just hog memory, and not allow for parallel processing.
HTH!

MusicBrainz API search provides different results from web page

I'm trying to work with MusicBrainz's API but I'm having some issues with the results of the search endpoint.
Let's have an example searching for Who's Who? - SIZE020 - Klack (Mix Two)
Searching from their site leads to this page, with an almost correct first result (probably because the 100% correct infos are not on the database at all).
Using the API leads to different situations which are causing some issues.
I made some different attempts with no success, even if I think I know enough of Lucene's syntax to write a successful query for this service.
Take 1 - empty results with the query "Who's Who? - SIZE020 - Klack (Mix Two)"
Take 2 - completely wrong results with the query Who's+Who%3F+-+SIZE020+-+Klack+(Mix+Two) (same result with the unescaped ? character)
Take 3 - empty results with the query "Who's" AND "Who?" AND "SIZE020" AND "Klack" AND "Mix" AND "Two"
Now, I know that SIZE020 shouldn't be in the query, but I don't want to deal with file names on client side so I'm just pushing the query to their service hoping that everything will work. And it works, but only if I query the service through their website, making me think that my query syntax is wrong and leaving me clueless.
Do you have any hint on why I get different results between website and xml API?
EDIT: as a side question, given a random file name, what's the better way to submit the query? I'm getting good result using the web version and submitting the typical mp3 filenames (like artist_-_title_(version).mp3 but I'm not getting anything good from my client.

Searching via the web service always uses the "indexed search with advanced query syntax" search method, this can't be changed.

Facebook like scroll down and searching/adding

I am working on enhancing the a search functionality of a website.
The current search is working as
1.reading all the rows from the database
2.find keywords from each rows and return the result.
The problem is it is too slow and it has to prepare all the data in the backend which mean read all the data from different database and put them to html.
The solution comes to my mind is:
show partial search results (like 10) which means as long as it find enough result in the databse it will stop reading and searching rows.
once user scroll down the page, using ajax to trigger another process of searching
My questions is:
Is it a good way(possible way) to do that?
Any tutorial source I should look up.
i know it is kinda abstract question, but I need advice for this.
Thanks in advance.
Update my research:
https://github.com/webcreate/infinite-ajax-scroll
this jquery lib can do the front end job

How to build a recursive structure with MongoDB

I'm trying to do something usually simple with SQL (with foreign key in the same table for example) (it may be as easy with MongoDB, I just don't know yet) which is to build a recursive data structure.
For this example, I'll talk about Pages in a Website. I'd like to make a multiple level page structure. So there could be:
Home
Our Products
Product 1
Product 2
About us
Where are we?
Contact us
Let's say pages would have a title and a content.
I need to know what's the best way to do this, and also how I could build a sitemap based on that data structure (page that shows every page from every level).
I'm building a node.js app with MongoDB for this case.
EDIT: Wouldn't it work by simply referencing a parent page in each page? Pages would be like { title: 'test', content: 'hello world', parentPage: ObjectID(parent page) }
Thanks for the help!

Personally I would implement a materialised paths structure here, it is very easy to update and query using prefixed none case insensitive regexs (which means it will use an index), so an example would look like:
{_id: {}, path: 'about_us/where_are_we'}
This also, as you can see, allows for SEO friendly URLs to hit directly on this tree giving you maximum power. This is particulary helpful in help systems where you like to display a URL like:
/help/how-to-use-my-site
Since how-to-use-my-site can hit directly on the path or even futher you can house two fields and hit directly on the full text like:
{_id: {}, path: 'about_us/where_are_we', normalised_url: 'where_are_we'}
Of course as the previous answer said you have to know how you wish to access you content but materialised paths are a good start in my opinion.
You can read more on tree structures in Mongo here: http://www.mongodb.org/display/DOCS/Trees+in+MongoDB

You will need to know how you want to access to your data.
The last time I was using a tree structure I implemented this (I took inspiration from various sources) in Ruby, it stores an _id path and the complete uri (slugified page titles), it is a pain to handle structures like this.
On the other side, you can create a collection documents (roots) and embedded documents (branches and leaves). It is more simple to handle but you will have to get the whole tree when querying, and you can query the inner documents only if you know how deep it is.
From my past experiences all the work to support a tree structure is not worth the candle (unless it is a requirement), most users will create a loose structure based more on tags than fixed categories.

Programmatic access to On-Line Encyclopedia of Integer Sequences

Is there a way to search and retrieve the results from On-Line Encyclopedia of Integer Sequences (http://oeis.org) programmatically?
I have searched their site and the results are always returned in html. They do not seem to provide an API but in the policy statement they say its acceptable to access the database programmatically. But how to do it without screen scraping?
Thanks a lot for your help.

The OEIS now provides several points of access, not just ones using their internal format. These seem largely undocumented, so here are all of the endpoints that I have found:
https://oeis.org/search?fmt=json&q=<sequenceTerm>&start=<itemToStartAt>
Returns a JSON formatted response of the results found from the sequenceTerm given. If too many results were returned, count will be > 0 whilst results will be null. If no results were returned, count will be 0. itemToStartAt is used for pagination of results, as only a maximum of 10 are ever returned. This starts at 0. If you wanted to return a second page of results, this would equal 10. Information about what each of the entries means can be found here.
https://oeis.org/search?fmt=text&q=<sequenceTerm>&start=<itemToStartAt>
Exactly the same arguments as before, however this returns it in the OEIS internal format. Which is largely written about here. Unless your project requires it, I'd highly recommend using the JSON format over this.
https://oeis.org/search?fmt=<json|text>&q=id:A<sequenceNumber>
Will return a single result if the sequenceNumber is found. This is the suggested method for obtaining single sequences, as it appears to be far more optimised than some of the alternative methods that can be used as queries. Requests often take under a second. Alternative search query methods can be found on this page.
https://oeis.org/A<sequenceNumber>/graph?png=1
This endpoint can be used to grab the images used to graph the data points. Alternatively, setting png to equal to zero returns the HTML page containing a graph of it.
https://oeis.org/recent.txt
This returns a list of recently updated entries in the OEIS internal format. There are no parameters available, or JSON format, as this seems like a static text file that is simply being served to the client. Due to the length of replies from the OEIS database (for some sequences replies can take above five seconds), I'd highly recommend heavily caching requests and using the above endpoint to update them when they change.

A URL of the form http://oeis.org/search?fmt=text&q=2,5,14,50,233 gives a nicely formatted text output.
But it seems there is no way to get a single sequence in text form.

If you happen to use Mathematica, it sounds like the following notebook might help. It allows you to specify a sequence and automatically import a detailed list of matching entries from the OEIS:
http://www.brotherstechnology.com/math/oeis_mathematica.html

It looks like direct use of their CGI program is the only API they provide.
URL for Searching the Database

https://oeis.org/search?q=id:A000032&fmt=text
gives the plain text form of an entry in their internal format
https://oeis.org/eishelp1.html

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas