MusicBrainz API search provides different results from web page - lucene

I'm trying to work with MusicBrainz's API but I'm having some issues with the results of the search endpoint.
Let's have an example searching for Who's Who? - SIZE020 - Klack (Mix Two)
Searching from their site leads to this page, with an almost correct first result (probably because the 100% correct infos are not on the database at all).
Using the API leads to different situations which are causing some issues.
I made some different attempts with no success, even if I think I know enough of Lucene's syntax to write a successful query for this service.
Take 1 - empty results with the query "Who's Who? - SIZE020 - Klack (Mix Two)"
Take 2 - completely wrong results with the query Who's+Who%3F+-+SIZE020+-+Klack+(Mix+Two) (same result with the unescaped ? character)
Take 3 - empty results with the query "Who's" AND "Who?" AND "SIZE020" AND "Klack" AND "Mix" AND "Two"
Now, I know that SIZE020 shouldn't be in the query, but I don't want to deal with file names on client side so I'm just pushing the query to their service hoping that everything will work. And it works, but only if I query the service through their website, making me think that my query syntax is wrong and leaving me clueless.
Do you have any hint on why I get different results between website and xml API?
EDIT: as a side question, given a random file name, what's the better way to submit the query? I'm getting good result using the web version and submitting the typical mp3 filenames (like artist_-_title_(version).mp3 but I'm not getting anything good from my client.

Searching via the web service always uses the "indexed search with advanced query syntax" search method, this can't be changed.

Related

How to get some some specific result using MarkLogic search API

I am new to MarkLogic and now I am trying to get some specific result of searching query.
More specifically, searching some word through search API and supposed to get a result of documents which include specific word.
No header information, no rank or any other meta data, just want to get documents as a result.
Is there any way to just one request and get documents as a result?
Or do I need to write some code to get specific result.
I'll be appreciated if you help me.
Thanks
If you are accessing MarkLogic from outside, I'd have a look at a POST call to /v1/search with an Accept header of multipart/mixed. Details should be described here: https://docs.marklogic.com/REST/POST/v1/search
If running inside MarkLogic, you could consider using the low-level cts:search, which indeed returns documents directly. Keep in mind though that it won't paginate results, and it is usually unwise to return more than about 50 to 100 documents at once. It would just hog memory, and not allow for parallel processing.
HTH!

Is it possible to use the Canonical Landscape API to get script output?

The documentation I can find for the Canonical Landscape API lets you do lots of things with scripts, but I can't find anything suggesting that you can get output. However, if you use the Canonical web interface, script output is available, so it's presumably exposed somehow...?
I just had this issue as well and since you're the first hit right now on google, I wanted to share the answer for everyone - if you run ExecuteScript on a landscape client and get back an ID of 123, and let's assume the job finished already - you want to then use that ID to ask the GetActivities API, with an input argument of "query" with value "parent-id:123". If there is a result there, you will find the script output you are looking for under the result_text field of the response. Good luck!!! It worked over here very well.

Formatting couchdb-lucene results with a couchdb list

Situation...
I have a simple couchapp that lists out emails that are stored in the couch database, these emails are queried with a simple view and then piped through a list to give me a pretty table that I can click on the emails to view them. That works great.
The next evolution of this app was to add some fulltext searching of the subject line of the emails with couchdb-lucene, and I think I have that nailed down as well as I can search using lucene and get valid results back. What I can't quite grasp is how do I take those results and pipe them back into my existing list function so they get formated correctly?
Here is an example of my view + list URL that gives me the HTML
http://localhost:5984/tenant103/_design/Email/_list/emaillist/by_type?startkey=["Email",2367264774866]&endkey=["Email",0]&limit=20&descending=true&include_docs=true
And here is my search URL that also gives me results
http://localhost:5984/_fti/local/tenant103/_design/Email/by_subject?q=OM-2875&include_docs=true
My thinking was I would build the URL like this
http://localhost:5984/_fti/local/tenant103/_design/Email/_list/emaillist/by_subject?q=OM-2875&include_docs=true
But that just returns
{
reason: "bad_request",
code: 400
}
This is a learning project for myself with CouchDB so I may not be getting some simple concepts here.
CouchDB-Lucene does not natively support list transformations and CouchDB can only apply list transformations to its own map/reduce views. Sorry about that!
Robert Newson.

Programmatic access to On-Line Encyclopedia of Integer Sequences

Is there a way to search and retrieve the results from On-Line Encyclopedia of Integer Sequences (http://oeis.org) programmatically?
I have searched their site and the results are always returned in html. They do not seem to provide an API but in the policy statement they say its acceptable to access the database programmatically. But how to do it without screen scraping?
Thanks a lot for your help.
The OEIS now provides several points of access, not just ones using their internal format. These seem largely undocumented, so here are all of the endpoints that I have found:
https://oeis.org/search?fmt=json&q=<sequenceTerm>&start=<itemToStartAt>
Returns a JSON formatted response of the results found from the sequenceTerm given. If too many results were returned, count will be > 0 whilst results will be null. If no results were returned, count will be 0. itemToStartAt is used for pagination of results, as only a maximum of 10 are ever returned. This starts at 0. If you wanted to return a second page of results, this would equal 10. Information about what each of the entries means can be found here.
https://oeis.org/search?fmt=text&q=<sequenceTerm>&start=<itemToStartAt>
Exactly the same arguments as before, however this returns it in the OEIS internal format. Which is largely written about here. Unless your project requires it, I'd highly recommend using the JSON format over this.
https://oeis.org/search?fmt=<json|text>&q=id:A<sequenceNumber>
Will return a single result if the sequenceNumber is found. This is the suggested method for obtaining single sequences, as it appears to be far more optimised than some of the alternative methods that can be used as queries. Requests often take under a second. Alternative search query methods can be found on this page.
https://oeis.org/A<sequenceNumber>/graph?png=1
This endpoint can be used to grab the images used to graph the data points. Alternatively, setting png to equal to zero returns the HTML page containing a graph of it.
https://oeis.org/recent.txt
This returns a list of recently updated entries in the OEIS internal format. There are no parameters available, or JSON format, as this seems like a static text file that is simply being served to the client. Due to the length of replies from the OEIS database (for some sequences replies can take above five seconds), I'd highly recommend heavily caching requests and using the above endpoint to update them when they change.
A URL of the form http://oeis.org/search?fmt=text&q=2,5,14,50,233 gives a nicely formatted text output.
But it seems there is no way to get a single sequence in text form.
If you happen to use Mathematica, it sounds like the following notebook might help. It allows you to specify a sequence and automatically import a detailed list of matching entries from the OEIS:
http://www.brotherstechnology.com/math/oeis_mathematica.html
It looks like direct use of their CGI program is the only API they provide.
URL for Searching the Database
https://oeis.org/search?q=id:A000032&fmt=text
gives the plain text form of an entry in their internal format
https://oeis.org/eishelp1.html

YouTube API - Querying by publish date

I'm writing a webapp that uses the YouTube Code API to do specific types of searches. In this case, I'm trying to search for all videos that match a query, and which were uploaded between two dates. This document says I can use published-min and published-max parameters, while this one says I can use updated-min and updated-max.
Both of these parameter sets cause YouTube to return an error:
published-min returns "This service does not support the 'published-min parameter"
updated-min returns "This service does not support the 'updated-max' parameter"
With neither returns a correct result set.
How can I limit my result set to hits within a specified date range?
The Reference Guide for YouTube's Data API doesn't list anything that would suggest the possibility to filter on time interval in general.
The published-min argument is only advertised in the "User activity feeds" section which is something different and probably not the thing you wanted. Or is it?
The updated-min argument in your link is referenced in a generic gdata context. It looks like they intended to describe all the things common to all the specialized APIs, but somehow updated-min isn't available everywhere.
When it comes to your very problem. I would suggest sorting on time (orderby=published) and do the filtering on the client side. I know this is not the optimal way, but the only one I can see with what Google gives us.
youtube api v3 supports publishedAfter and publishedBefore parameters with search results. For example:
https://www.googleapis.com/youtube/v3/search?key={{YOUKEY}}&channelId={{CHANNELID}}&part=snippet,id&order=date&maxResults=50&publishedAfter=2014-09-21T00:00:00Z&publishedBefore=2014-09-22T02:00:00Z