Understanding RESTful. URIs for complex actions - api

I'm trying to build a RESTful service, and I've faced with some problems. I'll describe these problems (questions) with an example of an imaginary RESTful service.
For example, I need a "News" service on my site. News can be of different types: local news and global news. News are added by administrator. User can view both local and global news (separately or all-together). News are shown by pages. User can view the exact news.
So, I've built such a verb-noun table for this task:
GET /news - Get all news
POST /news - Create news
GET /news/{id} - Show the news with id={id}
PUT /news/{id} - Edit the news with id={id}
GET /news/{type}/{page}/{per_page} - Get news page #{page} of type {type}
GET /news/{page} - Get news page #{page} of both types
So, there are problems:
1) how to distinguish {page} and {id}? maybe {id} can be only number, but {page} - a string, started with 'p' (for example 'p1'}?
2) User can change the value "per_page" - how many news are shown on a page. Isn't it too complicated - /news/{type}/{page}/{per_page}? How it can be simplified?
3) How should be URLs in browser look like on this services? URLs won't be exact as URIs from table above?
For example:
/news - Viewing news (1st page with default 'per_page' and default 'type')
/news/{type} - Viewing news (1st page with default 'per_page' and type={type})
/news/{id} - Viewing exact news with id={id}
/news/{type}/{page}/{per_page} - Viewing exact page of news of exact type.
4) Additional functional. For example filter search ( getting news by date, author or title).
How to realize this with REST? How filter object (xml or json) should be transmitted? How to make URL of page with results of the filter? /news/{date:12.12.2012,author:'admin'} or something better?
Sorry for my rough English, If you see some grammar and etc mistakes - feel free to correct them.
Thanks in advance.

I'd say you should use regular params for the type, page and per_page. Type, Page and Per_Page do not represent unique Resources, but are rather filters to the collection of News Resources. So I'd do
/news
/news/{id}
/news?type={type}&page={page}&per_page={per_page}
Same for additional filtering.
Make sure to check out http://www.ics.uci.edu/~fielding/pubs/dissertation/evaluation.htm#sec_6_2

As Gordon wrote, you can use request params as normal. Remember that REST doesn't means only clean and nice urls.
So, leave ids and type parameters in uri, but pagination params add with query string.
Also, to distinguish different uri parts, you could use pattern used in Google's gdata i.e. params are preceded with name
/news
/news/id/{id}
/news/type/{type}
with some parsing on server side, you could add many parameters, optional parameters and not enforce exact ordering.

Related

how to get table info and summary of page using Wikipedia api?

I want to get minimal information of a Wikipedia page using MediaWiki API like DuckDuckGo. For example for Steve Carell: https://duckduckgo.com/?q=steve+carell&t=hp&ia=news&iax=about
How can I get this information with a Wikipedia url (eg https://en.wikipedia.org/wiki/Steve_Carell) in HTML format?
You can use the MediaWiki API for that. There's an extension, TextExtracts, which is exactly for that (and it is installed on Wikipedia).
In your case, e.g.:
https://en.wikipedia.org/w/api.php?action=query&prop=extracts&exsentences=1&titles=Steve%20Carell
will return something like:
<p class=\"mw-empty-elt\">\n</p>\n\n<p class=\"mw-empty-elt\">\n \n</p>\n<p><b>Steven John Carell</b> (<span></span>; born August 16, 1962) is an American actor, comedian, producer, writer and director.</p>
You can customize how many sentences (or characters) the API returns, as well, please consult the API documentation for that.
There's also the way to retrieve the short description, which is saved at Wikidata (and visible in the mobile view of Wikipedia). This call would be:
https://en.wikipedia.org/w/api.php?action=query&prop=pageprops&titles=Steve_Carell
This returns the following property in the pageprops of the page:
"wikibase-shortdesc": "American actor"
This may fit better depending on your use case.
You can even get both of the results with a single, combined, request:
https://en.wikipedia.org/w/api.php?action=query&prop=extracts|pageprops&exsentences=1&titles=Steve_Carell

How to get all Wikipedia page links with their pageIDs?

Starting a request like that:
https://en.wikipedia.org/w/api.php?action=query&format=json&titles=Title&prop=links&pllimit=500
provides me a list of links (that the page contains) where every link consists of the title and the ns (namespace)
Is there a way to also get the PageID together with title & ns? (the less work it is for the sever the better of course)
You need to use generator parameter. Here is an example for Cobra Wikipedia page.
https://en.wikipedia.org/w/api.php?action=query&generator=links&titles=Cobra&prop=info&gpllimit=500

How to get with Mediawiki API all images in a category which are not in another one?

I am entirely new to API, so sorry if the question is silly.
I would like to get all images in a category in Commons let's say X, but exclude those which are also in another one (Y). I do not understand if I can actually do this.
https://commons.wikimedia.org/w/api.php?action=query&list=categorymembers&cmtype=file&cmtitle=Category:X
will get all of them, how to exclude some?
moreover I would like in the result to have the description of the images, not just the name of the file, is that possible?
MediaWiki has - by default - no built-in support for category building and querying intersections. To accomplish this task, extensions or external tools or multiple API queries and result processing is required.
CirrusSearch API
On Wikimedia Commons, like on the whole Wikimedia Wiki farm, CirrusSearch powers filtered search, including search for category intersections and is also available through API (action=query&list=search&srsearch=incategory:A+-incategory:B, this is Category:A minus Category:B).
FastCCI
One of the tools I can recommend (because it's a dedicated high-performance solution and actually running) is fastcci, developed by Daniel Schwen; specifically for Wikimedia Commons, there is already a database maintained and a webservice running but it's possible to set it up for any wiki, provided the tool set has a host to run on and has database access.
Query
Consider the following query URL:
https://fastcci.wmflabs.org/?c1=3302993&c2=15516712&d1=0&d2=0&s=200&a=not&t=js
https://fastcci.wmflabs.org/ - Host Wikimedia Commons fastcci runs on
c1 - ID of category 1
c2 - ID of category 2
d1 - depth of category 1 to search in (fastcci by default considers sub-categories)
d2 - depth of category 2 to search in (fastcci by default considers sub-categories)
s - Number or results to return
o - Offset
a - conjunction
t - connection type (t=js for a JSONP response; otherwise assumes being used as websocket)
Response
fastcciCallback( [ 'RESULT 27572680,0,0|1675043,0,0|27577015,0,0|27577043,0,0|27577106,0,0|27576896,0,0|27576790,0,0|23481936,0,0|17560964,0,0|11009066,0,0', 'OUTOF 10', 'DBAGE 378310', 'DONE'] );
RESULT followed by a | separated list of up to 50 integer triplets of the form pageId,depth,tag. Each triplet stands for one image or category
Resources
Sample client side implementation - to see it in action, just visit any category and next to the Good pictures button in any category page.
Example is FilesOf('Category:Saaleck') - FilesOf('Category:Rapeseed fields in Saxony-Anhalt')
Server application
Presentation on YouTube
Slides
A note on pageIDs
page IDs → page titles: GET /w/api.php?action=query&pageids=page_IDs_separated_by_pipe
page titles → page IDs: GET /w/api.php?action=query&titles=Titles_separated_by_pipe
AFAIK, there is no way to get that directly using the API. But, assuming both categories are reasonably small, you could get all images from both of them and then compute the complement in your code.
To retrieve the description, you can use prop=imageinfo&iiprop=extmetadata&iiextmetadatafilter=ImageDescription.
In the context of your example query, it would look like this:
https://commons.wikimedia.org/w/api.php?action=query&generator=categorymembers&gcmtype=file&gcmtitle=Category:X&prop=imageinfo&iiprop=extmetadata&iiextmetadatafilter=ImageDescription

REST API: How to search for other attribute

I use node.js as REST API.
There are following actions available:
/contacts, GET, finds all contacts
/contacts, POST, creats new contact
/contacts/:id, GET, shows or gets specifiy contact by it's id
/contacts/:id, PUT, updates a specific contact
/contacts/:id, DELETE, removes a specific contact
What would now be a logic Route for searching, quering after a user?
Should I put this to the 3. route or should I create an extra route?
I'm sure you will get a lot of different opinions on this question. Personally I would see "searching" as filtering on "all contacts" giving:
GET /contacts?filter=your_filter_statement
You probably already have filtering-parameters on GET /contacts to allow pagination that works well with the filter-statement.
EDIT:
Use this for parsing your querystring:
var url = require('url');
and in your handler ('request' being your nodejs http-request object):
var parsedUrl = url.parse(request.url, true);
var filterStatement = parsedUrl.query.filter;
Interesting question. This is a discussion that I have had several times.
I don't think there is a clear answer, or maybe there is and I just don't know it or don't agree with it. I would say that you should add a new route: /contacts/_search performing an action on the contacts list, in this case a search. Clear and defined what you do then.
GET /contacts finds all contacts. You want a subset of all contacts. What delimiter in a URI represents subsets? It's not "?"; that's non-hierarchical. The "/" character is used to delimit hierarchical path segments. So for a subset of contacts, try a URI like /contacts/like/dave/ or /contacts/by_name/susan/.
This concept of subsetting data by path segments is for more than collections--it applies more broadly. Your whole site is a set, and each top-level path segment defines a subset of it: http://yoursite.example/contacts is a subset of http://yoursite.example/. It also applies more narrowly: /contacts/:id is a subset of /contacts, and /contacts/:id/firstname is a subset of /contacts/:id.

How to get the result of "all pages with prefix" using Wikipedia api?

I wish to use Wikipedia api to extract the result of this page:
http://en.wikipedia.org/wiki/Special:PrefixIndex
When searching "something" on it, for example this:
http://en.wikipedia.org/w/index.php?title=Special%3APrefixIndex&prefix=tal&namespace=4
Then, I would like to access each of the resulting pages and extract their information.
What api call might I use?
You can use list=allpages and specify apprefix. For example:
http://en.wikipedia.org/w/api.php?format=xml&action=query&list=allpages&apprefix=tal&aplimit=max
This query will give you the id and title of each article that starts with tal. If you want to get more information about each page, you can use this list as a generator:
http://en.wikipedia.org/w/api.php?format=xml&action=query&generator=allpages&gapprefix=tal&gaplimit=max&prop=info
You can give different values to the prop parameter to get different information about the page.