How to get extracts for all returned pages by Wikipedia API? - wikipedia-api

I am using the sandbox to find out how I can use extracts property to retrieve the first sentence in all pages (limit 10) with title Amsterdam, but as you can see in the sandbox the extracts property is only working on the first retrieved page.
How can I get extracts for all the returned pages?
https://en.wikipedia.org/wiki/Special:ApiSandbox#action=query&format=json&prop=info|extracts&generator=search&exsentences=1&exintro=1&gsrsearch=Amsterdam&gsrnamespace=0&gsrlimit=10

You need to use exlimit for prop=extracts (max value is 20, default is 1):
https://en.wikipedia.org/wiki/Special:ApiSandbox#action=query&format=json&prop=info|extracts&generator=search&exsentences=1&exintro=1&gsrsearch=Amsterdam&gsrnamespace=0&gsrlimit=10&exlimit=10

Related

Section content using MediaWiki API

I'm using the MediaWiki API to get the content of a Wikipedia page like this in JSON.
http://en.wikipedia.org/w/api.php?format=json&action=query&titles=New_York&prop=extracts
I'd like each section to be separated out instead of having the entire content of the page as one value. I know you can get each section like this but I want it to also include the content with each section.
http://en.wikipedia.org/w/api.php?format=json&action=parse&prop=sections&page=New_York
Is this possible to do with the API?
If you know the number of the section which you want you can get the contents through action=parse with the section parameter. E.g. the "19th century" section of the New_York article would be:
https://en.wikipedia.org/w/api.php?action=parse&page=New_York&format=json&prop=wikitext&section=4
To get the section number you can use
http://en.wikipedia.org/w/api.php?format=json&action=parse&prop=sections&page=New_York
and then find the index corresponding to your section title (line). In this case "line":"19th century","index":"4".

Constrain Wikipedia Search API to generate only NS:0 pages

I am calling the Wikipedia API from Java using the following search query to generate redirected pages:
https://en.wikipedia.org//w/api.php?format=json&action=query&generator=allpages&gapfilterredir=redirects&prop=links&continue=&gapfrom=D
where the final 'D' is just an example for the continue-from.
I am interested in only iterating over items in namespace:0. In fact, if I don't, the continue return value includes category pages, which break the next query iteration.
Thank you in advance.
The parameter you need from the Allpages api is
…&gapnamespace=0&…
but notice that when you omit it, then 0 is the default anyway.

Angellist api: How to get to second page of data?

I was looking at Angel list api (https://angel.co/api) and I noticed a section on pagination. It says entries are limited to max of 50 (for eg. https://api.angel.co/1/users/135/roles has 2 pages work of data but returns only 1 page). The documentation mentions pagination but does not say how to get 2nd page.
Any ideas?
Chetan
Add ?page=2 to the end of the request to get the second page.

Google custom search REST number of results (num field)

I'm trying to figure out how to force google custom search to give me back 20 results per page.
I've tried to send this REST request configuring my new Custom Search Engine to:
Standard edition: Free, ads are required on results pages.
https://www.googleapis.com/customsearch/v1?key=AIzaSyCgGuZie_Xo-hOECNXOTKp5Yk7deryqro8&cx=015864032944730029962:5ipe0q27hgy&q=test&alt=json&num=20
IT NOT WORKS!
but
https://www.googleapis.com/customsearch/v1?key=AIzaSyCgGuZie_Xo-hOECNXOTKp5Yk7deryqro8&cx=015864032944730029962:5ipe0q27hgy&q=test&alt=json&num=10
IT WORKS!
But reading documentation at
https://developers.google.com/custom-search/docs/xml_results#numsp
it says that:
Optional. The num parameter identifies the number of search results to return.
The default num value is 10, and the maximum value is 20. If you request more than 20 results, only 20 results will be returned.
Note: If the total number of search results is less than the requested number of results, all available search results will be returned.
Someone has experienced this problem?
PS: I've tried also to send that REST request configuring my new Custom Search Engine to:
Site Search: Starts at $100 per year, ads are optional on results pages.
But nothing has changed no way to obtain 20 results in a request/page
This documentation url has descriptions of each parameter. It also says num is restricted to integers between 1 and 10, inclusive.
https://developers.google.com/custom-search/v1/using_rest#query-params

How to get the result of "all pages with prefix" using Wikipedia api?

I wish to use Wikipedia api to extract the result of this page:
http://en.wikipedia.org/wiki/Special:PrefixIndex
When searching "something" on it, for example this:
http://en.wikipedia.org/w/index.php?title=Special%3APrefixIndex&prefix=tal&namespace=4
Then, I would like to access each of the resulting pages and extract their information.
What api call might I use?
You can use list=allpages and specify apprefix. For example:
http://en.wikipedia.org/w/api.php?format=xml&action=query&list=allpages&apprefix=tal&aplimit=max
This query will give you the id and title of each article that starts with tal. If you want to get more information about each page, you can use this list as a generator:
http://en.wikipedia.org/w/api.php?format=xml&action=query&generator=allpages&gapprefix=tal&gaplimit=max&prop=info
You can give different values to the prop parameter to get different information about the page.