How can I get page id, wikidata id of some title along with multiple languages in a single API call? - api

I have been trying to call Wikipedia API to retrieve page id and wikidata item id using below call and it works fine.
https://en.wikipedia.org/w/api.php?action=query&prop=pageprops&ppprop=wikibase_item&redirects=1&format=xml&titles=Cat
but I need to retrieve the same information from other languages of my choice for example if I mention German and French languages in my call, it should look for their translation of word Cat and retrieve their page info. There is langlink property in Wikipedia API but somehow it doesn't work with query action along with pageprop.
So ideally, I want something like this:
https://en.wikipedia.org/w/api.php?action=query&prop=pageprops&ppprop=wikibase_item&prop=langlinks&lllang=de&lllang=fr&titles=Cat
Any help would be appreciated.

Using lllang twice will just result in the second value overwriting the first one. You'll have to omit the paramter and then you get all the links:
https://en.wikipedia.org/w/api.php?action=query&prop=pageprops|langlinks&ppprop=wikibase_item&titles=Cat

Related

[Mendeley API]: How to search for partial terms

I am using the Mendeley API to retrieve documents in the profile of a user.
Specifically, I am using this API:
GET https://api.mendeley.com/search/documents?view=all&limit=25&title=ONTOLOGY
I would like to search for all the documents that match a partial term, i.e. instead of the full word "ONTOLOGY" I would like to get the same result if I do an HTTP call like
GET https://api.mendeley.com/search/documents?view=all&limit=25&title=ONTOLO
How can I achieve that?
Should I put any jolly character?
I tried
ONTOLO*
ONTOLO$
ONTOLO?
with no luck.
I haven't found any documentation related to this feature.
Thanks!!

how to get table info and summary of page using Wikipedia api?

I want to get minimal information of a Wikipedia page using MediaWiki API like DuckDuckGo. For example for Steve Carell: https://duckduckgo.com/?q=steve+carell&t=hp&ia=news&iax=about
How can I get this information with a Wikipedia url (eg https://en.wikipedia.org/wiki/Steve_Carell) in HTML format?
You can use the MediaWiki API for that. There's an extension, TextExtracts, which is exactly for that (and it is installed on Wikipedia).
In your case, e.g.:
https://en.wikipedia.org/w/api.php?action=query&prop=extracts&exsentences=1&titles=Steve%20Carell
will return something like:
<p class=\"mw-empty-elt\">\n</p>\n\n<p class=\"mw-empty-elt\">\n \n</p>\n<p><b>Steven John Carell</b> (<span></span>; born August 16, 1962) is an American actor, comedian, producer, writer and director.</p>
You can customize how many sentences (or characters) the API returns, as well, please consult the API documentation for that.
There's also the way to retrieve the short description, which is saved at Wikidata (and visible in the mobile view of Wikipedia). This call would be:
https://en.wikipedia.org/w/api.php?action=query&prop=pageprops&titles=Steve_Carell
This returns the following property in the pageprops of the page:
"wikibase-shortdesc": "American actor"
This may fit better depending on your use case.
You can even get both of the results with a single, combined, request:
https://en.wikipedia.org/w/api.php?action=query&prop=extracts|pageprops&exsentences=1&titles=Steve_Carell

How to get all Wikipedia page links with their pageIDs?

Starting a request like that:
https://en.wikipedia.org/w/api.php?action=query&format=json&titles=Title&prop=links&pllimit=500
provides me a list of links (that the page contains) where every link consists of the title and the ns (namespace)
Is there a way to also get the PageID together with title & ns? (the less work it is for the sever the better of course)
You need to use generator parameter. Here is an example for Cobra Wikipedia page.
https://en.wikipedia.org/w/api.php?action=query&generator=links&titles=Cobra&prop=info&gpllimit=500

How can I query Wikidata API to get details of all the Korean films?

If possible, i want to return the results in Json or XML format. Is there any ways to do so? Earlier I did it using freebase.com but it is now deprecated. Please help.
This query would look a lot like the one to get the list of all films on Wikidata but adding another filter:
instead of http://wdq.wmflabs.org/api?q=claim[31:11424] (return all the entities marked as instances of film), you would do
http://wdq.wmflabs.org/api?q=claimCLAIM[31:11424] AND CLAIM[495:884] (return all the entities marked as instances of film and South Korea (Q884) as country of origin (P495))
http://wdq.wmflabs.org/api?q=claimCLAIM[31:11424] AND CLAIM[495:423] (the same for North Korea (Q423))
Then to parse the results and get the entities data, it would be the same as for the list of all the films
Remarques:
you will probably need to encode those URLs to get something that looks like: http://wdq.wmflabs.org/api?q=CLAIM%5B31%3A11424%5D%20AND%20CLAIM%5B495%3A884%5D
here is the full API documentation. Notice that this is an experimental API, which might be replaced in the coming year
The overview on Wikipedia may be more complete than Wikidata, as you've noticed yourself also. However, I could only find overviews per year, such as on https://en.wikipedia.org/wiki/List_of_South_Korean_films_of_2015.
To get a list of titles from that page, you would first retreive the raw wikicode of the page: https://en.wikipedia.org/w/index.php?action=raw&title=List_of_South_Korean_films_of_2015, and then run a regular expression such as /\{lang\|[^\|]+\|([^\}]+)/g on the code.
This returns a list of 149 titles.

How to get the result of "all pages with prefix" using Wikipedia api?

I wish to use Wikipedia api to extract the result of this page:
http://en.wikipedia.org/wiki/Special:PrefixIndex
When searching "something" on it, for example this:
http://en.wikipedia.org/w/index.php?title=Special%3APrefixIndex&prefix=tal&namespace=4
Then, I would like to access each of the resulting pages and extract their information.
What api call might I use?
You can use list=allpages and specify apprefix. For example:
http://en.wikipedia.org/w/api.php?format=xml&action=query&list=allpages&apprefix=tal&aplimit=max
This query will give you the id and title of each article that starts with tal. If you want to get more information about each page, you can use this list as a generator:
http://en.wikipedia.org/w/api.php?format=xml&action=query&generator=allpages&gapprefix=tal&gaplimit=max&prop=info
You can give different values to the prop parameter to get different information about the page.