MediaWiki api for Wikipedia - is it possible to search by title on ALL languages? - api

I know that to search for a page id of a wikipedia with known title, i can do:
https://en.wikipedia.org/w/api.php?action=query&titles=7_Studios
However, in this case, 7_Studios is a french wikipedia article, so the above link would not work. Instead I need to try
https://fr.wikipedia.org/w/api.php?action=query&titles=7_Studios
My question is, if I do not know what language the article is about but only the title itself, how can it make sure i can find it using the api?

As Bergi mentioned, you can use Wikidata for this: it contains the database of interwiki links, so it's possible some article title won't be there, but most should.
To do this, you can use the wbgetentities module: you specify the title to search for and a list of wikis to search. For example:
https://www.wikidata.org/w/api.php?action=wbgetentities&titles=7_Studios&sites=enwiki|frwiki|nlwiki|dewiki
You can specify up to 50 wikis in one query. Currently, there are around 300 Wikipedias, so if you really need to query all of them, you may need up to 6 requests for each title.

Related

Wikipedia data extraction

I am trying to populate some tables with Hindi Wikipedia data. I have to populate it with article titles, their categories and their corresponding English url.
Right now I am finding the category and English url by parsing the html file and locating the particular div tag. This is taking a lot of time. Is there any direct and efficient way to populate the categories. Do let me know.
I have downloaded hindi wikipedia from the link: ftp://wikipedia.c3sl.ufpr.br/wikipedia/hiwiki/20131201/
You could either use some sort of parsing engine like Wikiprep: http://www.cs.technion.ac.il/~gabr/resources/code/wikiprep/
Or you could use the MediaWiki engine to handle the Wiki markup language.
http://www.mediawiki.org/wiki/Manual:Importing_XML_dumps
There might be some other options that might be relevant to your case, you can check out also here:
http://en.wikipedia.org/wiki/Wikipedia:Database_download#Help_importing_dumps_into_MySQL
(I've personally used options #1 and #2)

Use the Apple Search API to search by genre?

Is it possible to use the Apple Search API to search by genre ? I'm thinking specifically games in the app store. Using Obj-c.
As has been pointed out in the comments of this question...
Search Apple App store by genre with iOS/Obj-c
There seems to be a problem with trying to search by genre, so I'm looking for answers which of examples of that actually working, not just links to the docs.
It's not actually documented on the Search API documentation, but you can add a genreId parameter to the search URL and it restricts the search to a particular genre.
If you look at the JSON returned from a search for "Yelp", there are 4 interesting things:
"genreIds":["6005", "6001"]
"genres":["Social Networking", "Weather"]
"primaryGenreName":"Social Networking"
"primaryGenreId":6005
Adding &genreId=6001 to a URL will find apps in the US in the "Weather" category. I'm using the search term "Check" in the URL.
https://itunes.apple.com/search?term=Check&country=us&entity=software&genreId=6001
Because it's not documented, you can't rely on it working forever. You may also be able to use the primaryGenreName as a parameter, I didn't try that. You'll have to figure out what numbers correspond to what categories too.
The Search API is documented here: http://www.apple.com/itunes/affiliates/resources/documentation/itunes-store-web-service-search-api.html
You can use this link to generate an RSS Feed of your liking. Without knowing too much about how you intend to use it, I would suggest looking at these two solutions and using the best one that suits your needs.

Soundcloud search only in title

Is it possible (like on youtube with intitle: parameter) to narrow the API search so that it only looks at the Title? I am looking for specific songs from local artists, and I often find DJ mixes that have the song title in the description.
So, are there ANY additional parameters that can be passed inside a q query? And is there any documentation on this?
Unfortunately not at this time. We're hoping to improve this in the future, so stay tuned to to announcements on Developer Blog.

Joomla 1.5 Article Meta Keywors and Description management

Thx for your time!
I am currenly using sh404SEF, and it has "Title and Metas" manager. This is pretty much what I need, the only problem is that if URLs are purged so are title and Metas and it does not have place for keywords. Here is screen shot of what it looks like http://img412.imageshack.us/img412/5624/sh404titlemetasmanager.jpg
I am looking for a administrative component that will allow me to manage all the article keywords and descriptions in one place for multiple articles at a time. The components needs to update the keywords and description for the actual articles in [#__content] table, and not an overload plug-in. I looked through extensions directory, did not find what I was looking for.
You could try looking at Scribe - http://scribeseo.com/
It's paid for but pretty good for SEO/meta etc.

Tool or methods for automatically creating contextual links within a large corpus of content?

Here's the basic scenario - I have a corpus of say 100,000 newspaper-like articles. Minimally they will all have a well-defined title, and some amount of body content.
What I want to do is find runs of text in articles that ought to link to other articles.
So, if article Foo has a run of text like "Students in 8th grade are being encouraged to read works by John-Paul Sartre" and article Bar is titled (and about) "The important works of John-Paul Sartre", I'd like to automagically create that HTML link from Foo to Bar within the text of Foo.
You should ask yourself something before adding the links. What benefit for users do you want to achieve by doing this? You probably want to increase the navigability of your site. Maybe it is better to create an easier way to add links to older articles in form used to submit new ones. Maybe it is possible to add a "one click search for selected text" feature. Maybe you can add a wiki-like functionality that lets users propose link for selected text. You probably want to add links to related articles (generated through tagging system or text mining) below the articles.
Some potential problems with fully automated link adder:
You may need to implement a good word sense disambiguation algorithm to avoid confusing or even irritating the user by placing bad automatic links with regex (or simple substring matching).
As the number of articles is large you do not want to generate the html for extra links on every request, cache it instead.
You need to make a decision on duplicate titles or titles that contain other title as substring (either take longest title or link to most recent article or prefer article from same category).
TLDR version: find alternative solutions that provide desired functionality to the users.
What you are looking for are text mining tools. You can find more info and links at http://en.wikipedia.org/wiki/Text_mining. You might also want to check out Lucene and its ports at http://lucene.apache.org. Using these tools, the basic idea would be to find a set of similar articles based on the article (or title) in question. You could search various properties of the article including titles and content or both. A tagging system a la Delicious (or Stackoverflow) might also be helpful. Rather than pre-creating the links between articles, you'd present the relevant articles in an interface much like the Related questions interface on the right-hand side of this page.
If you wanted to find and link specific text in each article, I think you'd need to do some preprocessing to select pertinent phrases to key on. Even then I think it would be very hard not to miss things due to punctuation/misspellings or to not include irrelevant links for the same reasons.