I need to check if a link pasted by a user is actually a link to a wikipedia article about a movie. I was able to check if the link is a valid Wikipedia article so far, but how do I know it is about a movie and not about something else?
This query returns a non negative page id if the article is a valid Wikipedia page:
https://en.wikipedia.org/w/api.php?format=json&action=query&prop=extracts&exintro&explaintext&redirects=1&titles=Aliens_(film)
Related
I am trying to search by section with the Wikipedia API.
What I already know:
For the below:
https://en.wikipedia.org/w/api.php?format=xml&action=query&prop=revisions&titles=Game_of_Thrones_(season_1)&rvprop=content&rvsection=0
I know the rvsection=0 will give me section 0 of the Wikipedia page and I can change this to get different sections of the page eg. 1,2,3.
What I am wondering is how/if I can search via section name? Eg. In the link above on the Wikipedia page there is a section named "Episodes", how can I search for this, so I get all the content from this section.
If this is not possible, is there a work around for this? What I am wanting to do is get Episode information from different Wikipedia pages.
I have done some more researching into this and have sort of found a solution.
If we want to get a certain section then we need to query this information with the API below:
https://en.wikipedia.org/w/api.php?action=parse&format=json&page=(NAME_OF_WIKI_PAGE)&prop=sections&disabletoc=1
This will give us JSON info about names of each section.
Once we have section info, use the parse API to get the wikitext. If we want the HTML, we can change prop to text:
https://en.wikipedia.org/w/api.php?action=parse&format=json&page=(NAME_OF_WIKI)&prop=wikitext§ion=(SECTION #)&disabletoc=1
As a result, we get the specific section we want formatted in JSON. The next step for me is sorting this and trying to get this HTML/wiki text into plain text.
This is my link for getting one random article using Wiki API:
https://en.wikipedia.org/w/api.php?%20format=json&action=query&prop=extracts&exsentences=2&exintro=&explaintext=&generator=random&grnnamespace=0
I need to get from it the first two sentences of the first section, and it works pretty well.
I want to use this kind of link and search this random article in a specific category. This is what I have tried after searching online:
https://en.wikipedia.org/w/api.php?%20format=json&action=query&prop=extracts&exsentences=2&exintro=&explaintext=&generator=random&grnnamespace=0&cmtitle=Category:Music
(I have added this part to the original link: cmtitle=Category:Music )
It doesn't work for me.
It gets the random article like the first link (not under a wanted category, which is Music in this link).
There is no API to get a random category member (and using a parameter from some unrelated API module is certainly not going to help). You could screen scrape Special:RandomInCategory (or turn it into an API module - patches welcome :)
try to use cmlimit to get all of the catgeorymembers, then use a programming language, like Python to request the page, then store every catgeory in an array, and use the random module to get a random catgeorymember from the array you stored them in. then you can use it in a link to get the specific page for the categorymember or anything else that you need.
If I search any keyword on Google like "Sesame oil" it shows content from wiki at right side. Those details are informative for users.
I wanted to know, is there any API provided by WikiPedia which I can use as well? So that if any user search for any keyword, details from Wiki can be shown as well.
You can use wikipedia Search API to find articles that are the closest to the keyword. Then once you've got the title, there's a publicly available summary endpoint, which gives you title, short text extract, wikidata definition and an image for an article what you can present to the user.
As for your question about whether it's legal - yep, it totally is.
I'm trying to find out whether it's possible to post a new post to a group via the linkedin API and have a link in the title of the post?
I can't find any documentation on what is allowed or not in the title in the documentation here:
https://developer.linkedin.com/documents/groups-api
Do you know if this is possible?
I have seen that it is possible.
An example is here: http://www.linkedin.com/groups/Now-new-website-FoodBarcelona-Restaurants-1446917.S.136723085?qid=90c4aa0a-fc9a-4934-b7e5-3d555133b8e5&trk=group_most_popular-0-b-ttl&goback=%2Egmp_1446917 (not sure if it's visible without being a member).
In the lists of questions the title is the link to the details page, but on the details page, there is a link shown in the end of the title.
Not sure though if that is passed as html or linkedin automatically parses links and converts them to html.
Doesn't look like it; you can test by trying to create a post here:
http://simplelinkedin.fiftymission.net/demo/groups.php#groupCreatePost
When you add any HTML to the tile, the API responds with a 400 error:
com.ctc.wstx.exc.WstxUnexpectedCharException: Unexpected character '\' (code 92) in start tag Expected a quote
at [row,col {unknown-source}]: [3,34]
Logically, I'm not sure how it would, as when you create a post via the API, it posts to a group on LinkedIn.com, and every group post title is itself a link to the full post itself within the group.
The code for links posted within notes appear as so:
justiceclaus.com
As where links everywhere else are explicitly no-follow like this:
http://www.justiceclause.com/
I am in belief that every link provides some value even if it is very little. It's not like it's going to pass the juice link on the facebook homepage would but even no follow and redirect links mean something.
Having some no follow back links helps with ranking. I think to Google it basically looks like if you have certain percentage of back links as no follow - you aren't a spammer. Besides that it doesn't give you any link juice, doesn't pass any values or keywords.
On the other hand, links from Facebook normally bring in traffic = potential customers. Therefor there might be a different kind of value you're looking at here.