Wikibook API query - wikipedia-api

I am trying to understand how to build an API call where I can get data (json format) for the recipes, ingredients, and procedure as mention here. Anyone who could help me out with this?
https://en.wikibooks.org/wiki/Cookbook:Recipes
This URL has the recipe names and when clicked on each item it gets the ingredients and the procedure.

To get all recipes I would not use Cookbook:Recipes but rather Category:Recipes which is more complete.
The API call to list all recipes which are listed in Category:Recipes is the following:
https://en.wikibooks.org/w/api.php?action=query&generator=categorymembers&gcmtitle=Category:Recipes&gcmlimit=max&format=json&gcmcontinue=.
It will return you 500 recipes but there are more on Wikibooks. To get the remaining ones, use the continue -> gcmcontinue value in the response and append it to the next API call.
To get the ingredients and procedure of a recipe, call for example
https://en.wikibooks.org/w/api.php?action=query&prop=revisions&format=json&rvprop=content&rvslots=%2A&rvsection=1&titles=Cookbook:Biscuits.
You can use the | character to retrieve multiple recipes with the same API call:
https://en.wikibooks.org/w/api.php?action=query&prop=revisions&format=json&rvprop=content&rvslots=%2A&titles=Cookbook:Biscuits|Cookbook:Baklava.
If you want to retrieve only the ingredients or only the procedure of recipes, use the additional parameter rvsection=. Most of the time (but not always) the ingredients are in the first section and the procedure is the second section. So calling
https://en.wikibooks.org/w/api.php?action=query&prop=revisions&format=json&rvprop=content&rvslots=%2A&rvsection=1&titles=Cookbook:Biscuits&rvsection=1 returns you the ingredients for making biscuits and
https://en.wikibooks.org/w/api.php?action=query&prop=revisions&format=json&rvprop=content&rvslots=%2A&rvsection=1&titles=Cookbook:Biscuits&rvsection=2 returns you the procedure for making biscuits.

I'm not sure if it was your question, but in addition to Pascalco's answer, it worth note that you will NOT be able to get structured json data that details ingredients with their quantities and procedures, ie something like:
{ "ingredient": "milk", "quantity": { "number": "1", "unit":"liter"}}
The API will drop the raw mediawiki's syntax page content in one single field, and a few extra metadata about the page.
Moreover, the fact that those pages do not use templates make this type of data very difficult to extract, either with a syntax parser or html parser.

Related

How to efficiently retrieve a list of all collections a product belongs to in Shopify?

I want to create a CSV export of product data from a Shopify store. For each product I'm exporting data like the product name, price, image URL etc... In this export I also want to list, for each product, all the collections the product belongs to, preferably in the hierarchal order the collections appear in the site's navigation menu (e.g Men > Shirts > Red Shirts).
If my understanding of the API is correct, for each product I need to make a separate call to the Collect API to get a list of collections it belongs to then another call to the Collections API to get the handle of each collection. This sounds like a lot of API calls for each product.
Is there a more efficient way to do this?
Is there any way to figure out the aforementioned hierarchy of collections?
Unfortunately, as you pointed out, I don't think there is an efficient way of doing this because of the way that the Shopify API is structured. It does not permit collections to be queried from products, rather only products queried from collections. That is, one can't see what collections a product belongs to, but can see what products belong to a collection.
The ShopifyAPI::Collect or ShopifyAPI::Collection REST resource does not return Product variant information, which is needed to get the price information as per the requirements. Furthermore, ShopifyAPI::Collect is limited to custom collections only, and would not work for products in ShopifyAPI::SmartCollection's. For this reason I suggest using GraphQL instead of REST to get the information needed.
query ($collectionCursor: String, $productCursor: String){
collections(first: 1, after: $collectionCursor) {
edges {
cursor
node {
id
handle
products(first: 8, after: $productCursor){
edges{
cursor
node{
id
title
variants(first: 100){
edges{
node{
price
}
}
}
}
}
}
}
}
}
}
{
"collectionCursor": null,
"productCursor": null
}
The $productCursor variable can be used to iterate over all of the products in a collection and the $collectionCursor to iterate over all collections. Note that only the first 100 variants need to be queried since Shopify has a hard limit on 100 variants per product.
The same query can be used to iterate over ShopifyAPI::SmartCollection's.
Alternatively the same query using the REST API would look something like this in Ruby.
collections = ShopifyAPI::Collection.all # paginate
collection.each do |collection|
collection.products.each do |product|
product.title
# note the extra call the Product API to get varint info
ShopifyAPI::Product.find(product.id).variants.each do |varaint|
variant.price
end
end
end
I don't see any way to address the inefficiencies with the REST query, but you might be able to improve on the GraphQL queries by using Shopify's GraphQL Bulk Operations.

Why does GraphQL query work with the "query" keyword before the curly braces?

I created a small API for authors and books as example. The problem is that I don't understand why a query can look different but still get me the same output. I've included 3 examples.
The GraphQL query looks like this:
{
"query":
"query{
author(id: 1) {
name
}
}"
}
Why is this query working if within the query is the keyword "query" two times? When I write the query like this:
{
"query":
"{
author(id: 1) {
name
}
}"
}
it also works, and when I write it like that:
{
"query":
"author{
author(id: 1) {
name
}
}"
}
It is not working. Why is that so?
GraphQL specifies three types of operations:
query – a read‐only fetch.
mutation – a write followed by a fetch.
subscription – a long‐lived request that fetches data in response to source events.
What you are sending to your server is a JSON object with a single property (query) the value of which is a GraphQL document that represents your actual request to the GraphQL service. This property is (unfortunately) called query by convention but it has nothing to do with the actual operation inside the document you are sending.
Any operation included in your GraphQL document must follow this format:
OperationType [Name] [VariableDefinitions] [Directives] SelectionSet
Name, VariableDefinitions and Directives are all optional. The OperationType is one of query, mutation or subscription. SelectionSet is the collection of fields you are requesting for that operation type. Only selection sets are wrapped in curly brackets. In your example, you have two selection sets -- one containing the author field and one containing the name field.
There's an exception to the above called query shorthand:
If a document contains only one query operation, and that query defines no variables and contains no directives, that operation may be represented in a short‐hand form which omits the query keyword and query name.
In other words if your operation:
is a query
is the only operation in the document
contains no variable definitions or directives
You can omit the query keyword and the operation name. This leaves you with just a selection set, which is wrapped in a set of curly brackets.
So your first two examples are equally valid. The third example is not valid because author is not a valid operation kind.
The first query key on your examples is a requirement from GraphQL to actually call the endpoint, it has to be present to actually run queries or mutations. You can see it in the docs.
The first example works because at the root of a GraphQL Schema there has to be an action with keywords query or mutation, and in your case you are triggering a query.
The second example works because if you don't define what type of action (query or mutation) on your request, it always defaults to execute a query.
The third example does not work because you don't have the action author at the root of your Schema.
I guess the first keyword query is what makes some confusion in this case.

how to do pagination in RESTFUL API in a effective way?

I want to support pagination in my RESTful API.
My API method should return a JSON list of product via http://localhost/products/v1/getproductsbycategory, there are potentially thousands of products, and I want to page through them, so my request should look something like this:
public function getProductsByCategory($product_id,$page){
$perPage=5;
$start=($page-1)*$perPage;
$stmt=$this->conn->prepare("SELECT id,product,description,destination_url,expiry_type,savings,expiry,title,last_updated_on FROM products WHERE product_id=? ORDER BY last_updted_on DESC LIMIT $start ,$perPage");
$stmt->bind_param('i',$category_id);
$stmt->execute();
$productbycategory=$stmt->get_result();
$stmt->close();
return $productbycategory;
}
}
Firstly, in a RESTful call, the URL should ideally be noun-based and not verbs. We are using HTTP verbs (GET, PUT, POST, etc) to do an action on a noun - product in your case.
So, the URL should be http://localhost/products/v1/category
This effectively means you are GETting product of type v1 based on category. To get a given page number, simply add it as a query parameter -
http://localhost/products/v1/category?page=1
and handle it accordingly in your GET implementation corresponding to localhost/products/v1/category
Hope this helps.
Pagination has nothing to do with the JSON format per se - it's all about the query string in the URL and how the server interprets that.
Expanding on #Sampada's answer, you can have a URL like
http://localhost/products/v1/category?pageSize=5&pageNumber=2
and then you'll simply pick the corresponding elements on the server side (consider whether you'll want 0 or 1-based index for the pageNumber), and return them.
Additionally you can wrap this collection in an object that also provides links as to navigate to the previous/next/specific page - see HATEOAS & Richardson's Maturity Model level 3.

SharePoint change column id for REST requests

I recently started experimenting with the REST API for SharePoint 2013 Foundation and I am trying to return all entries in a list. My GET request returns the data I am looking for, but the IDs used to identify the columns in the list are not helpful for identifying what the information is (see images below). The column IDs between 'Title' and 'ID', in the second image, are a jumble of characters.
SharePoint List View
Response Data
Is there any way to configure the list to use the column names as IDs? Also, is there some significance to the characters currently used as IDs?
You will need to make a second request to get a listing of columns that includes the InternalName and the Title which is what you are trying to reference:
You can use this REST call:
_api/web/lists/GetByTitle('Project Details')/fields
or you can use CSOM:
using (ClientContext context = new ClientContext(url))
{
List list = context.Web.Lists.GetByTitle("Project Details");
context.Load(list, l => l.Fields);
context.ExecuteQuery();
foreach(Field field in list.Fields)
{
Console.WriteLine(field.Title);
Console.WriteLine(field.InternalName);
}
}
SharePoint automatically generates the InternalName and it is a read-only field, at least using REST. It'll be easier to get the Field Data to correlate the InternalName to the Title than changing the values.
The column you are referring to, between Title and Id, is the ID of the content type associated to the item. It is not a column ID.
The SharePoint REST API is OData compliant, so you can use the $select parameter to query for the neccesary fields.
http://server/site/_api/web/lists('guid')/items?$select=Column1,Column2
Please be aware though, lookup fields need to be expanded as well, otherwise you get only the Id of the lookup item.
http://server/site/_api/web/lists('guid')/items?$select=LookupColumn&$expand=LookupColumn/Title

Instagram API error

I using Instagram API to get user info
api = InstagramAPI(access_token=access_token)
profile = api.user(user_id="kallaucyahoocojp") # I try to put output data to profile variable here
And I get the below error:
DownloadError: Unable to fetch URL: https://api.instagram.com/v1/users/kallaucyahoocojp.json?access_token=(u'1191812153.f78cd79.d2d99595c79d4c23a7994d85ea0d412c', {u'username': u'kallaucyahoocojp', u'bio': u'\u30c4\u30a4\u30c3\u30bf\u30d5\u30a9\u30ed\u30ef\u30fc\u5897\u52a0\u30b5\u30fc\u30d3\u30b9', u'website': u'http://twitter\u30d5\u30a9\u30ed\u30ef\u30fc.jp', u'profile_picture': u'http://images.ak.instagram.com/profiles/anonymousUser.jpg', u'full_name': u'Kallauc', u'id': u'1191812153'})
Can anybody help me to fix it?
You need to pass the numeric-based user id, rather than the username. For example, instead of passing kallaucyahoocojp, you might pass 1234 if t
Here's how to get the ID if you don't have it:
Search for the instagram user id using this endpoint. In the python api:
api.user_search(q="kallaucyahoocojp", count=100)
Check the results for an exact string match on each user name while iterating through the results (calling .lower() to be sure to ignore potential case issues).
If you don't find the user in the first page of results, call to the next page using the max id returned.
Get the user id object from the returned from the matching users search result, then call your original function again with the numeric id.
A couple of very important notes:
Notice that I called the search function for users with a count of 100. You can pick any number, but contrary to other SO posts, the first user is not always the user you want in a search. The search can and will match partials, and not always according to an exact match first. How do I know? I have production instagram apps. I will qualify and say that usually the results are in the first 2-3 matches. Decide what is cheaper; repeated API calls that bring you closer to the limit, or 1 large bulk call where you are certain to get all the results.
The python Instagram API last I checked does a terrible job returning paging information. You actually get the paging URL which defeats the purpose of the python API itself to get additional pages. Your options are extract the next id parameter from the URL using urlparse or something similar, or fix the API to return the paging data as an object per the json (I've done both). What happens is the API itself is discarding part of the json and only giving you the URL which normally you don't want/need.
In your example, here's the search response:
{
"meta": {
"code": 200
},
"data": [
{
"username": "kallaucyahoocojp",
"bio": "ツイッタフォロワー増加サービス",
"website": "http://twitterフォロワー.jp",
"profile_picture": "http://images.ak.instagram.com/profiles/anonymousUser.jpg",
"full_name": "Kallauc",
"id": "1191812153"
}
]
}
Revising your call:
api = InstagramAPI(access_token=access_token)
profile = api.user(user_id="1191812153")
I should note that you may not need to call the user call if you did a search because you may simply have all the info you need. It will depend on what you are doing of course, so I am giving you the general method to use the rest of the user api.
For extracting profile info using Instagram API, userid is required.
The endpoint for extracting userID:
https://api.instagram.com/v1/users/search?q=[username]&access_token=[HERE]
The endpoint for extracting profile info:
https://api.instagram.com/v1/users/[userid]/?access_token=[HERE]
Note that before extracting information, check the login permissions for your access token.