Does Google Scholar have an API available that we can use in our research applications? - journal

I am working on a research publication and collaboration project that has a literature search feature in it.
Google Scholar seems like it will work since it is an open source tool but, when I researched Google Scholar, I could not find any information about it having an API.
Please let me know if there is any API for Google Scholar that is valid.
TIA.

There's no official Google Scholar API. There are third-party solutions like free scholarly Python package which supports profile, author, cite and organic results (search_pubs seems to be the method to get organic results, although method name confuses me).
Note that by using scholarly constantly without a requests rate limit, Google may block your IP (mentioned by #RadioControlled). Use it wisely.
Alternatively, there's a Google Scholar API from SerpApi which is a paid API with a free plan that supports organic, cite, profile, author results and bypasses all the blocks on SerpApi backend so it won't block your IP.
Example code to parse profile results using scholarly using search_by_keyword method:
import json
from scholarly import scholarly
# will paginate to the next page by default
authors = scholarly.search_keyword("biology")
for author in authors:
print(json.dumps(author, indent=2))
# part of the output:
'''
{
"container_type": "Author",
"filled": [],
"source": "SEARCH_AUTHOR_SNIPPETS",
"scholar_id": "LXVfPc8AAAAJ",
"url_picture": "https://scholar.google.com/citations?view_op=medium_photo&user=LXVfPc8AAAAJ",
"name": "Eric Lander",
"affiliation": "Broad Institute",
"email_domain": "",
"interests": [
"Biology",
"Genomics",
"Genetics",
"Bioinformatics",
"Mathematics"
],
"citedby": 552013
}
... other author results
'''
Example code to parse organic results using Google Scholar Profile Results API from SerpApi:
import json
from serpapi import GoogleScholarSearch
# search parameters
params = {
"api_key": "Your SerpApi API key",
"engine": "google_scholar_profiles",
"hl": "en", # language
"mauthors": "biology" # search query
}
search = GoogleScholarSearch(params)
results = search.get_dict()
# only first page results
for result in results["profiles"]:
print(json.dumps(result, indent=2))
# part of the output:
'''
{
"name": "Masatoshi Nei",
"link": "https://scholar.google.com/citations?hl=en&user=VxOmZDgAAAAJ",
"serpapi_link": "https://serpapi.com/search.json?author_id=VxOmZDgAAAAJ&engine=google_scholar_author&hl=en",
"author_id": "VxOmZDgAAAAJ",
"affiliations": "Laura Carnell Professor of Biology, Temple University",
"email": "Verified email at temple.edu",
"cited_by": 384074,
"interests": [
{
"title": "Evolution",
"serpapi_link": "https://serpapi.com/search.json?engine=google_scholar_profiles&hl=en&mauthors=label%3Aevolution",
"link": "https://scholar.google.com/citations?hl=en&view_op=search_authors&mauthors=label:evolution"
},
{
"title": "Evolutionary biology",
"serpapi_link": "https://serpapi.com/search.json?engine=google_scholar_profiles&hl=en&mauthors=label%3Aevolutionary_biology",
"link": "https://scholar.google.com/citations?hl=en&view_op=search_authors&mauthors=label:evolutionary_biology"
},
{
"title": "Molecular evolution",
"serpapi_link": "https://serpapi.com/search.json?engine=google_scholar_profiles&hl=en&mauthors=label%3Amolecular_evolution",
"link": "https://scholar.google.com/citations?hl=en&view_op=search_authors&mauthors=label:molecular_evolution"
},
{
"title": "Population genetics",
"serpapi_link": "https://serpapi.com/search.json?engine=google_scholar_profiles&hl=en&mauthors=label%3Apopulation_genetics",
"link": "https://scholar.google.com/citations?hl=en&view_op=search_authors&mauthors=label:population_genetics"
},
{
"title": "Phylogenetics",
"serpapi_link": "https://serpapi.com/search.json?engine=google_scholar_profiles&hl=en&mauthors=label%3Aphylogenetics",
"link": "https://scholar.google.com/citations?hl=en&view_op=search_authors&mauthors=label:phylogenetics"
}
],
"thumbnail": "https://scholar.googleusercontent.com/citations?view_op=small_photo&user=VxOmZDgAAAAJ&citpid=3"
}
... other results
'''
There is a dedicated Scrape historic Google Scholar results using Python blog post of mine at SerpApi which shows how to scrape historic 2017-2021 Organic, Cite Google Scholar results to CSV, SQLite.
Disclaimer, I work for SeprApi

A quick search shows that others are trying to implement such APIs, but Google does not provide one. It is not clear whether this is legal, see for instance
How to get permission from Google to use Google Scholar Data, if needed?.

Related

Wikipedia API requests for english page and spanish equivalent. Unable to retrieve spanish JSON for recentchanges

I'm attempting to use the wikipedia API. However, I'm finding that I can only retrieve recent changes for english pages. For example:
https://en.wikipedia.org/w/api.php?action=query&list=recentchanges&rctitle=Battle%20of%20Palma&format=json
returns:
{
"batchcomplete": "",
"query": {
"recentchanges": [{
"type": "edit",
"ns": 0,
"title": "Battle of Palma",
"pageid": 67226819,
"revid": 1061578435,
"old_revid": 1049512969,
"rcid": 1455128770,
"timestamp": "2021-12-22T15:19:02Z"
}]
}
}
However, the same request with the same page in Spanish returns an empty array:
https://en.wikipedia.org/w/api.php?action=query&list=recentchanges&rctitle=Batalla%20de%20Palma&format=json
{"batchcomplete":"","query":{"recentchanges":[]}}
Even though both of the pages exist:
English: https://en.wikipedia.org/wiki/Battle_of_Palma
Spanish: https://es.wikipedia.org/wiki/Batalla_de_Palma
I was using the incorrect API call. The above provides any changes across wikipedia. I needed to use the history call found here: https://www.mediawiki.org/wiki/API:REST_API/Reference#Get_page_history

How to find API endpoints that accept oauth2 tokens

I have an angular 6 application which makes requests to various oauth2 providers. I’ve managed to successfully request access tokens from these providers using the implicit grant type (will be working on authorization code soon). Now I’m trying to find a list of API endpoints that I can test the access tokens with. For example, requesting user profile information from Google.
So far, I’ve been able to get access tokens from the following providers:
Google (https://accounts.google.com)
Anilist (http://anilist.co)
OneDrive (https://login.live.com)
DropBox (https://www.dropbox.com)
Does anyone know any publicly accessible API endpoints for any of the above (or any other oauth2 provider) that I can test with?
Thanks
Here is how you can answer your question for Google.
You first connect to the Google API explorer web application: https://developers.google.com/apis-explorer/#p/
This web page helps you browse the many Google APIs. So, search for an API named API Discovery Service. It will answer an API that provides information about other Google APIs, such as what APIs are available, the resource, and method details for each API.
Therefore, to get a list of every APIs, you can call the list entry point of this API Discovery Service here: https://www.googleapis.com/discovery/v1/apis?preferred=true
Here is the beginning of the result:
{
"kind": "discovery#directoryList",
"discoveryVersion": "v1",
"items": [
{
"kind": "discovery#directoryItem",
"id": "abusiveexperiencereport:v1",
"name": "abusiveexperiencereport",
"version": "v1",
"title": "Abusive Experience Report API",
"description": "Views Abusive Experience Report data, and gets a list of sites that have a significant number of abusive experiences.",
"discoveryRestUrl": "https://abusiveexperiencereport.googleapis.com/$discovery/rest?version=v1",
"icons": {
"x16": "https://www.gstatic.com/images/branding/product/1x/googleg_16dp.png",
"x32": "https://www.gstatic.com/images/branding/product/1x/googleg_32dp.png"
},
"documentationLink": "https://developers.google.com/abusive-experience-report/",
"preferred": true
},
[...]
On each of those APIs listed by the previous call, the discoveryRestUrl field gives you an URL on which you can get informations like the entrypoint of the corresponding API.
For instance, you can find that the GMail API is described here: https://www.googleapis.com/discovery/v1/apis/gmail/v1/rest
In the output, extract the OAuth2 part from the auth entry to get the scopes:
"auth": {
"oauth2": {
"scopes": {
"https://mail.google.com/": {
"description": "Read, compose, send, and permanently delete all your email from Gmail"
},
"https://www.googleapis.com/auth/gmail.compose": {
"description": "Manage drafts and send emails"
},
"https://www.googleapis.com/auth/gmail.insert": {
"description": "Insert mail into your mailbox"
},
"https://www.googleapis.com/auth/gmail.labels": {
"description": "Manage mailbox labels"
},
"https://www.googleapis.com/auth/gmail.metadata": {
"description": "View your email message metadata such as labels and headers, but not the email body"
},
"https://www.googleapis.com/auth/gmail.modify": {
"description": "View and modify but not delete your email"
},
"https://www.googleapis.com/auth/gmail.readonly": {
"description": "View your email messages and settings"
},
"https://www.googleapis.com/auth/gmail.send": {
"description": "Send email on your behalf"
},
"https://www.googleapis.com/auth/gmail.settings.basic": {
"description": "Manage your basic mail settings"
},
"https://www.googleapis.com/auth/gmail.settings.sharing": {
"description": "Manage your sensitive mail settings, including who can manage your mail"
}
}
}
},
In the description, you will also find the endpoint for the GMail API : https://www.googleapis.com/gmail/v1/users/
Finally, you can access this API by means of OAuth2.
NOTE: every scopes associated with one or several APIs are listed here: https://developers.google.com/identity/protocols/googlescopes

Spotify API: endpoint currently-playing podcast support

I'm developing a tool a tool (in Python 3) that gets my recently played track on Spotify. For that, I use the official Spotify API (you can try it there). When listening to music, I get a json containing the track, artist and much more info.
Unfortunately, this endpoint does not support listening to podcasts. When I listen to a podcast, the returned json is
{
"timestamp": 1545990763374,
"context": {
"external_urls": {
"spotify": "https://open.spotify.com/show/2tuQXnufTLetdGd7c24EfW"
},
"href": "https://api.spotify.com/v1/shows/2tuQXnufTLetdGd7c24EfW",
"type": "show",
"uri": "spotify:show:2tuQXnufTLetdGd7c24EfW"
},
"progress_ms": 11357,
"item": null,
"currently_playing_type": "episode",
"is_playing": true
}
which is not enough for my purposes. I googled a lot but did not find any endpoints of the Spotify API supporting podcasts.
Does anyone know a workaround?

Google custom search api returns 0 results when searchType is image

I'm using Google Custom Search API to search for images. The search result is always 0 when I request for images. I'm following the documentation at the link below:
https://developers.google.com/custom-search/json-api/v1/reference/cse/list
according to the docs, by specifying searchType=images, the api only looks for images.
Here's what my url looks like:
https://www.googleapis.com/customsearch/v1?key=[API_Key]&cx=017576662512468239146:omuauf_lfve&searchType=image&q=cars
and the result looks like below:
{
"kind": "customsearch#search",
"url": {
"type": "application/json",
"template": "https://www.googleapis.com/customsearch/v1?q={searchTerms}&num={count?}&start={startIndex?}&lr={language?}&safe={safe?}&cx={cx?}&cref={cref?}&sort={sort?}&filter={filter?}&gl={gl?}&cr={cr?}&googlehost={googleHost?}&c2coff={disableCnTwTranslation?}&hq={hq?}&hl={hl?}&siteSearch={siteSearch?}&siteSearchFilter={siteSearchFilter?}&exactTerms={exactTerms?}&excludeTerms={excludeTerms?}&linkSite={linkSite?}&orTerms={orTerms?}&relatedSite={relatedSite?}&dateRestrict={dateRestrict?}&lowRange={lowRange?}&highRange={highRange?}&searchType={searchType}&fileType={fileType?}&rights={rights?}&imgSize={imgSize?}&imgType={imgType?}&imgColorType={imgColorType?}&imgDominantColor={imgDominantColor?}&alt=json"
},
"queries": {
"request": [
{
"title": "Google Custom Search - cars",
"totalResults": "0",
"searchTerms": "cars",
"count": 10,
"inputEncoding": "utf8",
"outputEncoding": "utf8",
"safe": "off",
"cx": "017576662512468239146:omuauf_lfve",
"searchType": "image"
}
]
},
"searchInformation": {
"searchTime": 0.049329,
"formattedSearchTime": "0.05",
"totalResults": "0",
"formattedTotalResults": "0"
}
}
If I remove searchType from the request, I get results back in the form of web pages. What is wrong here?
Your Custom Search Engine might have Image Search disabled. The CSE API returns 0 results if the searchType requested is disabled.
You can enable it by visiting https://cse.google.com/cse/all, opening your search engine, and switching the "Image Search" toggle to ON.
I'm 2 years delayed but i found the solution.
you need to go at your Custom Search Engine Dashboard and turn on the property "Search the entire Web" Or something like that.
When you get that property activated you must be able to see results

REST pattern create, update and delete same endpoint

I have a page where I list the books of a school. The user can update a book, add a new book or delete an existing book. All actions must be saved when the form is submitted.
How can i map a rest API for that? I could take advantage of the endpoints i already have.
UPDATE
PUT /schools/1/books
{
"books": [
{
"id": "1",
"name": "Book 1"
}
]
}
CREATE
POST /schools/1/books
{
"books": [
{
"name": "Book 2"
},
{
"name": "Book 3"
}
]
}
DELETE
DELETE /schools/1/books
{
"books": [
{
"id": 2
}
]
}
But I need everything to run on the same transaction, and wouldn't make sense to submit 3 requests.
I also thought of creating a new endpoint where I would create books that doesn't exists, update books that exists, and remove books that are not present on the request.
So if this school has Book 1 and Book 2, I could update Book 1, create New Book and remove Book 2 with:
PUT /schools/1/batch-books
{
"books": [
{
"id": "1",
"name": "Updated Book 1"
},
{
"name": "New Book"
}
]
}
Do you guys have other options?
I would separate things into different resources:
/books and /books/{id} for books. They gives book details and allow to manage them.
/schools and /schools/{id} for schools. They gives school details and allow to manage them.
/schools/{id}/books to associate books in schools. I mean books that are available within a school. This resource provides methods to manage a list of links to books.
Let me detail the last resource. In fact, this is related to hypermedia. In the following, I'll use JSON-LD but you're free to use other hypermedia tools.
A GET method will return the list of associated books:
GET /schools/1/books
[
{
"#id": "http://api.example.com/books/1895638109"
},
{
"#id": "http://api.example.com/books/8371023509"
}
]
You can notice that you can implement mechanisms to allow to get more details if needed. Leveraging the Prefer header seems to be a great approach (see the link below for more details).
In addition, you could provide the following methods:
POST to add a link to the school. The request payload would be: {"#id": "http://api.example.com/books/1895638109"}. The response should be a 201 status code.
DELETE to delete a specific link from a school. A query parameter could be used to specify which link to remove.
PATCH to allow to do several operations in one call and actually provide some batch processing. You can leverage at this level JSON-PATCH for the request processing. Within the response, you could describe what happens. There is no specification at this level so you're free to use what you want... Here is a sample for the request payload:
PATCH /schools/1/books/
[
{
"op": "add", "value": "http://api.example.com/books/1895638109"
},
{
"op": "remove", "path": "http://api.example.com/books/8371023509"
}
]
Reading the following links could give you some hints on the way to design such use case:
Implementing bulk updates within RESTful services: http://restlet.com/blog/2015/05/18/implementing-bulk-updates-within-restful-services/
On choosing a hypermedia type: http://sookocheff.com/post/api/on-choosing-a-hypermedia-format/
Creating Client-Optimized Resource Representations in APIs: http://www.freshblurbs.com/blog/2015/06/25/api-representations-prefer.html
Hope it helps you,
Thierry