Google Custom Search API - Reverse image search - google-custom-search

I've got a collection of images that I'm missing information for. I'd like to be able to do a Google reverse image search to find names, keywords, links to similar images, etc. I'm aware that scraping the search results page is against the TOS, and have gotten suggestions that using the Custom Search API is the right way to go about it, but I haven't been able to find anything in the documentation detailing reverse image search. Is anyone able to point me in the right direction if this is possible with the API, or verify if it is, in fact, supported?
Much appreciated!

As for current API I have not found any mention of reverse image search functionality. not even anything except string based queries. You can look for yourself in the detail API references for custom search.
https://developers.google.com/custom-search/docs/xml_results
https://developers.google.com/custom-search/json-api/v1/reference/cse/list
So Google-custom-search doesn't provide that facility.
After looking at other APIs provided, none of them also provide this functionality. Try looking at all the different kinds of APIs supported by the Google here:
https://developers.google.com/apis-explorer/#p/
Thus the conclusion, no reverse image search thru API form Google. (this might change, I cant say anything, look at the links provided above)
There are paid apis available from some vendors.
TinEye API https://services.tineye.com/TinEyeAPI
Incandescent API http://incandescent.xyz/pricing/
Otherwise you have to ask Google to pardon you for the little TOS violation for your pet project :)

SerpApi, a third party solution, supports scraping Google reverse image. It's a paid API with a free trial.
Let's for example use this image of Danny DeVito: https://i.imgur.com/HBrB8p0.png
Example python code (available in other libraries also):
from serpapi import GoogleSearch
params = {
"engine": "google_reverse_image",
"google_domain": "google.com",
"image_url": "https://i.imgur.com/HBrB8p0.png",
"api_key": "secret_api_key"
}
search = GoogleSearch(params)
results = search.get_dict()
Example JSON response:
...
"image_results": [
{
"position": 1,
"title": "Danny DeVito - Wikipedia",
"link": "https://en.wikipedia.org/wiki/Danny_DeVito",
"displayed_link": "https://en.wikipedia.org › wiki › Danny_DeVito",
"snippet": "Daniel Michael DeVito Jr. (born November 17, 1944) is an American actor, comedian, director, producer, and screenwriter. He gained prominence for his ...",
"cached_page_link": "https://webcache.googleusercontent.com/search?q=cache:EVb7AC9xwHYJ:https://en.wikipedia.org/wiki/Danny_DeVito+&cd=1&hl=en&ct=clnk&gl=us",
"related_pages_link": "https://www.google.com/search?q=related:https://en.wikipedia.org/wiki/Danny_DeVito&sa=X&ved=2ahUKEwi7uom3wJ_xAhWxHDQIHct6DmQQHzAAegQIBhAQ"
},
{
"position": 2,
"title": "Danny DeVito - IMDb",
"link": "https://www.imdb.com/name/nm0000362/",
"displayed_link": "https://www.imdb.com › name",
"snippet": "Danny DeVito, Actor: Matilda. Danny DeVito has amassed a formidable and versatile body of work as an actor, producer and director that spans the stage, ...",
"cached_page_link": "https://webcache.googleusercontent.com/search?q=cache:c6r3v14HA7cJ:https://www.imdb.com/name/nm0000362/+&cd=2&hl=en&ct=clnk&gl=us"
},
{
"position": 3,
"title": "Danny DeVito - Simple English Wikipedia, the free encyclopedia",
"link": "https://simple.wikipedia.org/wiki/Danny_DeVito",
"displayed_link": "https://simple.wikipedia.org › wiki › Danny_DeVito",
"thumbnail": "https://serpapi.com/searches/60cbb000ce87f8cca8f63685/images/9db7034fa3524b93ce0598116fd3b874800a67b8b9434cd54a009f2be5fd0809.jpeg",
"thumbnail_destination_url": "https://upload.wikimedia.org/wikipedia/commons/thumb/2/21/Danny_DeVito_by_Gage_Skidmore.jpg/1200px-Danny_DeVito_by_Gage_Skidmore.jpg",
"image_resolution": "1200 × 1427",
"snippet": "Daniel Michael \" · Danny\" · DeVito, Jr. (born November 17, 1944) is an American actor, director, producer and screenwriter. He has starred in and directed a number ...",
"cached_page_link": "https://webcache.googleusercontent.com/search?q=cache:2DR2mxjaZbsJ:https://simple.wikipedia.org/wiki/Danny_DeVito+&cd=32&hl=en&ct=clnk&gl=us",
"related_pages_link": "https://www.google.com/search?q=related:https://simple.wikipedia.org/wiki/Danny_DeVito&sa=X&ved=2ahUKEwi7uom3wJ_xAhWxHDQIHct6DmQQHzAfegQIFhAQ"
},
...
]
Check out the documentation for more details.
Disclaimer: I work at SerpApi.

Related

How to get the URL of a ROBLOX game's thumbnail?

I am looking for an API that returns the URL of the thumbnail.
I have this API: https://www.roblox.com/asset-thumbnail/image?assetId=1970&width=768&height=432&format=png but this returns image bytes.
(the reason I asked this question here is because I can't access the developer forums)
See the docs for the Games API and the docs for the Thumbnails API.
The Thumbnail API thumbnails.roblox.com/v1/games/multiget/thumbnails?universeIds={universeID} will give you the image urls you need. However, you'll need the universeId first. So follow these steps :
1) Convert PlaceID to UniverseID
If you only know the placeID (the number from the game URL), then you can use the https://games.roblox.com/v1/games/multiget-place-details endpoint to get the universeID.
Let's use (Bee Swarm Simulator) as an example, whose placeID is 1537690962
https://games.roblox.com/v1/games/multiget-place-details?placeIds=1537690962
[
{
"placeId": 1537690962,
"name": "Bee Swarm Simulator",
"description": "Grow your own swarm of bees, collect pollen, and make honey in Bee Swarm Simulator! Meet friendly bears, complete their quests and get rewards! As your hive grows larger and larger, you can explore further up the mountain. Use your bees to defeat dangerous bugs and monsters. Look for treasures hidden around the map. Discover new types of bees, all with their own traits and personalities!\r\n\r\nJoin Bee Swarm Simulator Club for Honey, Treats and codes! https://www.roblox.com/groups/3982592\r\n\r\n\ud83e\uddf8 Bee Swarm Simulator toys are available now at Walmart Supercenters, Smyths Toys, and GameStop! Collect toy bees and hive slots to construct a real hive! There are also adorable plushies, bear action figures, and more!\r\n",
"sourceName": "Bee Swarm Simulator",
"sourceDescription": "Grow your own swarm of bees, collect pollen, and make honey in Bee Swarm Simulator! Meet friendly bears, complete their quests and get rewards! As your hive grows larger and larger, you can explore further up the mountain. Use your bees to defeat dangerous bugs and monsters. Look for treasures hidden around the map. Discover new types of bees, all with their own traits and personalities!\r\n\r\nJoin Bee Swarm Simulator Club for Honey, Treats and codes! https://www.roblox.com/groups/3982592\r\n\r\n\ud83e\uddf8 Bee Swarm Simulator toys are available now at Walmart Supercenters, Smyths Toys, and GameStop! Collect toy bees and hive slots to construct a real hive! There are also adorable plushies, bear action figures, and more!\r\n",
"url": "https://www.roblox.com/games/1537690962/Bee-Swarm-Simulator",
"builder": "Onett",
"builderId": 1912490,
"hasVerifiedBadge": false,
"isPlayable": true,
"reasonProhibited": "None",
"universeId": 601130232,
"universeRootPlaceId": 1537690962,
"price": 0,
"imageToken": "T_1537690962_3990"
}
]
And now we've got the universeID.
2) Get thumbnails
With the universeID, we can make the call to https://thumbnails.roblox.com/v1/games/multiget/thumbnails?universeIds={universeID}&size=768x432&format=Png&isCircular=false
https://thumbnails.roblox.com/v1/games/multiget/thumbnails?universeIds=601130232&size=768x432&format=Png&isCircular=false
{
"data": [
{
"universeId": 601130232,
"error": null,
"thumbnails": [
{
"targetId": 1538429222,
"state": "Completed",
"imageUrl": "https://tr.rbxcdn.com/7242c75dd5e18de97464e397164f6c68/768/432/Image/Png"
}
]
}
]
}
And now you've got the thumbnail url. Hope this helps.

Using of structured data markup with review authority

I'm trying to structured data for producing the review like this on google search (please see the image) -
According to this link I've to write the following structured data markup -
<script type="application/ld+json">
{
"#context": "http://schema.org/",
"#type": "Review",
"itemReviewed": {
"#type": "Thing",
"name": "Super Book"
},
"author": {
"#type": "Person",
"name": "Joe"
},
"reviewRating": {
"#type": "Rating",
"ratingValue": "7",
"bestRating": "10"
},
"publisher": {
"#type": "Organization",
"name": "Washington Times"
}
}
</script>
But according to this link I've to get review from a trusted review authority. I'm wondering why we need the structured data markup (where we have static 'rating', 'bestRating' etc value definitely these shouldn't be static) or how we can combine this with trusted review authority for getting dynamic ratting that changes over time?
If I'm understanding your question correctly, I think you are confusing two issues. Google requires reviews to be created using Schema markup in order for the review to have a chance to rank directly in the SERPs.
It is the companies that provide reviews: Yelp, Angie's List, Washington Times, etc, that have to format their content management systems to upload user generated review data into the proper markup.
So if you're a web developer working for one of these companies, then it makes sense to code the CMS so that the listings are displayed using schema markup.
If you are the marketer, your job is to get reviews, not format the way they are getting displayed.
There are of course other ways to use Schema markup on your own site to boost organic traffic. Consider for example the first SERP screenshot displayed in this article.
Here the webmaster has used schema markup to list three upcoming events in their result, which gives them four links in a single listing. This causes the listing to stand out and gives increased incentive for users to click, almost guaranteeing a higher click-thru rate than if they'd have not used the markup.

Schema.org is ProfessionalService deprecated?

After reading several recent popular articles on the internet I decided to use ProfessionalService over LocalBusiness for my web design company. It is my understanding that LocalBusiness is very broad and it is best to be as specific as much as possible and the reason why I opted to use both ProfessionalService and additionalType with The Product Types Ontology.
Using Google Tag Manager my json-ld looks like this:
<script type="application/ld+json">
{
"#context": "http://schema.org",
"#type": "ProfessionalService",
"additionalType": "http://www.productontology.org/id/Web_design",
"name": "BYBE",
"url": "https://www.bybe.net",
"logo": "https://www.bybe.net/wp-content/themes/showboat/logo-bybe.png",
"description": "Creative website design company based in Bournemouth and Poole, Dorset.",
"telephone": "01202 949749",
"areaServed": ["Bournemouth", "Poole", "Dorset"],
"openingHoursSpecification": [
{
"#type": "OpeningHoursSpecification",
"dayOfWeek": [
"Monday",
"Tuesday",
"Wednesday",
"Thursday",
"Friday"
],
"opens": "09:00",
"closes": "17:00"
}
],
"address": {
"#type": "PostalAddress",
"streetAddress": "Flat 11, East Cliff Grange, 35 Knyveton Road",
"addressLocality": "Bournemouth",
"addressRegion": "Dorset",
"postalCode":"BH1 3QJ"
},
"geo": {
"#type": "GeoCoordinates",
"latitude": "50.73744",
"longitude": "-1.8495269"
},
"sameAs" : [ "https://plus.google.com/+ByBeBournemouth",
"https://twitter.com/bybe_net",
"https://www.facebook.com/ByBeUK",
"https://uk.pinterest.com/bybenet/",
"https://www.youtube.com/c/ByBeBournemouth",
"https://www.linkedin.com/company/bybe"]
}
</script>
I'm a little confused over the choice of words Schema has used on the ProfessionalService page:
SOURCE
Original definition: "provider of professional services."
The general ProfessionalService type for local businesses was
deprecated due to confusion with Service. For reference, the types
that it included were: Dentist, AccountingService, Attorney, Notary,
as well as types for several kinds of HomeAndConstructionBusiness:
Electrician, GeneralContractor, HousePainter, Locksmith, Plumber,
RoofingContractor. LegalService was introduced as a more inclusive
supertype of Attorney.
It's not clear if ProfessionalService is completely deprecated since it is still listed on the list of Schema's, I suspect they mean its deprecated for using it in a certain way, I'd be grateful if a Schema Jedi could shed some light on this issue.
Question(s):
Is ProfessionalService completely deprecated? If it's not then please include an example demonstrating the type of usage that is deprecated, that way it'll help and others.
ProfessionalService is deprecated for all cases, not only for some specific ones.
However, it will likely never be removed from Schema.org, because it would do more harm than good: many sites might still use this type, and many of them will probably never update their structured data (or even notice that it got deprecated in the meantime).
See also what the Schema.org webmaster, Dan Brickley, says about superseded types:
We shouldn't make the warnings too heavy or it creates awkwardness e.g. when search marketing people have recommended something to their clients then it gets superseded. We want consumers to respect older structures wherever possible and not worry publishers into constantly updating in the absence of concrete product-related incentives imho.
So if you have to use this type, nothing will break (just don’t expect updates for this type, or integration with future developments of the vocabulary). But if possible, it would better to use an alternative.
If not using ProfessionalService, the closest type for your web design company would be LocalBusiness. The services (design, development, consulting, CMS updates etc.) your company provides can be modelled with Service (where the provider is the LocalBusiness) and/or with makesOffer (where the Offer can reference the Service with itemOffered) (or with hasOfferCatalog in the same way, if you want to model it as list).

Can I retrieve Sitelinks through custom Search api?

I want to scrape sitelinks which are shown in the google search results(like About us Home Page etc..) . Is there any way I can retrieve them ?
enter image description here
I recently implement Google Search JSON API, and from my understanding, the only way to get the website links is through the JSON Callback where each result contains formattedUrl or htmlFormattedUrl. The query would be the site in question and hopefully the first results would give you relevant links of the site.
However, if I properly understood your question, you want to scrap the sub-links of a given website which is something that a web crawler would do. If you are the owner of the website, you can create a sitemap using many tools around the web, but if your intentions can be classified as "other", then I believe that you are barking at the wrong tree. See this question which will pinpoint you to create a simple WebCrawler.
// Example customsearch#result item in which the query was Deovandski.
"items": [
{
"kind": "customsearch#result",
"title": "Student Experience - College of Science and Mathematics (NDSU)",
"htmlTitle": "Student Experience - College of Science and Mathematics (NDSU)",
"link": "https://www.ndsu.edu/scimath/currentstudents/student_experience/",
"displayLink": "www.ndsu.edu",
"snippet": "Sep 16, 2015 ... Association for Computing Machinery Student Chapter Chair: Jordan Goetze \nAdvisor: Brian Slator. Upsilon Pi Epsilon President: Deovandski ...",
"htmlSnippet": "Sep 16, 2015 \u003cb\u003e...\u003c/b\u003e Association for Computing Machinery Student Chapter Chair: Jordan Goetze \u003cbr\u003e\nAdvisor: Brian Slator. Upsilon Pi Epsilon President: \u003cb\u003eDeovandski\u003c/b\u003e ...",
"cacheId": "pyzF9XJwrXsJ",
"formattedUrl": "https://www.ndsu.edu/scimath/currentstudents/student_experience/",
"htmlFormattedUrl": "https://www.ndsu.edu/scimath/currentstudents/student_experience/",
"pagemap": {
"cse_image": [
{
"src": "https://www.ndsu.edu/fileadmin/_processed_/csm_080117_anatomy_03med_9dbc3c8cce.jpg"
}
],
"cse_thumbnail": [
{
"width": "184",
"height": "275",
"src": "https://encrypted-tbn2.gstatic.com/images?q=tbn:ANd9GcTTL-GZRfSv30cyESsCnd_65BFoLMDdo8fqNS58mHfRbGiOTjSq-e-o28FE"
}
]
}
},

Google Search API Results Completely Different from Google.com Results

Below is one Json item returned from this query and this is the query:
https://www.googleapis.com/customsearch/v1?key={key}&cx={key}&q=Action+Motivation%2c+Inc.&alt=json
The "dc.type" in the Json is "Patent" and this is obviously patent data BUT I didn't specify that search engine. I've googled this to death but can't find anything re why patent data would be returned from a simple query like this. If Google "Action Motivation, Inc." on the regular google.com page, I get completely different (normal) results. Has anyone had this problem?
"items": [
{
"kind": "customsearch#result",
"title": "Patent US5622527 - Independent action stepper - Google Patents",
"htmlTitle": "Patent US5622527 - Independent \u003cb\u003eaction\u003c/b\u003e stepper - Google Patents",
"link": "https://www.google.com/patents/US5622527",
"displayLink": "www.google.com",
"snippet": "Apr 22, 1997 ... Original Assignee, Icon Health & Fitness, Inc., Proform Fitness ....",
"htmlSnippet": "Apr 22, 1997 \u003cb\u003e...\u003c/b\u003e Original Assignee, Icon Health & Fitness..."
"formattedUrl": "https://www.google.com/patents/US5622527",
"htmlFormattedUrl": "https://www.google.com/patents/US5622527",
"pagemap": {
"book": [
{
"description": "A motivational exercise stepping machine has a pair of independently operable pivoting treadles for operation..."
"url": "https://www.google.com/patents/US5622527?utm_source=gb-gplus-share",
"name": "Patent US5622527 - Independent action stepper",
"image": "https://www.google.com/patents?id=&printsec=frontcover&img=1&zoom=1"
}
],
"metatags": [
{
***"dc.type": "Patent"***,
"dc.title": "Independent action stepper",
"dc.contributor": "William T. Dalebout",
"dc.date": "1994-3-23",
"dc.description": "A motivational exercise stepping machine has a pair of independently operable pivoting treadles for operation by a user's feet. Each treadle..."
"dc.relation": "JP:S5110842"
}
]
}
},
{
When using their API, you can issue around 40 requests per hour. The results you see on the API is not what the real user sees. You are limited to what they give you, it's not really useful if you want to track ranking positions or what a real user would see. That's something you are not allowed to gather.
If you want a higher amount of API requests you need to pay.
60 requests per hour cost 2000 USD per year, more queries require a custom deal.