Refining search results of Google Custom Search Engine - google-custom-search

We are using Custom Search Engine(CSE) of Google for getting search results in our website forum. It is working fine and we are able to get the results properly. For making the search easier, we have planned to provide advanced search options like sorting and filtering options in forum which is using Custom Search Engine to get the search result.
We would like to provide the below options for filtering.
Filtering by category - Each forum comes under some category, need to filter the search result based on the category.
Filtering by tags - Each forum has some tags, need to filter the search result based on the tags.
How can we achieve the above requirement using CSE of Google? If it’s not possible using CSE, is there any other different searching option to achieve this scenario?

Related

How to get the most searched words in Solr? [duplicate]

I'm trying to organize a solr search engine. I've already set up the misspelling system and the suggestions.
However I can't seem to find how to retrieve the top 10 most searched words/terms/keywords in solr/lucene. How can I get this? I want to display those on my homepage.
Solr does not provide this kind of feature out of the box. There is the StatsComponent, that provides you with all kind of statistics, but all of those are numeric only.
Depending on how you access solr (directly or via your own app) you could intercept all calls an log the query string. I did this in a recent project where I logged a queries to a database. If you submit all keywords to an other core on your solr server, you can faceting queries on your search terms as described by Hyque
You could use a facet for retrieving the Top X words like this:
http://yourservergoeshere/solr/select?q=*&wt=xml&indent=true&facet=true&facet.query=*&facet.field=message&facet.limit=10&facet.minCount=1
The value of facet.field depends on the field you like to search in. With facet.limit you'll (obviously) limit the amount of results to 10. You'll find the facet results at the end of the results, starting with "facet_counts"
Edit: I really should go to bed earlier. I didn't see the "most searched" in your question. Sorry for that.
Apache Solr does not provide any such capability as of today. There is a desire for this and a JIRA ticket corresponding to it. You can vote for it if you'd like to see it in Solr some day: https://issues.apache.org/jira/browse/SOLR-10359.
The stats component provides information around statistics, but it's mostly numeric in nature. You could parse server logs and come up with a way to build a Frequently Searched Terms (e.g. pump those logs in SiLK or Kibana for visualization).
If you have the ability to change the front end and add some javascript code to the UI or can intercept the search request and make an async or batch calls to APIs for tracking, you can use SearchStax Analytics that provides Search Analytics that tracks searches, clicks, cart actions, revenue, etc.

Custom Search API not returning all results

I am a long time customer of using the Custom Search API.
The problem - as described in the CSE documentation - is that the API is intended to search your own site and not the web in general. It misses results, for example from books.google.com, and results from other languages etc.
Is there another (paid) API that returns all results?
Sample search string: "الاستخدامات التالية من التطبيق"
(The above search gets 1 result in Google Search but 0 results in the Custom Search I am paying for.)
Thanks.
I didn't want to switch to Bing, but I was getting better results in the end.
For anyone else having this issue:
https://learn.microsoft.com/en-us/rest/api/cognitiveservices/bing-web-api-v7-reference

Filter google query results

I'm writing a search engine for wikipedia articles using lucene on the wiki xml dump and I want to calculate the accuracy of the engine when compared to google wiki result on a particular query, when I give "site:en.wikipedia.org" along with the query. I want to do it for multiple queries so I'm getting the google search result URLs manually. I got Google APIs to use a bot to search Google but the problem is I want to get rid off certain type of results like
"/Category:"
"/icon:"
"/file:"
"/photo:"
and user pages.
But I haven't found a convenient way to do this except for using an iterative method of issuing a query, get n number of results, then filter out by using regular expressions, then retrieve the remaining (n-x) results and so on. Google keeps blocking me when I do that.
Is there an intelligent way to get Google results the way I want using Java?
Thanks in advance guys.
You could just try excluding those pages from the Google results, like this:
living people site:en.wikipedia.org -inurl:category -inurl:category_talk -inurl:file -inurl:file_talk -inurl:user -inurl:user_talk

Amazon API search results vs. Amazon.com search results

For our web app, which will use Amazon's API as a basis for some of the site's main interactions, we required the ability to do a generalized search of Amazon's products and return results based on relevancy. The expectation was that their API would work exactly like their actual site's search.
Unfortunately it does not. For instance, querying "joy of cooking" does not return a link to the famous cook book, but to some food processor. Contrarily, on the actual site, one would see the book isn't just first, but it and any derivations occupy the top 5 or so results.
Is there a way of getting this level of relevancy search from Amazon's API without specifying a node to browse through? We need to be able to search everything at once, and the API seems very limited on parameter sets.
The answer is that, if you use "All" as your sorting basis, rather than "Blended", you will get results that are inline with Amazon's own product search. Older docs don't seem to account for this discrepency, but testing both methods has shown "All" to be the preferred product sorting method.
http://docs.amazonwebservices.com/AWSECommerceService/2010-11-01/DG/
Pagesearch under "SearchIndex: All"
You don't get any item sorting options with this method, but if all you want is "most relevant" results, this is the preferred method.

Programmatic Querying of Google and Other Search Engines With Domain and Keywords

I'm trying to find out if there is a programmatic way to determine how far down in a search engine's search results my site shows up for given keywords. For example, my query would provide my domain name, and keywords, and the result would return a say 94 indicating that my site was the 94th result. I'm specifically interested in how to do this with google but also interested in Bing and Yahoo.
No.
There is no programmatic access to such data. People generally roll out their own version of such trackers. Get the Google search page and use regexes to find your position. But now different results are show in different geographies and results are personalize.
gl=us parameter will help you getting results from US, you can change geography accordingly to get the results.
Before creating this from scratch, you may want to save yourself some time (and money) by using a service that does exactly that [and more]: Ginzametrics.
They have a free plan (so you can test if it fits your requirements and check if it's really worth creating your own tool), an API and can even import data from Google Analytics.