Removing PDFs, PPTs from google custom search - google-custom-search

I want to remove all nonHTML content pages containing PPTs, DOCs ,PDFs from google custom search. I have tried various url patterns in my custom search engine , but it is still returning pdfs and ppts in search results .

Search features → Advanced → Websearch settings → Query Addition
Query Addition: Appends additional query parameters to the search. Search results will
be served using "OR" logic. Supported values: Any search term to add to user query.
Advanced Google Site Search features
https://support.google.com/customsearch/answer/3037004?hl=en
I added -filetype:pdf -filetype:ppt -filetype:doc and I'm not getting PDFs, DOCs, PPTs in results.

Related

Search in mutliple sites using Google Custom Search JSON

Trying to figure out how can i search in mutliple sites using Google Custom Search JSON API.
Meaning that search will be only from a specific sites list.
i was playing with the api explorer - https://developers.google.com/custom-search/v1/reference/rest/v1/cse/list?apix_params=%7B%22cx%22%3A%22011602274690322925368%3Atkz2zvvpmk0%22%2C%22siteSearch%22%3A%22www.walla.co.il%22%7D
and noticed the site search query key, but it can only accept a single string not a list of sites:
enter image description here
What is the way to search in only in specific sites?
Thanks
There's a couple things you can do.
If you know the specific sites you want to search, you can add them as refinements to your engine. Then query for that refinement by adding 'more:<REFINEMENT_LABEL>' to the query.
Or, add 'site:' operators to the query itself. For example cats site:cnn.com OR site:bbc.com

Search Database via Google Custom Search? Attached Google CSE to (SQL/NoSQL) Database for website?

TOPIC - Google Search Engine / Custom Search - with Database
References
Search for "Google Search Engine" and "Google Custom Search"
(New to StackOverflow; just joined the other day.I'm limited to 2 links I can post right now).
NOTE:
I have not YET decided/committed to any specific coding language, framework, etc. Not until I figure out how to accomplish my question (below).
BACKGROUND INFO
What I'm trying to do (for now) is add a "search-box/ search engine" to a simple website I'm building out. Before I get too far into it (planning ahead) I would like to use Google CSE if all possible (which can do A LOT of things and works well). However, I will have a database (not sure on type YET. Will depend on what my options and I can do with CSE) of "items" that I want to be able to quickly search (in the search-box) i.e. like Amazon.com.
QUESTION:
Is there any way at all, to use Google Custom Search and or Custom Search API to search/attach a database (SQL, NoSQL, or others)? I would HIGHLY prefer being able to do all of this in Google Cloud Platform, and use one of their storage/database products.
If I get what you try to do, Google CSE is enough.
From the google doc you linked :
#Defining a Custom Search Engine in Control Panel
In the Sites to search section, add the pages you want to include in
your search engine. You can include any sites you want, not just the
sites you own. You can include whole site URLs or individual pages
URLs. You can also use URL patterns.
#Enabling Autocomplete
[...]you can enable or disable autocomplete feature using
enableAutoComplete attribute.
For the Is there any way at all [..] to search a database, I'll said not directly, but it's not a big problem.
Google CSE work on "indexable web pages", so it'll not work again a raw DB, restricted internet, or custom network not under http(s)://.
But in your case, if you make a DB, I suppose you'll have to make web page to display the data you store inside to your users ? (like products pages on Amazon)
If yes, then you'll run Google CSE again these pages by adding your http://[server ip] or http://[domain name] in the white list.
As far as I know, custom search won't guarantee all your content will be indexed.
You probably want to try exporting a full sitemap.xml, a RSS feed and if the custom search results from either of these won't satisfy you, you will probably want to look at the google search appliance product.
There's also http://sphinxsearch.com/ by the way.

Google autocomplete api for my site

Is it possible for google autocomplete api to specify to return results only for my site not for all sites? I see that there is param ds, but only purpose for that is to search in youtube. So how can I get autocomplete or maybe related or suggested search words only for single site?
I needed the very same thing and so far the only way I found to get this working is to create a custom search engine and then add it as a parameter to the autocomplete call:
http://clients1.google.com/complete/search?client=partner&gs_ri=partner&partnerid={0}&ds=cse
Where {0} is your custom search id
Certain features such as returning the results as XML don't work if you use the partner id but at least all the autocomplete results will be from your site.
You can also have multiple search engines and use different ones in different textboxes. Results are just a json string you parse.
Good luck

How to restrict Google Custom Search to specific url prefixes?

If I go to the Google CSE control panel (https://www.google.com/cse/all) I see a list of my custom search engines.
When I click on one I can see in the list the option "sites to search". There I can list
example.com/cool-path
example.com/awesome-path
etc
How do I use the API to do the same? To add multiple domains, sites, or paths to the search? I can't find any documentation specifying this behavior.
The API will connect to a search engine that is specified on your custom search panel.
So you create the search engines you need (each with it's own list of sites to search) and then you connect to whichever one you need using the API using the engine's unique code

Multiple file types search using Google Custom Search API

I need to get Google search results for particular filetypes.
For example, in browser I would directly google search for "hyperloop filetype:pdf" and it will list out PDF files for "Hyperloop".
For this, my Google Custom Search request URI will be https://www.googleapis.com/customsearch/v1?key=MY_KEY&cx=MY_UNIQUE_ID&q=hyperloop&fileType=pdf
However, currently I would like to get search results for "hyperloop" of filetypes .ppt or .doc.
In browser, I would achieve this by googling "hyperloop filetype:ppt OR filetype:doc".
What will be my Search request URI equivalent for this query?
I could not find anything related to querying using multiple values for a single parameter in Google Custom Search Documentation.
Rather than doing
q=hyerloop&filetype=pdf
you can use
q=hyperloop%20filetype:pdf%20OR%20filetype:doc
use this its work
$url='https://www.googleapis.com/customsearch/v1?key=AIzaSyCJUGIb_tevRKD-Kxxi5f4&cx=010407088:onjj7gscy2g&q='. urlencode($keywords).'&filetype=doc&filetype=docx';
for me