Google CSE multiple redefinements - google-custom-search

I'm using Google CSE and would like to use multiple redefinements in a query, e.g. q=search+term+more:redefinement1+more:redefinement2, however I can't seem to find this in the documentation. I know this works for one more:... but how can I specify multiple labels?
Thx

Probably you can't.
I've noticed that you can also specify a site to force the search, such as: "keyword1 keyword2 site:includesite.com", but you can't specify more than one.
Moreover, you can specify more than one excluded sites: "keyword1 keyword2 -site:includesite1.com -site:includesite2.com". I don't know why. :)

Related

How to limit scrapy to a particular section of a website, e.g. http://www.domain.com/section/

I have a scrapy project which crawls all the internal links of a given website. This is working fine, however we have found a few situations where we want to limit the crawling to a particular section of a website.
For example, if you could imagine a bank has a special section for investor information, e.g. http://www.bank.com/investors/
So in the example above, everything in http://www.bank.com/investors/ only would be crawled. For example, http://www.bank.com/investors/something/, http://www.bank.com/investors/hello.html, http://www.bank.com/investors/something/something/index.php
I know I could write some hacky code on parse_url which scans the URL and does a pass if it doesn't meet my requirements (i.e. it's not /investors/), but that seems horrible.
Is there a nice way to do this?
Thank you.
I figured this out.
You need to add an allow() for the pattern you want to allow.
For example:
Rule(LinkExtractor(allow=(self.this_folder_only)), callback="parse_url", follow=True)
Everything else will be denied.

Make google custom search location-aware

When I search for "football images" on google.co.uk, it knows that I mean the sport that elsewhere might be called "soccer". If I do the same search on google.com, I get American Football.
I'm using the custom search API - how can I tell it that I'm in the UK and would like results relevant to here?
You can limit your engine to operate on sites from a particular country via "cr" param, e.g. in Custom Element it looks like this:
<gcse:search cr="gb"></gcse:search>
Google knows some synonyms on the web, but if your particular use case is not correctly recognized you can add it in Control Panel in Search Features > Synonyms
More on synonyms:
Sorry, I ended up answering this myself. I couldn't get any of the instructions under the custom search API itself to work (although the answer offered above was also mentioned there, but this just made my CSE Context XML apparently invalid), but you can make requests to a custom search engine by using the instructions here https://developers.google.com/custom-search/json-api/v1/using_rest and an API key.
This is how I did it;
<gcse:search cr="countryUK"></gcse:search>
You can even return results in a specific language, code shown below returns results only in french;
<gcse:search lr = "lang_fr"></gcse:search>
This is the reference for Google Custom Search Element Control API: https://developers.google.com/custom-search/docs/element?hl=en
Try to set the 'gl' parameter to the country you want.
For details, look into CSE:list

Customize Google Custom Search

Does anyone know how to span the search result set over all of the links provided in google custom search . For Example, if I have provided sites like site 1,site 2....site n to search from ,then I want say top five results from all of these individual websites as JSON. Is there a way to achieve this.
I know this may be a little late but might be able to help someone out .
This will return 2 results via the REST API for GCSE.
https://www.googleapis.com/customsearch/v1?key=YOUR-KEY_HERE&cx=CX_HERE&fields=kind,items&filter=1&num=2&prettyPrint=true&q=querystring
its the num=2 part your looking for...

Flickr API - Photos Search, excluding tags: Am I doing this wrong?

So, I'm trying to pull all photos of a specific user's account via the flickr.photos.search method, but I want to exclude photos with a particular tag. The related documentation page states that "You can exclude results that match a term by prepending it with a - character." ... Well, I tried implementing that option but what get in return is only one photo (even though there are several photos with the tag in question) and that result remains the same whether that specific photo has the tag in question or not AND whether or not I use the "-" option to exclude that tag rather than include it. I also tried the text method with the same exact result. Here's my REST call:
http://api.flickr.com/services/rest/?&method=flickr.photos.search&api_key='.$api_key.'&user_id='.$user_id.'&tag_mode=any&tags=-blog&extras=url_o,url_t&format=json
And here is the page where I'm trying to get this all working:
http://corazonbrew.com/temp/
Anyone know what is going on here?
It seems the answer in the Flickr discussion board I linked to earlier is proving true. In order to use the exclusion option, there has to also be at least one other, non-excluded tag. Well, that is just not good enough for me.
A couple of friends tell me this is a longstanding bug that will not be fixed anytime soon, if ever. But those friends also kindly reminded me of my n00bishness- this whole time I thought I needed to affect the feed to get the desired output. I totally was not realizing I could just use some good ol' PHP if statements to weed out what I don't want.

Normal Google Custom Search

I'm writing an application that analyses search engine results.
With the Google Search API now being depreciated and limited to 1000 queries/day they are forcing developers to move to the AJAX APIs and to use the Custom Search API to do a Google search.
The thing is I don't need a Custom Search, I need a general search not one that is filtered by site; OK maybe filtered by USA/UK (Google.com/Google.co.uk).
Does anyone know how to just do a regular Google search using the AJAX APIs? Is the Custom Search the right thing to be using?
I don't want to hit the 1000/day limit using the old service but this is exactly what I need.
I did find: How do I create a CSE that searches the entire web?
http://www.google.com/support/customsearch/bin/answer.py?hl=en&answer=1210656
But by the sounds of it this will distort the search results.
Thank you.
OK. Here's how I think it is done.
Create a Custom Search Engine.
Add a site such as *.com When this is created go to the Advanced tab
and download the context xml.
Remove the Background Label associated with the site.
Upload the XML to replace the previous context.
This seems to work just fine and is returning the same values as far as I can see.
Yes, you are right *in theory, and this should let you get 100 results a day on the fly. Just this Saturday though, Google confirmed how here -
(* so far though, we can't get it working...)