How to find search volume of large set of Keywords (say 1 million) by Keyword planner or Adword API - seo

I am using Keyword planner to find keywords' search volume. But when I upload a .CSV file, there is a limit of 3000 keywords.
How can I get the search volume of a large set of Keywords which I want to.
Is there any way by Adword API or somw other tool ?

Yes. You can use the AdWords API TargetingIdeaService to get Keyword Planner data for large sets of keywords.
This will allow you to get search volume (and other metrics) for a specified list of keywords. As well as your input list of keywords you would have to specify a requestType of STATS in you TargetingIdeaSelector.

Related

Google Custom Search with no sites and "Search entire web" returns way less results than normal google search

I'm trying to create a simple python program that returns the ratio of search results between two keywords (two searches in total, thus). However, the google-api-python-client that uses my Custom Search Engine always returns about 10x less results than a normal google search.
I don't have any restrictions in the CSE except I've set it to use 'google.fi' and set the user geolocation as Finland, because that mirrors the way I normally search on Google on the web.
Any ideas?
According to:
https://developers.google.com/custom-search/json-api/v1/reference/cse/list
You can adjust the number of results returned with the "num" parameter. However:
Valid values are integers between 1 and 10, inclusive.

Solr How to store limited amount of text instead of fully body

As per our business requirement, I need to index full story body (consider it for a news story for example) but in the Solr query result I need to return a preview text (say, first 400 characters) to bind to the target news listing page.
As I know there are 2 options in schema file for any field stored=false/true. Only way I can see as of now is I set it to true and take the full story body in result and then excerpt text to preview manually, but this seems not to be practical because (1) It will occupy GBs of space on disc for storing full body and (2) the json response becomes very heavy. (The query result can return 40K/50K stories).
I also know about limiting the number of records but for some reasons we need complete result at once.
Any help for achieving this requirement efficiently ?
In order to display just 400 characters in the news overview, you can simply use Solr Highlighting Feature and specify the number of snippets and their size. For instance for Standard highlighter you have parameters:
hl.snippets: Specifies maximum number of highlighted snippets to generate per field. It is possible for any number of snippets from
zero to this value to be generated. This parameter accepts per-field
overrides.
hl.fragsize: Specifies the size, in characters, of fragments to consider for highlighting. 0 indicates that no fragmenting should be
considered and the whole field value should be used. This parameter
accepts per-field overrides.
If you want to index everything but store only part of the text then you can follow the solution advised here in Solr Community.

Filter the fields returned by Elastic Search hits to enchance performance.(source filtering)

Indexed documents with around 70 fields.Some of them with store=yes but not indexed and others with store=no but indexed (with some analyzed and some not analyzed).Upon querying our .net client (the one talking to ES cluster for search) is extracting complete documents(those matches the search).
We want to enhance performance and but we dont need all the fields of the documents indexed(fields required vary from query to query passed as view columns).
On query level(jason query body) best way to do this filtering(source filtering may be not sure,googled but documentation is very immature).way to specify in the query that for this searchrequest body i want these fields?

Elasticsearch - higher scoring if higher frequency of term

I have 2 documents, and am searching for the keyword "Twitter". Suppose both documents are blog posts with a "tags" field.
Document A has ONLY 1 term in the "tags" field, and it's "Twitter".
Document B has 100 terms in the "tags" field, but 3 of them is "Twitter".
Elastic Search gives the higher score to Document A even though Document B has a higher frequency. But the score is "diluted" because it has more terms. How do I give Document B a higher score, since it has a higher frequency of the search term?
I know ElasticSearch/Lucene performs some normalization based on the number of terms in the document. How can I disable this normalization, so that Document B gets a higher score above?
As the other answer says it would be interesting to see whether you have the same result on a single shard. I think you would and that depends on the norms for the tags field, which is taken into account when computing the score using the tf/idf similarity (default).
In fact, lucene does take into account the term frequency, in other words the number of times the term appears within the field (1 or 3 in your case), and the inverted document frequency, in other words how the term is frequent in the index, in order to compare it with other terms in the query (in your case it doesn't make any difference if you are searching for a single term).
But there's another factor called norms, that rewards shorter fields and take into account eventual index time boosting, which can be per field (in the mapping) or even per document. You can verify that norms are the reason of your result enabling the explain option in your search request and looking at the explain output.
I guess the fact that the first document contains only that tag makes it more important that the other ones that contains that tag multiple times but a lot of ther tags as well. If you don't like this behaviour you can just disable norms in your mapping for the tags field. It should be enabled by default if the field is "index":"analyzed" (default). You can either switch to "index":"not_analyzed" if you don't want your tags field to be analyzed (it usually makes sense but depends on your data and domain) or add the "omit_norms": true option in the mapping for your tags field.
Are the documents found on different shards? From Elastic search documentation:
"When a query is executed on a specific shard, it does not take into account term frequencies and other search engine information from the other shards. If we want to support accurate ranking, we would need to first execute the query against all shards and gather the relevant term frequencies, and then, based on it, execute the query."
The solution is to specify the search type. Use dfs_query_and_fetch search type to execute an initial scatter phase which goes and computes the distributed term frequencies for more accurate scoring.
You can read more here.

Flickr Geo queries not returning any data

I cannot get the Flickr API to return any data for lat/lon queries.
view-source:http://api.flickr.com/services/rest/?method=flickr.photos.search&media=photo&api_key=KEY_HERE&has_geo=1&extras=geo&bbox=0,0,180,90
This should return something, anything. Doesn't work if I use lat/lng either. I can get some photos returned if I lookup a place_id first and then use that in the query, except then all the photos returned are from anywhere and not the place id
Eg,
http://api.flickr.com/services/rest/?method=flickr.photos.search&media=photo&api_key=KEY_HERE&placeId=8iTLPoGcB5yNDA19yw
I deleted out my key obviously, replace with yours to test.
Any help appreciated, I am going mad over this.
I believe that the Flickr API won't return any results if you don't put additional search terms in your query. If I recall from the documentation, this is treated as an unbounded search. Here is a quote from the documentation:
Geo queries require some sort of limiting agent in order to prevent the database from crying. This is basically like the check against "parameterless searches" for queries without a geo component.
A tag, for instance, is considered a limiting agent as are user defined min_date_taken and min_date_upload parameters — If no limiting factor is passed we return only photos added in the last 12 hours (though we may extend the limit in the future).
My app uses the same kind of geo searching so what I do is put in an additional search term of the minimum date taken, like so:
http://api.flickr.com/services/rest/?method=flickr.photos.search&media=photo&api_key=KEY_HERE&has_geo=1&extras=geo&bbox=0,0,180,90&min_taken_date=2005-01-01 00:00:00
Oh, and don't forget to sign your request and fill in the api_sig field. My experience is that the geo based searches don't behave consistently unless you attach your api_key and sign your search. For example, I would sometimes get search results and then later with the same search get no images when I didn't sign my query.