Google custom search api - Limit results to 1 per domain - google-custom-search

Is there any way to limit the results returned by the Google Custom Search API to 1 per domain?

Yes. If you're looking for the first result. Then you can try the following if you are familiar with Python.
res = service.cse().list(query,cx='YOURSEARCHENGINEIDHERE,).execute()
name = res['items'][0]['title'] #title of the first search result
url = res['items'][0]['link'] #url of the first search result

Related

Search for a keyword in a user's tweets

Whenever I run the code below, it gives me the most recent 10 tweets from #Spongebob, instead of giving me the tweets that include "Bikini Bottom" from the last 10 tweets. How do I make it conditional to the keyword?
user = api.get_user('SpongeBob')
public_tweets = api.user_timeline("Bikini Bottom", count=10, screen_name = "SpongeBob")
for tweet in public_tweets:
print(tweet.text)
You need to use the Twitter Search API, with the correct search operators.
In this case, you want to search for the string "bikini bottom from:spongebob"
With Tweepy, this will be:
public_tweets = api.search("bikini bottom from:spongebob")

How to get all url's ( not just titles ) in a wikipedia article using mediawiki api?

I am using the wikimedia api to retrieve all possible URL's from a wikipedia article ,'https://en.wikipedia.org/w/api.php?action=query&prop=links&redirects&pllimit=500&format=json' , but it is only giving a list of link titles , for example , Artificial Intelligence , wikipedia page has a link titled " delivery networks," , but the actual URL is "https://en.wikipedia.org/wiki/Content_delivery_network" , which is what I want
Use a generator:
action=query&
format=jsonfm&
titles=Estelle_Morris&
redirects&
generator=links&
gpllimit=500&
prop=info&
inprop=url
See API docs on generators and the info module.
I have replaced most of my previous answer, including the code, to use the information provided in Tgr's answer, in case someone else would like sample Python code. This code is heavily based on code from Mediawiki for so-called 'raw continuations'.
I have deliberately limited the number of links requested per invocation to five so that one more parameter possibility could be demonstrated.
import requests
def query(request):
request['action'] = 'query'
request['format'] = 'json'
request['prop'] = 'info'
request['generator'] = 'links'
request['inprop'] = 'url'
previousContinue = {}
while True:
req = request.copy()
req.update(previousContinue)
result = requests.get('http://en.wikipedia.org/w/api.php', params=req).json()
if 'error' in result:
raise Error(result['error'])
if 'warnings' in result:
print(result['warnings'])
if 'query' in result:
yield result['query']
if 'continue' in result:
previousContinue = {'gplcontinue': result['continue']['gplcontinue']}
else:
break
count = 0
for result in query({'titles': 'Estelle Morris', 'gpllimit': '5'}):
for url in [_['fullurl'] for _ in list(result.values())[0].values()]:
print (url)
I mentioned in my first answer that, if the OP wanted to do something similar with artificial intelligence then he should begin with 'Artificial intelligence' — noting the capitalisation. Otherwise the search would start with a disambiguation page and all of the complications that could arise with those.

How to get crawl stats FROM webmaster tools api

I want to get this graph datas :
I can't add an image here : graph.png I don't have the reputation 10.
So I want to get for each day the 3 values (Pages crawled per day, kilobytes downloaded per day, time spent downloading a page)
the idea is to get an array like this :
$datas['2015-11-20']['pages_crawled'] = 125;
$datas['2015-11-20']['kilobytes'] = 1452;
$datas['2015-11-20']['time_spent'] = 1023;
$datas['2015-11-21']['pages_crawled'] = 146;
$datas['2015-11-21']['kilobytes'] = 2410;
$datas['2015-11-21']['time_spent'] = 1563;
$datas['2015-11-22']['pages_crawled'] = 102;
$datas['2015-11-22']['kilobytes'] = 1560;
$datas['2015-11-22']['time_spent'] = 1400;
Something like this.
thanks specially to #alex for his greathfull Help.
Unfortunately you can't get this Crawl Stats via API.
The only supported methods are webmasters.urlcrawlerrorscounts.query, webmasters.urlcrawlerrorssamples.list, webmasters.urlcrawlerrorssamples.get, webmasters.urlcrawlerrorssamples.markAsFixed ( https://developers.google.com/apis-explorer/#p/webmasters/v3/ )
So you can get information about crawl errors, but not general crawl stats.
The crawl errors you can retrieve with this API call :
https://www.googleapis.com/webmasters/v3/sites/https%3A%2F%2Fwww.mywebsite.com/urlCrawlErrorsCounts/query?latestCountsOnly=true&fields=countPerTypes&key={YOUR_API_KEY}
But the crawl statistics are not exposed through API.

Get ALL tweets, not just recent ones via twitter API (Using twitter4j - Java)

I've built an app using twitter4j which pulls in a bunch of tweets when I enter a keyword, takes the geolocation out of the tweet (or falls back to profile location) then maps them using ammaps. The problem is I'm only getting a small portion of tweets, is there some kind of limit here? I've got a DB going collecting the tweet data so soon enough it will have a decent amount, but I'm curious as to why I'm only getting tweets within the last 12 hours or so?
For example if I search by my username I only get one tweet, that I sent today.
Thanks for any info!
EDIT: I understand twitter doesn't allow public access to the firehose.. more of why am I limited to only finding tweets of recent?
You need to keep redoing the query, resetting the maxId every time, until you get nothing back. You can also use setSince and setUntil.
An example:
Query query = new Query();
query.setCount(DEFAULT_QUERY_COUNT);
query.setLang("en");
// set the bounding dates
query.setSince(sdf.format(startDate));
query.setUntil(sdf.format(endDate));
QueryResult result = searchWithRetry(twitter, query); // searchWithRetry is my function that deals with rate limits
while (result.getTweets().size() != 0) {
List<Status> tweets = result.getTweets();
System.out.print("# Tweets:\t" + tweets.size());
Long minId = Long.MAX_VALUE;
for (Status tweet : tweets) {
// do stuff here
if (tweet.getId() < minId)
minId = tweet.getId();
}
query.setMaxId(minId-1);
result = searchWithRetry(twitter, query);
}
Really it depend on which API system you are using. I mean Streaming or Search API. In the search API there is a parameter (result_type) that is an optional parameter. The values of this parameter might be followings:
* mixed: Include both popular and real time results in the response.
* recent: return only the most recent results in the response
* popular: return only the most popular results in the response.
The default one is the mixed one.
As far as I understand, you are using the recent one, that is why; you are getting the recent set of tweets. Another issue is getting low volume of tweets that have the geological information. Because there are very few users added the geological information to their profile, you are getting very few tweets.

Twitter user search with API 1.1: limit the number of results and paging

When performing user search with the new Twitter API (can be checked here https://dev.twitter.com/console), found the problem with limiting the number of returned results, as well as with paging.
So, let's say I want to get 5 results from searching and use count parameter:
https://api.twitter.com/1.1/users/search.json?q=Online&count=5
It works correctly, returns 5 records. But if I set count to zero, there is still one result returned:
https://api.twitter.com/1.1/users/search.json?q=Online&count=0
Is it expected?
Then i tried to use paging for the same purposes (wanted to get the first page, and limit results in it):
https://api.twitter.com/1.1/users/search.json?q=Online&per_page=5&page=1
Now it looks like the limit doesn't work at all, there are much more than 5 records returned.
Does anybody know if something is wrong with the queries, or it's an API bug?
**Try like this : **
https://api.twitter.com/1.1/users/search.json
Parameters:
1) page : Specifies the page
2) count : Number of user results to retrieve per page and max is 20.
Following example will return 1 result from page 2 for search keyword "wordpress"
<?php
require_once('api/TwitterAPIExchange.php');
require_once("token.php"); // For your access token - viral
$searchword = "wordpress";
$url = 'https://api.twitter.com/1.1/users/search.json';
$getfield = '?&page=2&count=1&q='.$searchword;
$requestMethod = 'GET';
$twitter = new TwitterAPIExchange($settings);
$followers = $twitter->setGetfield($getfield)
->buildOauth($url, $requestMethod)
->performRequest();
$json = json_decode($followers, true);
print_r($json);
?>
i think the 'per_page' is not a valid parameter in this 1.1 API.
And then, it returns the default number 20.
https://dev.twitter.com/docs/api/1.1/get/users/search