Whenever I run the code below, it gives me the most recent 10 tweets from #Spongebob, instead of giving me the tweets that include "Bikini Bottom" from the last 10 tweets. How do I make it conditional to the keyword?
user = api.get_user('SpongeBob')
public_tweets = api.user_timeline("Bikini Bottom", count=10, screen_name = "SpongeBob")
for tweet in public_tweets:
print(tweet.text)
You need to use the Twitter Search API, with the correct search operators.
In this case, you want to search for the string "bikini bottom from:spongebob"
With Tweepy, this will be:
public_tweets = api.search("bikini bottom from:spongebob")
Related
I am using premium account (not sandbox) for data collection.
I want to collect:
All tweets in English that contain ‘china’ or ‘chinese’ that are user geolocated to US and not geolocated at tweet level, excluding all retweets
All tweets in English that contain ‘china’ or ‘chinese’ that are user geolocated to ‘Minnesota’ and not geolocated at tweet level, excluding all retweets
The code is as follows:
premium_search_args = load_credentials('twitter_API.yaml',
yaml_key ='search_tweets_premium_api', env_overwrite=False)
# keywords for the search
# key word 1
keywords = '(China OR Chinese) lang:en profile_country:US -place_country:US -is:retweet'
# key word 2
keywords = '(China OR Chinese) lang:en -place_country:US profile_region:"Minnesota" -is:retweet'
# define search rule
rule = gen_rule_payload(keywords,from_date='2019-12-01',
to_date='2019-12-10',results_per_call=500)
# create result stream and print before start
rs = ResultStream(rule_payload=rule, max_results=1250000,
**premium_search_args)
My problems are that:
For the first one, a large portion of the results I get didn’t satisfy the query. First, some don’t have Profile Geo enrichment, i.e. user.derived.locations attribute is not in the user object. Second, if it is, a lot don’t have country code US, i.e. they are identified to other countries.
For the second one, the result I get from this method is a smaller subset of the results I can get from 1). That is, when I filter all tweets user geolocated to Minnesota (by user.derived.locations.region) from profile_country:US, it gives a larger sample than using profile_region:“Minnesota”. A considerable amount of data is missing using this method.
I have tried several times but it seems that user geolocation operators don’t work exactly what I want. Does anyone has any idea why this is the case? I would very much appreciate any answers/suggestions/comments.
Thank you!
Is there any way to limit the results returned by the Google Custom Search API to 1 per domain?
Yes. If you're looking for the first result. Then you can try the following if you are familiar with Python.
res = service.cse().list(query,cx='YOURSEARCHENGINEIDHERE,).execute()
name = res['items'][0]['title'] #title of the first search result
url = res['items'][0]['link'] #url of the first search result
I've built an app using twitter4j which pulls in a bunch of tweets when I enter a keyword, takes the geolocation out of the tweet (or falls back to profile location) then maps them using ammaps. The problem is I'm only getting a small portion of tweets, is there some kind of limit here? I've got a DB going collecting the tweet data so soon enough it will have a decent amount, but I'm curious as to why I'm only getting tweets within the last 12 hours or so?
For example if I search by my username I only get one tweet, that I sent today.
Thanks for any info!
EDIT: I understand twitter doesn't allow public access to the firehose.. more of why am I limited to only finding tweets of recent?
You need to keep redoing the query, resetting the maxId every time, until you get nothing back. You can also use setSince and setUntil.
An example:
Query query = new Query();
query.setCount(DEFAULT_QUERY_COUNT);
query.setLang("en");
// set the bounding dates
query.setSince(sdf.format(startDate));
query.setUntil(sdf.format(endDate));
QueryResult result = searchWithRetry(twitter, query); // searchWithRetry is my function that deals with rate limits
while (result.getTweets().size() != 0) {
List<Status> tweets = result.getTweets();
System.out.print("# Tweets:\t" + tweets.size());
Long minId = Long.MAX_VALUE;
for (Status tweet : tweets) {
// do stuff here
if (tweet.getId() < minId)
minId = tweet.getId();
}
query.setMaxId(minId-1);
result = searchWithRetry(twitter, query);
}
Really it depend on which API system you are using. I mean Streaming or Search API. In the search API there is a parameter (result_type) that is an optional parameter. The values of this parameter might be followings:
* mixed: Include both popular and real time results in the response.
* recent: return only the most recent results in the response
* popular: return only the most popular results in the response.
The default one is the mixed one.
As far as I understand, you are using the recent one, that is why; you are getting the recent set of tweets. Another issue is getting low volume of tweets that have the geological information. Because there are very few users added the geological information to their profile, you are getting very few tweets.
When performing user search with the new Twitter API (can be checked here https://dev.twitter.com/console), found the problem with limiting the number of returned results, as well as with paging.
So, let's say I want to get 5 results from searching and use count parameter:
https://api.twitter.com/1.1/users/search.json?q=Online&count=5
It works correctly, returns 5 records. But if I set count to zero, there is still one result returned:
https://api.twitter.com/1.1/users/search.json?q=Online&count=0
Is it expected?
Then i tried to use paging for the same purposes (wanted to get the first page, and limit results in it):
https://api.twitter.com/1.1/users/search.json?q=Online&per_page=5&page=1
Now it looks like the limit doesn't work at all, there are much more than 5 records returned.
Does anybody know if something is wrong with the queries, or it's an API bug?
**Try like this : **
https://api.twitter.com/1.1/users/search.json
Parameters:
1) page : Specifies the page
2) count : Number of user results to retrieve per page and max is 20.
Following example will return 1 result from page 2 for search keyword "wordpress"
<?php
require_once('api/TwitterAPIExchange.php');
require_once("token.php"); // For your access token - viral
$searchword = "wordpress";
$url = 'https://api.twitter.com/1.1/users/search.json';
$getfield = '?&page=2&count=1&q='.$searchword;
$requestMethod = 'GET';
$twitter = new TwitterAPIExchange($settings);
$followers = $twitter->setGetfield($getfield)
->buildOauth($url, $requestMethod)
->performRequest();
$json = json_decode($followers, true);
print_r($json);
?>
i think the 'per_page' is not a valid parameter in this 1.1 API.
And then, it returns the default number 20.
https://dev.twitter.com/docs/api/1.1/get/users/search
I'd like to ask if there is a more efficient way to get more than 50 results besides these options?
How do I get more locations?
Foursquare Venue API & Number of Results
and this, which is for the old API Foursquare API nearByVenue service issue
I'm using the current foursquare api for the venue search https://developer.foursquare.com/docs/venues/search .
What I'd like is something like an offset option, in order to get more results, but it seems that there isn't such an option.
Is there an alternative solution?
Thank you in advance.
you should use venues explore with offset and limit as paramters,
venues explore gives you totalResults and you can use this response to calculate number of pages you need in paginate
for example assume totalResults is 90(pay attention at offset and limit parametr value )
in first request:
https://api.foursquare.com/v2/venues/explore?client_id=client_id&client_secret=client_secret&v=20150825&near=city_name&categoryId=category_id&intent=browse&offset=0&limit=30
in second request:
https://api.foursquare.com/v2/venues/explore?client_id=client_id&client_secret=client_secret&v=20150825&near=city_name&categoryId=category_id&intent=browse&offset=30&limit=30
in third request:
https://api.foursquare.com/v2/venues/explore?client_id=client_id&client_secret=client_secret&v=20150825&near=city_name&categoryId=category_id&intent=browse&offset=60&limit=30
for 90 results you can get all records with above three request
There is actually another option not mentioned here (not pagination though)
Using the (experimental?) categoryId filter.
You can search for a single point (ll) a few times with different category ids, giving you many results (some duplicates as venues can have more than one category).
So you can search for 'Food' venues and 'Nightlife' venues at the same place, getting 100 results in stand of 50.. as said it is 100 results, but not unique results, could be duplicates. I think that is more efficient then trying to play around with the browse radius thing.
Not pagination, but will give a lot more results than a normal search - usually enough even in urban areas.
But yea, having some sort of way to extract more than 50 on a single point is not possible, but could be nice :)
Afraid not. Currently there is no pagination, in order to find more venues you need to move your search area around as in the answers you highlighted. I agree, pagination would be handy though!
For the explorer endpoint this worked for me: If the maximum number of results that is returned for instance is 100, just use offset=100 in the next call which gives you the next 100 results starting from 100 (the offset). Iterate (e.g. using while loop) and keep increasing offset by 100 until you reach the total number or results (which is returned in in the api for totalResults).
My first stack overflow post, tried to answer as clearly as possible
def getNearbyVenues(neighborhoods, latitudes, longitudes, radius=500,ven_num=300):
venues_list=[]
for name, lat, lng in zip(neighborhoods, latitudes, longitudes):
i=0
while (i < ven_num+50):
url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&offset={}&limit={}'.format(
CLIENT_ID,
CLIENT_SECRET,
VERSION,
lat,
lng,
radius,
i,
LIMIT)
# make the GET request
results = requests.get(url).json()['response']['groups'][0]['items']
# return only relevant information for each nearby venue
venues_list.append([(
name,
lat,
lng,
v['venue']['name'],
v['venue']['location']['lat'],
v['venue']['location']['lng'],
v['venue']['categories'][0]['name']) for v in results])
i=i+50
nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
nearby_venues.columns = ['Neighborhood',
'Neighborhood Latitude',
'Neighborhood Longitude',
'Venue',
'Venue Latitude',
'Venue Longitude',
'Venue Category']
print('Ok')
return(nearby_venues)
the code above has worked perfectly with me, where ven_num variable is the desired limit for calling venues in a certain neighborhood