Foursquare Venue API & Number of Results, in a more efficient way?

Foursquare Venue API & Number of Results, in a more efficient way? - api

I'd like to ask if there is a more efficient way to get more than 50 results besides these options?
How do I get more locations?
Foursquare Venue API & Number of Results
and this, which is for the old API Foursquare API nearByVenue service issue
I'm using the current foursquare api for the venue search https://developer.foursquare.com/docs/venues/search .
What I'd like is something like an offset option, in order to get more results, but it seems that there isn't such an option.
Is there an alternative solution?
Thank you in advance.

you should use venues explore with offset and limit as paramters,
venues explore gives you totalResults and you can use this response to calculate number of pages you need in paginate
for example assume totalResults is 90(pay attention at offset and limit parametr value )
in first request:
https://api.foursquare.com/v2/venues/explore?client_id=client_id&client_secret=client_secret&v=20150825&near=city_name&categoryId=category_id&intent=browse&offset=0&limit=30
in second request:
https://api.foursquare.com/v2/venues/explore?client_id=client_id&client_secret=client_secret&v=20150825&near=city_name&categoryId=category_id&intent=browse&offset=30&limit=30
in third request:
https://api.foursquare.com/v2/venues/explore?client_id=client_id&client_secret=client_secret&v=20150825&near=city_name&categoryId=category_id&intent=browse&offset=60&limit=30
for 90 results you can get all records with above three request

There is actually another option not mentioned here (not pagination though)
Using the (experimental?) categoryId filter.
You can search for a single point (ll) a few times with different category ids, giving you many results (some duplicates as venues can have more than one category).
So you can search for 'Food' venues and 'Nightlife' venues at the same place, getting 100 results in stand of 50.. as said it is 100 results, but not unique results, could be duplicates. I think that is more efficient then trying to play around with the browse radius thing.
Not pagination, but will give a lot more results than a normal search - usually enough even in urban areas.
But yea, having some sort of way to extract more than 50 on a single point is not possible, but could be nice :)

Afraid not. Currently there is no pagination, in order to find more venues you need to move your search area around as in the answers you highlighted. I agree, pagination would be handy though!

For the explorer endpoint this worked for me: If the maximum number of results that is returned for instance is 100, just use offset=100 in the next call which gives you the next 100 results starting from 100 (the offset). Iterate (e.g. using while loop) and keep increasing offset by 100 until you reach the total number or results (which is returned in in the api for totalResults).
My first stack overflow post, tried to answer as clearly as possible

def getNearbyVenues(neighborhoods, latitudes, longitudes, radius=500,ven_num=300):
venues_list=[]
for name, lat, lng in zip(neighborhoods, latitudes, longitudes):
i=0
while (i < ven_num+50):
url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&offset={}&limit={}'.format(
CLIENT_ID,
CLIENT_SECRET,
VERSION,
lat,
lng,
radius,
i,
LIMIT)
# make the GET request
results = requests.get(url).json()['response']['groups'][0]['items']
# return only relevant information for each nearby venue
venues_list.append([(
name,
lat,
lng,
v['venue']['name'],
v['venue']['location']['lat'],
v['venue']['location']['lng'],
v['venue']['categories'][0]['name']) for v in results])
i=i+50
nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
nearby_venues.columns = ['Neighborhood',
'Neighborhood Latitude',
'Neighborhood Longitude',
'Venue',
'Venue Latitude',
'Venue Longitude',
'Venue Category']
print('Ok')
return(nearby_venues)
the code above has worked perfectly with me, where ven_num variable is the desired limit for calling venues in a certain neighborhood

Related

Twitter Premium API Profile location operators profile_country: and profile_region: not working

I am using premium account (not sandbox) for data collection.
I want to collect:
All tweets in English that contain ‘china’ or ‘chinese’ that are user geolocated to US and not geolocated at tweet level, excluding all retweets
All tweets in English that contain ‘china’ or ‘chinese’ that are user geolocated to ‘Minnesota’ and not geolocated at tweet level, excluding all retweets
The code is as follows:
premium_search_args = load_credentials('twitter_API.yaml',
yaml_key ='search_tweets_premium_api', env_overwrite=False)
# keywords for the search
# key word 1
keywords = '(China OR Chinese) lang:en profile_country:US -place_country:US -is:retweet'
# key word 2
keywords = '(China OR Chinese) lang:en -place_country:US profile_region:"Minnesota" -is:retweet'
# define search rule
rule = gen_rule_payload(keywords,from_date='2019-12-01',
to_date='2019-12-10',results_per_call=500)
# create result stream and print before start
rs = ResultStream(rule_payload=rule, max_results=1250000,
**premium_search_args)
My problems are that:
For the first one, a large portion of the results I get didn’t satisfy the query. First, some don’t have Profile Geo enrichment, i.e. user.derived.locations attribute is not in the user object. Second, if it is, a lot don’t have country code US, i.e. they are identified to other countries.
For the second one, the result I get from this method is a smaller subset of the results I can get from 1). That is, when I filter all tweets user geolocated to Minnesota (by user.derived.locations.region) from profile_country:US, it gives a larger sample than using profile_region:“Minnesota”. A considerable amount of data is missing using this method.
I have tried several times but it seems that user geolocation operators don’t work exactly what I want. Does anyone has any idea why this is the case? I would very much appreciate any answers/suggestions/comments.
Thank you!

options for questions in Watson conversation api

I need to get the available options for a certain question in Watson conversation api?
For example I have a conversation app and in some cases Y need to give the users a list to select an option from it.
So I am searching for a way to get the available reply options for a certain question.

I can't answer to the NPM part, but you can get a list of the top 10 possible answers by setting alternate_intents to true. For example.
{
"context":{
"conversation_id":"cbbea7b5-6971-4437-99e0-a82927607079",
"system":{
"dialog_stack":["root"
],
"dialog_turn_counter":1,
"dialog_request_counter":1
}
},
"alternate_intents":true,
"input":{
"text":"Is it hot outside?"
}
}
This will return at most the top ten answers. If there is a limited number of intents it will only show them.
Part of your JSON response will have something like this:
"intents":[{
"intent":"temperature",
"confidence":0.9822100598134365
},
{
"intent":"conditions",
"confidence":0.017789940186563623
}
This won't get you the output text though from the node. So you will need to have your answer store elsewhere to cross reference.
Also be aware that just because it is in the list, doesn't mean it's a valid answer to give the end user. The confidence level needs to be taken into account.
The confidence level also does not work like a normal confidence. You need to determine your upper and lower bounds. I detail this briefly here.
Unlike earlier versions of WEA, the confidence is relative to the
number of intents you have. So the quickest way to find the lowest
confidence is to send a really ambiguous word.
These are the results I get for determining temperature or conditions.
treehouse = conditions / 0.5940327076534431
goldfish = conditions / 0.5940327076534431
music = conditions / 0.5940327076534431
See a pattern?🙂 So the low confidence level I will set at 0.6. Next
is to determine the higher confidence range. You can do this by mixing
intents within the same question text. It may take a few goes to get a
reasonable result.
These are results from trying this (C = Conditions, T = Temperature).
hot rain = T/0.7710267712183176, C/0.22897322878168241
windy desert = C/0.8597747113239446, T/0.14022528867605547
ice wind = C/0.5940327076534431, T/0.405967292346557
I purposely left out high confidence ones. In this I am going to go
with 0.8 as the high confidence level.

Gmail API : resultSizeEstimate gives odd numbers

https://developers.google.com/gmail/api/v1/reference/users/messages/list
The Gmail API allows to retrieve an estimate for the number of message for a given query (from:send1#gmail.com is:unread). Somehow the number that the API returns seems very different from the one shown on the webmail.
Any idea on how to return the actual number?

resultSizeEstimate is only an estimate and isn't guaranteed to be accurate for general queries. It should give more reasonable (still estimated) numbers for queries on specific labels ("label:MYLABEL" or "label:MYLABEL is:unread").
Unfortunately, there currently isn't a method of getting the actual numbers, other than retrieving all of them and looking at the size of the list returned.

Partial answer - I was hitting the 103 limit as well.
I've noticed that by playing with the maxResults parameter of the Google API I get different, yet still invalid, results.
With this value of maxResults:
request = service.users().messages().list(userId=user_id, labelIds=formatted_labels)
--> resultSizeEstimate = 103
With this value of maxResults:
request = service.users().messages().list(userId=user_id, labelIds=formatted_labels, maxResults=100)
--> resultSizeEstimate = 103
With this value of maxResults:
request = service.users().messages().list(userId=user_id, labelIds=formatted_labels, maxResults=1000)
--> resultSizeEstimate = 511
With this value of maxResults:
request = service.users().messages().list(userId=user_id, labelIds=formatted_labels, maxResults=50)
--> resultSizeEstimate = 52

This is not exactly the solution to all cases but it may help. You can get the exact number of emails under a given label using Users.labels:get
https://developers.google.com/gmail/api/v1/reference/users/labels/get
The messagesTotal value gives you that number.
Unfortunately, as others notice, listing messages with Users.messages:list or doing a search do not return accurate results.

Get ALL tweets, not just recent ones via twitter API (Using twitter4j - Java)

I've built an app using twitter4j which pulls in a bunch of tweets when I enter a keyword, takes the geolocation out of the tweet (or falls back to profile location) then maps them using ammaps. The problem is I'm only getting a small portion of tweets, is there some kind of limit here? I've got a DB going collecting the tweet data so soon enough it will have a decent amount, but I'm curious as to why I'm only getting tweets within the last 12 hours or so?
For example if I search by my username I only get one tweet, that I sent today.
Thanks for any info!
EDIT: I understand twitter doesn't allow public access to the firehose.. more of why am I limited to only finding tweets of recent?

You need to keep redoing the query, resetting the maxId every time, until you get nothing back. You can also use setSince and setUntil.
An example:
Query query = new Query();
query.setCount(DEFAULT_QUERY_COUNT);
query.setLang("en");
// set the bounding dates
query.setSince(sdf.format(startDate));
query.setUntil(sdf.format(endDate));
QueryResult result = searchWithRetry(twitter, query); // searchWithRetry is my function that deals with rate limits
while (result.getTweets().size() != 0) {
List<Status> tweets = result.getTweets();
System.out.print("# Tweets:\t" + tweets.size());
Long minId = Long.MAX_VALUE;
for (Status tweet : tweets) {
// do stuff here
if (tweet.getId() < minId)
minId = tweet.getId();
}
query.setMaxId(minId-1);
result = searchWithRetry(twitter, query);
}

Really it depend on which API system you are using. I mean Streaming or Search API. In the search API there is a parameter (result_type) that is an optional parameter. The values of this parameter might be followings:
* mixed: Include both popular and real time results in the response.
* recent: return only the most recent results in the response
* popular: return only the most popular results in the response.
The default one is the mixed one.
As far as I understand, you are using the recent one, that is why; you are getting the recent set of tweets. Another issue is getting low volume of tweets that have the geological information. Because there are very few users added the geological information to their profile, you are getting very few tweets.

Facebook Application that Searches for nearest Place

Is it possible to create a facebook application that would display a list of the nearest place of interest location?
Much like the current existing Urbanspoon Application (http://www.urbanspoon.com/c/338/Perth-restaurants.html) but on facebook.
Can somebody please point me to the right direction :)

Get place ID for user's current location:
SELECT current_location FROM user WHERE uid=me()
Get coordinates for that place ID:
https://graph.facebook.com/106412259396611
Get nearby places:
https://graph.facebook.com/search?type=place&center=65.5833,22.15&distance=1000
You could of course get the latitude and longitude by other means.

I know this is a resolved post, but i've also been doing some work on this and thought I'd add my findings / advice...
Using the FQL:
SELECT page_id, name, description, display_subtext FROM place WHERE distance(latitude, longitude, "52.3", "-1.53333") < 10000 AND checkin_count > 50 AND CONTAINS("coffee") order by checkin_count DESC LIMIT 100
This will get pages containing the word "coffee" in their title, description or meta-data that are within 10k of the specified lat/lon.
"page_id": 163313220350407,
"name": "Starbucks At Warwick Services, M40",
"display_subtext": "Coffee Shop・Service Station Supply・3,012 were here"
I like to set a min check-in count of around 50 to help filter out the ad-hoc places that single users create such as "the coffee machine at jim's house". This helps ensure better quality and relevancy of returned results.
Hope this helps someone.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas