How often to run the cron, to mine twitter public timeline?

How often to run the cron, to mine twitter public timeline? - api

The webapps that depend on the public timeline of twitter, how often do they collect the data? There must be hundreds of thousands of messages every minute, correct? How do they manage to collect all the tweets, without missing any of them?

Some services (Friendfeed is a good example) are granted access to the Twitter Streaming API, aka the 'firehose'. It requires approval and a written agreement.

The publictimeline is not a great place to mine data anymore. Twitter now uses its Streaming APIs to output tweets like crazy. The closest comparison to the publictimeline would be the spritzer method, but that only includes a small sample. If you need to gather all (or more) tweets than the spritzer method, you'll need to sign a written agreement to get access to other Streaming API (HTTP push) feeds, such as the firehose feed, which returns all public tweets.

The twitter API is rate limited, as has been said. The public timeline (twitter.com/public_timeline) is not rate limited in the same sense, but it is only updated every 5 seconds, so most tweets never appear there.
There are I think three or four companies that have access to the firehose, as Twitter's full feed is called. FriendFeed is one of these. Another is Gnip. Gnip resells the feed to other companies. This is probably the only feasible way to get a full twitter feed.

Go here:
http://twitter.com/help/request_whitelisting
and get your account white-listed (allows 20,000 per hour) if 100 requests per hour isn't enough.
#ceejayoz its not 100 GET requests its 100 requests in general excluding a few requests like verify_credentials and rate_limit_status.

Related

Creating Multiple Google/YouTube Data API Keys

Is it possible to create multiple API keys for the YouTube Data API?
The majority of Live YouTube Subscriber Counters use loads of different API keys for their counters (as can be seen in their JavaScript code).
The aim of doing so is to not exceed the daily quota limit of 1,000,000 and having to send requests every few seconds per page visited would mean that the limit would be reached very quickly.
How are they able to get away with this?

Here is a SO post to answer your question.
Technically you can run your application using different API Keys it
should work fine. Technically there is nothing wrong with creating
additional projects on Google Developer console. You don't need to go
as far as creating another Google account.

Podio API limit

I am working on one product which is fetching all the organization/workspace and app details of the customer. The customer can refresh them any time.
So let’s say I have one customer who has 100 applications across multiple workspaces so around it is making around 110 calls to get each application detail, workspace details and organizations.
Now if that customer refreshed the applications multiple times like 10 times in an hour so the action only for that is 1000 API calls. If I have 50 such users active and doing this thing then it will be something 50000.
AFAIK I can not make so many API calls in an hour so how to handle this scenario. I know a lot of applications are doing such things so want to understand how everyone is handling this.

If you need a higher rate limit, I would encourage you to contact Podio support and ask specifically for what you need. We have internal guidelines for evaluating these kinds of requests and may increase the limit for your user and client ID if appropriate.
In general, though, I would expect your app to implement some kind of batching, transient storage, and/or caching layers, especially if your customers are interacting with Podio exclusively or primarily through your system.
Please see our official statement here: https://developers.podio.com/index/limits
Summary:
The general limit is 5,000 API calls per hour, but if the API call is marked as "Rate limited" in the API reference the call is deemed resource intensive and a lower rate of 1000 calls per hour is enforced. If you hit the rate limits the API will begin returning 420 HTTP error codes for all API calls. Rate limits are per user per API key.
Contacting support:
If you have a project that requires a higher rate limit contact support#podio.com with a brief description of your project, your estimated usage and the client_id of the API key you are using.
Usage tips:
Tips for reducing API usage
Avoid making API requests inside loops. Instead of fetching individual objects inside a loop, fetch a collection of objects in one API operation. E.g. filter items
Cache results whenever possible. This is especially true when you are displaying data to the public (i.e. every sees the same output).
Don't poll for changes. Instead of polling Podio to see if your content has changed use webhooks or push to receive a notification. This might save you thousands of requests: https://developers.podio.com/doc/hooks
Use logging to see how many requests you're making
Bundle responses with "fields" parameter

You might want to build an API proxy app; you would need a messaging queue and a rate limiter. This would lets you keep track of the api calls consumptions across apps and users.
Also worth noting: some API routes are more expensive than others if they are more resource intensive on the Podio side… The term in use is rate-limited: rate limited api route are bound to 1k calls an hours, so in effect costs 5 times as much as regular routes.
Hope this helps!

Measure how hot a topic is on Twitter

What kind of service should I use to measure how hot a topic is on Twitter, and how hot it has been in the past?
I thought about:
The Twitter API (https://dev.twitter.com/rest/reference/get/search/tweets) that lets me run searches up to 100 tweets. So in this case I have to make multiple calls to determine how many tweets there are. Is that correct?
TweetReach, that gives reports like this: https://tweetreach.com/reports/16000571, but the cheapest plan is at 300$/month.

With the Twitter API, you have a few options, but none of them may be exactly what you want, and none of them can go back very far into the past. You would have to either compile that information yourself, or use an external service like the one you mentioned.
Using the search API, you can only get results from the past 7 days, and are limited to 100 tweets per request. You can also set result_type to popular to get the most popular tweets about that search term. Twitter does have rate limits, but the ones for search are relatively high. You can use 180 requests every 15 minutes for any user you have authenticated, plus 450 requests every 15 minutes for the app itself (completely separate from the user requests). So if you only use app requests, you can get 45,000 tweets every 15 minutes.
If you don't need to search for specific terms, you can get trending topics in different areas using trends. The available areas can be retrieved using trends/available. Searching for trends also gives you the tweet_volume of each trend over the past 24 hours. If you check the trends every 24 hours and save the volumes, you can build up histories of trending topics.
Another option is using the streaming api. This only gives you current tweets, but you can use track to only get results for a set of terms, which you can then analyze.
Any external service, like TweetReach, will probably either cost you money or strictly limit the amount you can do with it unless you pay.

I'm the Social Media Manager for Union Metrics (we make TweetReach and lots of other things) and I just wanted to let you know that our free snapshots are built on the Search API which gives it those restrictions you've already discussed above, while our full snapshot reports can grab up to 1500 tweets for $20.
We do have more comprehensive Twitter analytics which I think you've already looked at, and those do backfill 30 days before tracking going forward. However you might have missed our new product Echo, which allows for a full, interactive search of the entire Twitter archive (you can see it in action here https://unionmetrics.com/product/echo-twitter-archive-search/) and is available through our Social Suite.
I understand if you don't have a large budget, and I completely understand the dilemma of cost of your time to build what you need vs. budget restrictions. Hope this helps at least let you know what else we offer!
Sarah A. Parker
Social Media Manager | Union Metrics
Fine Makers of TweetReach, The Union Metrics Social Suite, and more

Twitter Streaming API limits?

I understand the Twitter REST API has strict request limits (few hundred times per 15 minutes), and that the streaming API is sometimes better for retrieving live data.
My question is, what exactly are the streaming API limits? Twitter references a percentage on their docs, but not a specific amount. Any insight is greatly appreciated.
What I'm trying to do:
Simple page for me to view the latest tweet (& date / time it was posted) from ~1000 twitter users. It seems I would rapidly hit the limit using the REST API, so would the streaming API be required for this application?

You should be fine using the Streaming API, unless those ~1000 users combined are tweeting more than (very) roughly 60 tweets per second at any moment.
Using the Streaming API endpoint statuses/filter with the follow parameter, you are allowed up to 5000 users. There is no rate limit except when the stream returns more than about 1% of the all tweets being tweeted at that moment. (60 tweets per second is 1% of the average rate of tweets, which is always fluctuating, so don't rely on that number.)
If your stream does go above the 1% threshold, you can detect this. (See the LIMIT notice.) Then you would use the REST API to find missed tweets.

Twitter simply will not allow multiple streams from one registered app/account. Doing so will result in the older one being closed.
Also too many connection tries are not allowed as well and will result in a user being blocked.
Reference docs: Public Streaming API (outdated)

API call request limit

I have been looking into various different APIs which can provide my the weather data I need in JSON format. A lot of these API's have certain limits such as: in order to get more requests per minute, you need to pay more money per month so that your app can make more API requests.
However, a lot of these API's also have free account which five you limited access to them.
So what I was thinking is, wouldn't it be possible for a developer to just make lots of different developer accounts with an API provider and then just make lots of different API keys?
That way, they wouldn't have to pay anything as they could stick with the free accounts. Whenever one of the API keys has reached the maximum daily request calls, the developer could just put a switch statement in their code which gets their software to use a different API key.
I see no reason why this wouldn't work from a technical point of view... but, is such a thing allowed?
Thanks, Dan.

This would technically be possible, and it happens.
It is also probably against the service's terms, a good reason for the service to ban all your sock puppet accounts, and perhaps even illegal.
If the service that offers the API has spent time and money implementing a per-developer limit for their API, they have almost certainly enforced that in their terms of service, and you would be wise to respect those.
(relevant xkcd)

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas