I have a project that involves having public data downloaded from Google plus, can you give me a reference on how I can download like 1 GB of any type of public data from Google plus?
The data can be posts or circles information. I've tried to work with developer tools but the far I got is downloading my own profile information but what I need is public data.
Thanks !
There is no truly "public" data on Google+.
Every stream is unique to a user.
Try viewing the site without logging in, and you'll see what I mean.
Since users have the ability to block other users from viewing even their "public" posts, before Google shows you a post they check to see if you're on the blocked list. For them to be able to do that, you have to be logged in.
Your best bet would be to create a dummy account and only look at your nearby stream or What's Hot.
Otherwise you'd need to circle users, and that would create the stream. G+ is not like twitter. There's no firehose to speak of.
To programmatically cull data, you would have to use their API, but even then their HTTP API limits you to 20 results per search and you have to provide a query.
You could get up to 100 results per user if you picked individuals and got their userids, but again there's not a programmatic way to get a bulk dump.
You could randomly select users by using an activity search for a dictionary entry, and then seed that into the activity listing api... something like (in pure pseudocode)
for Random word in dictionary
group = userids from GET https://www.googleapis.com/plus/v1/activities?query=[word]
for userid in group
GET https://www.googleapis.com/plus/v1/people/[userid]/activities/collection/public
Actual code would of course depend on the language.
Related
If you type IAM <GO> in the terminal you'll be shown the UserID, UUID, CLID etc.
Is it possible to extract this information through blpapi when using the Desktop API to connect via BBComm? I've seen references to Identity and populating that by sending an AuthorizationRequest but it appears that's only relevant for SAPI/B-PIPE.
To the best of my knowledge and after asking a couple of Bloomberg reps - this isn't possible. The best work around which I've found is: each user creates an EQS screen called their UUID. Add some filtering which causes this screening to return nothing. Then the application, upon start up, requests all possible UUIDs as EQS screens and stops when it doesn't get back an error - that's the UUID.
This is a dirty, dirty hack and, granted, this only works if you have few distinct users using your system. You don't want to ask may users to create such a screen and probably don't want to iterate over thousands of EQS screen names.
There is a "SID report" which is provide together with monthly invoices from Bloomberg which contains the UUIDs for users - this can be used to look up existing users but when setting up a brand new account you have to manually copy this information out of the terminal.
This is more of a general programming question.
I'm trying to create an app, think of it as a Yelp clone. I have most of it working but I'm missing one important feature. The data of the places around me. For now I'm only focused on food, so I'd like it if I search something like "Pizza", it'd show me all the pizza joints near me.
I was originally planning to use Google Places API. However if you havent heard, they're changing their pricing and lowering the free tier and upping the cost by a huge margin.
There's also the problem of saving the data. One workaround I saw a user suggest was to just keep using Google's API, but every time you make the query, store the data in your own DB as well (I only need address and name and latitude and longitude) so eventually, you'd have what you need in a sense. However I also want to have something like a simple rating system for each place like Yelp, but Google (and all other places like MapBox, Here Maps, etc) states something along the lines of "info from their API should not be stored or cached for more than 24hrs" but it's very broad and not specific.
So what I was planning to do was, call the Google API, grab the 3 info I need (Address, Name, Lat/Lng), add more fields to store the rating, likes, whatever else the user will add. Then store it in my database, but that doesn't seem like a solution now.
So does anyone have any ideas or advice? Or know of a service where I can get the details of all the food places? And if possible, can anyone confirm that storing the Name, Address, Lat&Lng is a violation of their policy since in my eyes, it's public data, but something like the rating that Google provides, or the pictures that Google provides, now that's Google property.
For obtaining places you can use OpenStreetMap, e.g. using Overpass API. Since larger traffic can be expected you should run your own database(s) instead of using the public APIs.
However OSM doesn't contain ratings. So you have to combine this data with some other publicly available rating system.
Is it possible to create a Kapow Robot that can search Google for the Operating hours of the Businesses from our list/database and update the timings if changes are made?
Please share if there are any other more efficient ways than the KAPOW robot that can be implemented with minimal effort and cost-effectiveness.
That's what the Google Places API is there for. While you could in theory just open Google Maps in a Load Page action, enter the query string and then parse the results, I would advise against it. Here's why:
The API will be faster, returning results in a structured manner (JSON)
Kapow has actions for calling RESTful services and parsing/modifying JSON
Google does not like robots parsing their pages, and most likely will lock you out (i.e. present you with Captchas sooner or later)
If you decide to go for the API, here's what you should do:
Get your API key first, see this page for details: https://developers.google.com/places/web-service/get-api-key. Note that the free plan allows for 1,000 requests within a 24-hours limit (https://developers.google.com/places/web-service/usage)
Maintain the place ids for all the businesses you'd like to query regularly, and update your list.
For each place, retrieve the details as described in the API documentation. The opening hours will be within the JSON response: https://developers.google.com/places/web-service/details
Update your list. I'd recommend using a definite type in Kapow for that, and using the actions Store in Database and Query Database. In case you need the data elsewhere, you may create additional robots (e.g. for Excel files, sending data per email, et cetera).
Context
I am in the process of providing some consultancy on doing a HTTP GET using YouTube Data API V3; in order to develop a Windows based application to GET a list of results from Youtube, for say a specific CATEGORY, or a specific TAG.
We are open to using any programming language(I'm from a C++ background and am hoping You tube will support direct HTTP connections without using Google client SDK and so on) to connect to YouTube and (HTTP) GET data.(Once a month or so, so YouTube API quotas should not be problem).
The Issue
We are being told by some of my client's web developers that YouTube API v3 will only return a maximum of 500 records/results, for say a query that returns JUST the Total viewers, the Video's link, and basic meta data such as that.
S, say I wish to find 5,000 results for category "House music" or "basketball" - and I have the Developer Key etc are all set up, would that be possible?
If so, what GET fields would I need to populate(such as "max_results_per_page")?
Thank you.
The API won't provide more than ~500 search results for any arbitrary query. It's by design. Technically, it means that the nextPageToken field won't be returned once you hit ~500 results. No additional parameter can change that.
If you want more than ~500 results for a query, you have to split it into more specific sub-queries. I'd suggest using the publishedAfter and publishedBefore parameters to achieve that, but feel free to experiment with the other ones here.
This only holds for the search-Query. Other queries like "PlaylisItem:list" deliver more results. I have tested with 100.000 items to get the videos of a playlist.
I'm running a contest on the web where the image with the most likes wins. It's tiresom having to go through 900 images manually so what I want to do is, sort all images with the tag lets say #computer after the amount of likes, with the most liked pics on top. I have searched the net like crazy for some program or site that does this (ExtraGram, gramhoot, statigram, webstagram) but none offer to sort by amount of likes and it drives me INSANE! It's a really relevant request.
I've tried istafeed.js but it doesn't include all images, actually it leaves out the ones with the moest likes which defies the purpose.
There's nothing I know of in the Instagram API that sends back media sorted by likes in advance. I don't think there's a tool to do this either, but writing one is relatively simple IMO and I've done it before for a contest specifically.
The simplest thing to do is to do the following:
Use the Instagram API (via a library or pure REST) to query by tag. For instance, if you only care about the most recently tagged media or you want to process by date, you can use the [/tag/tag-name/media/recent][1] enpoint.
Page through each result page by processing the next_max_id/next_max_tag_id.
Collect the results locally into a database. You will receive the "like" count for each media item. You will have to update the data if you want to track the likes over time.
Sort the results using your database or if it's a small result set, you could skip #3 and just sort in memory.
If you need to refresh the results, you need to subscribe to the Tag via the API. You can give Instagram a URL to then push updates, and then you'll have to retrieve 1 or media items and update them in your database accordingly.
You will of course need to register your application with Instagram to get an API key if you want to do this. Then you can either send them your client_id or use OAuth.
The best way to achieve this is to pull the photos in and then sort them programmatically based on the likes numeric value. I've designed a plugin that does this automatically for you for anyone interested.
Instagram Journal