I am trying to get the twitter name, website, and markets for any cryptocurrency listed on coinmarketcap.
For example:
https://coinmarketcap.com/currencies/bitcoin shows all of the data that I need but how would I parse the data listed on that page to get the twitter name and website associated with bitcoin?
Don't try to parse the data on that page if you are trying to use it in an application. If you are doing it for a one time data collection then maybe that is OK. There is absolutely no guarantee that the format will remain consistent or that the information will even be there tomorrow.
You should try to find an API that gives you the information that you are looking for. An API is a contract that is expected to be honored and therefore is reliable. A quick look at CoinMarketCap's API doesn't appear to have the info you are looking for but maybe another one exists that does.
If you were to parse the HTML you could write a regex for the specific thing you are looking for. For example if you want to get the website you could write a regex that would pick up the pattern:
Website
the capture group ([^"]+) is the website in this example. You could do something like that for every element you want to get.
I am running a website, whose community is powered by Disqus. I would like to create user profile pages, where the page would display the particular user's most recent activity, but only for my particular site (forum, in Disqus' terminology).
I ran through the entire API documentation, but I could not find a way that would allow me to filter by both user, and forum. I would be able to grab either the entire list of posts for a given forum, or the one from a particular user.
In every API call, there is a mysterious query paramater, where I tried to plug a series of filters, but none of them worked.
Is there something that I could be missing?
It's not that obvious, but you can use the query param as a filter for users. Try something like this:
https://disqus.com/api/3.0/forums/listPosts.json?forum={SHORTNAME}&query=user:{USERNAME}&api_key={YOUR_API_KEY}
Right, this is really working on my nerves, but Instagram has to do something about their bloody documentation.
I am already trying for a week to live update my website with new instagram posts without refreshing the page. Twitter was fairly easy, but instagram is just one big mess. Basically I use the realtime Instagram API, the callback and all that stuff is working fine, but thanks to Instagram it does not return me an ID from the post that is new, the callback only returns some basic stuff:
[{"changed_aspect": "media", "object": "tag", "object_id": "nofilter", "time": 1391091743, "subscription_id": xxxxx, "data": {}}]
with this data you are nothing, except for the Tag, but I knew the tag before this callback too so doesn't matter. It actually only tells me that there is a new post. I have tried doing the same request as when the page loads, when this callback occurs, and get the posts that are newer than those already on the page. Unfortunately I have not succeeded in this yet. I have picked the ID from the last posted instagram post, and checked if it is in the callback request, and it's not.
What am I doing wrong?
I'd appreciate some help, thanks!
Edit:
I'd like to note that this is not only a problem with the realtime api, but also with the normal API. I just don't know how to compare data so I don't get duplicates in my database(normal api), or on my website (realtime). I can't find any tutorial or documentation (Yes, I might be blind), that explains to me how to compare data. I can only find the min_id and max_id, but no explanation what these id's contains. I checked these id's with id's from results, and they do not match. It's not an ID from a media item.
I also checked the next_url, and in my logic thinking, this should be a URL to the next page (like Twitter).
Am I looking at this all wrong?
Ok strike my old answer, I changed the way I do this. Here's how I'll do it now.
I still wait for 10 hits on my Real-time subscription, when I reach 10 I send off a new thread (if one is not already running).
The sync thread queries my DB for a value, I need the last min_tag_id I used. Then I query:
https://api.instagram.com/v1/tags/*/media/recent?access_token=*&min_tag_id=*
Try it out here: https://api.instagram.com/v1/tags/montreal/media/recent?access_token=*
You'll get 20 results, and a min_tag_id value. Append that to your url, you'll see you get no results. Wait a couple of seconds and refresh. Eventually you'll get some media, and a new min_tag_id.
(You can ignore the "next_url" value they give you, you won't be using that).
Basically you only need to store that min_tag_id and query until you have no more results, that means you're done then.
When you get a subscription push, you need to query that endpoint (tag / recent).
I normally start an synchronous thread to perform this so I can answer in under 2 seconds to Instagram.
Then you parse that endpoint and look for a "next url" value.
Keep querying that end point, parsing the media and going to the next url until you find your stop condition.
For me I try to match 10 consecutive records in my DB. Basically from the tag, I store media when then meet my business rules.
The Instagram documentation is accurate and actually well written.
The realtime API is working correctly. As stated in the documentation:
The changed data is not included in the payload, so it is up to you
how you'd like to fetch the new data. For example, you may decide only
to fetch new data for specific users, or after a certain number of
photos have been posted.
http://instagram.com/developer/realtime/
You only receive a notification that an update has happened to your subscribed object. It is up to you to call the API to find out what that data is.
You can call the /tags/[tag-name]/media/recent with an access token that you have previously stored on your own server or DB. Then, you should be able to compare the data returned from that endpoint with any data you have retrieved prior, and just pull the objects that you do not yet have.
I am trying to get some information about the post people do in google plus. In particular I am interested in the "+1"'s.
Either from the google api or directly from the google plus web site you can get the total count and name of the people who did "+1". But, I would be interested in getting the time or timestamp of the "+1"'s. Does anyone knows if it is possible or how can I do that?
Help is always appreciated
Thanks to all,
As you can see at https://developers.google.com/+/api/latest/activities, the only data we return for a +1 is a list of the +1-ers, as well as the total number of people who +1'ed the item you are looking at.
If you would like to request additional data, please file a feature request in our Issue Tracker: https://code.google.com/p/google-plus-platform/issues/list. It would really help if you could be as detailed as possible in the type of data you would like to see and how you would ideally use that data.
If you're dealing with your own website and only care about +1s for pages on your own domain, you could use Google Analytic's social information to see how +1s change over time. You wouldn't get information about who did the +1ing.
I'm trying to query Picasa Web Albums to pull album/photo data (obviously!) however initially I only need to pull the first 4 photos. Is there any way to limit a particular field or specify the items per page?
I've already accomplished something similar with Facebook's Graph API however I'm unable to find anything similar for Picasa. The only items I can find related to limiting the response is specifying which fields, but nothing related to the number of rows.
Use max-results argument. See doc here.
https://developers.google.com/picasa-web/docs/2.0/reference#Parameters