Search repos with GitHub API using date query - api

If I make the following GET request to the GitHub API, I get about 58 entries:
https://api.github.com/search/repositories?&per_page=250&sort=updated&order=asc&q=%22github-octocat%22
However, the following with any date parameters return 0 entries:
Created since date: ( created:>=2010-09-01 )
https://api.github.com/search/repositories?&per_page=250&sort=updated&order=asc&q=%22github-octocat+created:>=2010-09-01%22
Date range: ( created:2012-08-13..2020-08-14 )
https://api.github.com/search/repositories?&per_page=250&sort=updated&order=asc&q=%22github-octocat+created:2012-08-13..2020-08-14%22
In the GitHub docs, under the section Constructing a search query, the syntax is outlined as such:
SEARCH_KEYWORD_1 SEARCH_KEYWORD_N QUALIFIER_1 QUALIFIER_N
Thei GitHub docs at Search by when a repository was created or last updated outlines date formats like the two above, and Query for values between a range outlines valid combinations of them. I suspect these examples are not meant for this use, as the examples use URLs intended for browsers, such as https://github.com/search?q=case+pushed%3A%3E%3D2013-03-06+fork%3Aonly&type=Repositories, instead of api.github.com, which is confusing too.
I'm trying to apply the patterns shown in the following resources in order to get a range of dates filter:
Github API call: filter by committer-date, answer by Poonacha
Get issues on a date range from Github enterprise API, answer by Al Neill
Any pointers greatly appreciated.

Even though the GitHub docs outlines the syntax as such:
SEARCH_KEYWORD_1 SEARCH_KEYWORD_N QUALIFIER_1 QUALIFIER_N
the parameters for date or similar, are not valid unless placed outside the q parameter. Instead of including date qualifier inside of the double quotes (shown as %22 below):
q=%22mysearch+pushed%3A2017-09-01..2020-10-01+sort:updated%22
Pull them out, and add them after the closing quote of the q parameter:
q=%22mystring%22+pushed%3A2017-09-01..2020-10-01+sort:updated

Related

Using an API to Extract All Comments from a Reddit Post

I am using the Reddit API (Pushshift) : https://github.com/pushshift/api
Using the documentation, I understand how I can use this to extract every comment containing the word "covid" that was left in a certain time period:
https://api.pushshift.io/reddit/search/comment?q=covid&after=3h&before=2h&size=1
The output looks something like this:
{"data":[{"subreddit_id":"t5_2qh6p","author_is_blocked":false,"comment_type":null,"edited":false,"author_flair_type":"richtext","total_awards_received":0,"subreddit":"Conservative","author_flair_template_id":null,"id":"j98zf27","gilded":0,"archived":false,"collapsed_reason_code":null,"no_follow":false,"author":"VamboRoolOkay","send_replies":true,"parent_id":41917615743,"score":1,"author_fullname":"t2_7uxkru5f","all_awardings":[],"body":"I will never believe that election fraud wasn't a significant factor. Go ahead - call it a conspiracy theory. But I also maintained that Covid was lab-created. Truth is the Daughter of Time.","top_awarded_type":null,"author_flair_css_class":null,"author_patreon_flair":false,"collapsed":false,"author_flair_richtext":[{"e":"text","t":"Conservative"}],"is_submitter":false,"gildings":{},"collapsed_reason":null,"associated_award":null,"stickied":false,"author_premium":false,"can_gild":true,"link_id":"t3_116l7ct","unrepliable_reason":null,"author_flair_text_color":"dark","score_hidden":true,"permalink":"/r/Conservative/comments/116l7ct/kamala_harris_plans_on_running_with_biden_in_2024/j98zf27/","subreddit_type":"public","locked":false,"author_flair_text":"Conservative","treatment_tags":[],"created_utc":1676866031,"subreddit_name_prefixed":"r/Conservative","controversiality":0,"author_flair_background_color":"","collapsed_because_crowd_control":null,"distinguished":null,"retrieved_utc":1676866047,"updated_utc":1676866048,"body_sha1":"328df3784d15f77b98a84418c4ce720822227cfe","utc_datetime_str":"2023-02-20 04:07:11"}],"error":null,"metadata":{"es":{"took":98,"timed_out":false,"_shards":{"total":828,"successful":828,"skipped":824,"failed":0},"hits":{"total":{"value":573,"relation":"eq"},"max_score":null}},"es_query":{"size":1,"query":{"bool":{"must":[{"bool":{"must":[{"simple_query_string":{"fields":["body"],"query":"covid","default_operator":"and"}},{"range":{"created_utc":{"gte":1676862433000}}},{"range":{"created_utc":{"lt":1676866033000}}}]}}]}},"aggs":{},"sort":{"created_utc":"desc"}},"es_query2":"{\"size\":1,\"query\":{\"bool\":{\"must\":[{\"bool\":{\"must\":[{\"simple_query_string\":{\"fields\":[\"body\"],\"query\":\"covid\",\"default_operator\":\"and\"}},{\"range\":{\"created_utc\":{\"gte\":1676862433000}}},{\"range\":{\"created_utc\":{\"lt\":1676866033000}}}]}}]}},\"aggs\":{},\"sort\":{\"created_utc\":\"desc\"}}","api_launch_time":1673017478.254743,"api_request_start":1676873233.6143198,"api_request_end":1676873233.7406816,"api_total_time":0.12636184692382812}}
My Question: Suppose I identify a post that contains the word "covid" - now, I want to retrieve every comment on this post : Is this possible to do?
For instance, based on the output of these results, I see that :
link_id: t3_116l7ct
parent_id:41917615743
Can I somehow use this information to write an API query to retrieve all comments from this post?
I tried the following query but got an empty result: https://api.pushshift.io/reddit/comment/search/?link_id=t3_116cjib
Thanks!

Date range search using Google Custom Search API

I am using the Google Custom Search API to search for images. My implementation is using Java, and this is how I build my search string:
URL url = new URL("https://ajax.googleapis.com/ajax/services/search/images?"
+ "v=1.0&q=barack%20obama&userip=INSERT-USER-IP");
How would I modify the URL to limit search results, for example, to: 2014-08-15 and 2014-09-31?
You can specify a date range using the sort parameter. For your example, you would add this to your query string: sort=date:r:20140815:20140931.
This is documented at https://developers.google.com/custom-search/docs/structured_data#page_dates
Also if you use Google's Java API you can use the Query class and its setSort() method rather than building the URL by hand.
I think the better way is to put this into query itself. Query parameter contains 'after' flag which can be used like:
https://customsearch.googleapis.com/customsearch/v1?
key=<api_key>&
cx=<search_engine_id>&
q="<your_search_word> after:<YYYY-MM-DD>"

Obtain Deviantart Deviation ID / UUID from page URL

I was looking at the Deviantart API to see what you can do with it .
A lot of requests require you to provide a deviation id to work with.
Take for instance adding a deviation to favorites ( in Collections -> Add deviation to favorites above, I cannot post more than 2 links... )
Now I looked through the API to figure out how to obtain that id, but I did not find out how to do so.
If I only have the deviation URL, for instance http://kennyklent.deviantart.com/art/Pinkie-Pie-Dancing-296143815 , how can I tell its deviation-id?
It is not the number at the end 296143815, I would've thought so, but it's not.
If it helps, here's one example from the api's /browse/dailydeviations endpoint
"deviationid": "27FD366A-30CB-FC3E-DE54-9621E90FCE60",
"printid": "E984FC87-8B57-239C-FE7C-E2674A0DDFC4",
"url": "http://mudimba.deviantart.com/art/SF-Botanical-Gardens-57879397",
So this deviation SF-Botanical-Gardens-57879397 has the id 27FD366A-30CB-FC3E-DE54-9621E90FCE60 - but how would I find out if it wasn't listed in the examples?
Update 06/2017:
For anyone stumbling across this 2 years later, the answer below still works but there is now another way to get the UUID. Every Deviation now has a meta property da:appurl showing the UUID value on the deviation page itself.
To stay with the SF-Botanical-Gardens-57879397 example from above, looking at the page source at http://mudimba.deviantart.com/art/SF-Botanical-Gardens-57879397 reveals:
<meta property="da:appurl" content="DeviantArt://deviation/27FD366A-30CB-FC3E-DE54-9621E90FCE60">
Which contains exactly the UUID value 27FD366A-30CB-FC3E-DE54-9621E90FCE60
Original answer
I got an answer from a Deviantart dev directly, http://comments.deviantart.com/1/492518964/3755610860
You cannot convert integer IDs into UUID format, you have to query the api to find the correct uuid. So for your example, you would query the /gallery/folders endpoint and then the gallery/{folderid} endpoint to get the list of deviations in that folder.
There's no easier way to obtain the UUID for a given URL for now.

google analytics API, how to extract pageviews for a specific page?

Google Analytics API: how to extract pageviews for a specific page?
I tried using something like
ga:pagePath=~page.php%3fid%3d44 (page.php?id=44)
but it doesn't seem to work... I get "no results found" where I have 20 pageviews for sure
UPDATE
I think I found the solution
ga:pagePath==/website/page.php?id=44
for some reason I had to include the complete path and ==
To use a partial path to match for a page in filters you should use
ga:pagePath=#page.php?id=44
=# tells ga to match a substring.
What you were originally using was incorrect for this.
I think your problem is that you put the hex version of the ? and = characters into your query, which doesn't match how Analytics stores the page paths. If you change these to the normal characters it should work:
ga:pagePath=~page.php?id=44
Your other solution should work as well but is a bit more inflexible in case you wanted to tweak the query to return other pages.

Preventing YQL from URL encoding a key

I am wondering if it is possible to prevent YQL from URL encoding a key for a datatable?
Example:
The current guardian API works with IDs like this:
item_id = "environment/2010/oct/29/biodiversity-talks-ministers-nagoya-strategy"
The problem with these IDs is that they contain slashes (/) and these characters should not be URL encoded in the API call but instead stay as they are.
So If I now have this query
SELECT * FROM guardian.content.item WHERE item_id='environment/2010/oct/29/biodiversity-talks-ministers-nagoya-strategy'
while using the following url defintion in my datatable
<url>http://content.guardianapis.com/{item_id}</url>
then this results in this API call
http://content.guardianapis.com/environment%2F2010%2Foct%2F29%2Fbiodiversity-talks-ministers-nagoya-strategy?format=xml&order-by=newest&show-fields=all
Instead the guardian API expects the call to look like this:
http://content.guardianapis.com/environment/2010/oct/29/biodiversity-talks-ministers-nagoya-strategy?format=xml&order-by=newest&show-fields=all
So the problem is really just that the / characters gets encoded as %2F which I don't want to happen in this case.
Any ideas on how this can be achieved?
You can also check the full datatable I am using:
http://github.com/spier/yql-tables/blob/master/guardian/guardian.content.item.xml
The URI-template expansions in YQL (e.g. {item_id}) only follow the version 3 spec. With version 4 it would be possible to simply (only slightly) change the expansion to do what you want, but alas not currently with YQL.
So, a solution. You could bring a very, very basic <execute> block into play: one which adds the item_id value to the path as needed.
<execute><![CDATA[
response.object = request.path(item_id).get().response;
]]></execute>
Finally, see the diff against your table (with a few other, minor tweaks to allow the above to work).