I am using the Reddit API (Pushshift) : https://github.com/pushshift/api
Using the documentation, I understand how I can use this to extract every comment containing the word "covid" that was left in a certain time period:
https://api.pushshift.io/reddit/search/comment?q=covid&after=3h&before=2h&size=1
The output looks something like this:
{"data":[{"subreddit_id":"t5_2qh6p","author_is_blocked":false,"comment_type":null,"edited":false,"author_flair_type":"richtext","total_awards_received":0,"subreddit":"Conservative","author_flair_template_id":null,"id":"j98zf27","gilded":0,"archived":false,"collapsed_reason_code":null,"no_follow":false,"author":"VamboRoolOkay","send_replies":true,"parent_id":41917615743,"score":1,"author_fullname":"t2_7uxkru5f","all_awardings":[],"body":"I will never believe that election fraud wasn't a significant factor. Go ahead - call it a conspiracy theory. But I also maintained that Covid was lab-created. Truth is the Daughter of Time.","top_awarded_type":null,"author_flair_css_class":null,"author_patreon_flair":false,"collapsed":false,"author_flair_richtext":[{"e":"text","t":"Conservative"}],"is_submitter":false,"gildings":{},"collapsed_reason":null,"associated_award":null,"stickied":false,"author_premium":false,"can_gild":true,"link_id":"t3_116l7ct","unrepliable_reason":null,"author_flair_text_color":"dark","score_hidden":true,"permalink":"/r/Conservative/comments/116l7ct/kamala_harris_plans_on_running_with_biden_in_2024/j98zf27/","subreddit_type":"public","locked":false,"author_flair_text":"Conservative","treatment_tags":[],"created_utc":1676866031,"subreddit_name_prefixed":"r/Conservative","controversiality":0,"author_flair_background_color":"","collapsed_because_crowd_control":null,"distinguished":null,"retrieved_utc":1676866047,"updated_utc":1676866048,"body_sha1":"328df3784d15f77b98a84418c4ce720822227cfe","utc_datetime_str":"2023-02-20 04:07:11"}],"error":null,"metadata":{"es":{"took":98,"timed_out":false,"_shards":{"total":828,"successful":828,"skipped":824,"failed":0},"hits":{"total":{"value":573,"relation":"eq"},"max_score":null}},"es_query":{"size":1,"query":{"bool":{"must":[{"bool":{"must":[{"simple_query_string":{"fields":["body"],"query":"covid","default_operator":"and"}},{"range":{"created_utc":{"gte":1676862433000}}},{"range":{"created_utc":{"lt":1676866033000}}}]}}]}},"aggs":{},"sort":{"created_utc":"desc"}},"es_query2":"{\"size\":1,\"query\":{\"bool\":{\"must\":[{\"bool\":{\"must\":[{\"simple_query_string\":{\"fields\":[\"body\"],\"query\":\"covid\",\"default_operator\":\"and\"}},{\"range\":{\"created_utc\":{\"gte\":1676862433000}}},{\"range\":{\"created_utc\":{\"lt\":1676866033000}}}]}}]}},\"aggs\":{},\"sort\":{\"created_utc\":\"desc\"}}","api_launch_time":1673017478.254743,"api_request_start":1676873233.6143198,"api_request_end":1676873233.7406816,"api_total_time":0.12636184692382812}}
My Question: Suppose I identify a post that contains the word "covid" - now, I want to retrieve every comment on this post : Is this possible to do?
For instance, based on the output of these results, I see that :
link_id: t3_116l7ct
parent_id:41917615743
Can I somehow use this information to write an API query to retrieve all comments from this post?
I tried the following query but got an empty result: https://api.pushshift.io/reddit/comment/search/?link_id=t3_116cjib
Thanks!
Problem: Attempting to insert a JSON string into a Postgres table column of json datatype intermittently returns this error for some record insertion attempts but not others.
I confirmed using multiple third party 'JSON validator' apps that the JSON I am inserting is indeed valid, and I have confirmed that any single ' quote characters have been escaped with the double '' technique, and the issue persists.
What are some additional troubleshooting steps to consider?
Here is a scrubbed sample JSON I have attempted:
{"id": "jf4ba72kFNQ","publishedAt": "2012-09-02T06:07:28Z","channelId": "UCrbUQCaozffv1soNdfDROXQ","title": "Scout vs. Witch: a tale of boy meets ghoul (Official Version)","tags": ["L4D","TF2","SFM","animation","zombies","Valve","video game"],"description": "Howdy folks (he''s alive!). I made a new SFM video (October 2015), called \"Nick in a Hotel Room\". Please check it out: https://www.youtube.com/watch?v=FOCTgwBIun0\n\nAlso check out some early behind the scenes of Scout vs. Witch:\nhttps://www.youtube.com/watch?v=73tQEBgD09I\n\nYou can find links to my stuff on my website: http://nailbiter.net\n\n-----\n\nhey gang,\nI''m the animator who made this cartoon. Hope you like it.\n\nThis is my little mash-up of a bunch of stuff I like. What happens when the Scout from Valve''s Team Fortress 2 video-game walks into the wrong neighborhood (Left 4 Dead). Hilarity (and a bodycount) ensues. It was created using Source Film Maker (for all the dialog stuff and the montage at the beginning), and with TF2/Source SDK for the entire 300 alley-run sequence. I had already completed that part before SFM was released. The big zombie horde scenes and a couple others were shot in Left 4 Dead. I hope you get a kick out of it.\n\nStuff I did:\nI animated all of the characters (using Maya) except for the big crowd scenes and parts of the headcrab zombie (the crawling and the legs). The faces in the dialog scenes were animated in SFM.\n\nAlso did additional mapping, particles, motion graphics, zombie maya rigging, and created blendshapes for the Witch''s face to enable her to talk/emote. I didn''t do a full set, just the phonemes I needed for this performance. Inspiration for her performance was based on Meg Mucklebones (if you''ve ever seen Legend) mixed with the demon ladies in Army of Darkness. I have a feeling Valve had seen those movies too when they designed her..\n\nthanks for watching."}
I am answering this question by enumerating all the other troubleshooting steps I have found so far, either 'working knowledge' that 'field workers' will have, or a little more obscure (or buried in postgres docs which, while thorough, are esoteric) insights I have found thru my own trial & error
Steps
Make sure you have escaped any single quote ' characters by double-escaping with like ''
Make sure your JSON string is actually a single line string - JSON is very easy to copy as a multiline string, and postgres JSON columns will not accept this (easy as hitting backspace on any newline)
Most obscure I've found: even when encapsulated in a JSON string field, the ? question mark weirdly enough breaks the JSON syntax for postgres. Something like {"url": "myurl.com?queryParam=someId"} will return as invalid. Solve this by escaping the question mark like: {"url": "myurl.com\?queryParam=someId"}
After many googling searchs I decided to post my problem here hoping that someone help me. What I want to achieve is to perform queries as follows:
q1: (adjective) "jumps" (preposition) // any adj followed by "jumps" followed by any prep.
q2: (adjective:brown) "jumps" (preposition) // brown as adj. followed by "jumps" followed by any prep.
q3: (adjective:brown) (verb:jumps) (preposition) // brown as adj followed by jumps as verb followed by any preposition.
In a more general form, what I want is
(POS[:specific_word]) (POS[:specific_word]) (POS[:specific_word])
For that, I have the text tagged as follows:
the|[pos:DT][lemma:the] quick|[pos:JJ][lemma:quick] brown|[pos:JJ][lemma:brown] fox|[pos:NN][lemma:fox] jumps|[pos:NNS][lemma:jump] over|[pos:IN][lemma:over] the|[pos:DT][lemma:the] lazy|[pos:JJ][lemma:lazy] dog|[pos:NN][lemma:dog]
The first thing I thought was to index extra info of each term as payload and using PayloadNearQuery after in order to access to the payload of each span. The problem is that PayloadNearQuery match the terms first and then access its payload, so none of the 3 above queries will work. (correct me if I'm wrong)
The second thing I thought was to index extra info as synonyms of the term but, this way, the second query won't work since I can't ask if the first term is an adj and the specific word "brown" simultaneously.
Any way to address this problem, suggestions, etc. will be appreciated.
I'm trying to return a string of the total count of items in a Middleman blog. (I'm currently using 3 on a site).
The closest I've come to getting the count is including = i in a loop, in which the results went from 0 to 34. So I know one particular blog has 35 items but I can't get that value on its own.
It feels like I should be able to do something like:
def get_articles_count(blogName)
data.blog(blogName).articles.count
end
= get_articles_count('posts')
Bonus begging: I'd love to know how I could've tracked down the answer, if possible. I'm missing something and I'd love to know where I should be looking. I've been referencing the local sitemap data http://localhost:4567/__middleman/sitemap, the MM docs, and the MM blog docs, but I can't decipher if an item in a blog is a page or an article. I only use article in my example because that's what the loops require for displaying post information.
It turns out that it's incredibly simple
= blog.articles.count
It turns out that it's so simple...
= blog.articles.count
I've got a small bot communicating with users on ICQ, it's using Twisted.Words, Oscar protocol. I need to see their online status, but that seems to be only possible when I have them in my buddy list. So here comes the question:
How do I add a buddy to my buddy list in Twisted.Words Oscar?
That's pretty weird, but there seems to be nothing about it in the API docs and I couldn't find any good clues in the oscar.py source code. :\
Finally I came up with a solution, after hours of looking at the code of oscar.py and at OSCAR protocol documentation.
So here we go. Go to the function gotBuddyList(self, l) in this example:
http://twistedmatrix.com/documents/current/words/examples/oscardemo.py
You might have your own analogue, that's a callback function called when the SSI is received. It's bound like this:
self.requestSSI().addCallback(self.gotBuddyList)
So inside this gotBuddyList(self, l) function you put this:
self.groupAll = l[0][0]
In my case, this contains the first buddy group in my buddy list (which was created manually in advance, from a regular ICQ client). The l variable is the SSI received from the server and it contains your buddy groups, buddies in those groups and other stuff like settings or something. That's according to the OSCAR docs.
I'm going to add my buddies to the first group in my list. If you have your own cases or want to create a more flexible solution, you'll have to make more investigation on that.
Next, when you want to add a new buddy to your buddy list, you do this (assuming this is still inside one of your BOSConnection's implementation class methods):
buddy = oscar.SSIBuddy(the_uin_to_add) # put the UIN of the buddy to add in the argument
try:
buddyID = max(self.groupAll.usersToID.itervalues()) + 1 # incrementing the buddyID
except ValueError: # if the group is empty yet
buddyID = 1
self.groupAll.addUser(buddyID, buddy) # adding it to the group
self.addItemSSI(buddy) # actually sending the stuff to the server
And here you are, the buddy is in your list now. If he's online, you'll immediately get an updateBuddy event, containing the info about his online status and so on.
Here I couldn't really understand what the buddyID is. There's no info explaining it. But I finally assumed that it's just an inner ID inside the group the buddy is in. It's limited by 32767. I decided to go from 1 and increment it by one from the highest in the group each time.
That's all I have. I hope it can help someone once. If you can add anything or correct me, I'll be glad to see your comments!