How does Google reCAPTCHA v2 work behind the scenes? - captcha

This post refers to Google ReCaptcha v2 (not the latest version)
Recently Google introduced a simplified "captcha" verification system (video) that enables users to pass the "captcha" just by clicking on it.
But how can it differentiate a bot from a person just by a click?
As per this answer, (assuming a similar implementation), at first "recaptcha" generates a hidden key and attaches it to a hidden input element and also lazily renders a check box (not an actual check box input but a div) with the same key which when clicked, sends an asynchronous request (XHR) to the Google backend servers to mark it as a valid verification key (i.e. a key that has to be validated when the form is submitted).
But why can't bots automate that click (at least, browser-based bots)?
How might this work?

This is speculation, but based on Google's reference to the "risk analysis engine" they use (http://googleonlinesecurity.blogspot.com/2014/12/are-you-robot-introducing-no-captcha.html)
I would assume it looks at how you behaved prior to clicking, how your cursor moved on its way to the check (organic path/acceleration), which part of the checkbox was clicked (random places, or dead on center every time), browser fingerprint, Google cookies & contents, click location history tied to your fingerprint or account if it detects one etc.
It's fairly difficult to fake "organic" behavior in such a way that it would fool a continuously learning pattern detection engine. In the cases where it's not sure, it still prompts you to match an actual CAPTCHA string.

A new paper has been released with several tests against reCAPTCHA:
https://www.blackhat.com/docs/asia-16/materials/asia-16-Sivakorn-Im-Not-a-Human-Breaking-the-Google-reCAPTCHA-wp.pdf
Some highlights:
By keeping a cookie active for +9 days (by browsing sites with Google resources), you can then pass reCAPTCHA by only clicking the checkbox;
There are no restrictions based on requests per IP;
The browser's user agent must be real, and Google run tests against your environment to ensure it matches the user agent;
Google tests if the browser can render a Canvas;
Screen resolution and mouse events don't affect the results;
Google has already fixed the cookie vulnerability and is probably restricting some behaviors based on IPs.
Another interesting finding is that Google runs a VM in JavaScript that obfuscates much of reCAPTCHA code and behavior. This VM is known as botguard and is used to protect other services besides reCAPTCHA:
https://github.com/neuroradiology/InsideReCaptcha
UPDATE 2017
A recent paper (from August) was published on WOOT 2017 achieving 85% accuracy in solving noCAPTCHA reCAPTCHA audio challenges:
http://uncaptcha.cs.umd.edu/papers/uncaptcha_woot17.pdf
UPDATE 2018
Google is introducing reCAPTCHA v3, which looks like a "human score prediction engine" that is calibrated per website. It can be installed into different pages of a website (working like a Google Analytics script) to help reCAPTCHA and the website owner to understand the behaviour of humans vs. bots before filling a reCAPTCHA.
https://www.google.com/recaptcha/intro/v3beta.html

My Bots are running well against ReCaptcha.
Here my Solution.
Let your Bot do this Steps:
First write a Human Mouse Move Function to move your Mouse like a B-Spline (Ask me for Source Code). This is the most important Point.
Also use for better results a VPN like https://www.purevpn.com
For every Recpatcha do these Steps:
If you use VPN switch IP first
Clear all Browser Cookies
Clear all Browser Cache
Set one of these Useragents by Random:
a. Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; Trident/5.0)
b. Mozilla/5.0 (Windows NT 6.1; WOW64; rv:44.0) Gecko/20100101 Firefox/44.0
5 Move your Mouse with the Human Mouse Move Funktion from a RandomPoint into the I am not a Robot Image every time with different 10x10 Randomrange
Then Click ever with random delay between
WM_LBUTTONDOWN
and
WM_LBUTTONUP
Take Screenshot from Image Captcha
Send Screenshot to
http://www.deathbycaptcha.com
or
https://2captcha.com
and let they solve.
After receiving click cooridinates from captcha solver use your Human Mouse move Funktion to move and Click Recaptcha Images
Use your Human Mouse Move Funktion to move and Click to the Recaptcha Verify Button
In 75% all trys Recaptcha will solved
Chears Google
Tom

May I present my guess, since this is not a open technology.
Google says it's about combing information from before, during, after to distinguish human from robot. But I am more interested about that final click on the check box.
Say, the POST data (solved CAPTCHA) has a field called fingerprint, a string calculated from user behavior. I think there may be a field about that check box location. I guess this check box is in a coordinate system randomly generated by Google back-end and encrypted by the public key of my site. So, a robot may "guess/calculate" a location about this box, but when site owner makes the GET query with private key to verify user identity, Google will decrypt the coordinate system and say if the user click on the right place. So, only one possible right click(with some offsets, it's a square box) location in this random coordinate system owned by only Google and site owners.

Please remember that Google also use reCaptcha together with
Canvas fingerprinting
to uniquely recognize User/Browsers without cookies!

Related

Is it possible to create a button to delete an hour old web history?

I would like to offer the possibility to delete the last hour web history of my users (in order to protect people who used the website but don't have technical skills) or at minimum the url of the website and the referer (google search, social media link, etc.)
Is it possible with a html/js button which would interact with history like some extension ( https://developer.mozilla.org/fr/docs/Mozilla/Add-ons/WebExtensions/API/history/deleteUrl ) ?
Thanks for your help
Clear-Site-Data is a header, implemented to different degrees by different browsers, to indicate to the browser that you'd like the browser to clear cookies, caches and other kinds of storage for your site. This can be a useful security practice for your site -- to clear any sign that the user has logged in, or to make sure that no one who uses the device later will be able to access that user's account.
While I can see important and legitimate use cases for it, it isn't possible to force or prompt the browser to delete the browsing history, even just for your site. You can see how there might be security issues with that, like an attacker clearing evidence of their malicious site from the user's device, or just annoying websites that hide themselves from history and so the user can't see where they were. See these other questions that have also longed for this capability:
How to clear browsing history using JavaScript?
How to clear browsers (IE, Firefox, Opera, Chrome) history using JavaScript or Java except from browser itself?
You might consider exploring the History.replaceState method and other parts of the History API. That won't let you delete URLs and referrers in general, but it can be used to modify the URL of the current page in the history. So if a user arrives on your site visiting a page about something particularly sensitive or revealing, you might be able to modify the current history so that their browser only records that the user visited your domain, and not that particular page.

Verification Google OAuth2 concert scren with the apps for personal use only

I recently asked this question and user's #DalmTo and #Sergio NH they gave me an exhaustive answer for which I thank them very much.
Moving forward to question, we started publishing the application, and its verification was not required, since no scope was added (here it is a little unclear why the requests worked in an application with a test mode in which these scope were not added (google drive, google sheet and google ads)).
However, this time the application in the "In Production" mode began to give us an "Unverified app screen" (see Unverified app screen). We decided that we still need to add scope to the list, and, of course, that the scope list (their list is described above) requires verification by Google.
We started filling in the necessary fields, while studying the Google documentation at the same time, and came across the following information (see block Verification process -> What are the requirements for verification?):
Apps not applicable for verification
Apps for internal use only
(single domain use) Apps for personal use only Apps that are Gmail
SMTP plugins for WordPress Apps that are in development or
staging/testing
Apps for personal use only
And this is just our case: we have already received permission from Google Ads and are just generating simple reports that we want to integrate with Google Sheet. I.e., this is an elementary script that works within this account (however, we still need to request the first concert screen, even for this developer account) and cannot be distributed to any other accounts.
But when adding our scope, Google requires us to pass verification, forcing us to fill in the required fields, in the form of domains and their verification via the Search Console (we have already done this and this stage does not cause difficulties) and links to Youtube videos - where we must show how scope is used.
And just this stage is not clear. We do not allow other people's accounts to connect to this application, and the software does not have any interface, it is just a script that receives data from Google Ads and saves it to Google Sheet (creating a file via Google Drive). We have described all this in the scope usage description field. But the link to the Youtube video is require field, and we sincerely do not understand why (considering our case) we should record something, and most importantly, what exactly we should record in this case. If the documentation itself says that in our case we do not even need a verification.
Maybe we did not understand something and now we are doing it wrong? We will be glad to receive any tips from experts working with Google Cloud Console and apologize in advance for broken English.
We also apologize in advance to the StackOverflow community that we have to publish such elementary (which we are absolutely sure of from our side) questions here. We come here from Google Cloud Console - > Support - > Community support, and we must first try to publish posts in the Google Groups specified there, but they simply do not answer us, apparently considering our questions too elementary and not worthy of attention (however, these same questions in Google Groups are moderated) (for example, the previous question). And we are no longer able to contact any other support. Once again, we apologize for having to ask about this here.
It is true that if your app is a single use app then you do not need to be verified.
However if you don't get your app verified then there will be some restrictions.
you will see the unverified app screen
your refresh tokens will probably only be good for two weeks.
In the case of the YouTube api uploaded videos will be suck private.
If you can live with those points then you don't need to verify your app and you can continue as is.
If on the other hand you don't want to see the unverified app screen and you want a refresh token that will last longer then two weeks. You will need to verify your app. Yes, Even if your app is a console application running as a job some where you still show the consent screen. This is the YouTube video you will need to show Google. Show the consent screen popping up show the URL bar and then show your script running. You also need to set up the homepage and privacy policy screens. Yes i 100% agree with you that this is silly.
When you go though the process. Explain to google that this is a single use script running as a job some where.
Unfortunately when Google changed it so that Refresh tokens expire for unverified apps they pretty much tied the hands of all developers who are running such single user scripts. We now have to get our apps verified if we don't want to have to request a new refresh token every two weeks.
If your program needs to access the requested scopes of the Google account privacy, even though the user is yourself, you also need to provide a youtube video to demonstrate how you use this program. The auditor cannot guarantee whether you will make this program public.

Logic for parking payment

I want to create an app for faster payment of parking.
This question is more about logic of my app, and what tools I need to use about creating it.
At this point, I use a parking place every day and I pay for it through the web page.
I do it like this.
Login to page.
click on the menu and it redirects me to www.parkingexample.page/payments
there is a search menu and I enter my car plate number if my car is found it returns me how much I need to pay, and "Pay" Button appears.
I click "Pay" buttons and then it's all done.
So my goal is to create an app that when I start it will automatically connect to the page and will search for my plate and if found and payment is needed there would be just one button "Pay"
So I think I should do it like this, but as I haven't created any web app(I'm 100% back-end developer) I ask you is my thought process is correct.
And also I don't want to use WebView as I think it's not necessary for me.
When I start my app it sends "POST" request to page to login.
Then I send 'GET' request to www.parkingexample.page/payments with params = 'mycarspaltenumber'
Somehow I need to click on PAY button on page when it appears so I think it's probably again 'POST' request, but at this point, I'm not sure.
So a QUESTION is, is my logic valid? or it can be done in some other way?
UPDATE. ADDED SCREENSHOTS
First Screen shoot this is the menu after I logged in with the search bar where I need to enter my card plate.
Second screen is where I found my car(Entered plate number and clicked search)
and now the page is updated with sum I have to pay and there is a button "PAID" in the bottom right corner I need to click.
And that's all i need.
To validate whether your suggested sequence is correct I would start by capturing your typical browser session between yourself and your parking provider with something like Fiddler. Then I would use HTTP client library of choice (for C# it would be something like HttpClient) and emulate the same flow with correct headers, query parameters and such like.
Looknig at your screenshots it seems the application is ASP.NET Web Forms, which can get a bit painful to emulate due to way its state management works: you will likely need to decode View state object (to ensure you're passing it back correctly) and locate all dynamic field ids that it uses for postbacks. This however is very doable.
If you discover that the above is too hard to emulate (or there's javascript involved) it might be easier to explore Remote Selenium WebDriver coupled with a headless browser like PhantomJS. You'd then have your PhantomJS interact with the page on your server, and you'll drive it with your mobile app. Basically you'll reduce the complexity of your parking provider page to a well documented API.
Hopefully that gives you a starting point
In your application, all that you will need is services call and the security part of logging a new user everytime to check for payment.
So It will be a simple spring-boot application, where you can use the security part for logging, and you can exactly use the simple way , for example you don't need to have a database, just to redirect your page, and if you are not familiar to front-end framework, you can use a basic html-css pages for client side.
Another important point, you should start by designing your application, before coding, because it's very important to know all the ideas behind your application.
Enjoy your doing time!

Tracking Interactive PDF Clicks

I came across a curious question today, asked by my boss. Is it possible to track the clicks to pages inside an interactive PDF without it being embedded in a web page?
The client wants the user to download a PDF from his/her website and track what pages the user is clicking on inside the downloaded PDF.
After searching around on google for a while all I kept getting was links to pages telling you how to track PDF downloads.
Anyone who can shed some light on this or offer me a definitive yes or no to this question would be greatly appreciated.
The Javascript for Acrobat API Reference makes note of this event (page 368 of the API reference):
Page/Open
4.05
This event occurs whenever a new page is viewed by the user and after page
drawing for the page has occurred.
The target for this event is the Doc.
This event does not listen to the rc return code.
This would imply to me that you can hook this event and (assuming the end user permits the communication) send info to your web server every time they change pages.
Obviously this is limited to when the user is reading in Acrobat (Reader or Professional); it will not work if they are reading directly in Chrome or Firefox. And to re-emphasize, Acrobat will prompt the user to ask if it is allowed to communicate with an external website. If the user denies it, no tracking.
As it has already been stated, the PageOpen event would be the hook for tracking pages. But as this works only for (Acrobat) JavaScript enabled viewers, which at that moment have an internet connection available, those stats would be suboptimal.
We also have to point out that this kind of tracking is highly questionable from the point of view privacy (in Europe, this may even be illegal).
A little bit less questionable could be to chop up the document into single pages, add navigation links, and then use the server stats.

Google Plus Interactive Posts not displayed on any stream (Client side API)

As the title denotes, looking for insight on reasons why an Interactive Post doesn't show up on any stream (user sharing the post, and to whomever the user is sharing it with).
Brief
Implement client side api of G+ interactive posts
This seems successful
application auth is requested and if granted is displayed in user's "applications list"
intended content, prefill text, etc. are all displayed when the trigger to initiate the share is invoked
no error indicators (that I know of) are displayed when the "Share" button is clicked by the user (the act of actually posting the share).
it is visible in some way only to Google - explained below
Findings
It seems Google is blocking the post because even though the share isn't posted on any stream (origin nor target), I received a warning about violating Google policies as displayed below, indicating that the (http) post was sent/submitted...
further inspection of network activity also displays what looks like (a guess on my part) a spam score (of 8), somehow already pre-determined (another guess on my part):
https://apis.google.com/u/0/_/sharebox/post/?spam=8&hl=en&ozv=...
Questions
Primary question is why would interactive posts not appear on stream? Any debugging tool out there?
IF my guess on spam blocking is accurate, then why would such be the case? For interactive posts (which somehow inherently is a case of some user "promoting" something in the first place) - eg: with a "BUY" calltoactionlabel?
IF my other guess on the content being "pre-tagged" as spam, how/why would that occur. I didn't include it above, but it is a "product page" - the idea of it, which isn't new nor revolutionary, is to give the opportunity for a user to "share" an item he/she just purchased, say in a normal checkout flow?
It's my assumption that implementation was done correctly, no errors reported, etc. - or perhaps it wasn't? Though it seems unlikely, it's "grasping at straws" time..
Further testing/debugging seems unwise given the warning of policy violation - and yes, I've stopped further dev on this to prevent harming my accounts (one personal, one work, both used above for testing this API).
Thanks for any assistance/input.
Note: I've posted this on G+ community (no luck so far) so once this is resolved I'll share the answer there too (or vice versa).
It looks like you are posting the Interactive post from http://localhost or from any private domain url. Google crawler can only allow interactive post from public domain.
As from their website -
Important: Interactive posts will not work when PhotoHunt is hosted at http://localhost:8888 because the Google crawler can only access public URLs to get microdata about the content of the post. In the case of PhotoHunt for Java, you can deploy your app to appspot.com as a public Google App Engine app.