What are the best ways to test failures? I know of the UserAgent change trick where setting it to "Googlebot" for example, will fail the test.
However are there other ways to test this?
I can test how my application will behave when the score is below the acceptable threshold but I would like to simulate a bot (in the eyes of Google) in the browser in some way.
I am not sure, that this is what you are looking for, but you can try to issue duplicate request to your back-end endpoint with the same g-recaptcha-response parameter, you have already used in any previous request.
When you call the back-end for the first time you will get valid response from https://www.google.com/recaptcha/api/siteverify
{"success":true,"score":0.9,"action":"register","challenge_ts":"2020-07-21T18:09:09Z","hostname":"localhost"}
After calling the same endpoint for the second time with the same g-recaptcha-response parameter, google API will respond with something like this:
{"success":false,"score":0.0,"error-codes":["TimeoutOrDuplicate"]}
On desktop or phone using Chrome Incognito mode will get you a low score of 0.1 ~ 0.3 (in my case, at least). See here: https://tehnoblog.org/google-no-captcha-invisible-recaptcha-first-experience-results-review/#google-invisible-recaptcha-v3.0-live-demo
Related
I know this is probably strictly case-specific, but I do feel like I encounter this problem a lot so I will make an effort to try and understand it better.
I am new to using APIs, but I have never succeeded in using one without copying someone's code. In this case, I can't even find any examples on forums, nor in the API documentation.
I'm trying to pull my balance value from my investment bank "NordNet" to scroll, amongst other things, on an Arduino display I've made. Right now I use python Selenium to automatically but "physically" login to NordNet and grab my balance from the DOM. As I'm afraid I might get "punished" for such botted behavior, and because the script is fairly high maintenance (as the HTML changes over time), I would obviously much rather get this information through NordNet's new API.
Link to NordNets API doc
Every time I try to utilize an API doc it's always the same, it looks easy, but I can never get it to work.
This time I tried to just play a little with the API before exploring further.
I use PostMan to send the simplest request:
https://www.nordnet.se/api/2
And I get a successful code 200 JSON response.
I then try to take it a step further to access my account data using this endpoint:
https://www.nordnet.se/api/2/accounts
For this one, I obviously need some authentication of some sort
The doc looks like this:
So I set my PostMan client up like this and get the response showcased:
I've put my NordNet login into the "Auth" tab as "basic auth" and I then see PostMan encrypts this info some way, in the "Headers" tab.
I'm getting an unauthorized response code and I have no idea why. Am I using PostMan wrong (probably)? Is the API faulty (probably not)? There is a mention of a session_id that should contain both password and username? Maybe something completely else...
I hope you can help
The documentation says to use session_id as username and password for that api ,
so try logging in and then get the session id (try with both sid and ssid) . from network tab and pass it as username and password for authorization .
sid- is for http and ssid for https i guess , try with both
Background: my website is pretty simple, containing a main page with a list of links (provided by 3rd party service) - each links pops up a file upload input with a submit button. In that popup I embedded the Recaptcha script, and verified the token upon file submission. Because of this multiple popup setup I chose V3 for zero user interactions with the verification mechanism.
Now, a question arise - how should I interpret Google's response from google.
Google documentation for V3 says:
reCAPTCHA learns by seeing real traffic on your site. For this reason,
scores in a staging environment or soon after implementing may differ
from production. As reCAPTCHA v3 doesn't ever interrupt the user flow,
you can first run reCAPTCHA without taking action and then decide on
thresholds by looking at your traffic in the admin console. By
default, you can use a threshold of 0.5.
It is pretty clear to me, from this description, that the score is what matters - 0.0 for most likely bot, 1.0 for most likely human. So in my code, I check that success == true and score >= 0.5
However - none of the V3 examples I find online for server side validation pay any attention to the score. here are 3 of them. All three only check for the request being successful:
https://stackoverflow.com/a/54118106/3367818
https://stackoverflow.com/a/52633797/3367818
https://dzone.com/articles/adding-google-recaptcha-v3-to-your-laravel-app
Finally, my question is - is that a misconception of V3's mechanism, or is it me missing something?
Thanks.
Yes, you should definitely be checking the value of "score" in Google's verification response.
Those three examples are very lacking in detail and actually pretty confusing.
The "success" simply means that you sent a well formed request with the right token and secret.
It sounds like you're already checking the value of "score", so that's great, but I just wanted to clarify this for anyone who finds this question and is still a little confused.
To complement the answer of #BrettM...
It depends on the way the verification is being handled.
Using reCAPTCHA PHP client library:
See code lines ReCaptcha::verify() line180-182
When setting the threshold:
$recaptcha = new Recaptcha($secret);
$response = $recaptcha
->setExpectedHostname($hostname)
->setExpectedAction($action)
->setScoreThreshold(.5)
->verify($token, $ip);
$response->isSuccess() will return false when threshold is not met.
$response->getErrors() will contain E_SCORE_THRESHOLD_NOT_MET
Without setting the threshold:
$response->isSuccess() will return true, unless there are errors.
$response->getScore() should now be checked.
Without using reCAPTCHA PHP client library:
Both $response['success'] and $response['score'] should be checked.
The documentation I can find for the Canonical Landscape API lets you do lots of things with scripts, but I can't find anything suggesting that you can get output. However, if you use the Canonical web interface, script output is available, so it's presumably exposed somehow...?
I just had this issue as well and since you're the first hit right now on google, I wanted to share the answer for everyone - if you run ExecuteScript on a landscape client and get back an ID of 123, and let's assume the job finished already - you want to then use that ID to ask the GetActivities API, with an input argument of "query" with value "parent-id:123". If there is a result there, you will find the script output you are looking for under the result_text field of the response. Good luck!!! It worked over here very well.
I'm using google custom search engine with a search engine that searches specific sites and excludes some patterns in these sites.
I'm testing the api locally and I receive 12 results. I test the same exact call in staging (heroku us region) and I receive 410 results.
Does google personalise the results when using a custom search engine?
If yes, how do I turn it off? If no, do you have any idea why am I seeing this difference?
Update
Ok I did a test. I issued the exact same request by using a proxy and not, and the results are different (vastly).
Now, the question is, can this behaviour be disabled?
Ok found it. By specifying the userIp param https://developers.google.com/custom-search/json-api/v1/using_rest it will force google to use the same behaviour regardless of location.
Does anybody know where documentation for the Metacritic api is/if it still works. There used to be a Metacritic API at https://market.mashape.com/byroredux/metacritic-v2#get-user-details which disappeared today.
Otherwise I'm trying to scrape the site myself but keeping getting a blocked by a 429 Slow down. I got data like 3 times this hour and haven't been able to get anymore in the last 20 minutes which is making testing difficult and application possibly useless. Please let me know if there's anything else I can be doing to scape I don't know about.
I was using that API as well for an app I wrote a while ago. Looks like the creator removed it from Mashape. I just sent him an email to ask whether it'll be back up. I did find this scraper online. It only has a few endpoints but following the examples given you could easily add more. Let me know if you make any progress!
Edit: Looks like CBS requested it to be taken down. The ToS prohibits scraping:
[…] you agree not to do the following, or assist others to do the following:
Engage in unauthorized spidering, “scraping,” data mining or harvesting of Content, or use any other unauthorized automated means to gather data from or about the Services;
Though I was hoping for a Javascript way of doing this, the creator of the API also told me some info.
He says I was getting blocked for not having a User agent in the header and should use a 429 handling procedure i.e. re-request with longer pauses in between.
A PHP plugin available as well: http://datalinx.io/shop/metacritic-api/
I had to add a user agent like JCDJulian said and now it allows me to scrape. So for Ruby:
agent = Mechanize.new
agent.user_agent_alias = "Mac Firefox"
Then it stopped giving me the 403 Forbidden error.