Background: I'm trying to log into an HTTPS site with my valid credentials, navigate to a page that has a frequently updated list, and then scrape the list.
I was using code someone else wrote, which worked until a few weeks ago. I am new to this, but even i can see that the code was not very good, so i am trying to rewrite.
First I log into the site and create an tunnel. Then I move to the page where my list is and grab the list, etc.
Here's what's weird. The login fails every time, until I turn on Fiddler. With Fiddler running it succeeds every time.
Any idea about why this would happen and how to fix?
Many thanks.
I got it working!
For anyone who finds themselves in the same situation (I've seen a number of posting of similar questions - but the answers hadn't worked for me, so I expect I am not alone), I eventually saw that I needed to set the security protocol to TLS. The specific syntax I used was:
ServicePointManager.SecurityProtocol = SecurityProtocolType.Tls;
The setting needs to be specified before the Httpwebrequest get or post event occurs.
If you have a similar problem, I hope this helps.
I had an invalid "User-Agent" header. It contained invalid characters (ä, ö, ü).
Related
I was creating an API for TD Ameritrade (my first time creating or dealing with APIs) and I needed to put in my own call back URL. I know that callback URL is where the API sends information to and i heard that I can just use my localhost API. I scoured the internet and I dont know how that would work and I was wondering if i can just use http://localhost?
Sorry if I seem like a noob because I am
In short, yes.
Follow the excellent directions at
https://www.reddit.com/r/algotrading/comments/c81vzq/td_ameritrade_api_access_2019_guide/. (Even with them, I spent excessive time on trial and error!)
Since stackoverflow has a limit of 8 links in a response, and the localhost text string looks like a link, I’m showing it with the colon replaced by a semicolon, i.e., http;//localhost to reduce the link count. Sorry.
I used the Chrome browser after first trying Brave, which did not work for, possibly because of my option selections.
Go to https://developer.tdameritrade.com/user/me/apps
Add a new app using http;//localhost (delete existing app if there is one).
Copy the resulting consumer key text string (AKA client_id or OAuth User ID).
Go to https://developer.tdameritrade.com/content/simple-auth-local-apps, follow instructions. Note: leading/trailing blanks were inserted by MSWord due to copy/paste of the auth code, which had to be manually deleted after wasting excessive time identifying the problem. The address string looks like:
https://auth.tdameritrade.com/auth?response_type=code&redirect_uri=http%3A%2F%2Flocalhost&client_id=ConsumerKeyTextString%40AMER.OAUTHAP
This returns a page stating the server refused to connect, but the address bar now contains a VeryLongStringOfCharacters in the address bar:
https;//localhost/?code= VeryLongStringOfCharacters
Copy the contents of the address bar, go to https://www.urldecoder.org/, decode the above, and extract the text after “code=”. This is your refresh_token
Go to: https://developer.tdameritrade.com/authentication/apis/post/token-0, fill out the fields with
grant_type=authorization_code
refresh_token=<<blank>>
access_type=offline
code=RefreshTokenTextString
client_id=ConsumerKeyTextString#AMER.OAUTHAP
redirect_uri=http://localhost
Press SEND.
If the resulting page starts with HTTP/1.1 200 OK, you have succeeded.
Try updating your redirect to:
redirect_uri=https://localhost
They may require https now and you need a colon instead of a semicolon. Everything looks correct. This process generally takes me more then one attempt, and 15 minutes to an hour to get my refresh token squared away every 90 days.
dont use #AMER.OAUTHAP in client_id
If you generate a new code and based on that try to get a new access token. it should work.
I'm trying to a CSRF protection to an existing MVC4 web application which uses DevExpress grids. I've added the Html.AntiForgeryToken() into the forms on the aspx pages (which contain ascx as partials containing the grids) and can see the __RequestVerificationToken and it's value clearly in developer tools when a save is called.
I've tried commenting out all my ValidateAntiForgeryToken attributes bar one - I went with the delete post method for simplicity (And also to eliminate the DevExpress grids messing with it) and I still keep coming up against this error:
There was a HttpAntiForgeryException
Url: http://localhost:54653/Users/Delete/f86ad393-0039-44e8-beed-a66dbab9266e?ReturnURL=http%3A%2F%2Flocalhost%3A54653%2FUsers
The exception message is
The required anti-forgery form field "__RequestVerificationToken" is not present.
Does anyone have any idea why this might be happening? Could it be that the error is non-descriptive and it's actually that the token doesn't match rather than that it doesn't exist? In previous answers to this question people just say "oh, you have to add the token," which is obviously not helpful here.
Are you submitting the form manually through Ajax? If that's the case, you need to pass the anti forgery token as another parameter with the name "__RequestVerificationToken".
Point 1 : Make sure if your application is has https secure protocol. Please load in https.
Point 2 : In case of DevExpress you have to call in the below pattern.
ViewContext.Writer.Write(Html.AntiForgeryToken().ToHtmlString());
After struggling with this for days I had a thought - maybe the browser is stopping the cookie being written. I did a search for dev servers and cookies not being written, and found that with Chrome and IE10 and up that there's problems writing the cookies.
I downloaded Firefox and tried it with that and it worked instantly. I then reapplied all the validate attributes to the all the controller methods and the all worked, every single one of them! Even the DexExpress postbacks seem to be working correctly.
I'll carry out more exhaustive testing, but for now, I think we're there.
Not exactly. If MVC AntiForgeryToken is already defined on page where you are using MvcxGridView and you want to protect grid actions you should send this token back to server during grid client side begin callback event.
settings.ClientSideEvents.BeginCallback = "function(s,e) { e.customArgs[\"__RequestVerificationToken\"] = $('input[name=\"__RequestVerificationToken\"]', $(s.GetMainElement())).val(); }";
Does anybody know where documentation for the Metacritic api is/if it still works. There used to be a Metacritic API at https://market.mashape.com/byroredux/metacritic-v2#get-user-details which disappeared today.
Otherwise I'm trying to scrape the site myself but keeping getting a blocked by a 429 Slow down. I got data like 3 times this hour and haven't been able to get anymore in the last 20 minutes which is making testing difficult and application possibly useless. Please let me know if there's anything else I can be doing to scape I don't know about.
I was using that API as well for an app I wrote a while ago. Looks like the creator removed it from Mashape. I just sent him an email to ask whether it'll be back up. I did find this scraper online. It only has a few endpoints but following the examples given you could easily add more. Let me know if you make any progress!
Edit: Looks like CBS requested it to be taken down. The ToS prohibits scraping:
[…] you agree not to do the following, or assist others to do the following:
Engage in unauthorized spidering, “scraping,” data mining or harvesting of Content, or use any other unauthorized automated means to gather data from or about the Services;
Though I was hoping for a Javascript way of doing this, the creator of the API also told me some info.
He says I was getting blocked for not having a User agent in the header and should use a 429 handling procedure i.e. re-request with longer pauses in between.
A PHP plugin available as well: http://datalinx.io/shop/metacritic-api/
I had to add a user agent like JCDJulian said and now it allows me to scrape. So for Ruby:
agent = Mechanize.new
agent.user_agent_alias = "Mac Firefox"
Then it stopped giving me the 403 Forbidden error.
Background
In our Apache configuration we use mod-auth-external (previously on Google Code) to invoke PAM authentication.
Now there is a request for proper handling of shadow-based password expiration:
If password is before warning period Apache should respond with HTTP status code 200. Nothing new here.
If password is in warning period (its validity end is near) Apache should respond with HTTP status code 200, but include somehow information about the warning period.
If password is in expiration period (it is no longer valid but user can still change it on his own) Apache should respond with HTTP status code 401 and include somehow information about expiration period.
If password is beyond expiration period (it is no longer valid and account was locked, administrator must unlock it) Apache should respond with HTTP status code 401 and include somehow information about the locked state.
(There are also corner cases of page missing or some other errors. It is not clear what to do then. But it seems that solving above points would allow to solve those corner cases as well.)
Our PAM authenticator (used through mod-auth-external) is able to differentiate those cases by adjusting return values. That we already have.
The problem is however how to get information from the authenticator to the associated action serving the page (either actual page with 200 status code or 401 error document).
Current investigations
It should be noted that there is significant difference between requirement 2 and requirements 3 and 4.
Requirements 3 and 4 alone are somewhat easier because they both involve our mod-auth-external authenticator returning error (access denied). So we only need to know how to get that error code in 401 error page. I even raised issue on that on mod-auth-external page.
Requirement 2 is much more difficult. In that case our authenticator must return 0 (access granted) and still somehow propagate information about the warning to whatever gets served in the end.
Logs parsing
Obvious (and ugly) idea is to parse logs. mod-auth-external description on Google Code Wiki mentions that authenticator return value gets written to Apache syslog. Also whatever authenticator prints to standard error stream gets logged as well.
This could be used to pass information from authenticator to some other entities.
The difficulty here is that it is not clear how to do it safely. What to print to be sure that "the other entity" will match properly current request with log entry. Mere URL doesn't seem to be enough since there can be multiple requests for the same URL at the same time. While I don't see anything more useful in what authenticator gets.
Another issue here is that it seems that to be able to parse the logs you have to have some non-trivial code running for "the other entity". And this complicates things further since how should we do it?
Another idea
If we could make the authenticator somehow modify "request session" (or whatever, maybe just environment? - I don't know, I'm new to Apache) to add arbitrary data to it we would be (almost) at home.
Our authenticator would somehow store "password status" and also possibly days remaining to the end of warning/expiration period (if applicable). Then upon serving 401 error page we would retrieve that back and use it to dynamically generate content of the page.
Or even better we would have it stored in session so that the other end could read that data directly. (For cases where it is not simply a browser showing page.)
But so far I fail to see how to do that.
Do you have any idea how to meet those requirements?
For over a month I got no answer here. Nor on GitHub issue that I opened for mod-auth-external.
So I ended doing a custom modification to our mod-auth-external. I don't like modifying third party software but this one seems dead anyway. And also it turned out we are using pretty old version (2.2.9 which I upgraded to 2.2.11, the last in 2.2.x line). Which already had some customizations anyway.
I explained details of the solution in a comment to my GitHub issue so I will not repeat them here.
I will however comment on shadow details as they were not mentioned there.
I had two choices: either use getspnam function to retrieve shadow data or to parse messages generated by PAM. First attempts based on getspnam function but in the end I used PAM messages. I didn't have strong reasons for any of those. However I decided to propagate in HTTP response not only shadow status but any PAM message that was generated and so it seemed easier to follow that way.
Good Morning (morning in Oregon). I am going through the Intuit Developer Testing process and continue to get errors when attempting to render the PayPage with ticket and opid strings that were provided to me in the previous step. To prevent links in this message, I separated the https:// from the remainder of the url.
Using:
Firefox browser (internet explorer for intuit is under construction?)
Windows XP Professional version 2002 service pack 3
ORIGINAL DATA FROM PREVIOUS STEP AND VARIABLE URL FOR RENDER PAYPAGE STEP:
Ticket=JiRr77+9RA7vv73vv70JdVcrYWnvv71b77+977+977+9Cu+/vW7vv70a13061215442877+9DhprATIXA++/ve+/vXfvv73vv73vv73doO+/vXDvv70nHWTvv73v
OpId=77-977-9VO--vX130612154428Dvv73vv70ZHyo7
PTC url with variables: (https://) paymentservices.ptcfe.intuit.com/checkout/terminal?Ticket=(X)&OpId=(Y)&action=checkout
When I DO NOT encode the url, I get the Secure Payment Processing window with the following message: Transaction Not Processed - Our system is currently facing some intermittent problems. Please try again in few minutes.
Unencoded url used:
(https://) paymentservices.ptcfe.intuit.com/checkout/terminal?Ticket=JiRr77+9RA7vv73vv70JdVcrYWnvv71b77+977+977+9Cu+/vW7vv70a13061215442877+9DhprATIXA++/ve+/vXfvv73vv73vv73doO+/vXDvv70nHWTvv73v&OpId=77-977-9VO--vX130612154428Dvv73vv70ZHyo7&action=checkout
When I encode the ticket portion of the url only and then paste the results into the url, I get the Transaction not processed screen again.
Resulting url w/ encoding of ticket only:
(https://) paymentservices.ptcfe.intuit.com/checkout/terminal?Ticket=JiRr77%2B9RA7vv73vv70JdVcrYWnvv71b77%2B977%2B977%2B9Cu%2B%2FvW7vv70a13061215442877%2B9DhprATIXA%2B%2B%2Fve%2B%2FvXfvv73vv73vv73doO%2B%2FvXDvv70nHWTvv73v&OpId=77-977-9VO--vX130612154428Dvv73vv70ZHyo7&action=checkout
When I encode the entire original url I get the QBMS Error Page and [Intuit Logo] - Not Found - Error Code 404
Encoded url entire resulting url encoded:
(https://) paymentservices.ptcfe.intuit.com/checkout/terminal%3FTicket%3DJiRr77%2B9RA7vv73vv70JdVcrYWnvv71b77%2B977%2B977%2B9Cu%2B%2FvW7vv70a13061215442877%2B9DhprATIXA%2B%2B%2Fve%2B%2FvXfvv73vv73vv73doO%2B%2FvXDvv70nHWTvv73v%26OpId%3D77-977-9VO--vX130612154428Dvv73vv70ZHyo7%26action%3Dcheckout
I feel I have tried everything and I do not know what else to try. I am under a deadline to get this done. I tried everything suggested in the one thread I found where this issue was discussed. But I noted there was never really an answer on that thread. I hope you can help me out. :)
I just had this problem, and found the cause. They want the value for Ticket and OpId url encoded. In most scripting languages like PHP, that will turn spaces into "+" symbols. They then expect those plus symbols to also be url encoded. So, in PHP if you do something like this to the Ticket and OpId and any other parameters you pass it will fix the problem (this assumes $ticket and $op_id are already set):
$ticket = str_replace('+','%2B',urlencode($ticket));
$op_id = str_replace('+','%2B',urlencode($op_id));