Scrapy - one request needs recent output from other - scrapy

I have a spider for an API. The API works in a way that you call an URL, which gives you back an authorization token, that is valid for a few minutes to do API calls.
The problem I see now is that I do this call, and pass the token to all underlying requests, but when the token expires the requests start to fail.
What would be the best way to go about this? Ideally I'd just schedule the token request every minute and share the token as some kind of variable the request then use at the moment they are executed. But I think that's not possible.

Related

What's the right way to use tokens for REST APIs?

I need to access a REST API using a token.
I am able to create a token that expires in 1 hour using one endpoint and then use that token to fetch some data at another endpoint.
I need to call the second endpoint multiple times every day and I could just create a token and then fetch the data each time, but that feels silly so I wonder what would be the right way to do this.
Should I be storing the token and the time of expiration and then reusing it until I know it's expired before I get a new token or how should I go about doing this? The only tokens I've used before are ones that don't expire, so I'm not really sure how to do this.
I would implement the Pseudocode logic below:
1/a/ Chek if token != Null? If true go to 3/
1/b/ If false, token==Null, go to 2/
2/ getToken() {make a resquest for a new token}, call 3/ after successfully retrieving a new token.
3/ queryAPI(token) {query the REST API}. If the token is expired you will get error 401 (sometimes 400 or 403 when people fail to send back the right error code, test it with your API), using a try catch, purge (delete) the current token and then go to 2/. If code 200 go to 4/
4/ ???
5/ profit
This way you do not need to check yourself if the token is expired, the API Endpoint will tell you

how should I handle an expired JWT

I am new to JWTs and I have a question about it.
I have a web app with React, Node.js, express, using axios for ajax calls and npm jsonwebtokens for the access tokens.
I read a lot about JWT access tokens and refresh tokens but still one thing is not clear to me.
let's say a user logged in, got his access token and a refresh token, the access token will expire in 15 minutes.
What is the best way to go about it ?
set a timeout that will execute an API call to get a new access token after 15 minutes (let's say 14.5 minutes to be on the safe side)
set an interceptor that will check if the token is still valid and if not first get a new token and then continue with the request
is there another way I didn't considered?
If number 1 is the best way how do I handle a page refresh? the way I have it setup right now is when a user is logging in the login function calls a _refreshCountDown function that:
counts the time until the token expiration - with a setTimeout function
execute the refresh_token API call
call it self back again to start a new countdown based on the new expiration time
now if a user refreshes the page the login function is not being called therefore the _refreshCountDown is never being called.
how would you have handle this scenario?
will appreciate any answer
thanks :)

Alternate approaches to token based authentication

I have a RESTful API which will be users will reach via a set of web/mobile clients, and I am trying to figure out how to handle token auth. My understanding is that traditional token auth works as follows:
User auths by providing user/pass, receives back a token and expiration
Until , token is passed with every request
Upon expiration, a new token is requested (either by providing a separate 'refresh' token or just by re-authenticating with user/pass)
Is there a good reason not to generate a new token with each request? That is: an initial token is requested via user/pass. This token is passed with the first API request, which returns the content of the api response plus a new token which must be passed with the following request... The advantage to this approach would be that each request (action) the user takes 'resets' the expiration of the token auth such that the token expiration time basically becomes the period of time the user can be inactive without being logged out. Is there a good reason not to use this approach? The approach laid out above seems more commonplace (which is why I ask).
Finally, one only slightly related question. What stops someone who is watching the network from grabbing the token from the user? In particular in the first scheme, it seems easy to do (in the second method, you would need to capture the incoming request and then quickly get the next token before the user does).
From what I read is that you want a sliding window in which a user is authenticated. Every new request within the expiry window prolongs the session.
If I understand that correctly I would suggest an alternate approach; every time a request is successfully authenticated update your store in which you have your tokens and update the expiration time.
This way you don't have to bother your users with all the hassle of grabbing the new token every single time.
So, yes, there's a good reason not to do that: it's not necessary for your use case and only annoys the user.
With the above approach I assume that you have a store (database) in which you keep your tokens + an expiration date.
So the process is this:
The user provides username + password
Create record in store
Give user the token
Update store every time a successful request is made
On a related note; don't give the users the expiration date. That's fine when using cookies for example but that is merely useful as an additional security measure.
On your slightly related question; nothing stops anyone from grabbing the token if you don't use TLS/SSL/HTTPS. Always use TLS (which is SSL, which is HTTPS, more or less).

JMeter Security Token not parsed

I'm busy with a performance test for Confluence created by JMeter. But I'm having a problem with a security token that is required for creating a page with a post function. This is the query I use, the atl_token is presented in the query:
spaceKey=BD&titleWritten=false&linkCreation=false&title=TEST1&wysiwygContent=TEST1kahdjkaskdjadhkajdlkajsdjaldkjsadlajksdjakldjlkacmnlknmclknmlsanmclanmlclanmldmaldlksadlasmdcalcmlamlamclmalkdjsakjdalksxlakmkslmlknmdlasmdlasdad&confirm=Save&parentPageString=Backend+Development+Home&moveHierarchy=true&atl_token=c52cba0fa075e0fde71e3a5546b95a049e9926a8
But when I use this query and paste into a webpage it says the following:
Your session has expired. You may need to re-submit the form or reload the page.
Is this a timeout or should I do something else in Jmeter?
Try adding HTTP Cookie Manager and HTTP Header Manager to your test plan. They will in most cases keep track of SessionID (store and send it).
You could also read few articles online about how to use these components in practice to get better understanding of them...
There are following measures you can take to get rid of this issue:
Add cookie manager to your script.
Check the response of the request before the Post request. Ideally if should have atl_token.
If you find the token in earlier request, add Regular Expression Extractor to that request and fetch the token.
Pass that token in actual query that you are calling to create page.

Keeping cookies between outbound HTTP calls

I've got a Mule application that listens for requests from one application, then responds by calling a JSON API multiple times to authenticate and then retrieve several data, doing some transformation, and returning the results. The API requires HTTP basic authentication. When an account authenticates, the application that provides the API 1) returns a session/authentication cookie that can be used to identify the current user in subsequent calls, and 2) updates the database to record the last authentication timestamp for the current user. The API also has a call to check to see if the session/authentication cookie is still valid.
I currently have a flow that invokes the authentication method, then goes on to make a bunch of calls with the session/authentication cookie.
The issue is when the Mule application gets many requests at once, the application that provides the API deadlocks trying to update the authentication timestamp, since the flow will authenticate once for each request. Is there a way (possibly using the object store) to store the session/authentication cookie for use by subsequent requests to the Mule flow? Basically, I want the flow to suspend all other requests to the same flow, check to see if there are stored cookies, check to see if they are still valid, authenticate (again or for the first time) to get a new session/authentication cookie if needed, store the new cookie, then continue.
Is that a reasonable way of doing that, and is it even possible? If not, I think you can get the gist of what I'm trying to accomplish. What better way is there? Thanks!
edit: I've done a little experimentation, and I can definitely use the object store to hold on to the cookie. The part I'm stuck on now is how I get the only first request to re-authenticate if there is no valid cookie while any near-simultaneous requests wait. I'm looking into VM queues and the Mule Requester, but I'm not sure that that will work. I will post the code for a fully functional test when I'm done.