scraping api protected by impreva - api

I want to scrape an api, protected by "impreva" society,they use X-D-token in request header , and visid_incap_ incap_ses_* in response header.
now with datacenter proxy I get 200 response each 50 429 http response. even if I use only one concurent request. but with residential proxy I get 200.
is there any solution to bypass this protection by using only datacenter proxy ?
this is the request :
GET /api/magasins/72/navigation-content/ HTTP/2
Host: api.cora.fr
User-Agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko)
Chrome/91.0.4472.114 Safari/537.36
Accept: application/vnd.api.v1+json
Accept-Language: fr,fr-FR;q=0.8,en-US;q=0.5,en;q=0.3
Accept-Encoding: gzip, deflate
Cache-Control: no-cache
Pragma: no-cache
Cora-Auth: apidrive
App-Id: 1
App-Signature: BROWSER;WEB;91.0.4472.114;;1.25.0;1;2;Chrome;1080;1920
X-D-Token: 3:R3wmbBvTR1D6vDBmzCLerA==:qzf5V/qwmQQShcP5/cFIjM/goahseigjk/Xs2H5btwW5kCw+nLSNStvZUdugaCm1WIVl4vGCwXFf8Te0GueaZV3koYe2oCe7YiDelKihZ5LSVVz3T6uNKMaOxpSFD+CIP6usg48ioqTCv/Wme5hdCQ8n7b5qR25xWKhFCYesCoYZnen2LHVOVnMWde6AkItRarRDG5IcEUW0XYyojX9i+XL6X3Mgnynvsb7l6wVVW4AruNE80MiLkSgo2XHlh3SBFArXBdBvvyKUpfRUGZokMqYDIS03w/ShB1OJ4KUfKs6Wu1hrNCZlY3N8RTE/S8oYAsjpagWzQwTuCTwCLtYv+48kvXRIihtHC1IQ5nRPsd7s4TuanGYsYDjm3CMaUpvA+pQIqLTiLUYdG+lIMfYXUpQpGOXC+2gF69yxyFQbtxpbluv7NsHELoaaLQHvoYKI:JA3UaEpTRK6Wjf6b6yXbvJ28p7vjimPImMsmAN8GEmI=
Uuid: 2bb3d0ad-04c1-485a-a98d-4ac3d753fd1b
Origin: https://www.cora.fr
Referer: https://www.cora.fr/
Sec-Fetch-Dest: empty
Sec-Fetch-Mode: cors
Sec-Fetch-Site: same-site
Te: trailers
this is the response 429 :
HTTP/2 429 Too Many Requests
Date: Mon, 27 Sep 2021 13:08:36 GMT
Content-Type: application/json
Server: api.cora.fr
X-Content-Type-Options: nosniff
X-Xss-Protection: 1; mode=block
X-Frame-Options: SAMEORIGIN
Strict-Transport-Security: max-age=31536000; includeSubDomains
Access-Control-Allow-Origin: https://www.cora.fr
Vary: Origin
Set-Cookie: nlbi_2346747=ozjvaL0oPg66DB86rtkoMQAAAAA9QXZBJ+NCBIVB3SmXJNXF; path=/;
Domain=.cora.fr; Secure; SameSite=None
Set-Cookie:
visid_incap_2346747=70eYj3uqQcKlyni2K4A651PCUWEAAAAAQUIPAAAAAAD3srVYiHapPbOcjTfZu3h0;
expires=Tue, 27 Sep 2022 10:34:34 GMT; HttpOnly; path=/; Domain=.cora.fr; Secure;
SameSite=None
Set-Cookie:
incap_ses_1099_2346747=FIAFfin/tmOLYnuIJm9AD1PCUWEAAAAA/xhZaPz5UshGvjEiQfzp2w==;
path=/; Domain=.cora.fr; Secure; SameSite=None
X-Cdn: Imperva
X-Iinfo: 0-27795484-27795152 pNYN RT(1632748115671 0) q(0 0 0 0) r(1 1) U5
{"message": "429 Too Many Requests","429}
this is the header of response 200 :
HTTP/2 200 OK
Content-Type: application/json
Vary: Accept-Encoding
Cache-Control: no-cache, private
Date: Mon, 27 Sep 2021 14:13:10 GMT
X-Ratelimit-Limit: 10
X-Ratelimit-Remaining: 9
X-Ratelimit-Reset:
Etag: W/"09af7903630eefe87a18365ff527e6917bac5da1"
Server: api.cora.fr
X-Content-Type-Options: nosniff
X-Xss-Protection: 1; mode=block
X-Frame-Options: SAMEORIGIN
Strict-Transport-Security: max-age=31536000; includeSubDomains
Access-Control-Allow-Origin: https://www.cora.fr
Vary: Origin
Set-Cookie:
nlbi_2346747=FsJOKpMDRQqeSktYrtkoMQAAAADB426mJ/c0BiDBDsyETwFU;
path=/; Domain=.cora.fr; Secure; SameSite=None
Set-Cookie: visid_incap_2346747=Dsbf9nV0RN+yazh0zGE893bRUWEAAAAAQUIPAAAAAAD6RV43J8UcxEZJHt07UrHN; expires=Tue, 27 Sep 2022 08:49:53 GMT; HttpOnly; path=/; Domain=.cora.fr; Secure; SameSite=None;
Set-Cookie: incap_ses_476_2346747=qsgAeXWGFRc+6OUn+BebBnbRUWEAAAAAimqhOCYEdQHug9mxUEC0wA==; path=/; Domain=.cora.fr; Secure; SameSite=None
X-Cdn: Imperva
X-Iinfo: 10-120722009-120615528 pNNN RT(1632751989673 0) q(0 0 0 -1) r(4 4) U5

Related

New to VBA : MSXML2.XMLhttp strips session cookies from POST response

I am very new to VBA and trying to scrape through a website. So far I have been able to get cookies from initial get request and use them in POST for a successful login. The next step is to capture the session and user cookies and use them in the subsequent requests.
Unfortunately, this is where my problem begins.
Post successful login I am using .getAllResponseHeaders() to capture all headers but it seems the two cookies (Set-Cookie: xf_user AND Set-Cookie: xf_session) are missing and hence I am not able to capture them for later use. For comparison and easier understanding, I am posting the fiddler (correct) response and response captured by vba (incorrect).
I am not sure what am I doing wrong. Please suggest any options, I am happy to take an alternate approach. I am sure I am very close to success, just need your expert advice.
Fiddler Response
HTTP/1.1 303 See Other
Date: Thu, 30 Apr 2020 04:55:14 GMT
Content-Type: text/html; charset=utf-8
Connection: keep-alive
X-Frame-Options: SAMEORIGIN
X-Content-Type-Options: nosniff
Last-Modified: Thu, 30 Apr 2020 04:55:24 GMT
Location: https://f95zone.to/
Expires: Thu, 19 Nov 1981 08:52:00 GMT
Cache-Control: private, no-cache, max-age=0
Set-Cookie: xf_user=19872%2CUsOoxkBS4bzvLttbYhWkicE-JFQ-vBWo2L68LEVS; expires=Fri, 30-Apr-2021 04:55:24 GMT; Max-Age=31536000; path=/; secure; HttpOnly
Set-Cookie: xf_session=nlJRIrZOrbAiQGVAo_wRJhDSKBsy7wKz; path=/; secure; HttpOnly
Strict-Transport-Security: max-age=15768000
CF-Cache-Status: DYNAMIC
Expect-CT: max-age=604800, report-uri="https://report-uri.cloudflare.com/cdn-cgi/beacon/expect-ct"
Server: cloudflare
CF-RAY: 58beab553a76fea5-MEL
alt-svc: h3-27=":443"; ma=86400, h3-25=":443"; ma=86400, h3-24=":443"; ma=86400, h3-23=":443"; ma=86400
cf-request-id: 026b0969420000fea583bd8200000001
Content-Length: 0
VBA Response
date: Thu, 30 Apr 2020 13:47:02 GMT
content-type: text/html; charset=utf-8
x-frame-options: SAMEORIGIN
x-content-type-options: nosniff
last-modified: Thu, 30 Apr 2020 13:47:01 GMT
expires: Thu, 19 Nov 1981 08:52:00 GMT
cache-control: private, no-cache, max-age=0
strict-transport-security: max-age=15768000
cf-cache-status: DYNAMIC
expect-ct: max-age=604800, report-uri=""https://report-uri.cloudflare.com/cdn-cgi/beacon/expect-ct""
server: cloudflare
cf-ray: 58c1b6504d3dfe8d-MEL
alt-svc: h3-27="":443""; ma=86400, h3-25="":443""; ma=86400, h3-24="":443""; ma=86400, h3-23="":443""; ma=86400
cf-request-id: 026cf0462f0000fe8d47804200000001
Snippet From My Code
Set objXMLHTTPSearch = CreateObject("MSXML2.XMLHTTP")
objXMLHTTPSearch.Open "POST", "https://f95zone.to/login/login", False
objXMLHTTPSearch.setRequestHeader "Accept", "text/html, application/xhtml+xml, image/jxr, */*"
objXMLHTTPSearch.setRequestHeader "Accept -Language", "en -US"
objXMLHTTPSearch.setRequestHeader "User-Agent", "Mozilla/5.0 (Windows NT 10.0; WOW64; Trident/7.0; rv:11.0) like Gecko"
objXMLHTTPSearch.setRequestHeader "Content-type", "application/x-www-form-urlencoded"
objXMLHTTPSearch.setRequestHeader "Accept -Encoding", "gzip , deflate"
objXMLHTTPSearch.setRequestHeader "Host", "f95zone.to"
objXMLHTTPSearch.setRequestHeader "Content-Length", Len(dataSTR)
objXMLHTTPSearch.setRequestHeader "Connection", "Keep -Alive"
objXMLHTTPSearch.setRequestHeader "cache -Control", "no-cache"
objXMLHTTPSearch.withCredentials = True
objXMLHTTPSearch.send dataSTR
statusSearch = objXMLHTTPSearch.status
fetchHeader = objXMLHTTPSearch.getAllResponseHeaders()

Download Page Txt using webclient vb.net

Im trying to download a simple web page as text using Weblcient but all time i get a problem,
i think the problem in the user-agent but when i set one for the weblclient i get the same problem
the page httpheader Capture :
GET /wp-json/binlist/v1/441442/?_wpnonce=335f68c9e2 HTTP/1.1
Host: binlist.org:443
Accept-Encoding: gzip, deflate
Accept-Language: en-US,en;q=0.9
Cookie: _ga=GA1.2.1639241798.1540059335
User-Agent: Mozilla/5.0 (Windows NT 6.2; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/65.0.0.1617 Safari/537.36
HTTP/1.1 200
access-control-allow-headers: Authorization, Content-Type
access-control-expose-headers: X-WP-Total, X-WP-TotalPages
allow: GET
alt-svc: quic=":443"; ma=86400; v="43,39"
cache-control: max-age=0
content-encoding: gzip
content-length: 221
content-type: application/json; charset=UTF-8
date: Sat, 22 Jun 2019 10:02:14 GMT
expires: Sat, 22 Jun 2019 10:02:13 GMT
host-header: 192fc2e7e50945beb8231a492d6a8024
link: <https://binlist.org/wp-json></https:>; rel="https://api.w.org/"
server: nginx
set-cookie: wpSGCacheBypass=0; expires=Sat, 22-Jun-2019 09:02:13 GMT; Max- Age=0; path=/
status: 200
vary: Accept-Encoding
x-cache-enabled: True
x-content-type-options: nosniff
x-proxy-cache: MISS
x-robots-tag: noindex
x-wp-nonce: 335f68c9e2
my code :
Private Sub Button4_Click(sender As Object, e As EventArgs) Handles Button4.Click
Dim webClient As New System.Net.WebClient
webClient.Headers("User-Agent") = "Mozilla/5.0 (Windows NT 5.1; rv:11.0) Gecko Firefox/11.0 (via ggpht.com GoogleImageProxy)"
Dim result As String = WebClient.DownloadString("https://binlist.org/wp-json/binlist/v1/441442/?_wpnonce=a7ddc554d3")
RichTextBox3.Text = result
End Sub

Jsessionid Jmeter after login to site

what should i do to corelate i can use different login in jmeter i have already parameterize the value. i am getting stuck with this jsession id
Sample Count: 1
Error Count: 0
Data type ("text"|"bin"|""): text
Response code: 200
Response message: OK
Response headers:
HTTP/1.1 200 OK
Server: Apache-Coyote/1.1
X-AREQUESTID: 340x9129744x1
Cache-Control: no-cache, no-store, must-revalidate
Pragma: no-cache
Expires: Thu, 01 Jan 1970 00:00:00 GMT
X-ASEN: SEN-1047238
Set-Cookie: atlassian.xsrf.token=AVWR-AYBS-V3UU-QQRS|fef17187ee7e13e93c498a08e44fb5c2b90aba75|lout; Path=/
X-AUSERNAME: anonymous
X-Content-Type-Options: nosniff
Set-Cookie: JSESSIONID=3075A3A258CBA5D6131F724E3C0800CC; Path=/; HttpOnly
X-Accel-Buffering: no
Vary: User-Agent
Content-Type: text/html;charset=UTF-8
Transfer-Encoding: chunked
Date: Sun, 07 Oct 2018 09:40:27 GMT
Content-Encoding: gzip
HTTPSampleResult fields:
ContentType: text/html;charset=UTF-8
DataEncoding: UTF-8

Retrieving multiple set-cookies in VB.NET

I have a common function to retrieve the cookies (login):
postReq.CookieContainer = tempCookies
tempCookies.Add(postresponse.Cookies)
Return tempCookies
The site I'm getting the cookies from is giving me three Set-Cookies:
Set-Cookie: PHPSESSID=abcdefghijklmnopqrstuvwxyz; path=/
Set-Cookie: login=abcdefghijklmnopqrstuvwxyz; expires=Sat, 10-Nov-2012 10:02:56
Set-Cookie: auth=abcdefghijklmnopqrstuvwxyz; expires=Mon, 10-Nov-2014 10:02:56 GMT; path=/; domain=.domain.here
Though it successfully retrieves all three, when I try to pass the cookies, it only sends the first one:
Cookie: PHPSESSID=abcdefghijklmnopqrstuvwxyz;
I want it to be like this:
Cookie: PHPSESSID=abcdefghijklmnopqrstuvwxyz;login=abcdefghijklmnopqrstuvwxyz;auth=abcdefghijklmnopqrstuvwxyz;

So nginx is not interpreting folded headers correctly?

HTTP/1.1 header field values can be
folded onto multiple lines if the
continuation line begins with a space
or horizontal tab. All linear white
space, including folding, has the same
semantics as SP. A recipient MAY
replace any linear white space with a
single SP before interpreting the
field value or forwarding the message
downstream.(quoted from here)
Here's my server side script,which just dumps the cookie content:
var_dump($_COOKIE);exit;
Here comes my test,please pay attention to the cookie part:
GET /logtest.php HTTP/1.1
Host: localhost
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 6.1; zh-CN; rv:1.9.2.17) Gecko/20110420 AlexaToolbar/alxf-2.11 Firefox/3.6.17
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: zh-cn,zh;q=0.5
Accept-Encoding: gzip,deflate
Accept-Charset: GB2312,utf-8;q=0.7,*;q=0.7
Keep-Alive: 115
Cookie: A=t;
artDate=t
Cache-Control: max-age=0
HTTP/1.1 200 OK
Server: iis/8.0
Date: Mon, 23 May 2011 12:38:00 GMT
Content-Type: text/html
Transfer-Encoding: chunked
Connection: keep-alive
Keep-Alive: timeout=20
X-Powered-By: PHP/5.3.2
Set-Cookie: ZDEDebuggerPresent=php,phtml,php3; path=/
27
array(1) {
["A"]=>
string(1) "t"
}
0
GET /logtest.php HTTP/1.1
Host: localhost
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 6.1; zh-CN; rv:1.9.2.17) Gecko/20110420 AlexaToolbar/alxf-2.11 Firefox/3.6.17
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: zh-cn,zh;q=0.5
Accept-Encoding: gzip,deflate
Accept-Charset: GB2312,utf-8;q=0.7,*;q=0.7
Keep-Alive: 115
Cookie: A=t;
artDate=t
Cache-Control: max-age=0
HTTP/1.1 200 OK
Server: iis/8.0
Date: Mon, 23 May 2011 12:38:11 GMT
Content-Type: text/html
Transfer-Encoding: chunked
Connection: keep-alive
Keep-Alive: timeout=20
X-Powered-By: PHP/5.3.2
Set-Cookie: ZDEDebuggerPresent=php,phtml,php3; path=/
27
array(1) {
["A"]=>
string(1) "t"
}
0
GET /logtest.php HTTP/1.1
Host: localhost
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 6.1; zh-CN; rv:1.9.2.17) Gecko/20110420 AlexaToolbar/alxf-2.11 Firefox/3.6.17
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: zh-cn,zh;q=0.5
Accept-Encoding: gzip,deflate
Accept-Charset: GB2312,utf-8;q=0.7,*;q=0.7
Keep-Alive: 115
Cookie: A=t;artDate=t
Cache-Control: max-age=0
HTTP/1.1 200 OK
Server: iis/8.0
Date: Mon, 23 May 2011 12:38:55 GMT
Content-Type: text/html
Transfer-Encoding: chunked
Connection: keep-alive
Keep-Alive: timeout=20
X-Powered-By: PHP/5.3.2
Set-Cookie: ZDEDebuggerPresent=php,phtml,php3; path=/
47
array(2) {
["A"]=>
string(1) "t"
["artDate"]=>
string(1) "t"
}
0
It's a known issue that doesn't have a high priority.