I am trying to do a perfectly conventional thing: I am using
CloudFront / S3 to host a static website, but I also want to host
another website in a subdirectory. Following the instruction, I
believe I got S3 to work
% curl -v http://mydomain.me.s3-website-us-west-1.amazonaws.com/c
> GET /c HTTP/1.1
> Host: mydomain.me.s3-website-us-west-1.amazonaws.com
> User-Agent: curl/7.64.1
> Accept: */*
>
< HTTP/1.1 302 Moved Temporarily
< x-amz-error-code: Found
< x-amz-error-message: Resource Found
< x-amz-request-id: 9BB13A73FFB4503E
< x-amz-id-2: 3JX26tNdHi1irPbFJS7E1BifwliygqRZsZIc/qZptjBqBjjmGL7YGK6xfG23GZR70R0Ou+3ZAiM=
< Location: /c/
< Content-Type: text/html; charset=utf-8
< Content-Length: 313
< Date: Tue, 01 Dec 2020 01:58:08 GMT
< Server: AmazonS3
So /c is redirecting to /c/, which I believe is correct, and that new location definitely serves correctly:
% curl -v http://mydomain.me.s3-website-us-west-1.amazonaws.com/c/
> GET /c/ HTTP/1.1
> Host: mydomain.me.s3-website-us-west-1.amazonaws.com
> User-Agent: curl/7.64.1
> Accept: */*
>
< HTTP/1.1 200 OK
< x-amz-id-2: BD0wdDnhonp7Y5i2b7mUDVbIXKYu4O52YPUKVQx5GDaLW5hmDzcrsF/EixdksCtkt/NK6Bg24hY=
< x-amz-request-id: 7F11B109218EF9ED
< Date: Tue, 01 Dec 2020 01:58:11 GMT
< Last-Modified: Tue, 01 Dec 2020 01:31:59 GMT
< x-amz-version-id: zSq5IxE3Ug8oG5SSW.lZsCYydp42.h.4
< ETag: "7999ccd49fe930021167ae6f8fe95eb6"
< Content-Type: text/html
< Content-Length: 36
< Server: AmazonS3
<
And it actually gives me my file. But when I try to go through CloudFront for /c:
% curl -v https://mydomain.me/c
> GET /c HTTP/2
> Host: mydomain.me
> User-Agent: curl/7.64.1
> Accept: */*
>
< HTTP/2 403
< content-type: application/xml
< date: Tue, 01 Dec 2020 01:59:43 GMT
< server: AmazonS3
< x-cache: Error from cloudfront
< via: 1.1 58b53da3f7d231b76d30fcffbf4945a1.cloudfront.net (CloudFront)
< x-amz-cf-pop: SFO20-C1
< x-amz-cf-id: PSjqsinkkfheUfhEPVYbbujMqemugFbrYxM-pQMIihMk3dpp2W4Bmw==
and it downloads the familiar S3 access denied. For /c/, it is even weirder:
% curl -v https://mydomain.me/c/
> GET /c/ HTTP/2
> Host: mydomain.me
> User-Agent: curl/7.64.1
> Accept: */*
>
< HTTP/2 200
< content-type: application/x-directory; charset=UTF-8
< content-length: 0
< last-modified: Tue, 01 Dec 2020 01:30:44 GMT
< x-amz-version-id: 4L.jn6WG3emcGutRuwEZv_lE0aO07AGR
< accept-ranges: bytes
< server: AmazonS3
< date: Tue, 01 Dec 2020 02:00:31 GMT
< etag: "d41d8cd98f00b204e9800998ecf8427e"
< x-cache: RefreshHit from cloudfront
< via: 1.1 37d64bca4c93552139fb3a85c9c4a119.cloudfront.net (CloudFront)
< x-amz-cf-pop: SFO20-C1
< x-amz-cf-id: r5lS4QTmg07XhIXRlXsNJ4qcJaWXfj5Ik9fXZPY_dzLjED-A2MhBiA==
It "works", but it returns an empty file, which it says is a directory listing.
I have logging turned on, and that last one returns:
b5063beaaa3c80c2ad85635ddb1c5fac3da6b5510e9ef332c9e0df0c9abdd45a mydomain.me [01/Dec/2020:01:57:47 +0000] 73.202.134.48 b5063beaaa3c80c2ad85635ddb1c5fac3da6b5510e9ef332c9e0df0c9abdd45a 116EA2ED16AA56DE REST.GET.NOTIFICATION - "GET /mydomain.me?notification= HTTP/1.1" 200 - 115 - 15 - "-" "S3Console/0.4, aws-internal/3 aws-sdk-java/1.11.888 Linux/4.9.217-0.3.ac.206.84.332.metal1.x86_64 OpenJDK_64-Bit_Server_VM/25.262-b10 java/1.8.0_262 vendor/Oracle_Corporation" - noe+YUO+FeYaIukSpTTKl9npt1R0+uAr4Hqzx/mQge2bfhydBiiquR9EWG3iGanDRjK/EagN5Ss= SigV4 ECDHE-RSA-AES128-SHA AuthHeader s3-us-west-1.amazonaws.com TLSv1.2
CloudFront is running some Java library?
curl -v https://mydomain.me/c/index.html works fine.
I assume I have misconfigured CloudFront, but cannot figure out how. Any suggestions?
Click on the CloudFront Distribution ID
Select the tab "Origins and Origin Groups"
Click the checkbox for the first item under "Origins" (assuming you only have one)
Click "Edit"
Change the "Origin Domain Name" to
"mydomain.me.s3-website-us-west-1.amazonaws.com" (following your
example)
Click "Yes, Edit"
I've done this a hundred times, I know this is a requirement, and it bites me every time!
Related
I was trying to write a karate test that validated that a particular response header contained the Content-Encoding header field with a value of gzip. I tried on my api where both times the content-encoding field was missing from karate's response. Both of these endpoints returned the content-encoding field on postman and curl commands.
I then tried to hit the postman-echo service to see if it was my api endpoint that karate was having issues with and it seems that it is not only my api. Could someone take a look at my code and see if I am doing something incorrectly to get the header field to show up in the response?
Feature: test getting Content-Encoding
Background:
* url 'https://postman-echo.com/gzip'
Scenario:
Given header Accept-Encoding = 'gzip'
When method get
Then status 200
And match responseHeaders contains {'Content-Encoding':'#present'}
This is what karate request looks like
1 > GET https://postman-echo.com/gzip
1 > Accept-Encoding: gzip
1 > Connection: Keep-Alive
1 > Host: postman-echo.com
1 > User-Agent: Apache-HttpClient/4.5.12 (Java/1.8.0_252)
and the response looks like
1 < 200
1 < Connection: keep-alive
1 < Content-Type: application/json; charset=utf-8
1 < Date: Fri, 08 May 2020 16:18:42 GMT
1 < ETag: W/"ef-7kclc8pzXTvQiPUaEOf6j95iFaE"
1 < Vary: Accept-Encoding
1 < set-cookie: sails.sid=s%3A6G_FShPRZH4V1G-tVDfUEEfMwQQmolo5.T2Cb37zqYA21FTyRyIGutVWQWo9ta4EWiod36%2FkM88I; Path=/; HttpOnly
{
"gzipped": true,
"headers": {
"x-forwarded-proto": "https",
"x-forwarded-port": "443",
"host": "postman-echo.com",
"x-amzn-trace-id": "Root=1-5eb58662-c4aaeec26efd116ac0544a18",
"accept-encoding": "gzip",
"user-agent": "Apache-HttpClient/4.5.12 (Java/1.8.0_252)"
},
"method": "GET"
}
The curl response header is
curl --location --request GET 'https://postman-echo.com/gzip' \
> --header 'Accept-Encoding: gzip' -I
HTTP/2 200
date: Fri, 08 May 2020 16:21:53 GMT
content-type: application/json; charset=utf-8
content-length: 220
content-encoding: gzip
etag: W/"dc-BuD8DN1qXT7trYUQtZOuSvbq1pM"
vary: Accept-Encoding
set-cookie: sails.sid=s%3Aj86lznX3nK20fnEN4B3nbHESrfWqVJ3M.236VrsmQp7V%2F7%2BrvG%2FEtlc9yUVLTtylh1yyIAdQJSiY; Path=/; HttpOnly
This seems to be the way the Apache HttpClient works or how it is configured for Karate. I just found you get the header back with karate-jersey:
1 > GET https://postman-echo.com/gzip
1 > Accept: */*
1 > Accept-Encoding: gzip
1 > User-Agent: Jersey/2.30 (HttpUrlConnection 1.8.0_231)
22:40:48.981 [ForkJoinPool-1-worker-1] DEBUG com.intuit.karate - response time in milliseconds: 1177.37
1 < 200
1 < Connection: keep-alive
1 < Content-Encoding: gzip
1 < Content-Length: 248
1 < Content-Type: application/json; charset=utf-8
1 < Date: Fri, 08 May 2020 17:10:48 GMT
1 < ETag: W/"f8-sigbV4PuNI2Fx08AqzMEqW1WIYY"
1 < Vary: Accept-Encoding
So if Jersey is ok for the rest of your tests, maybe that's all you need. Getting this to work for karate-apache is not a priority for me so if you or anyone else is willing to investigate or fix it, that would be great.
I want to pull a list of all the movies and shows I have seen on Netflix for a personal project, which Netflix has a page for.
Results from trying curl:
curl https://www.netflix.com/MoviesYouveSeen -v
* Trying 50.112.92.119...
* Connected to www.netflix.com (50.112.92.119) port 443 (#0)
* Initializing NSS with certpath: sql:/etc/pki/nssdb
* CAfile: /etc/pki/tls/certs/ca-bundle.crt
CApath: none
* ALPN/NPN, server did not agree to a protocol
* SSL connection using TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256
* Server certificate:
* subject: CN=www.netflix.com,OU=Operations,O="Netflix, Inc.",L=Los Gatos,ST=CALIFORNIA,C=US
* start date: Apr 14 00:00:00 2015 GMT
* expire date: Apr 12 23:59:59 2017 GMT
* common name: www.netflix.com
* issuer: CN=Symantec Class 3 Secure Server CA - G4,OU=Symantec Trust Network,O=Symantec Corporation,C=US
> GET /MoviesYouveSeen HTTP/1.1
> Host: www.netflix.com
> User-Agent: curl/7.43.0
> Accept: */*
>
< HTTP/1.1 302 Found
< Cache-Control: no-cache, no-store
< Date: Tue, 26 Apr 2016 14:47:16 GMT
< Edge-Control: no-cache, no-store
< location: https://www.netflix.com/Login?nextpage=https%3A%2F%2Fwww.netflix.com%2FMoviesYouveSeen
< req_id: 2a134cc9-7f77-4a35-9d83-0099fc7a2466
< Server: shakti-prod i-8cf6164a
< Set-Cookie: nflx-rgn=uw2|1461682036196; Max-Age=-1; Expires=Tue, 26 Apr 2016 14:47:15 GMT; Path=/; Domain=.netflix.com
< Set-Cookie: memclid=b40d0e2c-27b3-4d72-9b14-4477fcf5fa39; Max-Age=31536000; Expires=Wed, 26 Apr 2017 14:47:16 GMT; Path=/; Domain=.netflix.com
< Set-Cookie: nfvdid=BQFmAAEBEDgFjrzXIIi7X6rTj6vmSYUwYpekhXXCCx5ywGWHaOvo0%2BmNx86oMCsliwERTTbRi6FwmgZM3YhqFUBfffSwJ0Kd; Max-Age=31536000; Expires=Wed, 26 Apr 2017 14:47:16 GMT; Path=/; Domain=.netflix.com
< Strict-Transport-Security: max-age=31536
< Via: 1.1 i-6af8eaad (us-west-2)
< X-Content-Type-Options: nosniff
< X-Frame-Options: DENY
< X-Netflix-From-Zuul: true
< X-Netflix.nfstatus: 1_1
< X-Originating-URL: https://www.netflix.com/MoviesYouveSeen
< X-Xss-Protection: 1; mode=block; report=https://ichnaea.netflix.com/log/freeform/xssreport
< Content-Length: 256
< Connection: keep-alive
<
* Connection #0 to host www.netflix.com left intact
I also tried wget:
wget https://www.netflix.com/MoviesYouveSeen
--2016-04-26 10:57:23-- https://www.netflix.com/MoviesYouveSeen
Resolving www.netflix.com (www.netflix.com)... 54.244.126.7, 50.112.115.177, 54.214.7.82, ...
Connecting to www.netflix.com (www.netflix.com)|54.244.126.7|:443... connected.
HTTP request sent, awaiting response... 302 Found
Syntax error in Set-Cookie: nflx-rgn=uw2|1461682643973; Max-Age=-1; Expires=Tue, 26 Apr 2016 14:57:23 GMT; Path=/; Domain=.netflix.com at position 39.
Location: https://www.netflix.com/Login?nextpage=https%3A%2F%2Fwww.netflix.com%2FMoviesYouveSeen [following]
--2016-04-26 10:57:24-- https://www.netflix.com/Login?nextpage=https%3A%2F%2Fwww.netflix.com%2FMoviesYouveSeen
Reusing existing connection to www.netflix.com:443.
HTTP request sent, awaiting response... 200 OK
Syntax error in Set-Cookie: nflx-rgn=uw2|1461682644112; Max-Age=-1; Expires=Tue, 26 Apr 2016 14:57:23 GMT; Path=/; Domain=.netflix.com at position 39.
Length: unspecified [text/html]
Saving to: ‘MoviesYouveSeen’
MoviesYouveSeen [ <=> ] 41.63K 220KB/s in 0.2s
2016-04-26 10:57:24 (220 KB/s) - ‘MoviesYouveSeen’ saved [42629]
It looks like I am not being properly authenticated. Inside my browser if I view source I can see the list of movies. Any suggestions for getting the data?
That 302 response is redirecting you to the login page. You'd need to be logged in for the query to work correctly.
Per my understanding:
the Accept header is used by HTTP clients to tell the server what content types they'll accept. The server will then send back a response, which will include a Content-Type header telling the client what the content type of the returned content actually is.
With this understanding, I tried the following:
curl -X GET -H "Accept: application/xml" http://www.google.com -v
* About to connect() to www.google.com port 80 (#0)
* Trying 173.194.33.81...
* connected
* Connected to www.google.com (173.194.33.81) port 80 (#0)
> GET / HTTP/1.1
> User-Agent: curl/7.24.0 (x86_64-apple-darwin12.0) libcurl/7.24.0 OpenSSL/0.9.8y zlib/1.2.5
> Host: www.google.com
> Accept: application/xml
>
< HTTP/1.1 200 OK
< Date: Tue, 02 Sep 2014 17:58:05 GMT
< Expires: -1
< Cache-Control: private, max-age=0
< Content-Type: text/html; charset=ISO-8859-1
< Set-Cookie: PREF=ID=5c30672b67a74789:FF=0:TM=1409680685:LM=1409680685:S=PsGclk3vR4HWjann; expires=Thu, 01-Sep-2016 17:58:05 GMT; path=/; domain=.google.com
< Set-Cookie: NID=67=rPuxpwUu5UNuapzCdbD5iwVyjjC9TzP_Ado29h3ucjEq4A_2qkSM4nQM3RO02rfyuHmrh-hvmwmgFCmOvISttFfHv06f8ay4_6Gl4pXRjqxihNhJSGbvujjDRzaSibfy; expires=Wed, 04-Mar-2015 17:58:05 GMT; path=/; domain=.google.com; HttpOnly
< P3P: CP="This is not a P3P policy! See http://www.google.com/support/accounts/bin/answer.py?hl=en&answer=151657 for more info."
< Server: gws
< X-XSS-Protection: 1; mode=block
< X-Frame-Options: SAMEORIGIN
< Alternate-Protocol: 80:quic
< Transfer-Encoding: chunked
<
<!doctype html><html itemscope="" itemtype="http://schema.org/WebPage" lang="en"><
As you can notice in the response, I am sent Content-Type: text/html; charset=ISO-8859-1 which is not what I asked for?
why does a different representation (HTML in this case) is sent, although I asked for xml?
Thanks
From RFC 2616:
If an Accept header field is present,
and if the server cannot send a response which is acceptable
according to the combined Accept field value, then the server SHOULD
send a 406 (not acceptable) response.
Here, "should" means that Google aren't actually obliged to throw a 406 error. But since you're receiving an HTML response, it has effectively the same meaning.
I am verifying if my application handles file content delivered through chunked-encoding mode. I am not sure what change to make to the httpd.conf file to force chunked encoding through Apache. Is it even possible to do this with Apache server, if not what would be an easier solution? I am using Apache 2.4.2 and HTTP 1.1.
By default, keep-alive is On in Apache and I do not see the data as chunked when testing with wireshark.
EDIT: Added more info:
Only way I managed to do this was by enabling the deflate module.
Then I configured my client to send "Accept-Encoding: gzip, deflate" header and apache would compress and send the file back in chunked mode.
I had to enable the file type in the module though.
AddOutputFilterByType DEFLATE image/png
See example:
curl --raw -v --header "Accept-Encoding: gzip, deflate" http://localhost/image.png | more
* Connected to localhost (127.0.0.1) port 80 (#0)
> GET /image.png HTTP/1.1
> User-Agent: curl/7.35.0
> Host: localhost
> Accept: */*
> Accept-Encoding: gzip, deflate
>
< HTTP/1.1 200 OK
< Date: Mon, 13 Apr 2015 10:08:45 GMT
* Server Apache/2.4.7 (Ubuntu) is not blacklisted
< Server: Apache/2.4.7 (Ubuntu)
< Last-Modified: Mon, 13 Apr 2015 09:48:53 GMT
< ETag: "3b5306-5139805976dae-gzip"
< Accept-Ranges: bytes
< Vary: Accept-Encoding
< Content-Encoding: gzip
< Transfer-Encoding: chunked
< Content-Type: image/png
<
This resource produces chunked results http://www.httpwatch.com/httpgallery/chunked/
which is very useful for testing clients. You can see this by running
$ curl --raw -i http://www.httpwatch.com/httpgallery/chunked/
HTTP/1.1 200 OK
Cache-Control: private,Public
Transfer-Encoding: chunked
Content-Type: text/html
Server: Microsoft-IIS/7.5
X-AspNet-Version: 4.0.30319
X-Powered-By: ASP.NET
Date: Mon, 22 Jul 2013 09:41:04 GMT
7b
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
2d
<html xmlns="http://www.w3.org/1999/xhtml">
....
I tried this way to get HTTP chunked encoded data in Ubuntu, it might help.
In apache server create a file index.php in your directory where index page is there ( ex : /var/www/html/) and paste below content (should have php installed):
<?php phpinfo(); ?>
Then try to curl the page as below :
root#ubuntu-16:~# curl -v http://10.11.0.230:2222/index.php
* Trying 10.11.0.230...
* Connected to 10.11.0.230 (10.11.0.230) port 2222 (#0)
> GET /index.php HTTP/1.1
> Host: 10.11.0.230:2222
> User-Agent: curl/7.47.0
> Accept: */*
>
< HTTP/1.1 200 OK
< Date: Wed, 01 Jul 2020 07:51:24 GMT
< Server: Apache/2.4.18 (Ubuntu)
< Vary: Accept-Encoding
< Transfer-Encoding: chunked
< Content-Type: text/html; charset=UTF-8
<
<!DOCTYPE html>
<html>
<body>
...
...
...
I'm trying to port one of my Android applications to work natively on Mac OS X.
For the initialisation of the application, it needs to connect to a server and read only the headers of the server's response. The server (3rd party server) will respond with 82274 bytes of data, but the only useful data to me are the headers; specifically I only need to read the session cookie and retrieve its value. This means that all of the other data is redundant.
Through Googling, the only working looking response is as follows:
// Create the request.
NSMutableURLRequest *theRequest = [NSMutableURLRequest requestWithURL:[NSURL URLWithString:#"http://www.grooveshark.com/"]];
[theRequest setHTTPMethod:#"HEAD"];
[theRequest setValue:#"MySpecialUserAgent/1.0" forHTTPHeaderField:#"User-Agent"];
[theRequest setTimeoutInterval:15.0];
[theRequest setCachePolicy:NSURLRequestReloadIgnoringCacheData];
However this still downloads the entire page.
Can anyone point me in the right direction?
Let's see what happens if we hit that URL.
› curl -v -X HEAD http://www.grooveshark.com
* About to connect() to www.grooveshark.com port 80 (#0)
* Trying 8.20.213.76...
* connected
* Connected to www.grooveshark.com (8.20.213.76) port 80 (#0)
> HEAD / HTTP/1.1
> User-Agent: curl/7.24.0 (x86_64-apple-darwin12.0) libcurl/7.24.0 OpenSSL/0.9.8r zlib/1.2.5
> Host: www.grooveshark.com
> Accept: */*
>
< HTTP/1.1 301 Moved Permanently
< Server: richhickey
< Date: Sat, 15 Dec 2012 20:27:38 GMT
< Content-Type: text/html; charset=UTF-8
< Connection: close
< Location: http://grooveshark.com
< Vary: Accept-Encoding
< X-Hostname: rhl081
< X-Hostname: rhl081
<
* Closing connection #0
So www.grooveshark.com redirects to grooveshark.com. Let's see if that page honors HEAD requests correctly.
› curl -v -X HEAD http://grooveshark.com
* About to connect() to grooveshark.com port 80 (#0)
* Trying 8.20.213.76...
* connected
* Connected to grooveshark.com (8.20.213.76) port 80 (#0)
> HEAD / HTTP/1.1
> User-Agent: curl/7.24.0 (x86_64-apple-darwin12.0) libcurl/7.24.0 OpenSSL/0.9.8r zlib/1.2.5
> Host: grooveshark.com
> Accept: */*
>
< HTTP/1.1 200 OK
< Server: richhickey
< Date: Sat, 15 Dec 2012 20:28:06 GMT
< Content-Type: text/html; charset=UTF-8
< Connection: close
< Vary: Accept-Encoding
< Set-Cookie: PHPSESSID=844a5e6bdd6d84a97afd8f42faf4eb95; expires=Sat, 22-Dec-2012 20:28:06 GMT; path=/; domain=.grooveshark.com
< Expires: Thu, 19 Nov 1981 08:52:00 GMT
< Cache-Control: no-store, no-cache, must-revalidate, post-check=0, pre-check=0
< Pragma: no-cache
< Vary: Accept-Encoding
< X-Hostname: rhl061
< Set-Cookie: ismobile=no;domain=.grooveshark.com;path=/
< X-country: US
<
* Closing connection #0
That looks good. I suspect that your request is falling back to a GET when following that redirect. It looks like Chris Suter ran into the same thing and gave an example solution: http://sutes.co.uk/2009/12/nsurlconnection-using-head-met.html
In the future you might want to try running your requests through a local proxy so that you can see them in flight. That would probably reveal that you make a HEAD request to www.grooveshark.com followed by a GET to grooveshark.com.