I have a requirement to SFTP ".csv" files from corporate on-premise linux box to S3 bucket.
The Current Setup is as follows:
The on-premise linux box is NOT connected to internet.
Corporate Network is connected with AWS with Direct Connect.
There are several VPCs for different purposes. Only One VPC has IGW and Public Subnet (to accept requests coming from Public Internet), all other VPCs do not have IGW and Public Subnets.
Corporate Network and several AWS VPCs (those having no IGW) are connected with each other through Transit Gateway.
Can someone please advise whether I should use AWS Transfer or S3 VPC Interface Endpoints to transfer files to S3 bucket from on-premise (corporate network)? and why?
I appreciate your valuable advise in advance.
You should Create a server endpoint that can be accessed only within your VPC - AWS Transfer Family.
Note that this is a special endpoint for AWS Transfer. It is not an endpoint for Amazon S3.
Alternatively, you could run an SFTP server on an Amazon EC2 instance, as long as the instance also has access to Amazon S3 to upload the files received.
Of course, I'd also recommend avoiding SFTP altogether and upload directly to Amazon S3 if it is at al possible. Using SFTP adds complexity and expense that is best avoided.
Related
Whether I try to create an AWS S3 File Gateway (EC2) in the management console or with Terraform, I get the same problem below...
If I launch the EC2 instance in a public subnet, the gateway is created. If I try to launch the gateway in a private subnet (with NAT, all ports open in and out), it wont work. I get...
HTTP ERROR 500
I am running a VPN and able to ping the instance's private IP if I use the Management console. This is the same error code in terraform on a cloud 9 instance, which is also able to ping the instance.
Since I am intending to share the S3 bucket with NFS, its important that the instance reside in a private subnet. I'm new to trying out the AWS S3 File Gateway, I have read over the documentation, but nothing clearly states how to do this and why a private subnet would be different, so if you have any pointers I could look into, I'd love to know!
For any further reference (not really needed) my testing in terraform is mostly based on this github repository:
https://github.com/davebuildscloud/terraform_file_gateway/tree/master/terraform
I was unable to get the AWS console test to work, but I realised my Terraform test was poorly done - I mistakenly was skipping over a dependency that was establishing the VPC peering connection to the Cloud 9 Instance. once I fixed that it worked. Still I would love to know what would be required to get this to work through the Management Console too...
I have a Google Enterprise Subscription ( Redis Cloud/Fixed Plan/GCP/us-east1/Standard/100MB)
I can connect to the database from my local DEVELOPMENT environment.
BUT I CANNOT connect when I publish the app to the Google Cloud Platform (Cloud Run)
My Cloud Run app is in the same region as the Redis Instance (east-1)
The connection between your GCP project and the Redis instance is achieved through a VPC network peering as specified on the docs. Check all the restrictions and considerations for VPC network peering in GCP here. So I believe that if you make sure to route all traffic from your service through a Serverless VPC connector that is paired with the VPC network peering associated with your Redis instance could do the trick.
Anyhow, assigning your Cloud Run service a static outboud IP address by following this section of the docs should also guarantee that the connection is achieved. Notice that you'll basically need to configure the Cloud Run service's VPC egress to route all outbound traffic through a VPC network (using a Serveless VPC Access connector) that has a Cloud NAT gateway configured with the static IP address. Making sure that this IP address is cleared under the Source IP ACL related to your Redis Enterprise instance should guarantee the connection.
Finally, if you face too much difficulties achieving such a connection you could try to host your Redis instance in Cloud Memorystore and follow this section of the docs (notice that you'll basically need to once again create a VPC connector).
I am creating an Angular 6 frontend application. My backend api are created in DotNet. Assume the application is similar to https://www.amazon.com/.
My query is related to frontend portion deployment related only, on AWS. Large number of users with variable count pattern are expected on my portal. I thought of using AWS elastic beanstalk as PAAS web server.
Can AWS S3/ ELB be used instead of PAAS beanstalk without any limitations?
I'm not 100% sure what you mean by combining an Elastic Load Balancer with S3. I think you may be confused as to the purpose of the ELB, which is to distribute requests to multiple servers e.g. NodeJS servers, but cannot be used with S3 which is already highly available.
There are numerous options when serving an Angular app:
You could serve the files using a nodejs app, but unless you are doing server-side rendering (using Angular Universal), then I don't see the point because you are just serving static files (files that don't get stitched together by a server such as when you use PHP). It is more complicated to deploy and maintain a server, even using Elastic Beanstalk, and it is probably difficult to get the same performance as you could do with other setups (see below).
What I suspect most people would do is to configure an S3 bucket to host and serve the static files of your Angular app (https://docs.aws.amazon.com/AmazonS3/latest/dev/WebsiteHosting.html). You basically configure your domain name to resolve to the S3 bucket's url. This is extremely cheap, as you are not paying for a server which is running constantly, but rather only have to pay the small storage cost and plus a data transfer fee which would be directly proportional to your traffic.
You can further improve on the S3 setup by creating a CloudFront distribution that uses your S3 bucket as it's origin (the location that it get files from). When you configure your domain name to resolve to your CloudFront distribution, then instead of a user's request getting the files from the S3 bucket (which could be in a region on the other side of the world and so slower), the request will be directed to the closest "edge location" which will be much closer to your user, and check if files are cached there first. It is basically a global content delivery network for your files. This is a bit more expensive than S3 on it's own. See https://aws.amazon.com/premiumsupport/knowledge-center/cloudfront-serve-static-website/.
My Airflow application is running in AWS EC2 instance which has IAM role as well. Currently I am creating Airflow S3 connection using hardcoded access and secret key. But I want my application to pickup this AWS credentials from this instance itself.
How to achieve this?
We have a similar setup, our Airflow instance run inside containers deployed inside an EC2 machine. We set up the policies to access S3 on the EC2 machine instance profile. You don't need to pick up the credentials in the EC2 machine, because the machine has an instance profile that should have all the permissions that you need. From the Airflow side, we only use aws_default connection, in the extra parameter we only setup the default region, but there aren't any credentials.
Here a details article about Intance Profiles: https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/iam-roles-for-amazon-ec2.html
The question is answered but for future reference, it is possible to do it without relying on aws_default and just doing it via Environment Variables. Here is an example to write logs to s3 using an AWS connection to benefit form IAM:
AIRFLOW_CONN_AWS_LOG="aws://"
AIRFLOW__CORE__REMOTE_LOG_CONN_ID=aws_log
AIRFLOW__CORE__REMOTE_LOGGING=true
AIRFLOW__CORE__REMOTE_BASE_LOG_FOLDER="s3://path to bucket"
I have a lambda function which needs to access ec2 through ssh and load files and save it to s3. So,for that I have kept ec2 and lambda both in default VPCs and same subnet. Now the problem is that I am able to connect the function to ec2 but not to s3.
Its killing me since morning as when I remove the vpc settings it uploads the files to s3 ,but then connection to ec2 is lost.
I tried to add a NAT gateway to default VPC(although I am not sure I did it correctly or not because I am new to this) but it didnt do anything.
I am confused as my ec2 instance which s in the same VPC and subnet can access internet but lambda function is not able to access s3.
I am not sure how to proceed.
Please help!!!
The Lambda function will not get a public IP assigned to it from within a VPC, so it will never have direct Internet access like your EC2 instance has. You will have to move the Lambda function to a private subnet with a route to a NAT Gateway in order to give it Internet access. It sounds like you attempted this but configured it incorrectly.
If all the Lambda function needs to access is S3, then it is easier to setup a VPC Endpoint (AWS PrivateLink) in your VPC.