Direct upload to S3 from NAS - amazon-s3

I'm trying to set up an S3 backup for my company's NAS (a QNAP TS-EC879U-RD), and I'm having some trouble. The NAS device itself has a much faster network connection than the computer that I'm using to upload (10Gb/s vs 1Gb/s), but it seems like every tool I explore has to move data through my computer first. I'm sure there must be a way to bypass this bottleneck, but I can't imagine what it is. Any help pointing me in the right direction would be much appreciated.

The only way to do this would be to have the NAS itself run the upload software. Otherwise, the data MUST be read from the NAS into your PC and then sent via the upload software to S3.
Think about it this way: What you're asking for is software that can see what files are on the NAS, then tell the NAS to upload the files through its network connection. If you could do this without having to install software on the NAS, then any software running on your PC (i.e., a virus) could tell the NAS to upload its data to some hacker's servers.
As such, either you need to have the software running on the NAS itself, which can then directly upload data to S3, or the upload software will need to suck the data from the NAS into your PC and then upload it to S3.

Related

How to access Amazon S3 backup without Jungle Disk

I've been backing up my Mac to the Amazon S3 cloud using Jungle Disk. Now that Mac is dead. Fine, my backups are on the cloud. So, I go to my other Mac and download Jungle Disk. It is a workgroup version of the software. When I run it it wants me to verify that I purchased the software. Well, when I first set up the Jungle Disk client some years ago there was a free client. I'd rather not pay for this unless there's no good alternative.
Next I login to my Amazon S3 Console. I have a bunch of buckets there which are impossible to navigate.
So, I google around for S3 browsers and find Cyberduck. I download and install that. When I run it it wants a server URL. At this point I'm stuck.
Is there a client that knows about the structure of backups in S3 that I can install on this other Mac to get to my backed up data?
After a couple of conversations with Jungle Disk support I was given this (undocumented) url:
https://downloads.jungledisk.com/jungledisk/JungleDiskDesktop3160.dmg
I've downloaded and installed the client, didn't have to pay anything, and I've gotten to my backed up data. Whew!
Sol got his stuff fixed. Sharing additional background for future readers. Jungle Disk uses the WebDAV standard to allow access through our web service layer. Depending on the version of Jungle Disk you're running we have a few different URLs you'll authenticate to. Ping our team at support.jungledisk.com and we'll get you setup.

Where to begin with managing web servers / business document file management

I've inherited a couple of web servers - one linux, one windows - with a few sites on them - nothing too essential and I'd like to test out setting up back-ups for the servers to both a local machine and a cloud server, and then also use the cloud server to access business documents and the local machine as a back-up for these business documents.
I'd like to be able to access all data wherever I am via an internet connection. I can imagine it running as follows,
My PC <--> Cloud server - access by desktop VPN or Web UI
My PC <--> Web Servers - via RDP, FTP, Web UI (control panels) or SSH
My PC <--> Local Back-up - via RDP, FTP, SSH or if I'm in the office, Local Network
Web servers --> Local Back-up - nightly via FTP or SSH
Cloud Server --> Local Back-up - nightly via FTP or SSH
Does that make sense? If so, what would everyone recommend for a cloud server and also how best to set up the back-up server?
I have a couple of spare PC's that could serve as local back-up machines - would that work? I'm thinking they'd have to be online 24/7.
Any help or advice given or pointed to would be really appreciated. Trying to understand this stuff to improve my skill set.
Thanks for reading!
Personally I think you should explore using AWS's S3. The better (S)FTP clients can all handle S3 (Cyberduck, Transmit, etc.), the API is friendly if you want to write a script, there is a great CLI suite that you could use in a cron job, and there are quite a few custom solutions to assist with the workflow you describe. s3tools being one of the better known ones. The web UI is fairly decent as well.
Automating the entire lifecycle like you described would be a fairly simple process. Here's one process for windows, another general tutorial, another windows, and a quick review of some other S3 tools.
I personally use a similar workflow with S3/Glacier that's full automated, versions backups, and migrates them to Glacier after a certain timeframe for long-term archival.

Amazon S3 WebDAV access

I would like to access my Amazon S3 buckets without third-party software, but simply through the WebDAV functionality available in most operating systems. Is there a way to do that ? It is important to me that no third-party software is required.
There's a number of ways to do this. I'm not sure about your situation, so here they are:
Option 1: Easiest: You can use a 3rd party "cloud gateway" provider, like http://storagemadeeasy.com/CloudDav/
Option 2: Set up your own "cloud gateway" server
Set up a dedicated server or virtual server to act as a gateway. Using Amazon's own EC2 would be a good choice.
Set up software that mounts S3 as a drive. Two I know of on Windows: (1) CloudBerry Drive http://www.cloudberrylab.com/ and (2) WebDrive (http://webdrive.com). For Linux, I have never done it, but you can try: https://github.com/s3fs-fuse/s3fs-fuse
Set up a webdav server like CrushFTP. (It comes to mind because it's stable and cheap and works on any OS.) Another option is IIS but I personally find it's harder to set up securely for webdav.
Set up a user in your WebDav server (ie CrushFTP or IIS) with access to the mapped S3 drive.
Possible snag: Assuming you're using Windows, to start your services automatically and have this work, you may need to set up both services to use the same Windows user account (Services->(Your Service)->[right-click]Properties->Log On tab). This is because the S3 mapping software might not map the S3 drive for all Windows users. Alternatively, you can use FireDaemon if you get stuck on this step to start the programs as a service all under the same username.
Other notes: I have experience using WebDrive under pretty heavy loads, and it seems to work well. Under tons of pounding (I'm talking thousands of files per hour being added to a 5 TB WebDrive) it started to crash Windows. But I'm not sure if you are going that far with it. Also, if you're using EC2, you may not have that issue since it was likely caused by a huge transfer queue in memory and EC2 will have faster transit to S3 and keep the queue smaller.
I finally gave up on this idea and today I use Rclone (https://rclone.org) to synchronize my files between AWS S3 and different computers. Rclone has the ability to mount remote storage on a local computer, but I don't use this feature. I simply use the copy and sync commands.
S3 does not support webdav, so you're out of luck!
Also, S3 does not support hierarchial name spaces, so you cant directly map a filesystem onto it
There is an example java project here for putting a webdav server over Amazon S3 - https://github.com/miltonio/milton-aws

apache restricting bandwidth

Ive been looking around the web without much success.
I am running a local xampp (1.7.0) installation and a web app that i have developed that backups up my file and send them to an FTP server. The problem is that apache seem to be using only a limited amount of my bandwidth and i am unsure why this is happening.
It usually doesn't get above 64KB/s but i know that my current broadband will allow over 1MB/s which is a massive difference. Also if i use my FTP program to login to the server it will let me download in excess of 500KB/s. Does anyone know how i can get around this cos my backups are very big files and take hours to copy at 64KB/s?
Thanks Mic
You seem to be confusing download and upload speeds: "if i use my FTP program to login to the server it will let me download in excess of 500KB/s."
Are you perhaps on a 1Mb/64Kb ADSL or cable connection?

Amazon EC2 Windows AMI with shared S3 storage

I've currently got a base Windows 2008 Server AMI that I created on Amazon EC2. I use it to create 20-30 EBS-based EC2 instances at a time for processing large amounts of data into PDFs for a client. However, once the data processing is complete, I have to manually connect to each machine and copy off the files. This takes a lot of time and effort, and so I'm trying to figure out the best way to use S3 as a centralised storage for the outputted PDF files.
I've seen a number of third party (commercial) utilities that can map S3 buckets to drives within Windows, but is there a better, more sensible way to achieve what I want? Having not used S3 before, only EC2, I'm not sure of what options are available, and I've not been able to find anything online addressing the issue of using S3 as centralised storage for multiple EC2 Windows instances.
Update: Thanks for suggestions of command line tools for using S3. Was hoping for something a little more integrated and less ad-hoc. Seeing as EC2 is closely related to S3 (S3 used to be the default storage mechanism for AMIs, etc), that there might be something neater/easier I could do. Perhaps even around Private Cloud Networks and EC2 backed S3 servers, etc, or something (an area I know nothing about). No other ideas?
I'd probably look for a command line tool. A quick search on Google lead me to a .Net tool:
http://s3.codeplex.com/
And a Java one:
http://www.beaconhill.com/opensource/s3cp.html
I'm sure there are others out there as well.
You could use an EC2 instance with EBS exported through samba which can act as a centralized storage that windows instances can map?
this sounds very much like a hadoop/Amazon MapReduce job to me. Unfortunately, hadoop is best deployed on Linux:
Hadoop on windows server
I assume the software you use for pdf-processing is Windows only?
If this is not the case, I'd seriously consider porting your solution to Linux.