I have an internet facing sftp server that has regularly csv files update. Is there anyway command to have BigQuery to retrieve data from this sftp and put it into tables. Alternatively, any API or python library that support this?
As for BigQuery - there's no integration I know of with SFTP.
You'll need to either:
Create/find a script that reads from SFTP and pushes to GCS.
Add a HTTPS service to the SFTP server, so your data can be read with the GCS transfer service (https://cloud.google.com/storage-transfer/docs/)
Yet another 3rd party Tool supporting (S)FTP In and Out to/from GCP is Magnus - Workflow Automator which is part of Potens.io Suite - supports all BigQuery, Cloud Storage and most of Google APIs as well as multiple simple utility type Tasks like BigQuery Task, Export to Storage Task, Loop Task and many many more along with advanced scheduling, triggering, etc.. Also available at Marketplace.
FTP-to-GCS Task accepts a source FTP URI and can transfer single or multiple files based on input to a destination location in Google Cloud Storage. The resulting uploaded list to Google Cloud Storage is saved to a parameter for later use within the Workflow. The source FTP can be of types SFTP, FTP, or FTPS.
See here for more
Disclosure: I am GDE for Google Cloud and creator of those tools and leader on Potens team
Related
Let's say there is a file in this link example.com/file.bin and I want to transfer it to a cloud storage (Mega or Google Drive for example), or I have a download/upload host and want to transfer files from a cloud storage to that host. how can I do that? (Besides the good old downloading-and-reuploading way and transfer sites like Multcloud)
At first I thought I can use python + selenium framework to handle the cloud storage side, but that works only if I have the file on my own system. Can I use a host to deploy the code on it, and then use it to transfer the files? (Some of cloud storages don't have API to use them for downloading, So I think it's necessary to use Selenium)
Can someone please help to find the protocol used by the Azure Storage Explorer to connect to Azure storage??
is it SMB or REST?
Azure Storage Explorer (ASE) is a wrapper around azcopy command tool.
Here is a sample of azcopy command I pasted into the notepad
and Azcopy internally uses REST api.
In order to capture all the REST api calls going out you can also use fiddler tool.
Follow the instruction from the link below and you should be able to see them.
https://learn.microsoft.com/en-us/power-query/web-connection-fiddler
So the order is ASE uses -> azcopy uses -> REST API.
or you can also find the azcopy logs at this location for individual session at "%USERPROFILE%.azcopy"
It is REST.
Storage Explorer makes use of Storage SDKs for JavaScript which are wrapper over REST API.
I recently started using Google Colab and found that I need to reauth and recopy my data which is stored on Google Cloud Storage every time I start a new session.
Given that I'm using all Google services, is there a way to ensure that my settings are maintained in the environment?
There's no way to persist local files on the VM across sessions. You'll want to store persistent data externally in something like Drive, GCS, or your local filesystem.
Some recipes are available in the I/O example notebook.
I am using a shared host. No root access. I want to transfer files between the host and any online storage. The target is to take online backup and restore files from storage to host directly without download/upload. Kindly suggest me some way.
Mover.io is an online service that helps you easily transfer files and folders from one cloud storage service to another. The service works on a freemium model – you can transfer up to 10 GB of data for free and then pay $1 per extra GB of transfer.
Mover has connectors for all popular cloud storage providers. You may copy files from your Google Drive to Dropbox, from SkyDrive to Box or even from your old Google account to the new one. They also support FTP allowing you to directly transfer files from Google Drive or Dropbox to your FTP server, over the cloud.
I'm trying to host a database on Amazon RDS, and the actual content the database will store info on (videos) will be hosted on Amazon S3. I have some questions about this process I was hoping someone can help me with.
Can a database hosted on Amazon RDS interact (Search, update) something on Amazon S3? So if I have a database on Amazon RDS, and run a delete command to remove a specific video, is it possible to have that command remove the video on S3? Also, is there a tutorial on how to make the two mediums interact?
Thanks very much!
You will need an intermediary scripting language to maintain this process. For instance, if you're building a web based application that stores videos on S3 and the info for these videos including their locations on RDS you could write a PHP application (hosted on an EC2 instance, or elsewhere outside of Amazon's cloud) that connects to the MySQL database on RDS and does the appropriate queries and then interacts with Amazon S3 to complete a certain task there (e.g. delete a video like you stated).
To do this you would use the Amazon AWS SDK, for PHP the link is: http://aws.amazon.com/php/
You can use Java, Ruby, Python, .NET/Windows, and mobile SDKs to do these various tasks on S3, as well as control other areas of AWS if you use them.
You can instead find third-party scripts that do what you want and build an application around them, like for example, if someone wrote a simpler S3 interaction class you could use instead of rewriting some of your own code.
For a couple command line applications I've built I have used this handy and free tool: http://s3tools.org/s3cmd which is basically a command line tool for interacting with S3. Very useful for bash scripts.
Tyler