Not able to download multiple dynamoDB tables by using dynamodump
$ python dynamodump.py -m backup -r us-east-1 -s 'DEV_*'
INFO:root:Found 0 table(s) in DynamoDB host to backup:
INFO:root:Backup of table(s) DEV_* completed!
But i'm able to download if i give single table name and "*" (download all DynamoDB tables).
I have followed this procedure which is in the below link:
https://github.com/bchew/dynamodump
Can anyone suggest me how to download multiple dynamoDB tables with the specific pattern (like QA_* / DEV_* / PROD_* / TEST_*)
for i in aws dynamodb list-tables | jq -r ''| grep 'QA*'| tr ',' ' ' | cut -d'"' -f2;
do
echo "======= Starting backup of $i date =========="
python dynamodump.py -m backup -r us-east-1 -s $i
done
The above script will work if you want to take multiple dynamoDB tables backup. prior running the script you have to download the jq: (https://stedolan.github.io/jq/download/)
Related
I have some parquet files stored in HDFS that I want to convert to csv files FIRST and export them in a remote file using ssh.
I don't know if it's possible or simple by writing a spark job (I know that we can convert parquet to csv file JUST by using spark.read.parquet then to the same DF use spark.write as a csv file). But I really wanted to do it by using a impala shell request.
So, I thought about something like this :
hdfs dfs -cat my-file.parquet | ssh myserver.com 'cat > /path/to/my-file.csv'
Can you help me PLEASE with this request ? Please.
Thank you !!
Example without kerberos:
impala-shell -i servername:portname -B -q 'select * from table' -o filename '--output_delimiter=\001'
I could explain it all, but it is late and here is a link that allows you to do that as well as the header if you want: http://beginnershadoop.com/2019/10/02/impala-export-to-csv/
You can do that by multiples ways.
One approach could be as in the example below.
With impala-shell you can run a query and pipe to ssh to write the output in a remote machine.
$ impala-shell --quiet --delimited --print_header --output_delimiter=',' -q 'USE fun; SELECT * FROM games' | ssh remoteuser#ip.address.of.remote.machine "cat > /home/..../query.csv"
This command change from default database to a fun database and run a query on it.
You can change the --output_delimiter='\t', --print_header or not along with other options.
I want to Sync my server data to Google Cloud Storage to copy automatically using shell script. I don't know how to make script. Every time i need to use:
gsutil -m rsync -d -r [Source] gs://[Bucket-name]
If anyone knows the answer please help me!
To automate the sync process use cron job:
Create a script to run with cron $ nano backup.sh
Paste your gsutil command in the script $ gsutil -m rsync -d -r [Source_PATH] gs://bucket-name
Make the script executable $ chmod +x backup.sh
Based on your use case, put the shell script (backup.sh) in one of the below folders: a) /etc/cron.daily b) /etc/cron.hourly c) /etc/cron.monthly d)
/etc/cron.weekly
If you want to run this script for a specific time then go to the terminal and type: $ crontab -e
Then simply call out the script with cron as often as you want, for example, in midnight: 00 00 * * * /path/to/your/backup.sh
In case you are using Windows on your local server, The commands will be the same as above but make sure to use Windows path instead.
Instade of move I want to copy all my keys from a particular db to another.
Is it possible in redis if yes than how ?
If you can't use MIGRATE COPY because of your redis version (2.6) you might want to copy each key separately which takes longer but doesn't require you to login to the machines themselves and allows you to move data from one database to another.
Here's how I copy all keys from one database to another (but without preserving ttls)
#set connection data accordingly
source_host=localhost
source_port=6379
source_db=0
target_host=localhost
target_port=6379
target_db=1
#copy all keys without preserving ttl!
redis-cli -h $source_host -p $source_port -n $source_db keys \* | while read key; do
echo "Copying $key"
redis-cli --raw -h $source_host -p $source_port -n $source_db DUMP "$key" \
| head -c -1 \
| redis-cli -x -h $target_host -p $target_port -n $target_db RESTORE "$key" 0
done
Keys are not going to be overwritten, in order to do that, delete those keys before copying or simply flush the whole target database before starting.
Copies all keys from database number 0 to database number 1 on localhost.
redis-cli --scan | xargs redis-cli migrate localhost 6379 '' 1 0 copy keys
If you use the same server/port you will get a timeout error but the keys seem to copy successfully anyway. GitHub Redis issue #1903
redis-cli -a $source_password -p $source_port -h $source_ip keys /*| while read key;
do echo "Copying $key";
redis-cli --raw -a $source_password -h $source_ip -p $source_port -n $dbname DUMP "$key"| head -c -1| redis-cli -x -a $destination_password -h $destination_IP -p $destination_port RESTORE "$key" 0;
Latest solution:
Use the RIOT open-source command line tool provided by Redislabs to copy the data.
Reference: https://developer.redis.com/riot/riot-redis/cookbook.html#_performing_migration
GitHub project link: https://github.com/redis-developer/riot
How to install: https://developer.redis.com/riot/riot-redis/
# Source Redis db
SH=test1-redis.com
SP=6379
# Target Redis db
TH=test1-redis.com
TP=6379
# Copy from db0 to db1 (standalone Redis db, Or cluster mode disabled)
#
riot-redis -h $SH -p $SP --db 0 replicate -h $TH -p $TP --db 1 --batch 10000 \
--scan-count 10000 \
--threads 4 \
--reader-threads 4 \
--reader-batch 500 \
--reader-queue 2000 \
--reader-pool 4
RIOT is quicker, supports multithreading, and works well with cross-environment Redis data copy ( AWS Elasticache, Redis OSS, and Redislabs ).
Not directly. I would suggest to use the always convenient redis-rdb-tools package (from Sripathi Krishnan) to extract the data from a normal rdb dump, and reinject it to another instance.
See https://github.com/sripathikrishnan/redis-rdb-tools
As far as I understand you need to copy keys from a particular DB (e.g 5 ) to a particular DB say 10. If that is the case you can use redis database dumper (https://github.com/r043v/rdd). Although as per documentation it has a switch (-d) to select a database for operation but didn't work for me, so what I did
1.) Edit the rdd.c file and look for int main(int argc,char argv) function
2.) Change the DB to as per your requirement
3.) compile the src by **make
4.) Dump all keys using ./rdd -o "save.rdd"
5.) Edit the rdd.c file again and change the DB
6.) Make again
7.) Import by using ./rdd "save.rdd" -o insert -s "IP" -p"Port"
I know this is old, but for those of you coming here form Google:
I just published a command line interface utility to npm and github that allows you to copy keys that match a given pattern (even *) from one Redis database to another.
You can find the utility here:
https://www.npmjs.com/package/redis-utils-cli
Try using dump to first dump all the keys and then restore the same
If migrating keys inside of the same redis engine, then you might use internal command MOVE for that (pipelining for more speed):
#!/bin/bash
#set connection data accordingly
source_host=localhost
source_port=6379
source_db=4
target_db=0
total=$(redis-cli -n 4 keys \* | sed 's/^/MOVE /g' | sed 's/$/ '$target_db'/g' | wc -c)
#copy all keys without preserving ttl!
time redis-cli -h $source_host -p $source_port -n $source_db keys \* | \
sed 's/^/MOVE /g' | sed 's/$/ 0/g' | \
pv -s $total | \
redis-cli -h $source_host -p $source_port -n $source_db >/dev/null
I've backed all my mysql databases with he following command
mysqldump -u root -ppasswod --all-databases | gzip > all.sql.gz
just wanted to know will I be able to restore all of the database with following command
gunzip < alldb.sql.gz | mysql -u root -ppassword -h localhost
can you also tell me how to back up all of mysql users too?
I cant test it because I'm not sure and I don't want to break any db on my current system
Yes. Generally, to restore compressed backup files you can do the following:
gunzip < alldb.sql.gz | mysql -u [uname] -p[pass] [dbname]
Please consult How to Back Up and Restore a MySQL Database
Note that the --all-databases option is applicable to backup only. The backup file itself will contain all the relevant CREATE DATABASE quux; commands for the restore.
This is the command I use to backup all databases in MySQL:
mysqldump -u USERNAME -p --all-databases --events --ignore-table=mysql.event --extended-insert --add-drop-database --disable-keys --flush-privileges --quick --routines --triggers | gzip > "all_databases.gz"
The '--all-databases' option tells the command to include all of the databases. If you want to specify one or more then remove that option and replace it with '--databases dbname1 dbname2 dbnameX'
To backup all of your mysql users, passwords, permissions then include the 'mysql' database in your backup. The --all-databases option includes this database in the backup.
The '--routines' option includes stored procedures and functions in the backup.
The '--triggers' option includes any triggers in the backup.
To restore from a *.gz mysqldump file:
gunzip < all_databases.gz | mysql -u USERNAME -p
To display a progress bar while importing a sql.gz file, download pv and use the following:
pv mydump.sql.gz | gunzip | mysql -u root -p
If PV command is not installed on your system then try below command relatively
In CentOS/RHEL
yum install pv
In Debian/Ubuntu
apt-get install pv
In MAC
brew install pv
Output Something like that -->
pv mydump.sql.gz | gunzip | mysql -u root -p dbname
Enter password:
255MiB 0:05:49 [ 748kiB/s] [===========> ] 30%
I have this requirement where i need to export the report data directly to csv since getting the array/query response and then building the scv and again uploading the final csv to amazon takes time. Is there a way by which i can directly create the csv with the redshift postgresql.
PgSQL - Export select query data direct to amazon s3 servers with headers
here is my version of pgsql - Version PgSQL 8.0.2 on amazon redshift
Thanks
You can use UNLOAD statement to save results to a S3 bucket. Keep in mind that this will create multiple files (at least one per computing node).
You will have to download all the files, combine them locally, sort (if needed), then add column headers and upload result back to S3.
Using the EC2 instance shouldn't take a lot of time - connection between EC2 and S3 is quite good.
In my experience, the quickest method is to use shells' commands:
# run query on the redshift
export PGPASSWORD='__your__redshift__pass__'
psql \
-h __your__redshift__host__ \
-p __your__redshift__port__ \
-U __your__redshift__user__ \
__your__redshift__database__name__ \
-c "UNLOAD __rest__of__query__"
# download all the results
s3cmd get s3://path_to_files_on_s3/bucket/files_prefix*
# merge all the files into one
cat files_prefix* > files_prefix_merged
# sort merged file by a given column (if needed)
sort -n -k2 files_prefix_merged > files_prefix_sorted
# add column names to destination file
echo -e "column 1 name\tcolumn 2 name\tcolumn 3 name" > files_prefix_finished
# add merged and sorted file into destination file
cat files_prefix_sorted >> files_prefix_finished
# upload destination file to s3
s3cmd put files_prefix_finished s3://path_to_files_on_s3/bucket/...
# cleanup
s3cmd del s3://path_to_files_on_s3/bucket/files_prefix*
rm files_prefix* files_prefix_merged files_prefix_sorted files_prefix_finished