trying to download a dataset from a website

trying to download a dataset from a website - apache

I am trying to download a dataset from a website but I can't download the whole folder .. I have to download each file separately which will need a lot of time. I am wondering if there is anyway to download the whole folder at a time??
The website link: http://www.physionet.org/pn4/eegmmidb/

Use wget with the -r switch to turn on mirroring.
This command will do what you want:
wget --no-parent -r http://www.physionet.org/pn4/eegmmidb/
It'll produce a mirror copy of everything from that directory on down.

These two for cycles run in bash should do it:
for S in S{001..109}; do
mkdir ${S}
cd ${S}
for R in R{01..14}; do
file="http://www.physionet.org/pn4/eegmmidb/${S}/${S}${R}.edf"
wget "$file"
wget "${file}.event"
done
cd ..
done

Related

How to automate commands on Cygwin

Hi I am looking to automate my file transfering to my Jailbroken iPhone over USB with a bash file. Which will launch the relay then do the file transfers
With this here I installed and successfully transfered files to my iPhone with cygwin but now I want to automate the file transfer.
First I need to start the relay with cygwin and those commands are required
cd pyusbmux/python-client/
chmod +x *
./tclrelay.py -t 22:2222
so I created a .sh file that does it but when I launch it cygwin gives me those errors
This is what should happen on the left and the result of the script on the right
How can I make cygwin open with thoses commands

In addition to be sure that tcpON.sh has proper line termination with d2u of dos2unix package:
d2u tcpON.sh
You should add a proper SHEBANG on the first line of your script
https://linuxize.com/post/bash-shebang/
#!/bin/bash
cd /cygdrive/e/Grez/Desktop
cd pyusbmux/python-client/
chmod +x *
./tclrelay.py -t 22:2222
You can use as base the Cygwin.bat and make a tcpON.bat batch file like:
C:
chdir c:\cygwin64\bin
bash --login /cygdrive/e/Grez/Desktop/tcpON.sh
Verify the proper cd command to be sure that you are always in the expected directory.
It is not the only way but probably the most flexible (IMHO)

scp command - transfer folder over ssh

I have a Arduino Yun and want setup the server for Yun.
So what I want is to copy a folder that contain a py file and a index.html to my Yun
I used mac terminal to do this operation
the command looks like this
scp -r /Users/gudi/Desktop/LobsterHeartRate root#192.168.240.1:/mnt/sda1
and then terminal asked for the password
after I typed, it shows
scp: /mnt/sda1/LobsterHeartRate: Not a directory
I didn't type /mnt/sda1/LobsterHeartRate why it shows this error

Your code
scp -r /Users/gudi/Desktop/LobsterHeartRate root#192.168.240.1:/mnt/sda1
requires that the remote directory /mnt/sda1 exists. This looks like it is not true in your case. Check it using ssh root#192.168.240.1 ls /mnt/sda1.
scp is simple tool and it does not allow you to rename directories on the fly and the target directory must exists. You might try
scp -r /Users/gudi/Desktop/LobsterHeartRate root#192.168.240.1:/mnt/
ssh root#192.168.240.1 mv /mnt/LobsterHeartRate /mnt/sda1
or so, if it will suit your needs. But copying more files, rsync is usually more suitable. Check its manual page and give it a try next time.

As #Jens Höpken notes, your post is a bit sparse. But trying to read between the lines of your post I suspect that LobsterHeartRate is a DIRECTORY on your local system but a FILE named LobsterHeartRate in your target system. This might be happening right at the top of the directory tree, or perhaps you have directories/files of the same name further down the tree. scp -rv might help resolve any confusions here.
Beware: scp -r resolves symbolic links. If you want to preserve symlinks you need to do something else. For historic reasons I use the following, though cpio with a find front-end opens up interesting possibilities for fine-grained file selections.
( cd /Users/gudi/Desktop && tar -cf - LobsterHeartRate ) |
ssh root#192.168.240.1 'cd /mnt/sda1 && tar -xf -'
For a safe "dry run" you could change the -xf to a -tf. The && chains are required to prevent bad things from happening if any prior command fails.
Disclaimer: any debugging is left as an exercise for the student.

output file after docker image is created through Dockerfile

I'm creating a Dockerfile to build my docker image. I was wondering what the best way, or if it's even possible, to create a log file of some sort that can show the results of the build and see if there were any errors in the process. For example, right now I have this:
monoVersion="3.8.0"
mkdir ~/mono
curl http://download.mono-project.com/sources/mono/mono-$monoVersion.tar.bz2 | tar xj --strip-components 1 -C ~/mono
cd ~/mono
git apply /src/mono-fix-20131106.patch
./configure --prefix=/usr/local
make -j 2
make install
in a install.sh script. In my Dockerfile I have:
FROM centos
MAINTAINER crystaltwix
ADD . /src
RUN cd /src ; ./install.sh
I'd like a way that I can look at the output after the image is created so every time I grab a different version of Mono, or do something similar when creating a new iamge, I can look after the image is created to see if any errors were generated. Is this possible? Or is that "connection" to the image being built closed once the Dockerfile is completed. Thank you.

One approach that I have been seeing for years (well before docker though docker makes it easier) is to have the script save it's output to a build log with the resulting image so that when you use the image and find a bug you can be sure you know hot this image was created.
RUN cd /src ; ./install.sh | tee buildlog

Remote rsync in parallel

I'm trying to run rsync over ssh in parallel to transfer files between two machines for evaluation purposes. I wanna see how faster can I get compared to a single rsync process.
I tried these two solutions:
https://wiki.ncsa.illinois.edu/display/~wglick/Parallel+Rsync but with no great success.
https://gist.github.com/rcoup/5358786 (I couldn't make it work)
Based on the first link I run a command like this:
ssh HOST "mkdir -p ~/destdir/basefolder"
cd ./basefolder; ls | xargs -n1 -P 4 -I% rsync -arvuz -e ssh % HOST:~/destdir/basefolder/.
and I get the files transfered, but it doesn't seem to work well... In this case, It will run a process for every file and folder in the basefolder, but when it finds a folder, it will transfer everything inside that folder using only 1 process.
I tried to use find -type f, but I got problems because I loose the file hierarchy.
Does anyone how some methods to do what I want? (Use rsync in parallel over ssh while keeping files and folders hierarchy).

Since you tagged your question 'gnu-parallel' the obvious is to refer you to http://www.gnu.org/software/parallel/man.html#EXAMPLE:-Parallelizing-rsync
cd src-dir; find . -type f -size +100000 | parallel -v ssh fooserver mkdir -p /dest-dir/{//}\;rsync -Havessh {} fooserver:/dest-dir/{}

Chaining terminal script on mac os x

I am trying to chain some terminal commands together so that i can wget a file unzip it and then directly sync to amazon s3. Here is what i have so far i have s3cmd tool installed properly and working. This works for me.
mkdir extract; wget http://wordpress.org/latest.tar.gz; mv latest.tar.gz extract/; cd extract; tar -xvf latest.tar.gz; cd ..; s3cmd -P sync extract s3://suys.media/
How do i then go about creating a simple script i can just use variables?

You will probably want to look at bash scripting.
This guide can help you alot; http://bash.cyberciti.biz/guide/Main_Page
For your question;
Create a file called mysync,
#!/bin/bash
mkdir extract && cd extract
wget $1
$PATH = pwd
for f in $PATH
do
tar -xvf $f
s3cmd -P sync $PATH $2
done
$1 and $2 are the parameters that you call with your script. You can look at here for more information about how to use command line parameters; http://bash.cyberciti.biz/guide/How_to_use_positional_parameters
ps; #!/bin/bash is necessity. you need to provide your script where bash is stored. its /bin/bash on most unix systems, but i'm not sure if it is the same on mac os x, you can learn it by calling which command on terminal;
→ which bash
/bin/bash
you need to give your script executable privileges to run it;
chmod +x mysync
then you can call it from command line;
mysync url_to_download s3_address
ps2; I haven't tested the code above, but the idea is this. hope this helps.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

trying to download a dataset from a website - apache

I am trying to download a dataset from a website but I can't download the whole folder .. I have to download each file separately which will need a lot of time. I am wondering if there is anyway to download the whole folder at a time?? The website link: http://www.physionet.org/pn4/eegmmidb/

Use wget with the -r switch to turn on mirroring. This command will do what you want: wget --no-parent -r http://www.physionet.org/pn4/eegmmidb/ It'll produce a mirror copy of everything from that directory on down.

These two for cycles run in bash should do it: for S in S{001..109}; do mkdir ${S} cd ${S} for R in R{01..14}; do file="http://www.physionet.org/pn4/eegmmidb/${S}/${S}${R}.edf" wget "$file" wget "${file}.event" done cd .. done

Related

How to automate commands on Cygwin

scp command - transfer folder over ssh

output file after docker image is created through Dockerfile

Remote rsync in parallel

Chaining terminal script on mac os x

Categories

Resources