How to use MediaInfo with Amazon S3? - amazon-s3

According to the MediaInfo ChangeLog, Amazon S3 support was added in v0.7.76 and even patched in v0.7.77 (latest).
However, I can't find any documentation on how to implement it. It's not in CLI help menu nor the SourceForge project pages. I was hoping someone here might have some insight as the SourceForge forum is closed off.
How do I craft a MediaInfo command to use Amazon S3 with Access Key & Secret Key? I'm using the CLI.
The closest thing I could find was someone's example Java code:
http://fossies.org/linux/MediaInfo_CLI/MediaInfoLib/Source/Example/HowToUse_Dll.JNA.java
It looks like they're crafting a custom HTTP request to S3 and streaming the response to MediaInfo. I'm not sure. I don't know Java; I only know Bash, Ruby, PHP.
Has anyone successfully got MediaInfo working with S3; something like this?
mediainfo https://AWSAccessKeyId:AWSSecretAccessKey#s3.amazonaws.com/bucketname/filename

Mediainfo executable can be built with libcurl on linux distribution using below commands: (I used centos)
yum groupinstall 'Development Tools'
yum install libcurl-devel
yum install wget
wget http://mediaarea.net/download/binary/mediainfo/17.12/MediaInfo_CLI_17.12_GNU_FromSource.tar.xz
tar xvf MediaInfo_CLI_17.12_GNU_FromSource.tar.xz
cd MediaInfo_CLI_GNU_FromSource/
./CLI_Compile.sh --with-libcurl
cd MediaInfo/Project/GNU/CLI
./mediainfo --version
Then following command will provide media information for Amazon S3 url.
mediainfo --Output=XML https://AWSAccessKeyId:AWSSecretAccessKey#s3.amazonaws.com/bucketname/filename
The above command won't work with AWS keys(filename) having special characters. By using pre-signed url, it is possible to use special characters in AWS Keys.
aws s3 presign 's3://bucketname/testing/mini & bar™©.mp4'
mediainfo 'presignd url'

The Java example is an example about how to download with Java and send data to MediaInfo from Memory. Now MediaInfo has native support of S3. So just provide this URL.
The only issue is that you must have libcurl available and MediaInfo compiled with libcurl support. This is not already available on all platforms (e.g. on Windows you must put libcurl.dll from libcurl website in the same folder as mediainfo).
Better delivery of such support (with libcurl provided directly, and fully tested, on all platforms) is planned but there is no ETA.

I face the same problem. Please try this, it will work
https://{yourAwsAccessKey}:{yourAwswsSecretKey}#{yourBucketName}.s3.awsamazon.com/{file_path_in_bucket}

Related

How to download the cuDNN straight from nvidia website to my linux instance on GCP

I want to install tensorflow-gpu on my linux machine on google cloud platform. I am not using an deep learning vm gcp provide. So I installed anaconda on my linux instance and now i want to install tensorflow. I already installed nvidia drivers and cuda. They can be downloaded straight in to the cloud instances. But for cuDNN I have to download it into my local machine and then upoad it into the cloud instance. Is there a way to download that file directly from nvidia site to my cloud instance? Thank you
EDIT
CUDNN_URL="developer.download.nvidia.com/compute/redist/cudnn/v5.1/cudnn-8.0-linux-x64-v5.1.tgz"
wget -c ${CUDNN_URL}
Using these lines of commands we can directly download cudnnv5.1 and I have seen the links for version 6.5 as well. I tried the same link by putting the version I want but it did not work. Anyone knows a way to use this CUDNN_URL to directly download cudnn v7.1 or higher directly using wget or curl but not logging into the an Nvidia account?
There was a change in the naming convention of cuDNN archives.
Since version 7.2.1, NVIDIA added the full version number into the archive name instead of the previously used short one.
That means that the resulting download link for 7.2.1 is:
https://developer.download.nvidia.com/compute/redist/cudnn/v7.2.1/cudnn-9.2-linux-x64-v7.2.1.38.tgz
instead of,
https://developer.download.nvidia.com/compute/redist/cudnn/v7.2.1/cudnn-9.2-linux-x64-v7.2.tgz
You can follow this pattern:
VERSION_FULL="8.1.0.77"
VERSION="${VERSION_FULL%.*}"
CUDA_VERSION="11.2"
OS_ARCH="linux-x64"
CUDNN_URL="https://developer.download.nvidia.com/compute/redist/cudnn/v${VERSION}/cudnn-${CUDA_VERSION}-${OS_ARCH}-v${VERSION_FULL}.tgz"
wget -c ${CUDNN_URL}
The resulting link would be:
https://developer.download.nvidia.com/compute/redist/cudnn/v8.1.0/cudnn-10.2-linux-x64-v8.1.0.77.tgz
Because you need to have a developer account to get cuDNN there are no direct links to download files.
As a workaround you can download cuDNN and other software to your local machine and then follow documentation Transferring files to instances to copy files to your VM instance:
For example, if you use Windows I'd recommend you to use WinSCP to copy files to your VM.
In addition, have a look at this article Deep Learning environment setup on Ubuntu(16.04) | CUDA, cuDNN, OpenCV(3.x), TensorFlow, Keras.
If your really concerned about(I was) data to download cuda and cudnn files to your local machine and then upload it to the gcp instance. You can set up an GUI for your GCP instance in no time. check this https://www.youtube.com/watch?v=e3RnnmcNI_E or any vnc server tutorial. After that you can directly download any file from using a web browser.

Glusterfs client installation on ECS optimized Amazon Linux image

I am trying to use ECS optimized image (Amazon linux) and trying to install Glusterfs client on it.
Followed few documents like this internet but all are giving issue with repository. Unable to find the correct repo.
after tring yum install getting no package found error.
Please provide me some guidance to achieve this.

Tess4J - Native library (linux-x86-64/libtesseract.so) not found in resource path

I'm using Tess4J (JNA wrapper around tesseract), and trying to call tess.doOCR(myFile) to OCR text from a single-page PDF.
I have GhostScript installed (by using yum install ghostscript), gs -h works correctly.
My app server is using 64-bit JVM, and I have gsdll64.dll, and the 64-bit tesseract dll's liblept168.dll and libtesseract302.dll in the class path.
When tess.doOCR(myFile) is called, this is logged:
GPL Ghostscript 8.70 (2014-09-22)
Copyright (C) 2014 Artifex Software, Inc. All rights reserved.
This software comes with NO WARRANTY: see the file PUBLIC for details.
Processing pages 1 through 1.
Page 1
But then it just stops there. The program doesn't go any further.
UPDATE --
It looks like the real issue is from this error:
java.lang.UnsatisfiedLinkError: Unable to load library 'tesseract': Native library (linux-x86-64/libtesseract.so) not found in resource path
After looking around a lot, I don't see a convenient place to find this libtesseract.so file, and I'm not sure what it takes to get this onto my Linux app server. I read that maybe I need to download some C++ runtime, but I don't see a Linux download for that. Any advice would be much appreciated.
Or is this something to do with a symbolic link?
The Fix was simple for me,just do sudo apt-get install tesseract-ocr from the command line. For linux you dont need to worry about the DDL librarires or the jvm version. Installing tessearct from apt-get will do the trick.
Those DLLs are for Windows. For Linux, you'll need to install or build from Tesseract source.
That GS version, 8.70, is quite old. The latest Ghost4J library that Tess4J uses is not compatible with that.
Tess4J should include required libraries. However, you need to extract them first.
This should do the trick:
File tmpFolder = LoadLibs.extractTessResources("win32-x86-64"); // replace platform
System.setProperty("java.library.path", tmpFolder.getPath());
You should replace the argument of extractTessResources(..) with your platform. You can find possible options by looking into the Tess4J jar file.
This way you need not to install Tesseract on your system.
Recently I wrote a blog post about Tess4J in which I used this technique. Maybe it can help if you need further information or a running example project.
sudo apt-get update
sudo apt-get install tesseract-ocr
download test data by git
https://github.com/tesseract-ocr/tessdata

Encoding Videos In cloud Using Linux scripts or some softwares installed on linux

I want to encode the videos, which user uploads, similar to youtube which transforms whatever video type you upload to flv, mp4,webm . Because i can only play webm,html5 video in my Webapp
So i need to do this thing. I have tried www.zencoder.net But thats costly for me, because i have to encode too much videos most of the time.
Is there any solution, how this can be done, I have ubuntu 12.04 server ,I think this could be done by scripting or just parsing videos to already installed encoding software, But dont know how to pass the videos to encoder, And what encoder should i install on my linux.
I am using php as server side language in my cloud storage website
You can do this by encoding the videos with FFmpeg.
On the command line you could use:
ffmpeg -i file.mp4 file.mp4.avi
You can use ffmpeg-php with PHP to get some ffmpeg features. But it looks like you will have to use the command line with the php exec function.
$fileToFlv="/var/www/test/input.wmv";
$fileFlv="/var/www/test/test.flv";
exec("/usr/bin/ffmpeg -i ".$fileToFlv." -ar 22050 -ab 32 -f flv -s 320x256 ".$fileFlv);

What AWS CLI tools are installed by default on EC2 instances?

I'm using an Amazon Linux EC2 instance and am wondering What AWS CLI tools are installed by default on it.
Is it just the EC2 CLI API tools? How can one tell? Also where is the preferred single location on an EC2 instance to install each of the various CLI tools (RDS, cloudwatch, etc.) if they aren't installed already?
If you might answer each of these questions I'd be greatly appreciative.
For the Amazon Linux AMI 2012.03, here's the list of installed packages.
To answer your question, here's the list of AWS tools:
aws-amitools-ec2-1.4.0.7
aws-apitools-as-1.0.61.0
aws-apitools-cfn-1.0.9
aws-apitools-common-1.1.0
aws-apitools-ec2-1.5.5.0
aws-apitools-elb-1.0.17.0
aws-apitools-iam-1.5.0
aws-apitools-mon-1.0.12.1
aws-apitools-rds-1.8.002
aws-cfn-bootstrap-1.1
aws-scripts-ses-2012.05.15
According to Amazon Linux AMI Basics:
to allow the installation of multiple versions of the API and AMI
tools, we have placed symlinks to the desired versions of these tools
in /opt/aws, as described here:
/opt/aws/bin—Symlink farm to /bin directories in each of the installed
tools directories.
/opt/aws/{apitools|amitools}—Products are installed in directories of
the form [name]-version and symlink [name] attached to the most
recently installed version.
/opt/aws/{apitools|amitools}/[name]/environment.sh—Used by
/etc/profile.d/aws-apitools-common.sh to set product-specific
environment variables (EC2_HOME, etc.).
There are no fixed standards or set rules about what is installed on AMIs in general.
Different Linux distros and different AMI publishers each decide what they want to put in their image and where.
In fact, an AMI doesn't even need to give you command line access to your instance through ssh if they don't want to.
If you have a specific AMI series in mind (Amazon Linux, Ubuntu 12.04 LTS from Canonical, CentOS 5.5 from RightScale) then update your question to include this.
For the record: the "minimal" variant of Amazon Linux does not have the full suite of CLI tools. Doing a yum install ec2-tools didn't get me what I wanted, so I just created a new instance with the non-minimal AMI. I also found that this minimal Linux isn't any more space-efficient, at least as originally configured; the additional 6 GB that would go to the root partition is left unallocated.