jclouds/amazon s3/rackspace cloudfiles/windows azure storage- appending content to an existing blob - blob

I have a scenario in which multiple lines of text are to be appended to an existing text file... Is it possible to do this using Jclouds? (That would be ideal for me as jclouds supports a lot of cloud providers)...
Even if this is not doable using jclouds, does the native API of Amazon S3/Rackspace Cloudfiles/Azure storage support appending content to existing blobs?
If this is doable, then kindly point me to good working examples which show the same...

This is not possible in the underlying blob stores I know of.

Related

Cloud file storage with file tagging and search by tags/filename

My project needs to meet next requirements.
store large amount of files for reasonable price
tag individual files with custom tags
have API method to search files by name (contains) and tags (exact)
do it all via JS SDK (keep project serverless)
I made some work with Amazon S3 and turned out
no search method in JS SDK http://docs.aws.amazon.com/AWSJavaScriptSDK/latest/AWS/S3.html#listObjectsV2-property
listObjects accepts param Key Prefix (i.e. filename starts with), so there is no way to find by contains
no param to search by tag at all, i can only get it for individual file with getObjectTagging
So question is - what stable service can i use for file storage WITH functionality described above
Azure? Google Cloud? Backblaze B2? something else?
thanks!
If you use Azure blob storage, you can use Azure Search blob indexer to index both the metadata and textual content of your blobs. For a walkthrough of setting this up, see Build and query your first Azure Search index in the portal.

Extract data fom Marklogic 8.0.6 to AWS S3

I'm using Marklogic 8.0.6 and we also have JSON documents in it. I need to extract a lot of data from Marklogic and store them in AWS S3. We tried to run "mlcp" locally and them upload the data to AWS S3 but it's very slow because it's generating a lot of files.
Our Marklogic platform is already connected to S3 to perform backup. Is there a way to extract a specific database in aws s3 ?
It can be OK for me if I have one big file with one JSON document per line
Thanks,
Romain.
I don't know about getting it to s3, but you can use CORB2 to extract MarkLogic documents to one big file with one JSON document per line.
S3:// is a native file type in MarkLogic. So you can also iterate through all your docs and export them with xdmp:save("s3://...).
If you want to make agrigates, then You may want to marry this idea into Sam's suggestion of CORB2 to control the process and assist in grouping your whole database into multiple manageable aggregate documents. Then use a post-back task to run xdmp-save
Thanks guys for your answers. I do not know about CORB2, this is a great solution! But unfortunately, due to bad I/O I prefer a solution to write directly on s3.
I can use a basic Ml query and dump to s3:// with native connector but I always face memory error even launching with the "spawn" function to generate a background process.
Do you have any xquey example to extract each document on s3 one by one without memory permission?
Thanks

How to search a string in Amazon S3 files?

I have all types of files or documents stored in Amazon S3.
How to perform search on those documents using a search keyword or string (full-text search, if possible) ?
Is there any documentum built on it ?
Matching documents list which has the search string will be displayed to the user for download.
Any help please ?
Searching documents in S3 is not possible.
S3 is not a document database. It is an object store, designed for storing data but inferring no "meaning" from the data -- essentially a key/value store suporting very large values. It has no sense of context. It doesn't index the content of the objects, or even the object metadata. The only way to "find" an object in S3 is to already know its key.
It is excellent for highly available and highly reliable storage, but searching not part of its design.
The solutions depends on how structured your S3 file data is.
If it is structured or semi-structured like cvs, JSON with columnar alike format, AWS Athena will be the best choice. With just a few clicks, you’re ready to query your S3 files.
Otherwise, if the data is totally un-structured, you may want to use elasticsearch and etc.
You cant search as you wish in amazon S3.
but we have alternate solution for this. I am using S3 browser software for this.
here is link to download: http://s3browser.com/
Download it and you will have all access same like amazon S3 browser. you can also perform search and other processes.

Best way to back up Azure Blob Storage

I would like to use Azure Blobs to store user uploaded images for a website. Upon upload the images are resized and put into folders for thumbnails, large pics and originals. These can be easily referenced from the website, so the solution works pretty well.
The problem is the backup. I understand that Azure has three copies of every blob to protect against hardware failure. If an authenticated user deletes the blob, MS will faithfully delete all three copies, which is a problem.
I couldn't find an easy way to regularly back up and restore a blob container to a point in time. Is there such a solution offered in the azure marketplace that anyone knows of? Maybe this would be better on ServerFault as I'm looking for a canned solution, but the MS link sent me over to Stack Overflow so I'm giving it a shot here.
One method is to use blob snapshots. Refer to https://msdn.microsoft.com/en-us/library/azure/ee691971.aspx

Get All Documents from a CouchBase Bucket without View or N1QL

I am implementing an Express Web service using CouchBase as my database. To get all documents stored in a bucket, i created a view using the web console.
My question is if there is a way to do the same thing but without creating a view or using N1QL.
I was looking at the Couchbase Server REST API, but i didn't found a way.
Thank you
You could design your schema around something like this. I am thinking of a key pattern specifically that would allow for a bulk get of a range of docs.
Beyond that, there is no way without a view or N1QL.
In Couchbase 3.0 and higher, you can also use DCP to stream all documents from a bucket. Currently the DCP protocol is only implemented in java, you can see an example here: http://github.com/branor/couchbase-dcp-consumer
Note that there is a problem in the 1.1.0+ version of the couchbase core-io library, so you need to use version 1.1.0-dp (developer preview) to open a stream. DCP support in the SDK is still experimental, so I wouldn't use it in production yet.
Create a document that will hold the keys of all your documents.
While inserting a key value pair in couchbase, also append the key to that document.
Eg:
<Key1, Value1>
<Key2, Value2>
.
.
.
<Keyx, Valuex>
<All_Keys, <Key1, Key2, Key3...Keyx>>
To get all the documents,
Just do a client.get("All_Keys") and then do a client.getBulk() operation.