Most efficient way to rename a content stream with CMIS - filenames

Every CMIS document has:
a contentStream (for instance a video, as a binary)
a contentStreamFileName (for instance myvideo.ogv)
(well except CMIS documents that have a null content stream)
Paragraph 2.1.4.3.3 of the CMIS 1.1 specification says that contentStreamFileName is NOT updatable.
So, when a CMIS client wants to rename myvideo.ogv to cinematon.ogv, how should it do?
Anything more efficient than downloading and re-uploading the same binary with a different name?
The binary can be several gigabytes.

A generic CMIS client cannot rename a content stream without replacing the content (with the same content).
There is no unified definition among repositories how the content stream filename is handled. That's why the property is read-only.
Some repositories allow changing the filename, but how is repository specific.

Related

Uploading a file via Jaxax REST Client interface, with third party server

I need to invoke a remote REST interface handler and submit it a file in request body. Please note that I don't control the server. I cannot change the request to be multipart, the client has to work in accordance to external specification.
So far I managed to make it work like this (omitting headers etc. for brevity):
byte[] data = readFileCompletely ();
client.target (url).request ().post (Entity.entity (data, "file/mimetype"));
This works, but will fail with huge files that don't fit into memory. And since I have no restriction on filesize, this is a concern.
Question: is it somehow possible to use streams or something similar to avoid reading the whole file into memory?
If possible, I'd prefer to avoid implementation-specific extensions. If not, a solution that works with RESTEasy (on Wildfly) is also acceptable.
ReastEasy as well as Jersey support InputStream out of the box so simply use Entity.entity(inputStream, "application/octet-stream"); or whatever Content-Type header you want to set.
You can go low-level and construct the HTTP request using a library such as the plain java.net.URLConnection.
I have not tried it myself but there is example code which reads a local file and writes it to the request stream without loading it into a byte array.
Upload files from Java client to a HTTP server
Of course this solution requires more manual coding but it should work (unless java.net.URLConnection loads the whole file into memory)

What are the options and constancy of storing file data in NTFS?

I know it has the $DATA attribute, but heard that it's not always include the whole file content due to some circumstances.
I've also heard that delayed write operations could make this attribute not fully accurate for indicating the file content.
So - what are the variations/possible structures for holding a file content in NTFS, and what is the constancy according to them?
File data can be stored in $DATA attributes (in unnamed stream, the default one), and/or in alternative data streams (ADS, named $DATA attributes). Another option to store additional metadata about file are extended attributes (in NTFS it's in attribute $EA). Each attribute is a data stream that is managed in kernel by Cache Manager (Cc* API in Windows kernel realm) and Memory Manager (Mm* API). Whole $DATA attribute can be either inside MFT file record, or externalized to disk clusters. Nevertheless, if you use only user mode API, you can disregard all this, system always gives you accurate data.

How to set http headers in dotCMS

I'm trying to create a XML data feed with dotCMS. I can easily output the correct XML document structure in a .dot "page", but the http headers sent to the client are still saying that my page contains "text/html". How can I change them to "text/xml" or "application/xml"?
Apparently there's no way to do it using the administration console. The only way I found is to add this line of (velocity) code
$response.setHeader("Content-Type", "application/xml")
to the top of the page template.
Your solution is the easiest. However there are other options that are a bit more work, but that would prevent you from having to use velocity to do the XML generation, which is more robust most of the time.
DotCMS uses xstream to generate XML files (and vise versa). You could write a generic plugin to use this as well.
An JSONContentServlet exists in dotCMS that takes a query and generates json or xml (depending on your parameters). It is not mapped on a servlet by default, but that is easy to add.

.NET ZipPackage vs DotNetZip when getting streams to entries

I have been using the ZipPackage-class in .NET for some time and I really like the simple and intuitive API it has. When reading from an entry I do entry.GetStream() and I read from this stream. When writing/updating an entry I do entry.GetStream(FileAccess.ReadWrite) and write to this stream. Very simple and useful because I can hand over the reading/writing to some other code not knowing where the Stream comes from originally.
Now since the ZipPackage-API doesn't contain support for entry properties such as LastModified etc I have been looking into other zip-api's such as DotNetZip. But I'm a bit confused over how to use it. For instance, when wanting to read from an entry I first have to extract the entire entry into a MemoryStream, seek to the beginning and hand-over this stream to my other code. And to write to an entry I have to input a stream that the ZipEntry itself can read from. This seem very backwards to me. Am I using this API in a wrong way?
Isn't it possible for the ZipEntry to deliver the file straight from the disk where it is stored and extract it as the reader reads it? Does it really need to be fully extracted into memory first? I'm no expert but it seems wrong to me.
using the DotNetZip libraries does not require you to read the entire zip file into a memory stream. When you instantiate an instance an instance of ZipFile as shown below, the library is only reading from the zip file header. The zip file headers contain properties such as last modified, etc. Here is an example of opening a zip file. The DotNetZip library then reads the zip file headers and constructs a list of all entries on the zip:
using (Ionic.Zip.ZipFile zipFile = Ionic.Zip.ZipFile.Read(this.FileAbsolutePath))
{
...
}
It's up to you to then extract zip files either to a stream, to the file system, etc. In the example below, I'm using a string property accessor on zipFile to get a zip file named SomeFile.txt. The matching ZipEntry object is then extracted to a memory stream.
MemoryStream memStr = new MemoryStream();
zipFile["SomeFile.txt"].Extract(memStr); // Response.OutputStream);
Zip entries must be read into the .NET process space in order to be deflated, there's no way to bypass that by going straight into the filesystem. Similar to how your Windows Explorer shell zip extractor would work - The Windows shell extensions for 7zip or Windows built in Compressed Folders have to read entries into memory and then write them to the file system in order for you to be able to open an entry.
Okey I'm answering this my self because I found the answers. There are apparently methods for both these things I wanted in DotNetZip. For opening a read-stream -> myZipEntry.OpenReader() and for opening a write-stream -> myZipFile.UpdateEntry(e, (fn, obj) => Serialize(obj)). This works fine.

How is file upload handled in HTTP?

I am curious to know how webservers handle file uploads.
Is the entire file sent as a single chunk? Or is it streamed into the webserver - which puts it together and saves it in a temp folder for PHP etc. to use?
It's just a matter of following the encoding rules so that one can easily decode (parse) it. Read on the specification about multipart-form/data encoding (the one which is required in HTML based file uploads using input type="file").
Generally the parsing is done by the server side application itself. The webserver only takes care about streaming the bytes from the one to the other side.
It's streamed to answer that question, but see this RFC 1867 for more information.
RFC 1867 describes the mechanism.