writing file data directly to disk/db with blobs (plone.app.blob/Archetypes, and plone.namedfile) - blob

I have several pieces of data that need to be merged into one file (ATContentTypes blob file, Plone 4.1). The total amount of data is likely to be quite large so I really don't want to have to load it all into memory, concatenate it, and do something like o.setFile(data). If I were writing directly to the file system I could just do open(myfile, 'a') and write to it, but I'm not clear how I could do that with a blob supported content type. All of the docs and tests I've been able to look at just have it being set with a str or in-memory StringIO. Is there a way to append to this field without loading the whole thing into memory?
Similarly, I've also looked at using Dexterity with a plone.namedfile NamedBlobFile. It looks like that field just has a 'data' attribute that is basically a string. How could I append to that without loading the whole thing into memory?

It's quite old and the product has never been officially released, but it can help you: ore.bigfile.
It's well explained in this blog article: http://blog.jazkarta.com/2010/09/21/handling-large-files-in-plone-with-ore-bigfile/

Related

Why should applications read a PDF file backwards?

I am trying to wrap my head around the PDF file structure. There is a header, a body with objects, a cross-reference table and a trailer. In the official PDF reference from Adobe, section 3.4.4 about file trailer, we can read that:
The trailer of a PDF file enables an application reading the file to quickly find the cross-reference table and certain special objects. Applications should read a PDF file from its end.
This looks very inefficient to me. I can't show anything to users this way (not even the first page) before I load the whole file. Well, to be precise, I can - if my file is linearized. But that is optional and means some extra overhead both when writing and reading such file.
Instead of that whole linearization thing, it would be easier to just put the references in front of the body (followed by objects on page 1, page2, page 3... ). But people in Adobe probably had their reasons to put it after it. I just don't see them. So...
Why is the cross-reference table placed after the body?
I would agree with the two reasons already mentioned, but not because of hardware limitations "back in the day", but rather scale. It's easy to think an invoice with a couple of pages of text could be done better differently, but what about a book, or a PDF with 1,000 photos?
With the trailer at the end you can write images/text/fonts to the file as they are processed and then discard them from memory while simply storing the file offset of each object to be used to write the trailer.
If the trailer had to come first then you would have to read (or even generate in the case of an embedded font) all of these objects just to get their size so you could write out the trailer, then write all the objects to the file. So you would either be reading, sizing, discarding, then reading again, or trying to hold everything in ram until you could write them to the file.
Write speed and ram are still issues we contend with today when we're running in a docker container on a VM on shared hardware..
PDF was invented back when hard drives were slow to write files... really s-l-o-w. By putting the xref at the end, you could quickly change a file by simply appending new objects and an updated xref to the end of the file rather than rewriting the whole thing.
Not only were the drives slow (giving rise to the argument in #joelgeraci's answer), also was there much less RAM available in a typical computer. Thus, when creating a pdf one had to write data to file early, much earlier than one had any idea how big the file or, as a consequence, the cross references would become. Writing the cross references at the end, therefore, was a natural consequence.

How to compare and find the differences between two XML files in cocoa?

This is a bit of a two part question, for working with 40mb xml files.
• What’s a reasonable size to store in memory for a program running continually in the background?
• How to find what has changed in an XML file.
So on the first read the XML is loaded into NSData, then uploaded to the server.
Now instead of uploading a 40mb XML every time it changes, I would prefer to upload a “delta” file containing only what has changed. The program would monitor the file for change, and activate when it’s been modified. From what I can see, I would need to parse an old version of the xml file and parse the modified xml file, then compare them? Is it unreasonable to store 80mb in memory like this every time the file is modified?. Now I’m assuming that this has to be done with a DOM parser because I can’t see how you could compare two files like that with a SAX parser since it only has part of the file stored?
I'm a newbie at this so any help would be appreciated!
To compare two files:
There are many ways to do, (As file is to be considered, I may not be correct):
sdiff file1.xml file2.xml A unix command
You can use this command with apple script.
-[NSFileManager contentsEqualAtPath:andPath:]
This method checks to see if two files at given path are the same file, then compares their size, and finally compares their contents.
For other part:
What size is considered for background process, I dont think so, for an application it matters. You can save these into temporary files. Even safari uses 130+ MB as you can easily check through Activity monitor.
NSXMLParser ended up being the most useful for this

Taking too much time to load TreeGrid in Extjs4

Actually i am facing a problem that i amnot sure from where it is? as i am new to Extjs
I am using TreeGrid of Extjs4. I have a combobox, where i have to select an option and do a search operation.on search it will populates the TreeGrid.
But the problem i am getting when i have a huge xml files that i need to populate in to TreeGrid. Its taking toomuch time. So can anyone help me on this please to identify what may the problem?
Where as in the case small xml file it is working good.
I too have found problem with loading large files. If your files are too large dont stick with XML.
Try to use JSON format. it will perform better with large files.
To read XML you need to parse it, read the nodes, attributes, and child nodes in the XML document, and then use the data that you’ve found.
With JSON it’s easy to get at the data since its already native javascript. No parsers or proxies necessary–all you need to do is loop through the data, fast and simple.
http://think2loud.com/680-json-xml/

How to parse and load text file with Core Data?

I resort to your expertly advice because I am sort of "new" to Objective-C, I have read a couple of books and docs (namely Aaron Hillegass & Stephen G. Kochan's books), but some things are still unclear to me, for lack of practise.
To put you in context, I have a NSDocument project that uses Core Data for storage.
I struggle with 2 things right now: reading/writing to files, and table views ^^
So my first question is about Core Data : is it only able to save in SQL, XML or Binary format ?
Or can I use core data to read/write in any format, according to what I declared in the plist file ?
I am trying to work with .po files, and I want to display the translations in a table view containing 2 columns (1 for the msgid and the other for the msgstr).
To read and write files in the po format and display lines in my table view, I most likely need to parse the files using line endings and characters such as "#"as delimiters.
I haven't gotten around to doing that yet (I have no idea how to do that yet!), but I would like to know if it is possible or if I need to restart my project that doesn't use Core Data...
Please DO NOT just throw links to the apple documentation at me, it's the most confusing thing ever, and feels like it's made for experts only! I need me some human-readable explanations :)
Thanks a bunch for any help and advice you can give me!
It is possible to write a different storage format for Core Data, but it is not easy and it sounds like you are not at a level where that is a possibility (no shame there, I'm not either).
If you are only displaying data from the .po files then there is no need to use CoreData. CoreData is meant to provide a file storage solution. You create/edit data and save it using coredata. If you have no intention to create and edit data then get rid of coredata, it will only get in the way.

How do I store strings permanently? After the app is closed?

I'm trying to figure out how to do this as I'm not sure what's the proper way of doing this.
I've got several strings that I want to store/save permanently, even after the application is closed. How should I proceed? Do I read or write from a textfile?
I believe you're looking for a feature known as Application Settings. This feature will take care of storing settings between instances of the application. The manner in which it stores settings is ClickOnce and User aware so it takes much of the problems out of the picture.
Here's a link to an overview on the topic
http://msdn.microsoft.com/en-us/library/c9db58th(VS.80).aspx
Use My.Settings
Yes, you might store it in a simple text file or use a settings file.
Take a look at Application Settings:
http://msdn.microsoft.com/en-us/library/0zszyc6e.aspx
I store what I need in a plain text file. I use my own format: First line: lenght of the array or the number of bytes/lines the data needs to be stored. Second line: data types. third line: directories or path info. At the end I store the data.
That's because programming languages can read by characters or by lines. C++ considers either whitespaces and lines.
SQL or Access is when you need to store more complex data than just strings or arrays.
Yes, I'd store it in some form of text file, then you can read it on load. It's very easy to implement in Visual Basic and you might even find some samples in Codemonkeys or similar. I'd avoid using the registry. Of course if you want, you could also use some sort of database (Access, SQLITE, etc.) to store the values. But that depends upon the type of data and how much do you need to read/write from it.
yes you can write to a text file, or try SQLite, which can let your VB program have database capabilities.
http://www.google.com/search?hl=en&q=visual+basic+sqlite&btnG=Search