I'm trying to find a somewhat more comprehensive writeup, or an example to refer to regarding a Document Based Application which saves it's contents to a document bundle rather than a single file.
A good example of what I mean by a document bundle is how Pages saves it's documents i.e.
Document Name.myextension\
directory1\
somefile.ext
somefile2.ext
directory2\
directory3\
file1.png
file2.png
As we know apples documentation, although extensive, can, at the best of times be somewhat daunting to work with :-/
Any help would be appreciated!
Thanks
Ade
Take a look at the following Stack Overflow question:
NSDocument to hold a complete folder?
If the answer to that question isn't detailed enough, you can read the following article on file packages:
Working with Cocoa File Packages
Regarding Apple's documentation, the relevant classes to look at are NSFileWrapper, NSData, and NSDocument.
Related
I am writing a program in Java for a university project, part of the write up report states:
'You must provide listings for your program'
Can anyone provide me with some clarification on what is meant by this?
I have looked high and wide online but nothing i've come across has helped clear this up for me. I found a definition 'With computer programming, a program listing is the complete listing of a computer program, source code, and all files that make up the software program', but his hasn't helped with my understanding of what is being asked in the report.
Should I be providing screen-grabs of my code? Or a screen-grab of the folder with all related files?
Any help would be appreciated, thanks.
A listing of your program used to mean the code of your program rendered into printed form; i.e. on paper. These days, it could also mean that the source code is formatted and included as a PDF file, or a Word document or something else.
Should I be providing screen-grabs of my code?
It is unclear if that is what your lecturer wants. I don't expect so, because screenshots are harder to read than formatted text.
Or a screen-grab of the folder with all related files?
That is highly unlikely, IMO. If that is what your lecturer wanted they would have said "directory listing" not "listings for your program". (And that would be useless for assessment purposes.)
But my advice is to ask your lecturer if you are at all unclear what is required of you.
And if your lecturer is unwilling to explain, just do what you think is correct.
What you find is correct, you need to give source code and any other resources needed to build and run the softaware.
One options could be to :
- pack my project with some build manager (maven, gradle, etc)
- push it to some repository (like github) with a README.md for building and running
- give the github project reference.
If you prefer not to make public the code, just pack it and send an archive with the maven project.
They are looking for a nice printed output of your source code. In olden days (pre 2K) compilers would produce a output, well formatted (often with a list of symbols and line number to aid in understanding the code.
I am using a command line tool (ng-xi18n) to extract the i18n strings from an angular 2 app I wrote. The output of this command is a messages.xlf file. Coming from a .po background, and being not familiar with .xlf, I assumed that this file is the equivalent to the .pot file (correct me if I am wrong).
I then assumed that if I want to translate my app, I had to cp messages.xlf messages.de.xlf to have a copy (messages.de.xlf) of the template file (messages.xlf) where I can translate each message into German (hence the .de.xlf).
After translating some dummy texts and running the app, I saw that it worked as expected, so I quit translating and continued developing the app. After some time, I added more i18n strings, and eventually thought that I had to update my template. And this is where things got hardly maintainable. I updated the template messages.xlf file, and quickly was wondering how I could update the new strings to my already translated messages.de.xlf file without loosing my progress.
When I was developing using .po files, this was no problem thanks to good tools like poEdit, but I didn't find anything comparable for .xlf. After trying some tools, I thought that the best choice would be Lokalize, but I didn't find a possibility to merge the template file to already translated (but outdated) files either.
Up to now, this was rather an essay than a question, so here's a quick summary:
Is the workflow of dealing with .xlf files really comparable to .po as I initially thought (described above), or is it completely different?
How am I suppose to update my already translated files?
What are the best practices dealing with .xlf files?
What are proof of concept tools to work with .xlf?
Sidenotes:
The Lokalize handbook was not helpful at all. I see a lot of functions that sound promising, like:
"File" > "Update file from template". I did not find anything in the handbook to explain this function. If I click on this, nothing happens.
"Sync" > "Open file for sync/merge". This seems to be a function to merge two similar files (by multiple translators) rather than a tool to update the translation file from a template. Even though there is a tooltip in Lokalize's primary sync tab, notifying me about "x unmatched entries", I just couldn't find anything to append those unmatched entries to my .de.xlf file.
[Update] Turns out, I had similar issues as in this question. After downgrading my version of Lokalize to the suggested one, many issues (including the ones mentioned in the question) disappeared. However, now the "Update file from template" option is greyed out, and I don't know why.
I also tried OmegaT, which does not work at all on my platform (Ubuntu 16.04).
[Update] Virtaal works great for merging new strings from a template, but the UI in general is very poorly designed...
Googling did not help, as every hit seems to be related to XCode or something.
Thanks for any help in advance, I really appreciate it
I wrote a small npm command line tool called xliffmerge.
In principle it does the same, that Roland Oldengarm does with his gulp tasks described in his blog article.
It is free and you can have a look at it at https://github.com/martinroob/ngx-i18nsupport#readme
The best workflow automation solution I have seen described so far is from Roland Oldengarm's blog entry "Angular 2: Automated i18n workflow using gulp". To summarize, in a few dozen lines of Gulp code he created the tooling to handle some of the challenges you faced. Specifically it runs ng-xi18n to extract the messages; creates an English translation with sources copied to targets; updates existing translations by adding new trans-units, keeping existing ones, and removing missing ones; and then exposes all xlf files as TypeScript string constants. These last strings can then be imported to supply the bootstrapModule with its translation provider options.
Caveat: I have not used this exact solution (and code) myself, but I was able to expose generated xlf as TypeScript strings and use them in an app in a manner similar to what he described. As for maintaining translations, I have leveraged IntelliJ IDEA (WebStorm) file comparison features and Counterparts Lite (for Mac) for that. My own efforts are still in early stages but are working end to end for an application that is in active development.
Official Angular docs are now updated for Internationalization (i18n) at https://angular.io/docs/ts/latest/cookbook/i18n.html including a section specifically for creating a translation source file with the ng-xi18n tool.
I am trying to internationalize an iPhone app that was written few years ago and at that time unfortunately internalization was not the priority. As there are lot of nib files I have used ibtool to programmatically generate the nib files for each country. After running the ibtool utility i had to open up the project.pbxproj file and had to insert manual entries for the spanish language. Since i have to update lot of entries i did not choose this route.
I wanted to know how can i generate region specific nib file and have the entries reflected in the project.pbxproj file.
Secondly, I was unable to use gestrings option as the code doesn't use the NSLocalized method every where the text the is hard coded. while i prefer this option is there a safe mechanism to find out all the strings in the code base that can be internationalized?
Apologies for the long question. Any recommendations on how should i go about internationalizing the app?
Regards,
I solved this problem recently and came across the following links iphone-applications-localization-guide and ibtool-localization and Merging-changes-into-a-localized-nib-file
maybe it will help you.
I noticed that Apple started using zip archives to replace document packages (folders appearing as a single file in Finder) in the iWork applications. I'm considering doing the same as I keep getting support emails related to my document packages getting corrupted when copying them to a windows fileserver.
My questions is what would be the best way to do this in a NSDocument-based application?
I guess the easiest way would be to create a directory file wrapper, create an archive of it and return it in NSDocument's
- (NSFileWrapper *)fileWrapperOfType:(NSString *)typeName error:(NSError **)outError
But I fail to understand how to create a zip archive of the NSFileWrapper.
If you just want to make a zip file your format (ie, "mydoc.myextension" is actually a zip file), there's no convenient, built-in Cocoa mechanism for creating zip archives with code. Take a look at this Google Code project: ziparchive I don't believe a file wrapper will help in that case, though.
Since you cited iWork, I don't own iWork 09, but previous versions use a package format (ie, NSFileWrapper would be ideal) but zip the XML that describes the document's structure, while keeping attachments (like embedded media, images, etc.) in a resource folder, all within the package. I assume they do this because XML can be quite large for large, complicated documents, but compresses very well because it's text. This results in an overall smaller document.
If indeed Apple has moved to making the entire document one big zip archive (which I would find odd), they'd either be extracting necessary resources to a temp folder somewhere or loading the whole thing into memory (a step backward from their package-based approach, IMO). These are considerations you'll need to take into account as well.
You’ll want to take the data from the file wrapper and feed it into something like ziparchive.
Pierre-Olivier Latour has written an extension to NSData that deals with zip compression. You can get it here: http://code.google.com/p/polkit/
I know this is a little late to the party but I thought I'd offer up another link that could help anyone that comes across this post.
Looks like the ZipBrowser sample from Apple would be a good start http://developer.apple.com/library/mac/#samplecode/ZipBrowser/Introduction/Intro.html
HTH
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 2 years ago.
Improve this question
For a small project I have to parse pdf files and take a specific part of them (a simple chain of characters). I'd like to use python to do this and I've found several libraries that are capable of doing what I want in some ways.
But now after a few researches, I'm wondering what is the real structure of a pdf file, does anyone know if there is a spec or some explanations anywhere online? I've found a link on adobe but it seems that it's a dead link :(
Here is a link to Adobe's reference material
http://www.adobe.com/devnet/pdf/pdf_reference.html
You should know though that PDF is only about presentation, not structure. Parsing will not come easy.
I found the GNU Introduction to PDF to be helpful in understanding the structure. It includes an easily readable example PDF file that they describe in complete detail.
Other helpful links:
PDF Succinctly book is longer and has helpful pictures.
Introduction to the Insides of PDF is a presentation that isn't as in-depth but gives a quick overview and has lots of pictures.
When I first started working with PDF, I found the PDF reference very hard to navigate.
It might help you to know that the overview of the file structure is found in syntax, and what Adobe call the document structure is the object structure and not the file structure. That is also found in Syntax. The description of operators is hidden away in Appendix A - very useful for understanding what is happening in content streams. If you ever have the pain of working with colour spaces you will find that hidden in Graphics! Hopefully these pointers will help you find things more quickly than I did.
If you are using windows, pdftron CosEdit allows you to browse the object structure to understand it. There is a free demo available that allows you to examine the file but not save it.
Here's the raw reference of PDF 1.7, and here's an article describing the structure of a PDF file. If you use Vim, the pdftk plugin is a good way to explore the document in an ever-so-slightly less raw form, and the pdftk utility itself (and its GPL source) is a great way to tease documents apart.
I'm trying to do pretty much the same thing. The PDF reference is a very difficult document to read. This tutorial is a better start I think.
This may help shed a little light:
(from page 11 of PDF32000.book)
PDF syntax is best understood by considering it as four parts, as shown in Figure 1:
• Objects. A PDF document is a data structure composed from a small set of basic types of data objects.
Sub-clause 7.2, "Lexical Conventions," describes the character set used to write objects and other
syntactic elements. Sub-clause 7.3, "Objects," describes the syntax and essential properties of the objects.
Sub-clause 7.3.8, "Stream Objects," provides complete details of the most complex data type, the stream
object.
• File structure. The PDF file structure determines how objects are stored in a PDF file, how they are
accessed, and how they are updated. This structure is independent of the semantics of the objects. Sub-
clause 7.5, "File Structure," describes the file structure. Sub-clause 7.6, "Encryption," describes a file-level
mechanism for protecting a document’s contents from unauthorized access.
• Document structure. The PDF document structure specifies how the basic object types are used to
represent components of a PDF document: pages, fonts, annotations, and so forth. Sub-clause 7.7,
"Document Structure," describes the overall document structure; later clauses address the detailed
semantics of the components.
• Content streams. A PDF content stream contains a sequence of instructions describing the appearance of
a page or other graphical entity. These instructions, while also represented as objects, are conceptually
distinct from the objects that represent the document structure and are described separately. Sub-clause
7.8, "Content Streams and Resources," discusses PDF content streams and their associated resources.
Looks like navigating a PDF file will require a little more than a passing effort.
If You want to parse PDF using Python please have a look at PDFMINER. This is the best library to parse PDF files till date.
Didier have a tool to parse the PDF:
http://didierstevens.com/files/software/pdf-parser_V0_4_3.zip
or here:
http://blog.didierstevens.com/programs/pdf-tools/ which cataloged several related pdf-analysis tools.
Another tool is here:
http://mshahzadlatif.wordpress.com/2011/09/28/view-pdf-structure-using-adobe-acrobat-or-a-free-tool-called-pdfxplorer/
Extracting text from PDF is a hard problem because PDF has such a layout-oriented structure. You can see the docs and source code of my barely-successful attempt on CPAN (my implementation is in Perl). The PDF data structure is very cool and well designed, but it's easier to write than read.
One way to get some clues is to create a PDF file consisting of a blank page. I have CutePDF Writer on my computer, and made a blank Wordpad document of one page. Printed to a .pdf file, and then opened the .pdf file using Notepad.
Next, use a copy of this file and eliminate lines or blocks of text that might be of interest, then reload in Acrobat Reader. You'd be surprised at how little information is needed to make a working one-page PDF document.
I'm trying to make up a spreadsheet to create a PDF form from code.
You need the PDF Reference manual to start reading about the details and structure of PDF files. I suggest to start with version 1.7.
On windows I used a free tool PDF Analyzer to see the internal structure of PDF files.
This will help in your understanding when reading the reference manual.
(I'm affiliated with PDF Analyzer, no intention to promote)
To extract text from a PDF, try this on Linux, BSD, etc. machine or use Cygwin if on Windows:
pdfinfo -layout some_pdf_file.pdf
A plain text file named some_pdf_file.txt is created. The simpler the PDF file layout, the more straightforward the .txt file output will be.
Hexadecimal characters are frequently present in the .txt file output and will look strange in text editors. These hexadecimal characters usually represent curly single and double quotes, bullet points, hyphens, etc. in the PDF.
To see the context where the hexadecimal characters appear, run this grep command, and keep the original PDF handy to see what character the codes represent in the PDF:
grep -a --color=always "\\\\[0-9][0-9][0-9]" some_pdf_file.txt
This will provide a unique list of the different octal codes in the document:
grep -ao "\\\\[0-9][0-9][0-9]" some_pdf_file.txt|sort|uniq
To convert these hexadecimal characters to ASCII equivalents, a combination of grep, sed, and bc can be used, I'll post the procedure to do that soon.