I currently have a script which processes .exr files, and when manually opening an exr file you are given the option of opening with transparency or with alpha. However, when scripting the opening of an .exr you are given no such options. There is no OpenOptions like there is for say PDF, and as far as I can tell their is no code generated by the listener that dictates the choice between transparency or alpha. Additionally this choice does not seem to be captured via an open action.
My question is: Has anyone figured out a way in CS6 or in CC a way to choose automatically whether an .exr file loaded through scripting is loaded with alpha or transparency?
Answering my own question.
As far as I can tell there is no way to script any behavior relating to opening EXR files with alpha/transparency. The way I was able to work around this was using the Pro EXR plugin, specifically the EZ version which is free, to automatically set exr files to always open with alpha. It's very disappointing that even in CC exr files lack any sort of script-able options when opened. Hopefully adobe will fix this is future versions.
Link to the plugin. The installer zip includes the free version. You can bring up the default options if you press shit when opening a file. I hope this helps someone else who may find themselves needing to interact with exr files with photoshop scripting.
For anyone who stumbles upon this, here's a little history on the subject of Photoshop EXR format implementation (specifically about this alpha split issue):
https://forums.adobe.com/thread/369637
The gist of it is that Adobe developers work with "straight alpha" which means transparency is a property of the alpha channel. The majority of visual effects software developers use an "unpremultiplied" alpha workflow, in which the alpha can represent anything, though crucially this is often used to represent objects that have transparency and brightness such as a candle flame.
An update on the answer from the asker - ProEXR is now open source, and there is an additional open source alternative called EXR-IO. Both work very well, and currently have slightly different feature sets.
Related
I am trying to make a dynamic PDF generator as an .NET Core API. I want to take an existing PDF, or .docx file, and edit it so it replaces the current name (John Doe) with something that can be replaced like #NAME_PLACEHOLDER.
I then want to transform #NAME_PLACEHOLDER -> John Doe (or whatever is in the KeyValuePair or Dictionary<string, string>).
I am running this on a Docker environment, so I can easily execute commands and I am willing to do that as well.
So far I have tried a few things:
1) pdf2htmlEX
Executes as pdf2htmlEX file.pdf
Does the job pretty well
Can be converted back to PDF using Google Chrome headless or similar
Problem: Only the characters used in the PDF can be used to replace. So if I only use A, B, C as characters, it will make D into Times New Roman (or default font)
2) LibreOffice ODT to PDF
This was pretty nice, because I could simply unzip the .odt file, open content.xml, search and replace, then save it as an .odt file again
Could be converted into PDF rather easily using soffice --convert-to pdf
LibreOffice is quite nice
Problem 1: Microsoft Word -> Save as ODT tends to break the formatting, so we have to use LibreOffice to go and change it back again
Problem 2: We don't want to move away from Microsoft's Office suite
3) HTML to PDF using Chrome Headless
What you see is what you get
By far the best option, if we're all developers aaand have unlimited time
Problem 1: Only our developers can make changes, since our marketing department do not know HTML
Problem 2: Our existing PDFs would have to be rewritten in HTML
As you can see, I have tried a bunch of things. None of them, except Chrome Headless, has lived up to my expectations. What I really like about #3 is what you see is what you get. I can make the whole thing in HTML, press CTRL+P and see what it looks like as a finished PDF, basically.
I am looking for a better solution, though. It can be paid. It can be free. All I need is to change out words/phrases with other words dynamically, which apparently seems like a tough thing to do.
Thanks for specifying what you've already found clearly. It helps a lot providing a succinct answer.
The conversion is always tricky - I'm sure you know Word has trouble displaying/editing some Word documents itself.
I have experience regarding point #2 "LibreOffice ODT to PDF" and can suggest a few things to test:
Don't use Microsoft to do the docx->odt conversion. It's not good as you know. Use LibreOffice itself to do this step. The rest of your process remains the same.
For some documents, Libre Office does doc->odt much better. So, you can instead work with DOC format and get a better result without any other changes.
You won't be able to remove the devs from the process, but you can certainly reduce their role allowing your business/marketing teams to have more direct input simply by:
get the starting point document to the devs to run through the conversion process. The devs can "clean up" the document to make it convert nicely.
make this version of the document the "official" starting point. The business or technical teams can load it, adjust it, and put it back into the process.
if possible, expose a test-platform to the business teams so they can download, adjust, upload and render to PDF. This cycle means they will be able to achieve more and if they're good, do impressive stuff without any dev input.
the above steps simply mean don't expect perfect conversion of arbitrary complex documents. Starting from a (even complex) working baseline is great.
Some of that might show you that your #2 is actually going to get the best overall results.
I hope that helps.
To all whom are concerned:
My boss and I are on GitLab and we have problems trying to differentiate between dynamic PDF files.
Normally, for code files like C# class .cs files, it's easy to double-click and have GitLab highlight the changes made between two different versions.
However, we also create dynamic XFA/PDF files in Adobe LiveCycle and it's difficult to tell what has been changed at times, especially if the commit messages are not too specific or too vague. We know people suggested taking screenshots of the PDF between each version, but you can't diff text changes or format changes on image files.
We tried the program DiffPDF found here:
http://www.qtrac.eu/diffpdf.html
But we found out that it does not work with XFA/dynamic forms.
Does anyone have any suggestions on any possible programs that can diff the actual content on PDFs in GitLab?
Thank you for your time and future advice.
I will agree that it is hard to see what have been changed on a PDF, but if you instead look at the XDP-file you will be able to see what code have been changed.
If that is possible for you.
I have to extract text from invoices and bills pdf files
The files layouts can get complex, though its mostly filled with tables.
I've read a few dozens articles already about the pdf format, how easy it is for our brain to grasp it and how hard it is for a machine to understand its structure.
Also downloaded a few tools like the python's pdfminer and some java tools, some even have rule based layout extraction, like LA-PDBtext these are all great libraries, leaving you the final step.
Adobe also has an online service called exportPdf but it can't be customized
Bottom line, I understand that in order to extract text from structured pdf files and convert it to XML for example, there should be some level of manual work.
I also found From Data Extractor, a non free tool with the ability to set extraction rules that claims to do the job, though its hard to find a proper manual and it runs only on windows.
I thought I may even try a to convert those files to images and try tesseract-ocr but decided to ask for advice here before I spend more time on it.
I'll be very grateful if someone with such experience give me a hint.
I've done a lot of PDF extraction and I can confirm as you've already discovered that it can be a painful process to start. One of the important things to understand is that there is no concept of "tables" within a PDF, just text that happens to have lines around it. Also, there's no guarantee that the linear order of text within the PDF code actually matches the visual order when printed. In other words, there's no guarantee that "hello world" is written in that order, it could be draw 'word' at coord 20 then draw 'hello' at coord 10. Most PDF creators don't do this but still there's no guarantee. The more creative a PDF creator is (InDesign, Illustrator, etc) the more likely the text is going to be harder to get out. And actually, once a designer starts messing with fonts too much some programs will sometimes actually output words one character at a time, changing the font just slightly each time.
That said, I'd recommend the first one that you looked at, LA-PDFText. You can run it in discovery mode (blockify) from which you can create rules. I don't have Java installed anymore so I can't test it but it seems very promising.
Your second one, A-PDF Form Data Extractor, only really works with actual PDF forms. If this is your case I'd recommend just using an open source solution like iText/iTextSharp.
The last OCR one makes me cringe. I just can't imagine going through those hoops would get you better text representation than parsing the PDF. But then again, PDF is a visual format so maybe it would.
Personally I use iText/iTextSharp for this kind of thing but I also like to do things the hard way.
It is not clear if you are looking for the development tool to automate the data extraction from bills and invoices or just for the one time tool (utility) that can be used by the non-developer?
Anyway here are some specialized tools including engines they use:
Tabula (open-source, especially designed to extract data from tables in PDF. Can export shell scripts for batch processing, runs as the localhost web service, powered by JRuby Tabula engine)
Viet OCR (open-source .NET desktop utility for text extraction from PDF and images, based on tesseract oct engine)
Bytescout PDF Viewer (freeware closed source .NET utility, detects and extracts tables, including scanned invoices, powered by PDF Extractor SDK)
DISCLAIMER: I work for ByteScout.
How to include and use new fonts in wxWidgets projet?
I am using VS2005.
I just want to print text using new ttf font.
Thanks in advance!!
Unless you're willing to link against something like FreeType:
http://en.wikipedia.org/wiki/FreeType
...most any program is going to require the font to be installed to the operating system, by the user or by some OS-specific script. You can't just load it by filename off the cuff in your app.
Because of the platform dependence of naming and accessing custom fonts, the path of least resistance is not to try and hardcode a font...but to let the user pick one out of a dialog. You would use a wxFontDialog for this:
http://docs.wxwidgets.org/stable/wx_wxfontdialog.html
It will let you retrieve the wxFontData, from which you can get the chosen wxFont:
http://docs.wxwidgets.org/stable/wx_wxfontdata.html#wxfontdatagetchosenfont
Once you have that, you can save and reload an identity of the font via the native string interface:
http://docs.wxwidgets.org/stable/wx_wxfont.html#wxfontgetnativefontinfodesc
Trying to formulate these strings on your own or work with the "face name" is a little dodgier:
http://docs.wxwidgets.org/stable/wx_wxfont.html#wxfontsetfacename
Generally speaking a lot of the same problems arise here as dealing with fonts in HTML. If you have a very specific idea about the cross-platform appearance of some text, your best bet is often to make an image out of that text and use that instead of going through the hoops to get the font you want in the app. If you're more flexible and have a lot of text the user is interested in, then they may be interested in changing the font too. So just use a default but offer the user a choice to pick anything they want which is installed on their system.
(Note: I personally consider the handling of fonts in pretty much every OS or document system to be a disgrace. Imagine a world where in order to get a graphic to display in your program you had to register it with the operating system through a complex process and it would not copy from machine to machine when you copied a document in which it was embedded. We're dealing now with graphics that are orders of magnitude larger than font files, and yet they are handled seamlessly while people seem to accept the lack of seamless font transfer as "normal". Archaic DRM mindsets of font vendors is one side of the problem, but lame technology is another big component.)
I currently use Programmer's Notepad with the Rebol syntax scheme. It's not bad--does any insightful person have another suggestion?
For my Windows programming work I use the Zeus editor, but I'm not sure if it does Rebol?
Another windows option is TextPad. It is commercial but it is quite a useful editor.
There are 2 Rebol syntax files available from the official site
http://www.textpad.com/add-ons/synn2t.html
I also wrote a TextPad syntax file generator uploaded it to rebol.org
http://www.rebol.org/view-script.r?script=textpad-syngen.r
It is probably quite easy to modify this script to support other editors.
vim.
Especially with the following binding in your _vimrc/.vimrc:
nnoremap <Leader>fr :w<CR>:silent ! %<CR>
In normal mode, Leaderfr saves your current file and executes it: (fr is a memo for 'fast-run')
:wEnter save current file
:silent execute without messages: ! open shell % paste current file name Enter
Leader is usually \ key, I have this mapped to spacebar. In case anyone is interested on how to do that, post a comment.
Programmer’s Notepad better than Crimson Editor with Code Folding and Great Project Management
http://www.pnotepad.org/
It's opensource so you can even modify it in C++
For Windows, there is Crimson Editor or E with the REBOL bundle.
For Mac, there is TextMate.
Emacs, I believe has a REBOL syntax too.
Sublime Text is a really nice Windows editor (commercial, but reasonably priced) that supports TextMate configurations (well, at least for syntax and snippets) so if you manage to get a REBOL bundle from somewhere, you can use it with this.
SciTE also has REBOL syntax coloring support because the Scintilla editor component it's based on includes this.
Notepad++ should also support REBOL syntax coloring, being Scintilla based, but as it is currently distributed, the support is not compiled in. If you're so inclined, you could probably compile it yourself and add the support back in. It might be worth it because Notepad++ is quite a good editor too.
I can't include proper links because I don't have enough rep, but this should do:
www.sublimetext.com
www.scintilla.org/SciTE.html
www.scintilla.org/index.html
notepad-plus-plus.org
http://rebol.wik.is/index.php?title=Notepad%2b%2b
which is a REBOL plugin for Notepad++
I use JEdit which not only has REBOL syntax highlighting but also auto-indenting. It has most of the features you'd expect from a text editor (e.g. block selection, configurable keyboard shortcuts).
There are versions for Windows, Mac OS X and Linux so if you choose to work cross-platform you won't need to learn a new editor. The web page is Jedit.org1
I use UltraEdit.
with its advanced project, syntax highlighting, macro, command-line control and total keyboard shortcut configuration, per language and project, you can program the editor to do just about all of what you need at a single click of the mouse, or keyboard.
my setup starts rebol on any file, and assigns a launch "default project script " to a shortcut, so wherever I am in the files, I still launch the project's relevant script. change project, it will run that new project's scripts. another key for unit tests, another key for "find in all opened files, etc, etc..
also, the actual text-exiting, when combined with a few macros which create functions, objects, and more using the clipboard and "currently highlighted" text makes it much faster than any Visual IDE including MSVC.
ultra edit itself has thousands of other advanced features, and they all work... really they do.
I've tried other editors and they always fall short when I start to push them.
yeah, you have to buy it... but its cheap (like one or two hours of your life salary ;-)
so considering you might use it for several months or years... its a cheap investment.
also, ultra edit is now released on linux and the mac port is just around the corner.
I use EditPlus for several years, it is not free but not expensive. It has Rebol syntax highlighting file (downloadable from its web site).
It is especially useful & very fast if you work with huge files (over 100 mb) or with lots of files (say 300 files.), find & replace takes a second.
For syntax highlighting and a simple autocomplete, you can use http://komodoide.com/komodo-edit/.
It's free and open source with several nice features, including folder browsing while editing, which I personally find very useful.
There is also a bunch of other languages supported in case you want to take a closer look and give this editor a chance.