How to use PDFMarkedContentExtractor class from pdfbox library? - pdfbox

I use pdfbox library to extract text from arbitrary PDF file. And I want to know how can I extract some particular text from pdf using this library.
As I understand, I should use marked content feature for this task.
There is the PDFMarkedContentExtractor class. Using its getMarkedContent method I can get PDMarkedContent object, and then, by using method getContents, I can get a real content that I need.
Am I right?
Well, but how can I specify what the document PDFMarkedContentExtractor should use as a source?

To my understanding, PDFMarkedContentExtract is used specifically for tagged contents within the PDF. Based upon your comments and your description, I believe you just want to extract the text in general. If that's the case, I believe you need to use PDFTextStripper instead.

Related

how can I define strings in resource file and use them in qml

I want to define a number of xml formatted strings in resource file and use the strings in qml code. Is it possible? How to do it? I would be really appreciated any example.
As I know there is no way to store strings in resource file. But you still do that in another way.
First way: create language file with qt translation tool. As bonus you can store string in several languages.
Using in QML is very easy:
Text {
text: qsTr("myTextId");
}
See that link for more info.
Second way: store each string in different resource file.
But in this case you need to extend QML with C++ plugin to get ability to read files.
See that link for more info.

Two different fonts in one inline object while creating PDF

Is it technically possible to use two different font in the same
DrawHTMLTextBox while using Debenu Quick PDF Library 10?
Is it possible with any other libraries which can be used in a PHP
project (Not preferred)?
Currently it is not possible to use two different fonts in the string that you pass to the DrawHTMLTextBox function in Debenu Quick PDF Library. If you want to use a different font for different parts of the string you'll need to use DrawHTMLText instead and change the font using SetHTMLNormalFont prior to each section of the string being drawn.
Using this method you'll need to keep track of the width and height of the text you're drawing yourself but you can do that using the GetHTMLTextHeight, GetHTMLTextWidth and GetHTMLTextLineCount.

Format month/year in Ektron calendar header

Version 8.0.1 SP1
Our client would like us to reformat the month/year in the calendar header. See attached image. They want "April, 2012" instead of the abbreviated "Apr, 2012". Where is this specified? I have looked at the webcalendar objects, css files, xslt files.
Any suggestions?
I haven't made that particular change, but have faced similar requests. Your options are-
Use Javascript (jQuery) to reformat on client side
Dig around in the workarea looking for formatting code, though usually the formatting code is locked up in a DLL.
DisplayXslt or EkMarkup - some controls allow you to apply XSLT or EkMarkup (Ektron markup template) for custom display. You create an XSLT, and apply it via the DisplayXslt property on the control.
Discard the control, and use the APIs to drive your own custom control.
Also, this Ektron forum post suggests having a look at this file- workarea/webcalendar/xsl/default.xsl
If this file exists (I don't have an Ektron installation currently handy), then it is likely you can use the DisplayXslt method described above.

In iOS, how can you programmatically fill out a pdf form field?

I need to take an existing pdf file and programmatically fill in a list of form fields with text and then save the pdf without ever displaying it to the user.
For instance, if the pdf file contains fields named "LastName", and "FirstName" I would like to set the value of "FirstName" to "Louis" and then save the file.
I've been searching for a long time and can't find any guidance on even where to start since the iOS documentation (and most of the questions on here) seem geared towards displaying or creating pdf content instead of modifying it.
EDIT:
My main question is: Is it possible to open a pdf stream (I know how to do this) and copy each existing pdf dictionary item into a new pdf? I have not been able to find a way to write the dictionary items to a pdf.
I doubt that kind of functionality will ever be in the iOS frameworks. The reason most of the related info you can find "seem[s] geared towards displaying or creating pdf content instead of modifying it" is because that's what the vast majority of use cases will want or need for PDF functionality.
You'll need to find a 3rd party library that can open up PDFs, fill out the AcroForm fields, and then stamp out a PDF. I use iText on Java (there is also iTextSharp for C#) but I don't know of anything for Objective-C.
Once you find that library, you'll need to integrate it into your project. There are undoubtedly several related questions/answers here on SO for whatever version of the SDK you're using.
This would be easier to do with a HTML page. If you wish to use a HTML page instead of a .pdf form then thius is how you would go about doing it:
[field1 setText:#"Field 1 Text"];
[field2 setText:#"Field 2 Text"];
NSString *result;
result = [webView stringByEvaluatingJavaScriptFromString:[NSString stringWithFormat:#"$('#field1').val('%#');", field1.text]];
result = [webView stringByEvaluatingJavaScriptFromString:[NSString stringWithFormat:#"$('#pfield2').val('%#');", field2.text]];
result = [webView stringByEvaluatingJavaScriptFromString:#"$('#submitbutton').click();"];
You would need to create two UILabels or UITextFields and call them "field2" and "field2" in your .h file. You can then find the ID of the field you need to replace e.g. #field1 and then put it where I have put "#field1" and again for the second field where I have put "#field2". There also needs to be a UIWebView with the page already loaded. This code is to be used after the UIWebView page has been loaded. Maybe do the following:
-(void)webViewDidFinishLoad:(UIWebView *)webView {
// Insert above code here
}
You probably need a full understanding of Javascript if you want to do this for the whole form, but this should get you started.
Hope that helps you.

Objective-c code formatter site to create html that can be embedded into a blog

I'm looking for site similar to http://www.manoli.net/csharpformat/ that allows one to put in c# code snippet and it formats the html to post into your blog with a CSS file.
I need one that actually does this for Objective-C.
You want the GeSHi (Generic Syntax Highlighter) library. It's is excellent, has dozens of languages (including Objective-C, with the ability to automatically linkify classes/protocols to the documentation), and support for many popular CMSs (Django, WordPress, Drupal, Joomla, Mambo, etc).
If you'd like to see it in action, you can check out nearly any wiki page on our local CocoaHeads website. For example: http://cocoaheads.byu.edu/wiki/different-nslog
Assuming you're on a Mac, copying code from Xcode will keep the syntax coloring. Any WYSIWYG blog editor should support that.
In case your blog software isn't WYSIWYG, you can paste into TextEdit and save as HTML. It outputs pretty crappy HTML considering it's just highlighted source code, but it's nonetheless compliant HTML.
Other than that, I don't know of an online service for that.
I use pygments (python) to generate syntax highlight for source code examples embedded in blog.
If your entry text is just the source code it will work the same for what you are after, I tested it to highlight Objective-C as well.
I actually use markdown syntax to type plain text blog post in a file and I copy plain text code examples. Then I run the file via markdown processor, which includes pygments for highlight and store it into a file.
It's as simple as:
include markdown
html = markdown.markdown(text,['codehilite'])
See simple script at the link which just takes file name of your plain text file and creates html file.
Then I can copy/paste the code.
You have to include link or copy the css as well to get the syntax highligh but it's easy.
I do this for blogger, see example how to use markdown with pygments to do syntax highlight.