Is there a way to force line-breaks in pdf acro fields? - pdf

As the title states, I am currently struggling to force line-breaks ("\n" respectively "\r\n") in a pdf-acro-field. I am using OpenPdf and FlyingSaucer (see dependencies for details) and managed to enable multi-line and rich-text of fields by setting the bit-flags 13 and 26 of the "Ff" property.
I was using the PDF-Reference ISO-3200_2008 as a guide (starting on page 442).https://www.adobe.com/content/dam/acom/en/devnet/pdf/pdfs/PDF32000_2008.pdf
Issue
I read an existing PDF and fill the acro fields programmatically. Something like this:
acroFields.setField("address", content);
The content is just a simple string with some newline characters. Nothing special.
My text is too long and will have to break at some point. When multiline is enabled, the text will break, but only if it reaches the end of the line. Basically, what I want to display is something like an address and some additional information on top. But not all lines are equal in size and for example I want to have information 1 and 2 on the same line and info 4 and 5 each on separate lines.
What I had in mind
<h2>What I want</h2>
<pre>Company XYZ
A brand here</pre>
<pre>Another Section
With some more info that goes further to the right
phone, email, website</pre>
<br>
<h2>What it looks like</h2>
<p>Company XYZ
A brand here
Another Section
With some more info that goes further to the right
phone, email, website</p>
Questions
Is there a way to implement line-breaks in PDF in general?
Is there a way to implement line-breaks programmatically?
Can I do this in OpenPdf or FlyingSaucer?
If there is no simple way - what would be a viable workaround?
Dependencies
compile group: 'com.github.librepdf', name: 'openpdf', version: '1.3.3'
compile group: 'com.googlecode.juniversalchardet', name: 'juniversalchardet', version: '1.0.3'
compile 'org.xhtmlrenderer:flying-saucer-core:9.1.18'
compile 'org.xhtmlrenderer:flying-saucer-pdf-openpdf:9.1.18'

It did not work, because the content (input coming from parsed html file) had \r\n in it, but somehow it does only work with \n so I did the following:
content.replaceAll("\\\\r\\\\n", Chunk.NEWLINE.getContent()); // 2nd param is just "\n"
This fixed my issue.

Related

How do I replace part of a string with a lua filter in Pandoc, to convert from .md to .pdf?

I am writing markdown files in Obsidian.md and trying to convert them via Pandoc and LaTeX to PDF. Text itself works fine doing this, howerver, in Obsidian I use ==equal signs== to highlight something, however this doesn't work in LaTeX.
So I'd like to create a filter that either removes the equal signs entirely, or replaces it with something LaTeX can render, e.g. \hl{something}. I think this would be the same process.
I have a filter that looks like this:
return {
{
Str = function (elem)
if elem.text == "hello" then
return pandoc.Emph {pandoc.Str "hello"}
else
return elem
end
end,
}
}
this works, it replaces any instance of "hello" with an italicized version of the word. HOWEVER, it only works with whole words. e.g. if "hello" were part of a word, it wouldn't touch it. Since the equal signs are read as part of one word, it won't touch those.
How do I modify this (or, please, suggest another filter) so that it CAN replace and change parts of a word?
Thank you!
this works, it replaces any instance of "hello" with an italicized version of the word. HOWEVER, it only works with whole words. e.g. if "hello" were part of a word, it wouldn't touch it. Since the equal signs are read as part of one word, it won't touch those.
How do I modify this (or, please, suggest another filter) so that it CAN replace and change parts of a word?
Thank you!
A string like Hello, World! becomes a list of inlines in pandoc: [ Str "Hello,", Space, Str "World!" ]. Lua filters don't make matching on that particularly convenient: the best method is currently to write a filter for Inlines and then iterate over the list to find matching items.
For a complete example, see https://gist.github.com/tarleb/a0646da1834318d4f71a780edaf9f870.
Assuming we already found the highlighted text and converted it to a Span with with class mark. Then we can convert that to LaTeX with
function Span (span)
if span.classes:includes 'mark' then
return {pandoc.RawInline('latex', '\\hl{')} ..
span.content ..
{pandoc.RawInline('latex', '}')}
end
end
Note that the current development version of pandoc, which will become pandoc 3 at some point, supports highlighted text out of the box when called with
pandoc --from=markdown+mark ...
E.g.,
echo '==Hi Mom!==' | pandoc -f markdown+mark -t latex
⇒ \hl{Hi Mom!}

ways to separate passages in pdf using gap?

I have some pdf's with 2-3 passages for every page. every passage is separated by some line gap, but while reading with pymupdf, I cannot see any machine printable separator between passages. is there any other way, other library can do this?
code:
import fitz
from more_itertools import *
doc = fitz.open('IT_past.pdf',)
single_doc = doc.load_page(0) # put here the page number
text=single_doc.get_text('text')
text
page screen shot:
enter image description here
pdf
Full pdf
There is no gap as such, just for the moment as its much easier, lets look closer in your linked viewer rendering :-
   
So lets replicate what is inside the real PDF (that has no web side html <p> markers) :-
support, product design, HR Management, knowledge process outsourcing for
pharmaceutical companies and large complex projects.
Software exports make up 20 % of India's total export revenue in 2003-04, up from 4.9 %
in 1997.This figure is expected to go up to 44% of annual exports by 2010. Though India
See there is "no gap" just left aligned non justified (ragged) text that needs a style such as a font name and stretched out locations added to hold in a page de-void of line feeds nor true carriage returns. (occasionally there are some backspace or vertical/horizontal moves but generally meaningless in line printer text). Even "Tabs" "Indents" and some spatial characters are normally discarded in a PDF printout.
If you want gaps or line-wrap you need to add them.
A good alternative is export the -layout using poppler or xpdf here to - (console) or pipe it or replace that with a path/name.txt, many other options available like -nopgbrk
xpdf-tools-win-4.04\bin32>pdftotext -f 1 -l 1 -layout IT_past.pdf -

How to shorten the first line in the Intellij console?

I want my first line at the top of the console to look like this:
, but mine looks like this, and it's long:
How can I shorten it to like the first picture?
I'm watching Java tutorials on youtube and their first lines ("C:\programs Files\Java\jdk\whatever"...) are always so short and pretty with 3 cute dots in the end, but mine is long and annoying.
You can define the lines which should be folded:
Right click on that line in the console
Choose "Fold Lines Like This"
Click OK
You can also define special keywords/phrases in this menu, where lines, which contains the phrase will be folded with ellipsis (...) at the end.
See the ConsoleViewImpl.java file in the source code of IntelliJ IDEA
This line (execution command) will be automatically hidden in case the string length is more than 1000 symbols.
So, once you add something to the execution command (for example by adding libraries), the execution command is folded automatically.

How can I replace a single PDF page using Imagemagick?

I have a multi-page PDF document, and I want a version of the document with one of the pages in the middle with a scanned-in copy, as I needed a physical signature in the document. How can I do this with ImageMagick? I'm aware that ImageMagick may not necessarily be the best tool for the job. However, the resulting PDF does not need to be high quality or a high fidelity copy, so it should be sufficient for my needs.
As a specific example, I have a 9 page my-file.pdf, and I want to create a copy of the PDF with the 8th page replaced with page-8.png. It looks like I should be able to achieve this goal with the convert tool, though it's not immediately obvious what the syntax would be. How can I achieve this goal?
If I merely wanted to append the new page to the end of the file, I know I can do the following:
convert my-file.pdf page-8.png output-file.pdf
However, this end up with the original pages 1-9, then the new page 8. What I actually want is to replace the original page 8 with the new page 8. My desired output is:
[original pages 1 - 7],[new page 8],[original page 9]
A specific page or range of pages can be specified using the bracket syntax with zero-based indexing. For instance, [8] will refer to the ninth page, and [0-6] to the first seven pages. Using this, a duplicate of the PDF with the 8th page replaced can be achieved as follows:
convert my-file.pdf[0-6] page-8.png my-file.pdf[8] output-file.pdf
As indicated in the question, as well as a comment, Imagemagick is not a great tool for this task, as it rasterizes the output of the PDF. An alternate solution that avoids this would be to use pdftk to replace the page. While the inserted page will still be an image, the replaced pages will not be rasterized.
First, save the scanned-in page as a PDF using an application capable of this. Then, use the pdftk cat operation to combine the PDFs:
pdftk A=my-file.pdf B=page-8.pdf cat A1-7 B A9 output output-file.pdf

Intellij - Reformat Code - Insert whitespace between // and the comment-text?

I am working with another human being on project from that the professor expects to have uniform code-style. We have written large separate junks of code on our own, in which one has written single line comments without a white-space between the single-line-comment-token and the other one has inserted a white-space. We are working with IntelliJ and have failed to find an option to enable the Reformat Code function, to insert a white-space.
TLDR:
Can you tell us how to convert comments from that to this in IntelliJ?
// This is a load bearing comment - don't dare to remove it
//This is a load bearing comment - don't dare to remove it!
You can do a global search and replace (ctrl-shift-r on windows with default keyboard layout, or Replace in Path under the Edit/Find menu).
Check the regular expression option and enter //(\S.*) as the text to find and // $1 as the replacement. Check the whole project option, and clear any file masks. You can single step through the replacements, or simply hit the All Files option.