I'm working with python-ppt to create a portfolio of candidates in a Powerpoint presentation. There is one candidate per slide and each of them has provided information about themselves like name, contacts and a minibio (the problem I'm here to solve)
The text_frame, created with values of height and width, must fit the slide but must a contain all lenght of minibios, which is not happening.
In a long phase (>200 char, with font size 12) it exceeds the size of the text box and get "out" of the slide, so, in presentation mode or a PDF file, the "overrun" of text is lost
Is there any way to confine the text to the shape/size of the text_frame? (extra help if the solution wont change font size)
Just found one parameter that helped to find the answer
When creating a text_box object with slides.shapes.add_textbox() and adding a text_frame to it, the text_frame.word_wrap = True limits the text to be contained inside the dimentions of the text_box
The code shows it better
# creates text box with add_textbox(left, top, width, height)
txBox = slide.shapes.add_textbox(Cm(16),Cm(5),Cm(17),Cm(13))
tf = txBox.text_frame
tf.word_wrap = True
Before word_wrap parameter
After word_wrap parameter
The short answer is "No". PowerPoint is a page-layout environment, and much like the front page of a newspaper, text "story" content needs to be trimmed to fit the allotted space.
We're perhaps not used to this because word-processing, spreadsheet, and web-page content is "flowed" into a (practically) unlimited space, but the area of a PowerPoint slide is quite finite. Also, using it for large text blocks is somewhat of an off-label use. There is a certain amount of flexibility provided by reducing the font size, but not as much as one might expect. Even accommodating 20% additional text requires what appears as a pretty radical change in font size.
I've encountered this problem again and again, and the only solution I have ever seen work reliably is hand-curating the content to fit.
python-pptx has one experimental feature to address this but its operation has never been very satisfactory and it's tricky to get working. https://python-pptx.readthedocs.io/en/latest/api/text.html#pptx.text.text.TextFrame.fit_text
The business of fitting text is the role of a rendering engine, which python-pptx is not.
Related
So basically I have a textbox with a specific width and I need to know if the string I will put into it will either fit nicely in one line or take a second line. For example: I have
string v = "WERTYUIOSDFGHJKWERTYUISDFGHJKXCVBNSDFGHJ"
and a textbox that's 3000 in width.
At first I tried: if v.length = x then... where x is the length of the string that can fit into the textbox. But I soon found out that strings with mostly 'I' can fit more inside compared to a string of mostly 'M'. And that is where the problem lies. Is there a function that detects if the string is going to take/need a second line?
Another option is to create or use a 3rd-party Crystal Reports UFL (User Function Library). A list of 3rd-party UFLs is maintained by Ken Hamady here.
At least one of these UFLs provides functions that allow Crystal formulas to either:
a. specify as input the text, font name, font size, bold and italics status and get the required width (in pixels or twips), or
b. specify the available width for the text and get the maximum font size to fit, or the number of lines required for a given font size.
The advantage of using a UFL approach is that Crystal supports dynamic property expressions. That means that the result of the function call can dynamically control a property such as font size, height, position, etc.
So Im trying to make a Object Detector for this companys forms, and we have labelled the images as shown in the example image I uploaded, my question is: Should We make more accurate boxes or is OK as they are, since the written part that we are trying to detect could be bigger.
So, what im asking is: In the example image, the "Descripcion" part or Description, has just 2 lines of text, but it could be more, should we make the box to just select the Description title + the 2 lines or so we stick to what we are doing now title + the 2 lines + all blank space that could have been filled with lines
It depends on what you really want to do with the detected boxes. What are the next steps, can the next step e.g. extracting the text handle all the free space, or would it be better to just the the part where it is actually written.
Besides that right now in your example I find that most boxes are too big. The form is more or less already splitted in boxes and it could be better to make the boxes smaller and more accurate e.g.the box around IMPORTE and some amount in €. I would label this closer. So the box only contains the information you actually want and nothing else.
But as I said it really depends on the next step the boxes should be used for.
Context
Writing to code to format a chart (all of which should be done by Microsoft, but that’s separate).
Am now positioning the legend. Taking a 9×9 block of possible positions, and counting the data points underneath each. As a fragment of the code: (ax.MaximumScale - ax.MinimumScale) * co.Chart.Legend.Width / co.Chart.PlotArea.InsideWidth.
Also coping with lines underlapping and text boxes overlapping the possible legend positions: same idea, more complexity.
Question
Obviously, all this works better if the legend is as small as possible, as that gives a greater likelihood of finding a location with zero ’lapping.
If .Legend.Width is too small, then the individual legend texts (the Series.Name’s) wrap onto ≥2 lines, which isn’t wanted. So VBA could interval bisect to find the smallest .Legend.Width for which there isn’t line wrapping. But how can the VBA code ‘see’|‘detect’|‘know’ of the existence of the line wrapping?
And mutatis mutandis for .Legend.Height: if that’s too small, some legend entries aren’t shown. How can the VBA code ‘see’|‘detect’|‘know’ that a height is too small?
Thank you.
PS: I expect that the correct answer is that “VBA cannot ‘see’|‘detect’|‘know’ either of these.” Please refute this expectation.
If you create your own legend, using a text box, you have better options when it comes to sizing and flow control. This will create a new set of challenges, but it might be easier to handle.
I have a very simple use-case for a filling up an acroform. I have a non-multi line text field. I would like to resize the font size to fit in the width of the text field.
The PDF spec mentions that a font size of 0 implies auto fit to width. However PDFBox - 1419 & PDFBOX-1402 mention that this isn’t supported in pdfbox.
Hence I have some small logic to calculate the font-sizes based on the widths etc. However I’m facing problems setting the font size.
I’m seeing the behavior mentioned in PDFBox - 1419.
Starts out with incorrect font size. If I click into the field, it displays correctly. Click outside the field, it reverts back to the wrong display.
Code :
pdfFormField.getDictionary.setString(COSName.DA, "/Helv 10 Tf 0 g”)
pdfFormField.setValue("Hello")
Any pointers or help would be much appreciated.
A simple example of such a PDF is here
Pdfbox form field classes read the default appearance into a member variable early in their life-cycle and don't follow-up to changes in the form field dictionary they are based on. Thus, when creating the appearance stream during pdfFormField.setValue("Hello"), the former DA value is used.
After setting the default appearance, therefore, you have to instantiate the form field object anew. Then set the field value using this new object.
For sample code look at this answer to How to set the text of a PDTextbox to a color?; here the existing DA value of a text field is changed to contain a color setting operation before the field value is set.
I have had this issue in multiple applications now and I am wondering if anyone has come up with a more efficient solution than mine. Essentially, my goal is to convert the content within a cell, to an HTML string to include all of its formatting. My workaround up to this point has been to loop through each character in the string to determine the font size, weight, and style, however, this can prove to be extremely slow when converting a lot of data at once.
Going through each character in turn will be very slow, but should only be necessary in extreme cases. I've tackled this same problem quite successfully using the following method.
For each relevant property (bold, italic, etc.) I build up an array that stores the position of each change in the value of that property. Then when generating the HTML, I can spit out all the text up until the next change (in any property). Where changes are infrequent, this is clearly faster.
Now, to arrive at the position of the changes in each property, I first test whether there are in fact any changes, and this is easy - for example, Font.Bold will return true if all the text is bold, false if it's all non bold, and null (or some other value - I can't remember) if there are both bold and non-bold parts.
So, if there's no change in the value at all, we're done already. If there is a change in the value, then I do a binary sub-division of the text into two halves and start again. Again, I might find that one half is all the same, and the other half contains a change, so I do another sub-division of the second half as before, and so on.
Since very few cells tend to have lots of changes, and many have none at all, this ends up being quite efficient. Or at least much more efficient than the character by character method.