Wrong barcode exported from Visual Basic Macro in MS Word to PDF - vba

I am trying to export through a Macro in MS Word a String that has 4 numbers. The whole Macro runs just fine, but, when I open the resulting PDF, I see that the barcode displayed seems corrupt.
This is the result:
In the Macro, I select the text I want to format, and change the font to "Free 3 of 9 Extended".
I have tried wrapping this number with "*" characters without success. Also tried "!". None of this seems to work. The funny part is that, if I open a Word document and type the same numbers using the same font, a clear Barcode is displayed:
This is what I see when writing directly in MS Word the same characters using the same font,this is what I see (which is what I want to achieve in the PDF export).
My macro exports to PDF with the following code:
Public Function guardar(id As String) As String
Dim path As String
guardar = id
obj_Word.ActiveDocument.ExportAsFixedFormat OutputFileName:=guardar,
OptimizeFor:=wdExportOptimizeForPrint, UseISO19005_1:=True,
IncludeDocProps:=True, KeepIRM:=True, ExportFormat:=wdExportFormatPDF
End Function

Is it possible that the template you are given is setting the font weight to bold in that portion of the document in which you are introducing the barcode, thus modifying the way it is displayed?
I cannot think of any other reason. The code you are posting does not seem to be the culprit.

Related

How to insert code in Word, export to pdf, and keep formatting?

I've written some Python code for my thesis, and wish to add this to my report in Word. I have to hand it in in pdf format. But when I convert the word document, the code cannot be copy pasted and used again, as the tabs are replaced by a single space.
Is there any solution for this?
found the answer: replace all spaces by _ , and possibly make them invisible (white).
If you want to copy paste and use the code, just replace all _ by spaces again.
Solved!

Combining Rich Text Content Control Content in MS Word using VBA

I'm trying to create a form for a non-technical user in MS Word to capture some text content in MS Word. This word doc consists of several rich text content controls where the user will type in or paste in some formatted data (bold, underlined, links, ...).
Once they get all the content entered into these various content controls I'm trying to make it easy for them to combine them together to paste in a consistent order into some podcast show notes which is in an HTML form.
So basically, I want to take three rich text content controls that have formatted data in them, combine them together into one formatted piece of content, and then copy it to the clipboard so they can then go to this web form, paste it in, and do some minor cleanup. The problem is that whenever I try to combine the RTF content it loses the formatting.
The only way I seem to be able to keep the formatting is if I copy the range object and then paste it. However, this doesn't paste just the formatted text. It pastes the whole rich text content control.
I've tried creating a blank RTF field at the bottom of the Word doc to combine everything in but I just can't figure it out. I wouldn't think this would be rocket science.
Being none of the code I've tried works and keeps the formatting I"m not sure if posting it here will do any good. Here's how I'm getting the value of the text object:
ActiveDocument.SelectContentControlsByTitle("txtShowNotes").Item(1).Range.Text
tried this:
ActiveDocument.SelectContentControlsByTitle("txtShowNotes").Item(1).Range.Copy
ActiveDocument.SelectContentControlsByTitle("txtCombinedContentSection").Item(1).Range.Paste
but this copies the whole RTF and not just the text.
Try something based on:
Sub Demo()
Dim Rng As Range
With ActiveDocument
Set Rng = .SelectContentControlsByTitle("txtCombinedContentSection").Item(1).Range
Rng.FormattedText = _
.SelectContentControlsByTitle("txtShowNotes").Item(1).Range.FormattedText
Rng.InsertAfter vbCr & vbCr
Rng.Characters.Last.FormattedText = _
.SelectContentControlsByTitle("txtShowNotes").Item(2).Range.FormattedText
End With
End Sub

Is there a way to save mathematical alphanumerical symbols (the ones that are in unicode) to a PDF or Word document in VB.NET?

Basically, I need to take a question from a text file and format it as a question would be formatted in a maths exam.
Right now, I'm using PDFsharp to do this but it always saves the alphanumerical symbols (for example, 𝑥) as boxes.
I tried copying from the sample on PDFsharp and have this
Dim document As New PdfSharp.Pdf.PdfDocument
Dim page As PdfSharp.Pdf.PdfPage = document.AddPage()
Dim gfx As PdfSharp.Drawing.XGraphics = PdfSharp.Drawing.XGraphics.FromPdfPage(page)
Dim tf As New PdfSharp.Drawing.Layout.XTextFormatter(gfx)
Dim options As New PdfSharp.Drawing.XPdfFontOptions(PdfSharp.Pdf.PdfFontEncoding.Unicode)
Dim font As New PdfSharp.Drawing.XFont("LastResort", 10, PdfSharp.Drawing.XFontStyle.Regular, options)
tf.Alignment = PdfSharp.Drawing.Layout.XParagraphAlignment.Left
tf.DrawString(questionArray(i)), font, PdfSharp.Drawing.XBrushes.Black, New PdfSharp.Drawing.XRect(0, 0, page.Width.Point, page.Height.Point), PdfSharp.Drawing.XStringFormats.TopLeft)
Dim filename As String = "test" + Str(i).Trim + ".pdf"
document.Save(filename)
Process.Start(filename)
I know I don't need to keep repeating the "PdfSharp.Pdf" stuff, my plan was to clean it all up when I get the characters saving properly.
Last Resort is a font that contains unicode symbols and the mathematical alphanumerical block, according to https://www.fileformat.info/info/unicode/block/mathematical_alphanumeric_symbols/fontsupport.htm
My end goal is to take a basic .txt file like "f(x) = 5[𝑥^2] + (k+7)𝑥 + k where k is a real constant." and save it to a PDF to resemble a real math exam question.
So, is there a better way to do this or a way to make PDFsharp do it?
Unicode support in PDFsharp works fine for characters in the range 0x0000 to 0xffff as long as you use a font that supports these characters.
Mathematical Alphanumeric Symbols are in the range U+1D400..U+1D7FF. You have to patch PDFsharp to make use of them. They are not yet supported out of the box as of today (December 2019).
In your snippet you give "LastResort" as the name of the font. Do you have a font with that name installed in Windows? Can you use it with e.g. Word or WordPad?
Maybe try "Arial" or "Tahoma" or "Verdana" instead.
Do you see the correct strings in the debugger? Maybe the problem is with the formatting of the text file or the encoding used to open it.
Update:
All characters in LastResort look like boxes. No good choice for math exam sheets:
Please try a different font.

Section Header Range.Text Returning Empty String Instead of Actual Text

I have a PDF file that I am trying to parse text out of. I opened the file using Microsoft Word, and text I need is in the header. On the first page, the header is justified left with a center tab that has the text (plain English name document title instead of the complicated reference name) that I am trying to grab. There is a right tab that has a page number control that I don't care about.
When I try to run the following:
Debug.Print ThisDocument.Sections(1).Headers(wdHeaderFooterPrimary).Exists
it gives me True, so I know the header exists. However, when I try to run
Debug.Print ThisDocument.Sections(1).Headers(wdHeaderFooterPrimary).Range.Text
it gives me nothing but an empty string, which I can further confirm by wrapping it in a Len(…) command which gives me 1. How can I get the text out of the header?
Of note, I tried using some Adobe SDK functions which would have been easier, but I do not have the professional Acrobat suite so I do not have access to those tools. Hence the MS Word workaround.

Convert text to image in Microsoft Word

I have a large book written in Microsoft Word and want to create a macro that will find all text using a predefined style and convert that text to an inline image. This text will be in Arabic and generally no longer than 4-5 lines. Is this possible?
UPDATE: Here's an example to show what I'm referring to:
I want to replace that entire line in Arabic with an image (as if I cropped this attached image to only include the Arabic and then replaced the line in Arabic with the image).
The reason I want a macro or script to do this is because there are hundreds of such lines and updating them one by one is cumbersome plus that will make modifications difficult later on.
UPDATE2: I found an interesting option here: http://windowssecrets.com/forums/showthread.php/31344-Convert-Text-to-an-Image-of-Text-in-VBA-(Office-2000-Sr1a)
It looks like you can cut a piece of text and then "Paste Special" as an image. So if there's a way to automate that that might work.
This is not an answer although I hope it will grow into a community answer. At the moment it is an exploration of what is required to solve the problem.
I know from the discussion when this question was posted on Super User that Abdullah wishes to publish his book on Kindle. So the question is really about how to get a document in English and Arabic ready for publication as an e-Book.
The Kindle does not support Arabic. The number of languages it does support is slowly increasing but there is no evidence I can find that Amazon has plans to add Arabic in the foreseeable future.
The format behind an Amazon e-Book is a cut down version of HTML. If a Word document containing Arabic letters is exported to HTML, the Arabic letters are included as character entities; for example: “ﭐ &#amp;64337; ﭒ ﭓ”. Importing the original Word or the HTML version to Kindle, results in the leading bits being discarded so these characters are displayed as P, Q, R and S instead of “ﭐ ﭑ ﭒ ﭓ (Alef Wasla isolated form, Alef Wasla final form, Beeh Wasla isolated form and Beeh Wasla final form).
I have tried Abdullah’s idea of saving some Arabic letters in a PNG file and creating an HTML file containing <p> … </p> <img src= “Arabic.png” > <p> … </p>. The appearance of this file on my Kindle 2 is perfectly acceptable so this has the potential to be a solution. The question is: how can the necessary conversions be performed?
We need to extract each Arabic string from either the Word document or its HTML equivalent and import it into a program that can convert them to PNG files.
The only way that I know of automating this would be to copy each string to a slide within PowerPoint. With PowerPoint’s SaveAs option it is possible to save each slide as a separate PNG file. The slides are named: SLIDE1.PNG, SLIDE2.PNG, SLIDE3.PNG and so on in sequence which would allow a macro to relate the results to the original strings. It would then be possible to replace the Arabic strings in the HTML file with the image elements. None of this would be too difficult to automate but there is a problem with the slides all being the size of the PowerPoint page. The page could be made smallish but what we need is for each slide to be cropped to just bigger than that slide’s text. I cannot think of any way of automating this cropping.
Does anyone have a better approach than converting each Arabic phrase to a PNG file?
I have been looking for PNG editors with some sort of command line interface but can find nothing that would be easier than using PowerPoint. Does anyone know of an alternative to PowerPoint?
Does anyone have any suggestions for automating the cropping of each image? When a string is placed in a PowerPoint slide it is possible to set its width to, say, 6.5cm (which looks good on my Kindle) and get the height determined by PowerPoint. This could be saved for later use if anyone knows how to use it.
Implementing solution
Pending any suggestions for improving the approach described above, the following outlines how I would implement it.
I would not attempt to process the Word document. I would save it as a Web Page, Filtered HTML file, which is a required step on the way to creating a Kindle eBook, and process that.
Within the HTML file created from my test document, the Arabic phrase comes out as:
<p class="MsoNormal"></p>
<p class="MsoNormal" align="center" style="text-align:center"><span dir="RTL"
style="font-size:24.0pt;font-family:Arial">
&#64336;&#64337;&#64338;&#64339;&#64340;&#64341;
&#64342;&#64343;&#65153;&#65154;&#65276;&#65275;
&#65274;&#65273;&#65246;&#65226;&#65227;&#65228;
</span><span style="font-size:24.0pt"></span></p>
<p class="MsoNormal"></p>
<p class="MsoNormal"></p>
I assume Abdullah's document will result in something similar. Note 1: the above is a random collection of Arabic letters. Note 2: they are held left-to-right in reading sequence even though, when displayed or printed, they are read right-to-left.
The whole of this block will have to be replaced with something like:
<br><imc src="xxxx.png"><br>
where the file xxxx.png holds an image of the Arabic text.
The file names, such as xxxx.png, could be systematic (A001.png, A002.png, ...) but I would have thought that transliterating the first ten or twenty characters of the phrase from the Arabic to English alphabets and using the result, with a numeric suffix, as the file name would be more convenient.
I would hold the records necessary to manage the process in an Excel worksheet. I would place the VBA code in the same workbook.
The steps in the conversion process that I envisage are:
VBA macro to extract Arabic strings from latest HTML file and add new strings to the Excel worksheet. (More about the Excel worksheet later.)
VBA macro to create PowerPoint file, with one slide per new string, and use SaveAs in PNG format to create one PNG file per slide before discarding the PowerPoint file.
Human to crop each PNG file. (There appears to be no way of automating the cropping so this task will be minimised by use of data in the Excel worksheet.)
VBA macro to rename each slide from SLIDEnnn.PNG to its permanent name and to record the permanent name in the Excel worksheet.
VBA macro to update the latest HTML file by replacing the block containing the Arabic phrase with the appropriate HTML IMG element.
The Excel worksheet needs two columns: Arabic phrase and PNG file name. If there is any risk of the worksheet being sorted between steps 2 and 4, we may need a sequence number as well.
Macro 1 will extract an Arabic phrase from the HTML file, look down the list in the worksheet for this phrase and add the phrase at the bottom if it is not already present.
Macro 2 will look for phrases in the worksheet that do not have a PNG file name. These new phrases are the ones to be written to the PowerPoint presentation. That is, a phrase only goes into this process once.
Task 3, cropping each PNG file, will be a pain. All I can say is that it will only be once per phrase.
Macro 4 will assume that the SLIDE001.PNG, SLIDE002.PNG, … are in the sequence of phrases without PNG files in the worksheet. If this might not be true (because the worksheet has been sorted) we will either need a sequence number or to retain the PowerPoint file. The macro will assign a unique name to each new phrase, record this name in the worksheet and rename the PNG file.
Macro 5 creates a new copy of the latest HTML file using the contents of the worksheet to determine which phrase to replace with which PNG file.
This process is not ideal but it will achieve the desired result and has no obvious complications. Any suggestions for improving it?
Before you begin these instructions, press record in the Microsoft Word macro editor, so you can see what the VBA code is.
I'm wondering if this will be easier if you convert the docx file to .rtf (rich text format) and replace that line with an image? Go to File > Save As.. > name it "old.rtf", then replace the line with an image and Save As.. again and name it "new.rtf" and then download Beyond Compare or your favorite diff program to see what happened. It should be easy to do this pro-grammatically if you choose to. I think working in text would be easier than Microsoft's binary format unless you can find a good library to modify their doc or docx formats.
Sub CopySelPasteAsPicture()
' Take a picture of a selection and paste it at the
' document end
With Selection
.CopyAsPicture
End With
ActiveDocument.Content.Select
With Selection
.Collapse Direction:=wdCollapseEnd
.TypeParagraph
.TypeParagraph
.PasteSpecial DataType:=wdPasteMetafilePicture
End With
End Sub