Open XML SDK 2.5 document validation: The 'smtClean' attribute is not declared - vb.net

Our current work project involves opening a Microsoft PowerPoint file (.pptx format), changing some text and chart values, and then presenting the edited version to the end user.
This works rather well so far, but I'm puzzled by what happens when I try to validate the document afterwards. Using the DocumentFormat.OpenXml.Validation.OpenXmlValidator class, I run the Validate function with the PresentationDocument passed in as the only parameter.
Dim document As PresentationDocument = PresentationDocument.Open(templateFilePath, True)
Dim validator As OpenXmlValidator = New OpenXmlValidator()
Dim errors = validator.Validate(document)
For Each errInfo As ValidationErrorInfo In errors
Debug.Print("Error: """ & errInfo .Description & """")
Debug.Print("XPath: " & errInfo .Path.XPath)
Next
Validate() returns an array filled with instances of ValidationErrorInfo. Just about all of these give the same error description when debugging:
The 'smtClean' attribute is not declared.
The XPath for each error looks like this (numbers vary; there appears to be one error per piece of text):
/p:sldLayout[1]/p:cSld[1]/p:spTree[1]/p:sp[4]/p:txBody[1]/a:p[1]/a:fld[1]/a:rPr[1]
Every TableCell has a Paragraph, with child element Run, and this Run has child elements RunProperties and Text. I modify the Text in my scripts, but I do not touch anything else.
Searching for 'smtClean' gave me an MSDN entry for RunProperties which shows 'smtClean' as one of the possible values to be set, but if I create a new instance of DocumentFormat.OpenXml.Presentation.Drawing.RunProperties the 'smtClean' attribute is not available.
Looking around, I found threads where people mentioned merged documents to be one possible cause, but these errors occur even in an unmodified presentation with only a single slide and table in it. Using the Open XML SDK 2.5 Productivity Tool to Validate the base document, I get the same result.
The errors also occur no matter which format I ask the Validator to test for - both the 2007, 2010 and 2013 version of the PowerPoint format return the same amount of errors.
Finally: The file itself works just fine in PowerPoint, even after being modified. I am curious about why the validator returns so many errors, however.
Thanks in advance for any help.

we process Office Documents and remove this Attribute in all Types (Word, Powerpoint, Excel) without Side-Effects. Eric White has identified this as Bug: smtpClean attribute not supported
It is fixed in the current OpenXml SDK on Branch Office2016: https://github.com/OfficeDev/Open-XML-SDK/tree/Office2016
Regards...

Smart tags were deprecated in Office 2010, and the SDK v2.5 validator doesn't support smart tag elements and therefore marks them as invalid.
Please see this MSDN article for more information.
The current developer of the productivity tool says in this thread that the smtClean validation error was a bug in some situations and has been fixed in v3 of the tool.
v3 (the Office 2016 productivity tool) can be found here, however I'm not sure how compatible it is with older versions of Office.

Related

Reading the ParagraphFormat.Style property of a Style object crashes Word every time

Word 365 ProPlus v1908 build 11929.20776 on a Win10Pro(1903, build 18362.30) machine with i7-6600#2.6GHz,16G ram,
I'm trying to compare range object formatting to style definitions (flagging formats that are applied without using styles). When I try to read a style object (e.g., "Normal" style), the "Style" property of the "ParagraphFormat" object causes word to crash every time:
Set vPropVal=ActiveDocument.Styles("Normal").ParagraphFormat.Style or
Set vPropVal=CallByName(ActiveDocument.Styles("Normal").ParagraphFormat,"Style",VbGet) (although using variables for the source object and string property name)
If I add a Watch for ActiveDocument.Styles("Normal"), and then try to expand the ParagraphFormat property, Word crashes.
If I try to run Debug.Print ActiveDocument.Styles("Normal").ParagraphFormat.Style.NameLocal in a module or in the Immediate window, Word Crashes.
I can (and have been) just skip over the Style property of the ParagraphFormat, but it's bugging having the problem and not knowing why or how to fix it.
I have not been able to find any web resources that provide insights into why the ParagraphFormat of a Style object might be problematic. I greatly appreciate any insights...
BTW - I'm not a professional coder; I just have some intermediate capabilities.
The crash occurs because of an error (bug) in Word that has been there since Word 2010. When you try to open the ParagraphFormat branch of the Style object tree in a Watch (or Locals) Window, Word will try to enumerate all the members and their values, and will fall foul of the same problem.
But the .Style value was not available from the ParagraphFormat property of a Paragraph Style object even before that. Even if it was, it's not unreasonable to expect that it would point to the same Style object as you are inspecting. If you need the NameLocal of the Style you can get it directly from ActiveDocument.Styles("stylename").NameLocal
In Word 2007, the same code to access .Style would not cause a crash but would raise error 91 ("Object Variable or With block variable not set"), and inspection in the Watch or Locals windows would tell you that the Style property was set to Nothing.
[FWIW Mac Word 2011 raises a different error - 5934, "This operation is not supported by a duplicated ParagraphFormat object."]

Mysterious Black Lines in Word When Using In-House VB.Net Application

A little backstory. I work at an organization that uses Mail Merge and SQL Databases to populate letters with names/addresses. Those letters are sent out to our donors as thank yous. These letters change frequently, and new ones come up at least 10 times a month.
To simplify our process, I created a program that allows you to copy/paste the letter body content into rich text boxes and when you press the 'Go' button, it opens a pre-made Word template and replaces bookmarks in the template with the copied body content.
The program works great with most letters, but some of them have a problem where these thick black lines are created and I'm unable to do ANYTHING to remove them. I can't right click them, I can't delete them with Backspace or Delete, and I can't highlight them.
I'm thinking that the problem may come from hidden formatting. Some of the employees that write the letters are using the Mac version of Office 2016, and I'm using Windows version. I sent an RTF file that showed the black lines for me to someone who uses the Mac version, and they said they couldn't see the lines.
My question is, is there a way to get rid of these lines or prevent them in the future? I've thought about upgrading Office version to 2019 on both ends, but there are quite a few people that have their hands in these letters and it may be difficult to upgrade everyone.
Please refer to the attached image for visual reference. Names and personal details have been removed.
EDIT: Here is the 'Go' code:
'create temp rtf files to maintain rtf
If strForm = "ANG2" Then
txtPreD.SaveFile("\\server\AcknowledgementLetters\fptemp.rtf")
txtPostD.SaveFile("\\server\AcknowledgementLetters\bptemp.rtf")
ElseIf strForm = "ANGL" Then
txtPreD.SaveFile("\\server\AcknowledgementLetters\predtemp.rtf")
txtPostD.SaveFile("\\server\AcknowledgementLetters\postdtemp.rtf")
txtBP.SaveFile("\\server\AcknowledgementLetters\bptemp.rtf")
Else
txtPreD.SaveFile("\\server\AcknowledgementLetters\predtemp.rtf")
txtPostD.SaveFile("\\server\AcknowledgementLetters\postdtemp.rtf")
End If
'if bookmarks exists, insert appropriate rtf files
If odoc.Bookmarks.Exists("fp") = True Then
goWord.ActiveDocument.Bookmarks("fp").Select()
goWord.Selection.InsertFile(FileName:="\\server\AcknowledgementLetters\fptemp.rtf")
End If
If odoc.Bookmarks.Exists("bp") = True Then
goWord.ActiveDocument.Bookmarks("bp").Select()
goWord.Selection.InsertFile(FileName:="\\server\AcknowledgementLetters\bptemp.rtf")
End If
If odoc.Bookmarks.Exists("PreD") = True Then
goWord.ActiveDocument.Bookmarks("PreD").Select()
goWord.Selection.InsertFile(FileName:="\\server\AcknowledgementLetters\predtemp.rtf")
End If
If odoc.Bookmarks.Exists("PostD") = True Then
goWord.ActiveDocument.Bookmarks("PostD").Select()
goWord.Selection.InsertFile(FileName:="\\server\AcknowledgementLetters\postdtemp.rtf")
End If
Before this happens, the program checks to see which template it needs to open and opens it as a Word object (odoc). This bit of code is really the only important part. After this, I just click Finish it just saves the file once I'm done checking it for errors. Also, yes, the RTF files that it creates DO have the black lines as well. Here is another picture of the program itself just so you can get a better idea of what's going on.

How to create a program to convert unit measurements

Using Microsoft access, visual basic.
I'm having a big problem doing this task.
What I have done: Created a table on access where I have put measurements in (from meters):
mile = 10000meters, nautic mile = 1862meters, English mile=1652, kilometers = 1000 meters and all the way down to Millimeters.
What I have created for input:
1 box takes an Integer to be converted and a 1 box specified with an initial unit.
What I have created for Output:
1 box shows the Integer of result with 1 box specified the chosen unit of the output.
Can anyone please, please help me with the codes?
Honestly I'd never really noticed the CONVERT function until today but here's a quick demo of how I'd slap together a "conversion tool" in Excel.
If you want to do the same thing in Access, the premise is the same, but it will be a bit more work since you'll have to design the form from scratch instead of using a worksheet, which is kind of meant for this kind of job.
Using Excel functions in Access
Before you are able to use Excel's CONVERT function in Access, you'll need to reference the Microsoft Excel Object Library.
In Access, open any VBA Module.
GoTools > References
Check the box next to Microsoft Excel 16.0 Object Library. (The version number will vary if you have an older version of Office.)
Then you can call most Excel functions from Access VBA or queries with WorksheetFunction (the same way you would use them in Excel VBA).
For example:
MsgBox WorksheetFunction.Convert(3.7, "m", "ft")
...displays a message box with the number of feet in 3.7 metres.
The calculations will be the easy part; a couple lines of VBA in the On Change or On Exit events will trigger the calculation.
The most time-consuming part will likely be perfecting the placement and formatting of the controls on the form, which is by no means difficult (and there are several tutorials online that can provide the basics if necessary.)
Lastly, keep in mind that there are no doubt a plethora of existing conversion tools available for free download with a little Googling... (I'm confident that you're not the first person who wanted to use MS Office to convert measurements.) 😉
More Information:
Microsoft Docs : WorksheetFunction.Convert Method
Microsoft Docs : List of Worksheet Functions Available to Visual Basic
Office Support : Create a form in Access
QuackIt: Microsoft Access Tutorial
Blueclaw : Access Event Procedures
You can download the demo xlsx used above from JumpShare here.
For both comboboxes, bind them to column 2, faktorTilMeter, and set the ColumnWidths to, say: 2,542cm;0cm.
Then, assign this expression as ControlSource for your output textbox:
=TextboxInput/ComboboxFrom*ComboboxTo

Decimal parsing differences on separate environments

Evening,
I'm bashing my head against a wall with the following problem:
I'm loading numbers from cells from a Number column with size=16
and decimal places = 2 inside adBase III .dbf file.
These numbers, when viewed with a DbfViewer appear as: 12345.12, where there is no thousands separator and the decimal
separator is ..
I parse the number from the cell in the database using decimal.parse(val).
I do stuff with that number.
I am using the ClosedXML library to paste the number into an .xlsx Excel file cell with the following formula: "=R[-1]C * 100/" & val where val is the value I obtained from the dBaseIII database file. This is done with the following statements:
Dim formula as String = "=R[-1]C * 100/" & project.TotalIncome(i)
cell.FormulaR1C1 = formula.
I am using two programming environments:
A Windows 8.1 machine with Visual Studio 2013 Community and Office 2010.
A Windows 8.1 machine with Visual Studio 2013 Ultimate and Office 2013.
I have made sure that both environments have the same Language, Date, Time and Number format, both for Windows and Office.
When I build and execute the program from the Option 1 Environment, everything pastes fine inside the Excel file. I navigate to the cell containing the formula, and whether or not the value obtained had decimal places, the formula is there.
However, If I build and execute the program from the Option 2 Environment, I get a:
Removed Records: Formula from /xl/worksheets/sheet.xml part
Removed Records: Formula from /xl/calcChain.xml part (calculation properties)
I tried adding a breakpoint in Environment 2, opening the Locals window and editing those values which had decimal places and everything worked as intended, whereas when I use Environment 1 I have no trouble whatsoever when the value has decimal places.
I have tried the following (in Environment 2):
Dim nfi As NumberFormatInfo = New CultureInfo("es-ES", False).NumberFormat
nfi.NumberDecimalSeparator = ","
value = Decimal.Parse(row("VALUECOL"), nfi)
also:
value = Decimal.Parse(row("VALUECOL"), New CultureInfo("es-ES"))
To no avail.
I have opened the XML file containing the Excel Sheet info in Environment 2 and found this:
<x:c r="L101" s="41">
<x:f>L100 * 100/57125,71</x:f>
</x:c>
Whereas the definitions for the same XML file created by the Environment 1 has the following cell value:
<x:c r="L101" s="41">
<x:f>L100 * 100/57125.71</x:f>
</x:c>
So, is it a Visual Studio Locale thing (which both have the same, as far as I can see), or am I missing something else?
EDIT: Printing out the current Locale with:
Console.WriteLine(CultureInfo.CurrentCulture.Name)
yields the same es-ES on both Environment 1 and Environment 2.
EDIT 2:
Taken from: Microsoft Office XML formats. Defective by design.
To save them time, Microsoft chose to store XML using the US English
locale regardless of all settings above. [...]
Also, for Excel formulas, it means the formula names are US English
formula names, [...] it implies you are willing to work with US English
function names (plus US English separators, ...).
So basically it all boils down (I believe) to a pre localisation of the decimal value into the Excel XML taking into account something, somewhere.
In Environment 2, any other (non-formula) value I write to the Excel file appears in the XML as an en-US localised value (i.e. 12345.12). Most of them brought in by a dataTable import. However, since writing a formula requires the input of a string, and Visual Studio applies locale settings to said string, it ends up as 12345,12 in the Excel XML, which results in the previously mentioned errors.
So, what on earth is Visual Studio taking from Environment 1 that is different from Environment 2? All possible UI localisation options are exactly the same in both machines...
I had a similar issue before, and found that there was a different dll file in my project references. The dll's were named the same, I only noticed because of a file size difference. Once I manually linked to the same one on both Dev machines, I got the expected results.
Like I said, my issue was different... But it did also involve excel files, and I did have Excel 2010 on one Dev machine and 2013 on the other.
I don't even know if this qualifies as an answer since I still have no clue about where's the localisation variable that Environment 1 has different from Environment 2.
However, It seems Visual Studio -when using different localisations- deals internally with de-localised decimal variables, but with localised string variables. Even when checking the locals panel during debugging, the value of a decimal number stored in a dictionary entry will appear as its localised version on the keyValuePair entry, and as a de-localised en-US value when expanded:
Hence, when outputting a dataTable as a whole to the Excel file, it's written onto the XML as en-US values. On the other hand, when outputting a formula (a.k.a. a string) it pours over the localised version of the associated decimal value.
Conclusion: When dealing with Office files in localised systems, just write the data as de-localised (i.e. en-US) and let the software localise it for you.
Ended up doing the following dirty patch:
Dim formula As String = "=R[-1]C * 100/" & project.TotalIncome(i).ToString().Replace(",", ".")

Why does every form field in my generated PDF end with "-0"?

So I have the following VB.NET code that creates a form field in a PDF using SyncFusion's Essential PDF module:
Dim pdfField As New Pdf.Interactive.PdfTextBoxField(pdfDoc.Pages(iPage), "txt1")
pdfField.Location = New PointF(50, 50)
pdfField.Size = New SizeF(100, 10)
pdfDoc.Form.Fields.Add(pdfField)
This works great except for one thing. When I open up the PDF in Acrobat and look at the field name I notice that it says "txt1-0". Now I can't figure out where the "-0" is coming from and how to get rid of it.
This may be a SyncFusion issue, in which case I hope I get an answer from them soon (I've asked this on their forum). But I thought it might also be a fundamental detail about PDF's and naming that I don't know about.
Ah ha, I just found out what was causing this.
Previously I was using both the PdfLoadedDocument and PdfDocument classes. I was loading the PdfLoadedDocument into the PdfDocument via ImportPages and apparently this process will add the "-0" suffix to the field names.
I found that in my case I can get rid of the PdfDocument object and just use PdfLoadedDocument and that fixed it.
UPDATE:
Just to expand on this, I've found that it's actually the PdfDocument.Form.FieldAutoNaming property that controls this. It's default value is true. And when it's set to true it'll automatically add suffixes as needed to prevent duplicate field names. But if you set it to false then it won't add the suffix "-0" anymore -- instead you might get errors in your code.