Can I define a text property as rich text? - vb.net

VS 2013, VB, EF6
I am creating an object that will keep user input in one of its properties. I would like that user input to be stored as rich text. What's involved to make that stored text be rich text format? So,
Public Property Text as <what?>

I thought I would post what was my answer for others who might ask the question the same way I did. I begin by stating that my question was poorly formed because I didn't understand I'm not really storing RTF, I'm storing WYSIWYG text with html tags. But I think the question as phrased is useful because that's how many people think until they are taught by others.
Ultimately this process opens a serious XSS vector, but first we have to at least collect the WYSIWYG text.
First step: using a script-based editor capture the text with html tags. I used CKEditor which is easy to download on NuGet. It comes in 3 flavors: basic, standard and full. Another popular one seems to be TinyMCE also available through NuGet.
CKEditor must be 'wired in' to replace the existing input element. I replaced #html.editorfor with a < textarea > directly as follows. Model.UserPost.Body is the property into which I want to place the WYSIWYG text. The Raw helper is required so the output is NOT encoded allowing us to see our WYSIWYG text.
<textarea name="model.UserPost.Body" id="model_UserPost_Body" class="form-control text-box multi-line">
#Html.Raw(Model.UserPost.Body)
</textarea>
CKEditor is 'wired in' using a script element to replace the < textarea > element.
#Section Scripts
<script src="~/scripts/ckeditor/ckeditor.js"></script>
<script>
CKEDITOR.replace('model.UserPost.Body');
</script>
End Section
The script above can be added to all pages via _layout.vbhtml, or just the target page via a #Section Scripts section as shown above, which is often recommended and what I did, but that may also require adding to the standard _Layout the following in the < head > section such as follows.
#RenderSection("Styles", False)
In the controller POST method for the view the following code is needed to capture the WYSIWYG text otherwise the default filter will raise an exception when it detects anything that looks like an html tag.
Dim rawBody = Request.Unvalidated.Form("model.UserPost.Body")
userPost.Body = rawBody
There are some possible gotcha's; The 'body' property has to be removed from the Include:= list of the < Bind > element in the method paramter list if < Bind > is being used. Also, although not directly related to this solution, you can't have a Data Annotation like < Required() > on this property in the model because background checking won't be able to confirm that condition so the ModelState.IsValid flag won't ever go true.
Second step: before saving the input it MUST be checked for XSS. Microsoft has a nice video explaining basic XSS that I recommend viewing; it's only 11 minutes.
Mikesdotnetting has a nice explaination for dealing with XSS and shows a whitelisting algorithm toward the bottom of this page. The following code is based on his work.
To create a white listing approach, the HTML Agility Pack is useful to catalogue the HTML nodes for review. This is easily loaded from Nu Get as well. This is the code I used in the POST method to invoke the white list methods (Yes, it could be more compact, but this is easier to read for us novices):
Dim tempDoc = New HtmlDocument()
tempDoc.LoadHtml(rawBody)
RemoveNodes(tempDoc.DocumentNode, allowedTags)
userPost.Body = tempDoc.DocumentNode.OuterHtml
The allowed tags are what you will allow, which means everything else is rejected, hence whitelisting. This is just a sample list:
Dim allowedTags As New List(Of String)() From {"p", "em", "s", "ol", "ul", "li", "h1", "h2", "h3", "h4", "h5", "h6", "strong"}
These are the methods based on Mikesdotnetting page:
Private Sub RemoveNodes(ByVal node As HtmlNode, allowedTags As List(Of String))
If (node.NodeType = HtmlNodeType.Element) Then
If Not allowedTags.Contains(node.Name) Then
node.ParentNode.RemoveChild(node)
Exit Sub
End If
End If
If (node.HasChildNodes) Then
RemoveChildren(node, allowedTags)
End If
End Sub
Private Sub RemoveChildren(ByVal parent As HtmlNode, allowedTags As List(Of String))
For i = parent.ChildNodes.Count() - 1 To 0 Step -1
RemoveNodes(parent.ChildNodes(i), allowedTags)
Next
End Sub
So basically, (1) CKEditor captures user input with html tags that looks nice, (2) the raw input is specially requested in the Controller POST method and then (3) cleaned using a white list. After that it can be output directly to the page using #Html.Raw() because it can be trusted.
That's it. I've not really posted solutions like this before, so if I've missed something let me know and I'll correct or add it.

Rich Text is stored in the Rich Text Format.
The Rich Text Format specifications can be found here:
http://www.microsoft.com/en-us/download/details.aspx?id=10725
It is just an ordinary string. You can extract the string from a RichTextBox using the SaveFile function:
Private Function GetRTF(ByRef Box As RichTextBox) As String
Using ms As New IO.MemoryStream
Box.SaveFile(ms, RichTextBoxStreamType.RichText)
Return System.Text.Encoding.ASCII.GetString(ms.ToArray)
End Using
End Function
You can load text in the Rich Text Format into a RichTextBox using the LoadFile method of the RichTextBox. The text needs to be in the correct format:
Dim rtf As String = "{\rtf1 {\colortbl;\red0\green0\blue255;\red255\green0\blue0;}Guten Tag!\line{\i Dies} ist ein\line formatierter {\b Text}.\line Das {\cf1 Ende}.}"
Using ms As New IO.MemoryStream(System.Text.Encoding.ASCII.GetBytes(rtf))
RichTextBox1.LoadFile(ms, RichTextBoxStreamType.RichText)
End Using
Ordinary controls usually will not interpret this format in their text property.

Related

How can I create the resource string without a big string?

In After Effects scripts, if you want your script to be able to be docked in the program's workspace, the only way to do it as far as I know is to use a resource string like this:
var res = "Group{orientation:'column', alignment:['fill', 'fill'], alignChildren:['fill', 'fill'],\
group1: Group{orientation:'column', alignment:['fill', ''], alignChildren:['fill', ''],\
button1: Button{text: 'Button'},\
},\
}";
myPanel.grp = myPanel.add(res);
The above code creates a script UI with one button ("button1") inside a group ("group1").
I would like to know other ways to create the same resource string. Is it possible to make it using a JSON object and then stringifying it??
I know it can be done somehow, because I have inspected the Duik Bassel script that is dockable and, for example, adds elements like this:
var button1 = myPal.add( 'button' );
but I cannot understand how to do it myself.
TL;DR: I want to make a dockable scriptUI without writing a giant string all at once, but bit by bit, like a floating script.
UI container elements have an add() method which allows you to add other UI elements to them, and you can treat them as normal objects.
var grp = myPanel.add("group");
grp.orientation = "column";
grp.alignment = ['fill', 'fill'];
grp.alignChildren = ['fill', 'fill'];
var group1 = grp.add("group");
…
var button1 = group1.add("button");
button1.text = 'Button'
More details and examples here: https://extendscript.docsforadobe.dev/user-interface-tools/types-of-controls.html#containers
Also worth checking out https://scriptui.joonas.me/ which is a visual scriptUI interface builder. You have to do some work on the code it produces to get panels for AE, but it's not hard.
extendscript still uses a 20th century version of javaScript, which doesn't have JSON built-in, but I have successfully used a JSON polyfill with it.
I used json2.js to get structured data in and out of Illustrator, and it worked beautifully, but I can see there's now a json3.js which might be better for whatever reason. This stackoverflow question addresses the differences.
To load another .js file (such as a polyfill) into your script, you need to do something like
var scriptsFolder = (new File($.fileName)).parent; // hacky but effective
$.evalFile(scriptsFolder + "/lib/json2.js"); // load JSON polyfill
These file locations may differ in other Adobe apps. Not sure what it would be in AfterEffects. I seem to remember that InDesign has a different location for scripts. You can also hardcode the path, of course.
Good luck!

Selecting element using webdriver (duplicate identifiers)

I have to look at an application which I can't use the normal selectors (like "id", "name", etc - this is a design flaw) but I do have a custom tag which has been applied to elements on the page:
test-tag='x'
and this is fine, I can interact with this using (simple script)
var tag = '[test-tag="x"]';
var selector = $(tag);
However, I have now found that some elements (notably textboxes) have a title and a box element - both have the same custom tag applied. Now the text box is an input type. Anyone know how I can change the above to target specifially input types?
try this:
'input[test-tag="x"]'
for the input box
Take a look at this as well:
https://www.w3schools.com/cssref/css_selectors.asp

Word to HTML fields in header and footer

I'm using docx4j to convert a Word template to several HTML files, one per chapter.
The Word template has several custom properties mapped by several fields (DOCPROPERTY ...) represented as both simple and complex fields. I populate those properties to obtain Freemarker code when the word document is converted to HTML (like ${...} or [#... /] directives).
In a later step I look for "heading 1" paragraphs to identify chapters and then split the document in several Word documents before conversion, then these documents are converted to HTML and written to temporary files.
Each document is successfully converted to HTML and fields are correctly replaced with my markers, but it behaves wrong when it writes header and footer parts: field codes are written before field values (eg. DOCPROPERTY "PROPERTY_NAME" \* MERGEFORMAT ${constants['PROPERTY_NAME']} ) instead of field values only (eg. ${constants['PROPERTY_NAME']} ).
If I write the updated document to a docx file instead, nothing seems wrong into the generated document.
If it's useful to solve the problem, this is what I do to split the document (per chapter):
clone the updated WordprocessingMLPackage (clone method)
delete every root element before the chapter's "heading 1" element
delete every root element from the "heading 1" element of the next chapter
convert the cloned and cleaned document
(actually I don't use the clone method every time, but I write the updated document to a ByteArrayOutputStream and then read it for every chapter, inspired by the source of the clone method).
I suspect it's for a docx4j bug, did anybody else try something similar?
Finally these are my platform details:
JDK 1.6
Docx4J v3.2.2
Thanks in advance for any help
EDIT
To produce freemarker markers in place of Word fields, I set document property values as follows:
traverse the document looking for simple or complex fields with new TraversalUtil(wordMLPackage.getMainDocumentPart().getContent(), visitor);, where visitor is my custom callback for looking for fields and set properties
traversing the document I look for
FldChar elements with type BEGIN and parse them using FieldsPreprocessor.canonicalise((P) ((R) fc.getParent()).getParent(), fields); (I don't use the return value of canonicalise) where fc is the found FldChar and fields is a empty ArrayList<FieldRef>; then I extract and parse field's instrText attribute
CTSimpleField elements and parse them using FldSimpleModel fldSimpleModel = new FldSimpleModel(); fldSimpleModel.build((CTSimpleField) o, null);; then I use fldSimpleModel.getFldArgument() to get the property name
I look for the freemarker code to show in place of the current field and set it as property value using wordMLPackage.getDocPropsCustomPart().setProperty(propertyName, finalValue);
finally I do the same from step 1 for headers and footers as follows:
List<Relationship> rels = wordMLPackage.getMainDocumentPart().getRelationshipsPart().getRelationships().getRelationship();
for (Relationship rel : rels) {
Part p = wordMLPackage.getMainDocumentPart().getRelationshipsPart().getPart(rel);
if (p == null) {
continue;
}
if (p instanceof ContentAccessor) {
new TraversalUtil(((ContentAccessor) p).getContent(), visitor);
}
}
Finally I update fields as follows
FieldUpdater updater = new FieldUpdater(wordMLPackage);
try {
updater.update(true);
} catch (Docx4JException ex) {
Logger.getLogger(WorkerDocx4J.class.getName()).log(Level.SEVERE, null, ex);
}
After filling all field properties, I clone the document as previously described and convert filtered cloned instances using
HTMLSettings settings = Docx4J.createHTMLSettings();
settings.setWmlPackage(wordDoc);
settings.setImageHandler(new InlineImageHandler(myDataModel));
Docx4jProperties.setProperty("docx4j.Convert.Out.HTML.OutputMethodXML", true);
ByteArrayOutputStream os = new ByteArrayOutputStream();
os.write("[#ftl]\r\n".getBytes("UTF-8"));
Docx4J.toHTML(settings, os, Docx4J.FLAG_EXPORT_PREFER_XSL);
String template = new String(os.toByteArray(), "UTF-8");
then I obtain in template variable the resulting freemarker template.
The following XML is the content of footer1.xml part of the document generated after updating the document properties as described: footer1.xml after field updates
The very strange thing (in my opinion) is that if some properties are not found, step 5 throws an Exception (ok), fields updating stops at the wrong field (ok) and all fields in header and footer are rendered right. In this case, this is the content for footer1.xml.
In the last case, fields are defined in a different way. I think the HTML converter handles well the last case and does something wrong in the first one.
Is there something I do wrong or I can do better?

FormBlock Server Control in Ektron

I am working in Ektron 8.6.
I have a FormBlock Server Control in my Template Page,It is having a DefualutFormID of a valid HTML form from workarea.The form in the workarea have got few form fields and their corresponding values.
While the template page is rendering I need to GET those form field values and re-set them with some other values.
In which Page –Cycle event I should do this coding?
I tried this code in Pre-Render Event,but I am unable to GET the value there,but I am able to set a value.
I tried SaveStateComplete event as well,no luck.
String s=FormBlock1.Fields["FirstName"].Value;
If(s=”some text”)
{
// Re-set as some other vale.
FormBlock1.Fields["FirstName"].Value=”Some other value”;
}
In which event I can write this piece of code?
Page_Load works fine for changing the value of a form field. The default behavior is for the Ektron server controls to load their data during Page_Init.
The real problem is how to get the default value. I tried every possible way I could find to get at the data defining an Ektron form (more specifically, a field's default value), and here's what I came up with. I'll admit, this is a bit of a hack, but it works.
var xml = XElement.Parse("<ekForm>" + cmsFormBlock.EkItem.Html + "</ekForm>");
var inputField = xml.Descendants("input").FirstOrDefault(i => i.Attribute("id").Value == "SampleTextField");
string defaultValue = inputField.Attribute("value").Value;
if (defaultValue == "The default value for this field is 42")
{
// do stuff here...
}
My FormBlock server control is defined on the ASPX side, nothing fancy:
<CMS:FormBlock runat="server" ID="cmsFormBlock" DynamicParameter="ekfrm"/>
And, of course, XElement requires the following using statement:
using System.Xml.Linq;
So basically, I wrap the HTML with a single root element so that it becomes valid XML. Ektron is pretty good about requiring content to be XHTML, so this should work. Naturally, this should be tested on a more complicated form before using this in production. I'd also recommend a healthy dose of defensive programming -- null checks, try/catch, etc.
Once it is parsed as XML, you can get the value property of the form field by getting the value attribute. For my sample form that I set up, the following was part of the form's HTML (EkItem.Html):
<input type="text" value="The default value for this field is 42" class="design_textfield" size="24" title="Sample Text Field" ektdesignns_name="SampleTextField" ektdesignns_caption="Sample Text Field" id="SampleTextField" ektdesignns_nodetype="element" name="SampleTextField" />

What is an HTMLSelectElement and an HTMLInputElement?

I'm attempting to learn VBA by reading through someone's code and understanding what happens every step of the way. However, I'm confused at to what these two elements are:
What is a HTMLSelectElement?
What is a HTMLInputElement?
See the W3C HTML Specs:
HTMLSelectElement
HTMLInputElement
I assume they correspond to select and input HTML tags. A select tag is also called a drop-down list, and input tags can be used for multiple things (checkbox, radio button, text, password).