I'm new to DXL and learning. I want to check in modules for all attributes for certain words to change them to italics.
Example:
specific word = change
before DXL script in attribute/column A: "This requirement should change"
after DXL script in attribute/column A: "This requirement should change"
Code snippet
for itemRef in f do
{
if(shType=="Formal")
{
filtering off;
m = read(fullName(itemRef), false)
Object o
for o in m do
{
//Operation for changing words to italic
}
close(moduleReference);
}
}
Updated Code
void ChangeItalic()
{
Module m = current
filtering off;
Object o
for o in m do
{
int i, j
string t = o."Object Text"
string ModuleName = m."Name"
string ObjectName = identifier(o)
print ModuleName "\n"
print ObjectName "\n"
print t
if(matches("[Ll]astenheft",t)){
print "changed" "\n"
i = start 0
j = end 0
t = t[0:(i-1)] "\\i " t[match 0] "\\i0 " t[j+1:]
o."Object Text" = richText t
}
}
}
// Main-Method
void main(void)
{
ChangeItalic();
}
main()
Here is the "guts" of your script:
Object o = current
int i, j
string t = o."Object Text"
if (matches("[Cc]hange", t)){
i = start 0
j = end 0
t = t[0:(i-1)] "\\i " t[match 0] "\\i0 " t[j+1:]
o."Object Text" = richText t
}
this script will operate on the current object and change the word "change" to Italic if it is within the Object Text.it would be different if you wanted to display only.
Unfortunately, I don't know if you are wanting to display the word in italics (in a column) or to change the word to italics in the attribute.
Related
I have a program that loads a checked list box, then the user selects the items they want and selects a button to say they are done. The checked items are read in a contiguous string with a newline "\n" at the end of each string added. My problem is everything works ok except the newline "\n", and I don't know why.
Code
private: System::Void bntSelected_Click(System::Object^ sender, System::EventArgs^ e) {
int numSelected = this->checkedListBox1->CheckedItems->Count;
if (numSelected != 0)
{
// Set input focus to the list box.
//SetFocus(lstReceiver);
String^ listBoxStr = "";
// If so, loop through all checked items and print results.
for (int x = 0; x < numSelected; x++)
{
listBoxStr = listBoxStr + (x + 1).ToString() + " = " + this->checkedListBox1->CheckedItems[x]->ToString() + "\n";
}
lstReceiver->Items->Add(listBoxStr);
}
}
The string is being formed correctly, but the ListBox control doesn't show newlines in its list items - each item is pushed onto a single line. You can see this by debugging, or by adding a Label to the form with AutoSize set to false - the label will show the newline(s) in the string properly.
Related (C#): How to put a \n(new line) inside a list box?
Instead of ‘\n’, try ‘\r\n’. This may be a windows thing.
I am trying to port some existing VBA code to C#. One routine controls the indentation of bullet items, and is roughly:
indentStep = 13.5
For Each parag In shp.TextRange.Paragraphs()
parag.Parent.Ruler.Levels(parag.IndentLevel).FirstMargin = indentStep * (parag.IndentLevel - 1)
parag.Parent.Ruler.Levels(parag.IndentLevel).LeftMargin = indentStep * (parag.IndentLevel)
Next parag
The code works, but appears to be spooky black magic. In particular, each time a particular ruler's margins are set ALL NINE rulers margins are actually set.
But somehow the appropriate information is being set. Unfortunately, when you do the same thing in C#, the results change. The following code has no visible effect:
const float kIndentStep = 13.5f;
foreach (PowerPoint.TextRange pg in shp.TextFrame.TextRange.Paragraphs())
{
pg.Parent.Ruler.Levels[pg.IndentLevel].FirstMargin = kIndentStep * (pg.IndentLevel - 1);
pg.Parent.Ruler.LevelS[pg.IndentLevel].LeftMargin = kIndentStep * pg.IndentLevel;
}
This appears to be a limitation/bug when automating PowerPoint from C#. I confirm it works with VBA.
I do see an effect after the code runs: it changes the first level with each run so that, at the end, the first level has the settings that should have been assigned to the last level to be processed, but none of the other levels appear to be affected, visibly. I do see a change in the values returned during code execution, but that's all.
If the code changes only one, specific level for the text frame, it works. The problem occurs only when attempting to change multiple levels.
I tried various approaches, including late-binding (PInvoke) and putting the change in a separate procedure, but the result was always the same.
Here's my last iteration
Microsoft.Office.Interop.PowerPoint.Application pptApp = (Microsoft.Office.Interop.PowerPoint.Application) System.Runtime.InteropServices.Marshal.GetActiveObject("Powerpoint.Application"); // new Microsoft.Office.Interop.PowerPoint.Application();
//Change indent level of text
const float kIndentStep = 13.5f;
Microsoft.Office.Interop.PowerPoint.Shape shp = pptApp.ActivePresentation.Slides[2].Shapes[2];
Microsoft.Office.Interop.PowerPoint.TextFrame tf = shp.TextFrame;
object oTf = tf;
int indentLevelLast = 0;
foreach (Microsoft.Office.Interop.PowerPoint.TextRange pg in tf.TextRange.Paragraphs(-1, -1))
{
int indentLevel = pg.IndentLevel;
if (indentLevel > indentLevelLast)
{
Microsoft.Office.Interop.PowerPoint.RulerLevel rl = tf.Ruler.Levels[indentLevel];
object oRl = rl;
System.Diagnostics.Debug.Print(pg.Text + ": " + indentLevel + ", " + rl.FirstMargin.ToString() + ", " + rl.LeftMargin.ToString()) ;
object fm = oRl.GetType().InvokeMember("FirstMargin", BindingFlags.SetProperty, null, oRl, new object[] {kIndentStep * (indentLevel - 1)});
//rl.FirstMargin = kIndentStep * (indentLevel - 1);
object lm = oRl.GetType().InvokeMember("LeftMargin", BindingFlags.SetProperty, null, oRl, new object[] { kIndentStep * (indentLevel) });
//rl.LeftMargin = kIndentStep * indentLevel;
indentLevelLast = indentLevel;
System.Diagnostics.Debug.Print(pg.Text + ": " + indentLevel + ", " + tf.Ruler.Levels[indentLevel].FirstMargin.ToString() + ", " + tf.Ruler.Levels[indentLevel].LeftMargin.ToString()) ;
rl = null;
}
}
FWIW neither code snippet provided in the question compiles. The VBA snippet is missing .TextFrame. The C# snippet doesn't like Parent.Ruler so I had to change it to TextFrame.Ruler.
I am using DotNetCore.NPOI (1.2.1) in order to read an MS Excel file.
Some of the cells are of type text and contain formatted strings (e.g. some words in bold).
How do I get the formatted cell value? My final goal: Retrieve the cell text as HTML.
I tried
var cell = row.GetCell(1);
var richStringCellValue = cell.RichStringCellValue;
But this won't let me access the formatted string (just the plain string without formattings).
Does anybody have an idea or solution?
I think you'll have to take longer route in this case. First you'll have to maintain the formatting of cell value like date, currency etc and then extract the style from cell value and embed the cell value under that style.
best option is to write extenstion method to get format and style value.
To get the fomat Please see this link How to get the value of cell containing a date and keep the original formatting using NPOI
For styling first you'll have to check and find the exact style of running text and then return the value inside the html tag , below method will give you idea to extract styling from cell value. Code is untested , you may have to include missing library.
public void GetStyleOfCellValue()
{
XSSFWorkbook wb = new XSSFWorkbook("YourFile.xlsx");
ISheet sheet = wb.GetSheetAt(0);
ICell cell = sheet.GetRow(0).GetCell(0);
XSSFRichTextString richText = (XSSFRichTextString)cell.RichStringCellValue;
int formattingRuns = cell.RichStringCellValue.NumFormattingRuns;
for (int i = 0; i < formattingRuns; i++)
{
int startIdx = richText.GetIndexOfFormattingRun(i);
int length = richText.GetLengthOfFormattingRun(i);
Console.WriteLine("Text: " + richText.String.Substring(startIdx, startIdx + length));
if (i == 0)
{
short fontIndex = cell.CellStyle.FontIndex;
IFont font = wb.GetFontAt(fontIndex);
Console.WriteLine("Bold: " + (font.IsBold)); // return string <b>my string</b>.
Console.WriteLine("Italics: " + font.IsItalic + "\n"); // return string <i>my string</i>.
Console.WriteLine("UnderLine: " + font.Underline + "\n"); // return string <u>my string</u>.
}
else
{
IFont fontFormat = richText.GetFontOfFormattingRun(i);
Console.WriteLine("Bold: " + (fontFormat.IsBold)); // return string <b>my string</b>.
Console.WriteLine("Italics: " + fontFormat.IsItalic + "\n");// return string <i>my string</i>.
}
}
}
Font formatting in XLSX files are stored according to schema http://schemas.openxmlformats.org/spreadsheetml/2006/main which has no direct relationship to HTML tags. Therefore your task is not that much straight forward.
style = cell.getCellStyle();
font = style.getFont(); // or style.getFont(workBook);
// use Font object to query font attributes. E.g. font.IsItalic
Then you will have to build the HTML by appending relevant HTML tags.
Trying to use Apache PDFBox version 2.0.2 for a text replace (with the below code) produces an output where few of the characters would not be displayed, mostly the capital Case Character. For example a replacement with "ABCDEFGHIJKLMNOPQRSTUVWXYZ" the output appears in pdf as "ABCDEF HIJKLM OP RST W Y ". Is this some bug ?? or we have some workaround to handle these character .
public static PDDocument replaceText(PDDocument document, String searchString, String replacement) throws IOException {
if (StringUtils.isEmpty(searchString) || StringUtils.isEmpty(replacement)) {
return document;
}
PDPageTree pages = document.getDocumentCatalog().getPages();
for (PDPage page : pages) {
PDFStreamParser parser = new PDFStreamParser(page);
parser.parse();
List tokens = parser.getTokens();
for (int j = 0; j < tokens.size(); j++) {
Object next = tokens.get(j);
if (next instanceof Operator) {
Operator op = (Operator) next;
//Tj and TJ are the two operators that display strings in a PDF
if (op.getName().equals("Tj")) {
// Tj takes one operator and that is the string to display so lets update that operator
COSString previous = (COSString) tokens.get(j - 1);
String string = previous.getString();
string = string.replaceFirst(searchString, replacement);
previous.setValue(string.getBytes());
} else if (op.getName().equals("TJ")) {
COSArray previous = (COSArray) tokens.get(j - 1);
for (int k = 0; k < previous.size(); k++) {
Object arrElement = previous.getObject(k);
if (arrElement instanceof COSString) {
COSString cosString = (COSString) arrElement;
String string = cosString.getString();
string = StringUtils.replaceOnce(string, searchString, replacement);
cosString.setValue(string.getBytes());
}
}
}
}
}
// now that the tokens are updated we will replace the page content stream.
PDStream updatedStream = new PDStream(document);
OutputStream out = updatedStream.createOutputStream();
ContentStreamWriter tokenWriter = new ContentStreamWriter(out);
tokenWriter.writeTokens(tokens);
page.setContents(updatedStream);
out.close();
}
return document;
}
Quoting from
https://pdfbox.apache.org/2.0/migration.html
Why was the ReplaceText example removed?
The ReplaceText example has been removed as it gave the incorrect illusion that text can be replaced easily. Words are often split, as seen by this excerpt of a content stream:
[ (Do) -29 (c) -1 (umen) 30 (tation) ] TJ
Other problems will appear with font subsets: for example, if only the glyphs for a, b and c are used, these would be encoded as hex 0, 1 and 2, so you won’t find “abc”. Additionally, you can’t replace “c” with “d” because it isn’t part of the subset.
You could also have problems with ligatures, e.g. “ff”, “fl”, “fi”, “ffi”, “ffl”, which can be represented by a single code in many fonts. To understand this yourself, view any file with PDFDebugger and have a look at the “Contents” entry of a page.
======================================================================
Your description suggests that the initial file has been using a font subset, that is missing the characters G, N, Q, V and Y.
And no, there is no easy workaround. You would have to delete the text you don't want from the content stream, and then append a new content stream with the text you want with a new font at the correct place.
P.S. the current PDFBox version is 2.0.7, not 2.0.2.
X and Requirement are existing attributes.
I want to create an attribute Z such that, for the given object,
if Requirement=True, then Z={the value of attribute X},
but if Requirement=False, then Z={Object Heading and Object Text}.
What is the DXL for making this attribute?
Thanks.
This is untested code but try something like this: (assuming attribute z exists as text)
Module m = current
Object o
for o in m do
{
if ((o."Requirement") == "true")
{
o."z" = o."x" ""
}
else // requirement = false
{
o."z" = o."Object Heading" "\n" o."Object Text" ""
}
}