Justify text in SQL Reporting Services - sql

Is there a way of fully-justifying text in SQL Reporting Services?
I've been searching around and it seems the feature is still not supported by Reporting Services, but are there any workarounds?
I know this question has been asked before, but maybe progress has been made in the mean time.

This is not possible, at least not in SSRS 2008 and below. The only options for aligning text are Left, Center and Right.
The only workaround I could think of was enabling HTML tags in a text box, but the styling for Justify alignment is just ignored. So there really aren't any suitable workarounds AFAIK, short of using picture with justified text (~shudder!~).
You should keep an eye on the corresponding MS feedback item and perhaps vote on it as well. It used to have 527 votes, but was reset to 0 during the move from MS Connect to this new feedback site. I found the bug report through this social.msdn thread, which has been going on for quite some time.

'picture with justified text in SSRS': you can create a AdvRichTextBox control (see code http://geekswithblogs.net/pvidler/archive/2003/10/14/182.aspx ) and use it in ssrs following these steps : http://binaryworld.net/Main/CodeDetail.aspx?CodeId=4049

Here's a possible workaround : Full Text Just
It makes use of RS utility and OLE Automation to do the job.

In Standard, SSRS does not Support justify. There are possibilities to work around:
Use a third party control doing this: (I was not able to get one to work.)
Call a component via COM like Word. (Is a security issue, but possible.)
Format the box in HTML and put small white spaces between the words. This can be done in a stored procedure.
The solution 3 is very long to describe in detail. This is the reason why I put my solution for free download on my web page.
The advantage of my solution is, that there is no installation necessary.
Here is the link to my solution: http://www.rupert-spaeth.de/justify/

If you use <p> try with:
$("[style*='padding-bottom:10pt']").css("text-align", "justify");

The following will work if you open the .rdl code file (which is xml).
You need a paragraph tag, if it doesn't already exist.
This formats a number to use commas (U.S. style) with two points after the decimal place.
It is then right-justified by the Right tag {I had been looking for a justify tag, but it is TextAlign}
<Paragraph>
<TextRuns>
<TextRun>
<Value>=Format( Sum(Fields!ourField.Value, "DataSet2") , "N2") </Value>
<Style>
<FontFamily />
<Color>White</Color>
</Style>
</TextRun>
</TextRuns>
<Style>
<TextAlign>Right</TextAlign>
</Style>
</Paragraph>

Actually its possible to Justify text in SSRS report if you pass the value as HTML and use something to format the text into justify'ed html text before, in my case im using .NET C# to format the passed string to justified html text.
But before that we need to to configure our SSRS report to accept HTML for this we need to add a text box and create a placeholder.
to add a place holder click on the textbox until it lets you write text to it then right click and choose "Create placeholder..."
After you created the place holder you will be prompted to enter the properties of the placeholder, all you need to specify is Value and Markup type
be sure to select the Markup type as HTML and for the value specify the variable that will have the justified html text in our case lets call it transformedHtml.
Now we need to create a function that trasforms our string to justified HTML text
/// <summary>
///
/// </summary>
/// <param name="text">The text that we want to justify</param>
/// <param name="width">Justified text width in pixels</param>
/// <param name="useHtmlTagsForNewLines">if true returns the output as justified html if false returns the ouput as justified string</param>
/// <returns>Justified string</returns>
public string GetText(string text, int width, bool useHtmlTagsForNewLines = false)
{
var palabras = text.Split(' ');
var sb1 = new StringBuilder();
var sb2 = new StringBuilder();
var length = palabras.Length;
var resultado = new List<string>();
var graphics = Graphics.FromImage(new Bitmap(1, 1));
var font = new Font("Times New Roman", 11);
for (var i = 0; i < length; i++)
{
sb1.AppendFormat("{0} ", palabras[i]);
if (graphics.MeasureString(sb1.ToString(), font).Width > width)
{
resultado.Add(sb2.ToString());
sb1 = new StringBuilder();
sb2 = new StringBuilder();
i--;
}
else
{
sb2.AppendFormat("{0} ", palabras[i]);
}
}
resultado.Add(sb2.ToString());
var resultado2 = new List<string>();
string temp;
int index1, index2, salto;
string target;
var limite = resultado.Count;
foreach (var item in resultado)
{
target = " ";
temp = item.Trim();
index1 = 0; index2 = 0; salto = 2;
if (limite <= 1)
{
resultado2.Add(temp);
break;
}
while (graphics.MeasureString(temp, font).Width <= width)
{
if (temp.IndexOf(target, index2) < 0)
{
index1 = 0; index2 = 0;
target = target + " ";
salto++;
}
index1 = temp.IndexOf(target, index2);
temp = temp.Insert(temp.IndexOf(target, index2), " ");
index2 = index1 + salto;
}
limite--;
resultado2.Add(temp);
}
var res = string.Join(useHtmlTagsForNewLines ? "<br> " + Environment.NewLine : "\n", resultado2);
if (useHtmlTagsForNewLines)
res = $"<div>{res.Replace(" ", " ").Replace("<br> ", "<br>")}</div>";
return res;
}
By using this function we can transform any string to justified text and we can select if we want the output to be HTMl or simple string
then we can just call this method like
string text = "Lorem Ipsum is simply dummy text of the printing and typesetting industry. Lorem Ipsum has been the industry's standard dummy text ever since the 1500s, when an unknown printer took a galley of type and scrambled it to make a type specimen book. It has survived not only five centuries, but also the leap into electronic typesetting, remaining essentially unchanged. It was popularised in the 1960s with the release of Letraset sheets containing Lorem Ipsum passages, and more recently with desktop publishing software like Aldus PageMaker including versions of Lorem Ipsum.";
string transformedHtml = GetText(text, 350, true);
and we get the output as folows:
In C#
In SSRS
Now this example mainly shows how to get justified text if your passing the values from C# code to ssrs reports but you could acchieve this if you would make the same function in a stored procedure that formats any text the same way. Hope this helps someone.

Related

How do you style a substring in a PDF (in Adobe Acrobat)?

Editing a PDF (specifically a user-editable form) using Adobe Acrobat, and using the PDF JavaScript API, is it possible to style separate substrings within a field value? Is there a markup language used, for example?
A bit of pseudocode for what I’m talking about:
This word is <red>red</red>, this word is <bold>bold</bold>. I have spoken.
The answer is to:
a) Make sure the field type is rich text, and
b) Use Adobe's JavaScript methods to set spans within the field's richValue property with the formatting desired, for example:
var field = this.getField("MyRichTextField");
var spans = new Array();
spans[0] = new Object();
spans[0].text = "This word is ";
spans[1] = new Object();
spans[1].text = "red";
spans[1].textColor = color.red;
field.richValue = spans;
Further details and a list of span properties can be found in Acrobat JavaScript Scripting Reference (https://www.adobe.com/content/dam/acom/en/devnet/acrobat/pdfs/Acro6JS.pdf)

Illustrator variables - dynamically line up two text strings next to each other when autogenerating

I am automating the generation of several thousand labels in Adobe Illustrator. The use of the VariableImporter script has made easy work of it so far, but now I have reached an issue where I am stumped. The original plan worked great, until the powers that be requested that one line of text have a bold text string, followed by a normal weight text string. Before, when the font weights were the same I could have connected the two strings of text in the CSV file prior to loading them into the drawing, and they would have came out lying right next to each other. This is now no longer possible and I can't think of a solution that is not incredibly fussy.
I don't know illustrator very well, so I am thinking I could just be unaware of some setting that would stick an object next to another one even as the other one moves.
Okay here is the way I figured out how to do this with help from Adobe forums and from Vasily.
First of all, use InDesign if possible. It is better at performing a Data Merge and can do this without your scripting.
Write out <variable1> <variable2> which is formatted as needed on the same line of text.
You will need to have the variables that you are putting in there somewhere in the illustration. Recommended to put it in a hidden layer behind everything.
replace variable1 and variable2 with the names of your variables where the functions getVariableContents() are called in this script
var idoc = app.activeDocument;
var vars = idoc.variables;
var replace1 = /<variable1>/g;
var replace2 = /<variable2>/g;
// author CarlosCanto on adobe forums
function getVariableContents(variableName) {
var idoc = app.activeDocument;
var ivar = idoc.variables.getByName(variableName);
return ivar.pageItems[0].contents;
}
var replaceWith1 = getVariableContents('variable1'), result;
var replaceWith2 = getVariableContents('variable2'), result;
// regex_changeContentsOfWordOrString_RemainFormatting.jsx
// regards pixxxel schubser
function exchangeWords(s, replacer) {
var s = s;
var replacer = replacer;
var atfs = activeDocument.textFrames;
for (var i = atfs.length - 1; i >= 0; i--) {
atf = atfs[i];
while (result = s.exec(atf.contents)) {
try {
aCon = atf.characters[result.index];
aCon.length = result[0].length;
aCon.contents = aCon.contents.replace(s, replacer);
} catch (e) {};
}
}
}
exchangeWords(replace1,replaceWith1);
exchangeWords(replace2,replaceWith2);
run the script
There is a way to accomplish this by having a script do some processing during the course of your batch output, and an organizational system which adds some overhead to your file, in terms of adding more text boxes and possibly an extra layer to your document. But - here's what you can have: a hidden layer with all your variables there in separate single point-text objects, and a layer with your regular template objects such as any point text or area-text objects. Your art text objects will need to be re-worked to contain a string with multiple variable placeholders like this: "Hello, <FirstName> <LastName>". The placeholders can be styled, and a processing script would then need to replace the <placeholder> words with your real variable values. Where are the varible values? They are going to be populating into your hidden layer which has your separate text objects and the script would need to read the contents of each of those to put into the <placeholders>. ~~Those same text fields can be styled as you wish, and the script could apply the same styles to your text when it is replaced in the main text body.~~ -actually this won't be necessary of your routine backs up the original text frame with the placeholder in it, therefore preserving the styling, but it may be necessary if you are going to instead use an external text file to keep your original text in. And of course, it will need to make a backup of the original text with all the <placeholders> so that it will reset the text for every new dataset during your batch process.
However, this is much easier done in Indesign, can you not use ID for your task?
I modified script from #tucker-david-grebitus's answer. So now it gets all textual variables and replaces all their names edged by percent symbol
for (var i = activeDocument.variables.length - 1; i >= 0; i -= 1) {
var variable = activeDocument.variables[i];
if (variable.kind !== VariableKind.TEXTUAL || !variable.pageItems.length) {
continue;
}
var search = new RegExp('%' + variable.name + '%', 'g');
var value = variable.pageItems[0].contents;
for (var j = activeDocument.textFrames.length - 1; j >= 0; j -= 1) {
var textFrame = activeDocument.textFrames[j];
textFrame.contents = textFrame.contents.replace(search, value);
}
}

How to check multiple PDF files for annotations/comments?

Problem: I routinely receive PDF reports and annotate (highlight etc.) some of them. I had the bad habit of saving the annotated PDFs together with the non-annotated PDFs. I now have hundreds of PDF files in the same folder, some annotated and some not. Is there a way to check every PDF file for annotations and copy only the annotated ones to a new folder?
Thanks a lot!
I'm on Win 7 64bit, I have Adobe Acrobat XI installed and I'm able to do some beginner coding in Python and Javascript
Please ignore the following suggestion, since the answers already solved the problem.
EDIT: Following Mr. Wyss' suggestion, I created the following code for Acrobat's Javascript console to be run only once at the beginning:
counter = 1;
// Open a new report
var rep = new Report();
rep.size = 1.2;
rep.color = color.blue;
rep.writeText("Files WITH Annotations");
Then this code should be applied to all PDFs:
this.syncAnnotScan();
annots = this.getAnnots();
path = this.path;
if (annots) {
rep.color = color.black;
rep.writeText(" ");
rep.writeText(counter.toString()+"- "+path);
rep.writeText(" ");
if (counter% 20 == 0) {
rep.breakPage();
}
counter++;
}
And, at last, one code to be run only once at the end:
//Now open the report
var docRep = rep.open("files_with_annots.pdf");
There are two problems with this solution:
1. The "Action Wizard" seems to always apply the same code afresh to each PDF (that means that the "counter" variable, for instance, is meaningless; it will always be = 1. But more importantly, var "rep" will be unassigned when the middle code is run on different PDFs).
2. How can I make the codes that should be run only once run only at the beginning or at the end, instead of running everytime for every single PDF (like it does by default)?
Thank you very much again for your help!
This would be possible using the Action Wizard to put together an action.
The function to determine whether there are annotations in the document would be done in Acrobat JavaScript. Roughly, the core function would look like this:
this.syncAnnotScan() ; // updates all annots
var myAnnots = this.getAnnots() ;
if (myAnnots != null) {
// do something if there are annots
} else {
// do something if there are no annots
}
And that should get you there.
I am not completely positive, but I think there is also a Preflight check which tells you whether there are annotations in the document. If so, you would create a Preflight droplet, which would sort out the annotated and not annotated documents.
Mr. Wyss is right, here's a step-by-step guide:
In Acrobat XI Pro, go to the 'Tools' panel on the right side
Click on the 'Action Wizard' tab (you must first make it visible, though)
Click on 'Create New Action...', choose 'More tools' > 'Execute Javascript' and add it to right-hand pane > click on 'Execute Javascript' > 'Specify Settings' (uncheck 'prompt user' if you want) > paste this code:
.
this.syncAnnotScan();
var annots = this.getAnnots();
var fname = this.documentFileName;
fname = fname.replace(",", ";");
var errormsg = "";
if (annots) {
try {
this.saveAs({
cPath: "/c/folder/"+fname,
bPromptToOverwrite: false //make this 'true' if you want to be prompted on overwrites
});
} catch(e) {
for (var i in e)
{errormsg+= (i + ": " + e[i]+ " / ");}
app.alert({
cMsg: "Error! Unable to save the file under this name ('"+fname+"'- possibly an unicode string?) See this: "+errormsg,
cTitle: "Damn you Acrobat"
});
}
;}
annots = 0;
Save and run it! All your annotated PDFs will be saved to 'c:\folder' (but only if this folder already exists!)
Be sure to enable first Javascript in 'Edit' > 'Preferences...' > 'Javascript' > 'Enable Acrobat Javascript'.
VERY IMPORTANT: Acrobat's JS has a bug that doesn't allow Docs to be saved with commas (",") in their names (e.g., "Meeting with suppliers, May 11th.pdf" - this will get an error). Therefore, I substitute in the code above all "," for ";".

Gecko Engine in ABCPDF not finding tags

Has anyone tried to implement tags using the ABCPDF Gecko engine? I have it working fine on the MSHTML engine (Internet Explorer) as soon as I use Gecko, which is rendering my HTML better, it can't find the tags specified in the HTML.
I'm using style="abcpdf-tag-visible: true;" to specify a tag which works using the default engine.
The following code produces a blank document.
[Test]
public void Tags_With_Gecko()
{
Doc theDoc = new Doc();
theDoc.Rect.Inset(100, 100);
theDoc.Rect.Top = 700;
theDoc.HtmlOptions.Engine = EngineType.Gecko;
// Tag elements with style 'abcpdf-tag-visible: true'
theDoc.HtmlOptions.ForGecko.AddTags = true;
int id = theDoc.AddImageHtml("<FONT id=\"p1\" style=\"abcpdf-tag-visible: true; font-size: 72pt\">Gallia est omnis divisa in partes tres.</FONT>");
// Frame location of the tagged element
XRect[] tagRects = theDoc.HtmlOptions.ForGecko.GetTagRects(id);
foreach (XRect theRect in tagRects)
{
theDoc.Rect.String = theRect.ToString();
theDoc.FrameRect();
}
// Output tag ID
string[] tagIds = theDoc.HtmlOptions.ForGecko.GetTagIDs(id);
theDoc.Rect.String = theDoc.MediaBox.String;
theDoc.Rect.Inset(20, 20);
theDoc.FontSize = 64;
theDoc.Color.String = "255 0 0";
theDoc.AddText("Tag ID \"" + tagIds[0] + "\":");
// Save the document
const string testFilename = #"C:\pdf\HtmlOptionsGetTagRects.pdf";
if (File.Exists(testFilename))
File.Delete(testFilename);
theDoc.Save(testFilename);
theDoc.Clear();
Process.Start(testFilename);
}
Almost identical code for the default engine produces it correctly.
I've been talking to WebSuperGoo support. Found out the documentation isn't consistent/complete.
http://www.websupergoo.com/helppdfnet/source/5-abcpdf/xhtmloptions/2-properties/addtags.htm
In Gecko, your tag has to have a visible impact on the page for it to be picked up. In my case, I had a tag that displayed a non-breaking space, and thus it wasn't found.
From their example, changing the style to the following got it to be findable:
style="abcpdf-tag-visible: true; border: 1px solid transparent"
Note the Border settings is what makes this work apparently.
Again, this fixes their demo and thus should fix Dillorscroft's example.
I have to futz a bit more to fix my problem, as I am trying to allocate blank spaces on the page (for a table of contents) so they can be updated after the html is rendered and I know where the first content page will start.

PDFBox extracting paragraphs

I am new to pdfbox and I want to extract a paragraph that matches some particular words and I am able to extract the whole pdf to text(notepad) but I have no idea of how to extract particular paragraph to my java program. Can anyone help me with this atleast some tutorials or examples.Thank you so much
Text in PDF documents is absolutely positioned. So instead of words, lines and paragraphs, one only has absolutely positioned characters.
Let's say you have a paragraph:
Neque porro quisquam est qui dolorem ipsum quia dolor sit amet, consectetur, adipisci velit
Roughly speaking, in the PDF file it will be represented as characters N at some position, e a bit right to it, q, u, e more to the right, etc.
PDFBox tries to guess how the characters make words, lines and paragraphs. So it will look for a lot of characters at approximately same vertical position, for groups of characters that are near to each other and similar to try and find what you need. It does that by extracting the text from the entire page and then processing it character by character to create text (it can also try and extract text from just one rectangular area inside a page). See the appropriate class PDFTextStripper (or PDFTextStripperByArea). For usage, see ExtractText.java in PDFBox sources.
That means that you cannot extract paragraphs easily using PDFBox. It also means that PDFBox can and sometimes will miss when extracting text (there are a lot of very different PDF documents out there).
What you can do is extract text from the entire page and then try and find your paragraph searching through that text. Regular expressions are usually well suited for such tasks (available in Java either through Pattern and Matcher classes, or convenience methods on String class).
public static void main(String[] args) throws InvalidPasswordException, IOException {
File file = new File("File Path");
PDDocument document = PDDocument.load(file);
PDFTextStripper pdfStripper = new PDFTextStripper();
pdfStripper.setParagraphStart("/t");
pdfStripper.setSortByPosition(true);
for (String line: pdfStripper.getText(document).split(pdfStripper.getParagraphStart()))
{
System.out.println(line);
System.out.println("********************************************************************");
}
}
Guys please try the above code. This works for sure with PDFBox-2.0.8 Jar
I had detected the start of paragraph using the using the following approach. Read the page line by line. For each line:-
Find the last index of '.' (period) in the line.
Compare this index with the length of the input line.
If the index is less then this implies that this is not the end of the previous paragraph.
If it is then it indicates that the previous paragraph has ended and the next line will be the beginning of the new paragraph.
Hope this helps.
After extracting text, paragraph can be constructed programmatically considering following points:
All lines starts with small letters should be joined with previous line. But a line starts with capital letter may also require to join with previous line. e.g: for quoted expression.
.,?,!," ending line with these characters may be the end of paragraph. Not always.
If programmatically a paragraph is determined, then test it for even number of quotes. This may be simple double quote or Unicode double opening and closing quote.
Try this:
private static String getParagraphs(String filePath, int linecount) throws IOException {
ParagraphDetector paragraphDetector = new ParagraphDetector();
StringBuilder extracted = new StringBuilder();
LineIterator it = IOUtils.lineIterator(new BufferedReader(new FileReader(filePath)));
int i = 0;
String line;
for (int lineNumber = 0; it.hasNext(); lineNumber++) {
line = (String) it.next();
if (lineNumber == linecount) {
for (int j = 0; it.hasNext(); j++) {
extracted.append((String) it.next());
}
}
}
return paragraphDetector.SentenceSplitter(extracted.toString());
}
You can first use pdfbox getText function to get the text. Every lines ends with '\n'; So you cannot segment paragraphs simpy with "\n". If a line satify the following condition:
line.length() > 2 && (int)line.charAt(line.length()-2) == 32
then this line is the last line of its paragraph. Here 32 is unicode value.