Does BIRT recognize RTF tags? - blob

I have a dataset that returns a BLOB field (thats how BIRT has binded in the table). In the database the data type is classified as Long Raw, so i need to transform the binary data to text using a generic convert function.
The problem is that BIRT appears to not recognize embedded RTF expressions after the conversion, but maybe im doing something wrong.
I was using a Dynamic Text component that contains the data converted in the Expression Builder property. Also, the content type of that field is set to RTF.
Here is how BIRT shows
{\rtf1\ansi
\ansicpg1252\deff0{\fonttbl{\f0\fnil MS
Sans Serif;}{\f1\fnil\fcharset0 MS Sans
Serif;}}
\viewkind4\uc1\pard\qc\lang1046\b
\f0\fs16 1 x\f1\'ed\-cara de leite
\par 1 colher de sopa de fermendo em p
\'f3
\par 3 x\'ed\-caras de farinha de trigo
\par 3 x\'ed\-caras de a\'e7\'facar
\par 3 ovos
\par 4 colheres de margarina\b0\f0
\par }
As we can see, the text contains RTF tags mixed with the main content.
The idea is to make birt delete the tags or be able to model them in some way.
Here is how i was expecting the output
1 xícara de leite
1 colher de sopa de fermento
3 xícaras de farinha de trigo

After some research there is a possible answer, but is not the perfect one because the goal was to model in some way the RTF tags. Here it is:
The firs step os to convert the binary data
function convert( byteArr ) {
const convertedbyteArr = "";
for(var i = 0; i<byteArr.length;i++){
teste += String.fromCharCode(byteArr[i]);
}
return convertedbyteArr ;
}
The next step is to delete all RTF tags using regex. This solution was based on this post: Regular Expression for extracting text from an RTF string .
function removeRTF (str) {
var basicRtfPattern = /\{\*?\\[^{}]+;}|[{}]|\\[A-Za-z]+\n?(?:-?\d+)?[ ]?/g;
var newLineSlashesPattern = /\\\n/g;
var ctrlCharPattern = /\n\\f[0-9]\s/g;
return str
.replace(ctrlCharPattern, "")
.replace(basicRtfPattern, "")
.replace(newLineSlashesPattern, "\n")
.replace(/\\'c9/g,"É")
.replace(/\\'cd/g,"Í")
.replace(/\\'ed\\-/g,"í")
.replace(/\\'f3/g,"ó")
.replace(/\\'d3/g,"Ó")
.replace(/\\'fa/g,"ú")
.replace(/\\'fa/g,"ú")
.replace(/\\'da/g,"Ú")
.replace(/\\'e7/g,"ç")
.replace(/\\'e1/g,"á")
.replace(/\\'e1/g,"á")
.replace(/\\'e0/g,"à")
.replace(/\\'c0/g,"À")
.replace(/\\'c1/g,"Á")
.trim();
}
It is important to note that the accents are treated individually.

The old ROM specification of BIRT shows that once upon a time there were plans to support RTF formatted text, but it was never implemented (and never will be implemented).
The de-facto standard for formatted text coded in a text file is now HTML.

Related

How to check if a field is empty or exist if the field is deep in the inheritance tag tree. (SSRS report)

I need to show a FileName (NomFichier in french) if only there is a file attached to the report.
enter image description here
Here you can see a part of the XML, where PiecesJointes = attached files and inside it has multiples type of files. The one that interest us is ResolutionCA.
<PiecesJointes>
<AttestationRQ>
<InfoFichierJoint>
<NomFichier>readme.txt</NomFichier>
</InfoFichierJoint>
</AttestationRQ>
<ResolutionCA>
<InfoFichierJoint>
<NomFichier>test.txt</NomFichier>
</InfoFichierJoint>
</ResolutionCA>
<FormulaireCautionnement>
<InfoFichierJoint>
<NomFichier>NW2W014_20210504075509_readme.txt</NomFichier>
</InfoFichierJoint>
</FormulaireCautionnement>
</PiecesJointes>
My question:
Let's say I want to check if NomFichier (NameFile) inside the InfoFichierJoint tag that is inside ResolutionCA tag, has a name file written in the tag or exist. What do I have to do?
Here you can see what I tried but didn't have success with. Which let me to think that the inheritance of the tags were the problems.
=IIF(IsNothing(Fields!NomFichier.Value)= "true" OR Fields!NomFichier.Value ="", "Les conditions de votre demande ne requièrent aucun document. ",
"Résolution de la personne morale, société ou autre entité qui autorise le répondant à présenter la demande de permis")

Can CrystalReports for VS2013 be programmed to export each page of the report to a separate pdf file

I am well into developing a billing program in VB2013 that needs to be able to export each customer bill to a pdf file that can then be attached to an email to the customer being billed. I have used CR for many, many years, but I have not found any way to programmatically make CR export to pdf. I have made activereports2 do so, but I am trying to get back down to just one report generator. I have had compatibility issues with Activereports2 by Datadynamics when running on some Windows Vista and later machines, so I was hoping to move everything to CR.
Well you can definitely generate Crystal Reports and convert to PDF on the fly in .NET.
Here's some sample code to help with that (this saves the PDF to the database, but you can remove that part if you don't need it):
public static int Crystal_PDFToDatabase(string reportName, object par1, object par2, object par3, string user, string DocName, string DocDesc, string DocType, int ClaimID, string exportFormatType)
{
try
{
CrystalReportSource CrystalReportSource1 = Crystal_SetDataSource(reportName, par1, par2, par3);
//SET EXPORT FORMAT
CrystalDecisions.Shared.ExportFormatType typ = CrystalDecisions.Shared.ExportFormatType.PortableDocFormat;
Stream str = CrystalReportSource1.ReportDocument.ExportToStream(typ);
if (str != null)
{
var memoryStream = new MemoryStream();
str.CopyTo(memoryStream);
int DocID = db_Docs.SaveNewDocument(DocName, DocDesc, DocType, user, memoryStream.ToArray(), ClaimID, null, null);
return DocID;
}
//clear out cache (to prevent other crystal reports from reusing old generated documents
CrystalReportSource1.ReportDocument.Close();
CrystalReportSource1.Dispose();
return 0;
}
catch
{
return 0;
}
}
Now a separate issue is not produce a separate PDF for each page in the report. I simply would not design it that way. I'd instead have the Crystal report generate 1 page, convert to PDF memorystream on the fly, and email it out. Then iterate on to the next customer. So much easier that way instead of having to figure out how (if its possible at all) to split that one huge Crystal Report / PDF into many slices.

Lucene: How to index and search multiple value under single field

How to index and search multiple value under single field.
e.g. say i have a field processor which might have i3,i5,i7 or i3 or i3,i5 values.
now imagine a laptop data as below:
data1:
name= laptop name
price = laptop price
processor=core duo
data2:
name= laptop name
price = laptop price
processor=i3,i5
data3:
name= laptop name
price = laptop price
processor=i3,i5,i7
Now,
if a user want to search only i3 and i5 processor it should show data2 & data3 only.
So my question is how should i index and search the lucene. I am using lucene 4.4.
I checked this but could not understand as no example was there. An example will be good for me.
Frankly, there isn't really much to it. Using using StandardAnalyzer and the standard QueryParser, you would simply add the field to the document, in the form shown, like:
Document document = new Document();
document.add(new TextField("name", "laptop name"));
document.add(new TextField("processor", "i3,i5,i7"));
//Add other fields as needed...
//Assuming you've set up your writing to use StandardAnalyzer...
writer.addDocument(document);
StandardAnalyzer will tokenize on punctuation (and whitespace, etc), indexing the tokens "i3", "i5" and "i7" in the "processor" field, so when using just using the standard QueryParser (see query parser syntax), the query:
processor:(i3 i5)
Will find any fields with either an "i3" or "i5" in the "processor" field
You can inspire by my source code: http://git.abclinuxu.cz/?p=abclinuxu.git;a=tree;f=src/cz/abclinuxu/utils/search;h=d825ec75da1b19ca0cd6065458fec771de174be9;hb=HEAD
MyDocument is POJO that constructs LuceneDocument. Important information is stored in Field, so it is searchable. My document type is similar to your processor type:
Field field = new Field(TYPE, type, Field.Store.YES, Field.Index.UN_TOKENIZED);
Each processor type shall be stored individually.

Xpath search for .docx

I want to read specific text from the subtable present in .docx file.
Is there a efficient way like xpath traversing or similar api supported in java.
Currently i tried reading .docx using java apache poi (code snippet below), but this way i have to iterate all the nodes based on tag 'w:tr' and read the nodes text value. Is there any approach to quickly retrieve required data based on searchpattern like xpath.?? . Any inputs is highly appreciated.
File myFile = new File( "D:\\XLS-Pages\\TestSherwin.docx" );
ZipFile docxFile = new ZipFile( myFile );
ZipEntry documentXML = docxFile.getEntry( "word/document.xml" );
InputStream documentXMLIS = docxFile.getInputStream( documentXML );
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
org.w3c.dom.Document doc = dbf.newDocumentBuilder().parse( documentXMLIS );
org.w3c.dom.Element tElement = doc.getDocumentElement();
NodeList n = (NodeList) tElement.getElementsByTagName( "w:tr" );
You can use XPath in docx4j; support is based on JAXB's support for XPath, with the various limitations which that entails.

RegEx help - finding / returning a code

I must admit it's been a few years since my RegEx class and since then, I have done little with them. So I turn to the brain power of SO. . .
I have an Excel spreadsheet (2007) with some data. I want to search one of the columns for a pattern (here's the RegEx part). When I find a match I want to copy a portion of the found match to another column in the same row.
A sample of the source data is included below. Each line represents a cell in the source.
I'm looking for a regex that matches "abms feature = XXX" where XXX is a varibale length word - no spaces in it and I think all alpha characters. Once I find a match, I want to toss out the "abms feature = " portion of the match and place the code (the XXX part) into another column.
I can handle the excel coding part. I just need help with the regex.
If you can provide a solution to do this entirely within Excel - no coding required, just using native excel formula and commands - I would like to hear that, too.
Thanks!
###################################
Structure
abms feature = rl
abms feature = sta
abms feature = pc, pcc, pi, poc, pot, psc, pst, pt, radp
font = 5 abms feature = equl, equr
abms feature = bl
abms feature = tl
abms feature = prl
font = 5
###################################
I am still learning about regex myself, but I have found this place useful for getting ideas or comparing what I came up with, might help in the future?
http://regexlib.com/
Try this regular expression:
abms feature = (\w+)
Here is an example of how to extract the value from the capture group:
using System;
using System.Text.RegularExpressions;
class Program
{
static void Main()
{
Regex regex = new Regex(#"abms feature = (\w+)",
RegexOptions.Compiled |
RegexOptions.CultureInvariant |
RegexOptions.IgnoreCase);
Match match = regex.Match("abms feature = XXX");
if (match.Success)
{
Console.WriteLine(match.Groups[1].Value);
}
}
}
(?<=^abms feature = )[a-zA-Z]*
assuming you're not doing anything with the words after the commas