creating multiple pdfs from multiple excel files that support both formats in java - apache

below is my code to convert excel to pdf, but i dont understand how do i generate multiple pdf from multiple excel sheets.
String files;
File folder = new File(dirpath);
File[] listOfFiles = folder.listFiles();
for (int i = 0; i < listOfFiles.length; i++) {
if (listOfFiles[i].isFile()) {
files = listOfFiles[i].getName();
if (files.endsWith(".xls") || files.endsWith(".xlsx")) {
// inputting files one by one
//here it should take an input one by one
System.out.println(files);
String inputR = files.toString();
FileInputStream input_document = new FileInputStream(new File("D:\\ExcelToPdfProject\\"+inputR));
// Read workbook into HSSFWorkbook
Workbook workbook = null;
if (inputR.endsWith(".xlsx")) {
workbook = new XSSFWorkbook(input_document);
System.out.println("1");
} else if (inputR.endsWith(".xls")) {
workbook = new HSSFWorkbook(input_document);
System.out.println("GO TO HELL ######");
} else {
System.out.println("GO TO HELL");
}
Sheet my_worksheet = workbook.getSheetAt(2);
// Read worksheet into HSSFSheet
// To iterate over the rows
Iterator<Row> rowIterator = my_worksheet.iterator();
//Iterator<Row> rowIterator1 = my_worksheet.iterator();
//We will create output PDF document objects at this point
Document iText_xls_2_pdf = new Document();
PdfWriter writer = PdfWriter.getInstance(iText_xls_2_pdf, new FileOutputStream("D:\\Output.pdf"));
iText_xls_2_pdf.open();
//we have two columns in the Excel sheet, so we create a PDF table with two columns
//Note: There are ways to make this dynamic in nature, if you want to.
Row row = rowIterator.next();
row.setHeight((short) 2);
int count = row.getPhysicalNumberOfCells();
PdfPTable my_table = new PdfPTable(count);
float[] columnWidths = new float[count];
my_table.setWidthPercentage(100f);
//We will use the object below to dynamically add new data to the table
PdfPCell table_cell;
I want something that can help me create a folder full of pdfs.

Related

Automatic PDF Rendering

I've read the MigraDoc/PdfSharp documentation, but it feels a bit thin. I want to render out a PDF, but not have to manually specify width and height. I just want it to align right, center, or left (of margins), and handle all the sizing for me.
Public Sub Write()
Dim document As PdfDocument = New PdfDocument()
Dim page As PdfPage = document.AddPage()
Dim gfx As XGraphics = XGraphics.FromPdfPage(page)
gfx.MUH = PdfFontEncoding.Unicode
gfx.MFEH = PdfFontEmbedding.Default
Dim font As XFont = New XFont("Verdana", 13, XFontStyle.Bold)
Dim migraDocument As New Document
Dim sec As Section = migraDocument.AddSection()
Dim quotationHeader As New Paragraph
quotationHeader.AddText("Quotation" & vbNewLine)
quotationHeader.Format.Alignment = ParagraphAlignment.Right
sec.Add(quotationHeader)
Dim dhAddressInfo As New Paragraph
dhAddressInfo.AddText("ADDRESS GOES HERE")
dhAddressInfo.Format.Alignment = ParagraphAlignment.Left
sec.Add(dhAddressInfo)
Dim quotationInfo As New Paragraph
quotationInfo.AddText("QUOTATION INFO AND DATE HERE")
quotationInfo.Format.Alignment = ParagraphAlignment.Right
sec.Add(quotationInfo)
Dim customerBilling As New Paragraph
With Customer
customerBilling.AddText("CUSTOMER BILLING OBJECT PROPERTIES HERE")
End With
customerBilling.Format.Alignment = ParagraphAlignment.Left
sec.Add(customerBilling)
Dim authorInfo As New Paragraph
authorInfo.AddText("AUTHOR INFO HERE")
authorInfo.Format.Alignment = ParagraphAlignment.Right
sec.Add(authorInfo)
Dim pricingTable As New Table
'pricingTable.Format.Alignment = ParagraphAlignment.Center
pricingTable.AddColumn("13cm")
pricingTable.AddColumn("13cm")
Dim headerRow As New Row
headerRow = pricingTable.AddRow()
headerRow.HeadingFormat = True
headerRow.Cells(0).AddParagraph("Description")
headerRow.Cells(1).AddParagraph("Amount")
For i As Integer = 0 To SelectedPrices.Count - 1
Dim row As Row = pricingTable.AddRow()
Dim price As Pricing = SelectedPrices(i)
row.Cells(0).AddParagraph(price.Item)
row.Cells(1).AddParagraph(price.Price * price.Quantity)
Next
Dim totalRow As Row = pricingTable.AddRow()
totalRow.Cells(0).AddParagraph("Total: ")
Dim total As Double = 0
For Each price As Pricing In SelectedPrices
total = total + (price.Price * price.Quantity)
Next
totalRow.Cells(1).AddParagraph(total.ToString)
sec.Add(pricingTable)
Dim docRenderer As DocumentRenderer = New DocumentRenderer(migraDocument)
docRenderer.PrepareDocument()
docRenderer.RenderObject(gfx, XUnit.FromCentimeter(0), XUnit.FromCentimeter(0), "10cm", quotationHeader)
docRenderer.RenderObject(gfx, XUnit.FromCentimeter(0), XUnit.FromCentimeter(2), "10cm", dhAddressInfo)
docRenderer.RenderObject(gfx, XUnit.FromCentimeter(5), XUnit.FromCentimeter(2), "10cm", quotationInfo)
docRenderer.RenderObject(gfx, XUnit.FromCentimeter(0), XUnit.FromCentimeter(6), "10cm", customerBilling)
docRenderer.RenderObject(gfx, XUnit.FromCentimeter(5), XUnit.FromCentimeter(6), "10cm", authorInfo)
docRenderer.RenderObject(gfx, XUnit.FromCentimeter(3), XUnit.FromCentimeter(10), "10cm", pricingTable)
document.Save(Environment.CurrentDirectory & "\test.pdf")
End Sub
Notice at the bottom I'm specifying the X and Y coordinates of each section. I just want to define spacing. Alignment should take care of the rest.
I found a different tutorial that uses PdfDocumentRenderer and shows how to correctly use it. It's not in VB, but quite easily translated. I copied it below in case the link goes dead.
http://www.c-sharpcorner.com/UploadFile/aftab_ku/create-object-model-document-and-renders-them-into-pdf/
public Document CreateDocument()
{
// Create a new MigraDoc document
this.document = new Document();
this.document.Info.Title = "";
this.document.Info.Subject = "";
this.document.Info.Author = "Aftab";
DefineStyles();
CreatePage();
FillContent();
return this.document;
}
Here, CreateDocument() in PDFform.cs creates a new MigraDoc. Take a look at the three functions called for creating style and page and fill the content of the tables.
//
void DefineStyles()
{
// Get the predefined style Normal.
Style style = this.document.Styles["Normal"];
// Because all styles are derived from Normal, the next line changes the
// font of the whole document. Or, more exactly, it changes the font of
// all styles and paragraphs that do not redefine the font.
style.Font.Name = "Verdana";
style = this.document.Styles[StyleNames.Header];
style.ParagraphFormat.AddTabStop("16cm", TabAlignment.Right);
style = this.document.Styles[StyleNames.Footer];
style.ParagraphFormat.AddTabStop("8cm", TabAlignment.Center);
// Create a new style called Table based on style Normal
style = this.document.Styles.AddStyle("Table", "Normal");
style.Font.Name = "Verdana";
style.Font.Name = "Times New Roman";
style.Font.Size = 9;
// Create a new style called Reference based on style Normal
style = this.document.Styles.AddStyle("Reference", "Normal");
style.ParagraphFormat.SpaceBefore = "5mm";
style.ParagraphFormat.SpaceAfter = "5mm";
style.ParagraphFormat.TabStops.AddTabStop("16cm", TabAlignment.Right);
}
DefineStyles() does the job of styling the document:
void CreatePage()
{
// Each MigraDoc document needs at least one section.
Section section = this.document.AddSection();
// Put a logo in the header
Image image= section.AddImage(path);
image.Top = ShapePosition.Top;
image.Left = ShapePosition.Left;
image.WrapFormat.Style = WrapStyle.Through;
// Create footer
Paragraph paragraph = section.Footers.Primary.AddParagraph();
paragraph.AddText("Health And Social Services.");
paragraph.Format.Font.Size = 9;
paragraph.Format.Alignment = ParagraphAlignment.Center;
............
// Create the item table
this.table = section.AddTable();
this.table.Style = "Table";
this.table.Borders.Color = TableBorder;
this.table.Borders.Width = 0.25;
this.table.Borders.Left.Width = 0.5;
this.table.Borders.Right.Width = 0.5;
this.table.Rows.LeftIndent = 0;
// Before you can add a row, you must define the columns
Column column;
foreach (DataColumn col in dt.Columns)
{
column = this.table.AddColumn(Unit.FromCentimeter(3));
column.Format.Alignment = ParagraphAlignment.Center;
}
// Create the header of the table
Row row = table.AddRow();
row.HeadingFormat = true;
row.Format.Alignment = ParagraphAlignment.Center;
row.Format.Font.Bold = true;
row.Shading.Color = TableBlue;
for (int i = 0; i < dt.Columns.Count; i++)
{
row.Cells[i].AddParagraph(dt.Columns[i].ColumnName);
row.Cells[i].Format.Font.Bold = false;
row.Cells[i].Format.Alignment = ParagraphAlignment.Left;
row.Cells[i].VerticalAlignment = VerticalAlignment.Bottom;
}
this.table.SetEdge(0, 0, dt.Columns.Count, 1, Edge.Box,
BorderStyle.Single, 0.75, Color.Empty);
}
Here CreatePage() adds a header, footer, and different sections into the document and then the table is created to display the records. Columns from the datatable are added into the table inside the document and then a header row that contains the column names is added.
column = this.table.AddColumn(Unit.FromCentimeter(3));
//creates a new column and width of the column is passed as a parameter.
Row row = table.AddRow();
//A new header row is created
row.Cells[i].AddParagraph(dt.Columns[i].ColumnName);
//this will add the column name to header of the row.
this.table.SetEdge(0, 0, dt.Columns.Count, 1, Edge.Box,
BorderStyle.Single, 0.75, Color.Empty);
//sets the border of the row
void FillContent()
{
...............
Row row1;
for (int i = 0; i < dt.Rows.Count; i++)
{
row1 = this.table.AddRow();
row1.TopPadding = 1.5;
for (int j = 0; j < dt.Columns.Count; j++)
{
row1.Cells[j].Shading.Color = TableGray;
row1.Cells[j].VerticalAlignment = VerticalAlignment.Center;
row1.Cells[j].Format.Alignment = ParagraphAlignment.Left;
row1.Cells[j].Format.FirstLineIndent = 1;
row1.Cells[j].AddParagraph(dt.Rows[i][j].ToString());
this.table.SetEdge(0, this.table.Rows.Count - 2, dt.Columns.Count, 1,
Edge.Box, BorderStyle.Single, 0.75);
}
}
.............
}
FillContent() fills the rows from the datatable into the table inside the document:
row1.Cells[j].AddParagraph(dt.Rows[i][j].ToString());
//adds the value of column into the table row
The Default.aspx file contains the code for generating the PDF:
using MigraDoc.DocumentObjectModel;
using MigraDoc.Rendering;
using System.Diagnostics;
MigraDoc libraries are used for generating PDF documents, and System.Diagnostics for starting a PDF Viewer:
PDFform pdfForm = new PDFform(GetTable(), Server.MapPath("img2.gif"));
// Create a MigraDoc document
Document document = pdfForm.CreateDocument();
document.UseCmykColor = true;
// Create a renderer for PDF that uses Unicode font encoding
PdfDocumentRenderer pdfRenderer = new PdfDocumentRenderer(true);
// Set the MigraDoc document
pdfRenderer.Document = document;
// Create the PDF document
pdfRenderer.RenderDocument();
// Save the PDF document...
string filename = "PatientsDetail.pdf";
pdfRenderer.Save(filename);
// ...and start a viewer.
Process.Start(filename);
The PdfForm object is created and using it, a new MigraDoc is generated. PdfDocumentRenderer renders the PDF document and then saves it. Process.Start(filename) starts a PDF viewer to open the PDF file created using MigraDoc.

How to save Excel Table as a Picture using vb.net?

I'm trying to save tables from excel sheets as pictures. Is there a way to just put that table on the clipboard and save it? This is what I've got so far but the library referenced is not there?
Thank you in advance!
-Rueben Ramirez
Public Sub extract_excelTable(ByRef data_file As String, ByRef app1 As excel.Application, ByRef sheet_name As String)
'defining new app to prevent out of scope open applications
Dim temp_app As excel.Application = app1
Dim workbook As excel.Workbook = temp_app.Workbooks.Open(Path.GetFullPath(data_file))
temp_app.Visible = False
For Each temp_table As excel.DataTable In workbook.Worksheets(sheet_name)
temp_table.Select()
'temp_app.Selection.CopyAsPicture?
Next
End Sub
I'm not going to write any code here, but I will outline a solution for you that will work. Note that this will not reproduce the formatting of the excel document, just simply get the data from it, and put it on an image in the same column/row order as the excel file.
STEP 1:
My solution to this problem would be to read the data from the excel file using an OLEDB connection as outlined in the second example of this post: Reading values from an Excel File
Alternatively, you may need to open the document in excel and re-save it as a CSV if it's too large to fit in your computer's memory. I have some code that reads a CSV into a string list in C# that may help you:
static void Main(string[] args)
{
string Path = "C:/File.csv";
System.IO.StreamReader reader = new System.IO.StreamReader(Path);
//Ignore the header line
reader.ReadLine();
string[] vals;
while (!reader.EndOfStream)
{
ReadText = reader.ReadLine();
vals = SplitLine(ReadText);
//Do some work here
}
}
private static string[] SplitLine(string Line)
{
string[] vals = new string[42];
string Temp = Line;
for (int i = 0; i < 42; i++)
{
if (Temp.Contains(","))
{
if (Temp.Substring(0, Temp.IndexOf(",")).Contains("\""))
{
vals[i] = Temp.Substring(1, Temp.IndexOf("\",", 1) - 1);
Temp = Temp.Substring(Temp.IndexOf("\",", 1) + 2);
}
else {
vals[i] = Temp.Substring(0, Temp.IndexOf(","));
Temp = Temp.Substring(Temp.IndexOf(",") + 1);
}
}
else
{
vals[i] = Temp.Trim();
}
}
return vals;
}
STEP 2:
Create a bitmap object to create an image, then use a for loop to draw all of the data from the excel document onto the image. This post had an example of using the drawstring method to do so: how do i add text to image in c# or vb.net

How to get data from xlsx sheet using dataprovider in selenium TestNG?

Can someone give me the logic of how to retrieve data from an Excel sheet (latest Excel file format) using the data provider in Selenium?
I'm mostly looking for the for loop logic inside the data provider.
Bascially, TestNG Data Provider is anything you create.
If you want to read xlsx files you need to create a class which reads excel rows and returns dataset.
#DataProvider(name = "data")
public static Object[][] returnExcelSheetData()
throws BiffException, IOException
{
String absolutePath = filePath.concat("/").concat(fileName); //path to excel file
FileInputStream file = new FileInputStream(new File(absolutePath));
Workbook workbook = Workbook.getWorkbook(file);
Sheet worksheet = workbook.getSheet(sheetName); //sheet name
int ROWS = worksheet.getRows() - 1; //if headers are present - use -1
int COLS = worksheet.getColumns(); //read all columns
Object[][] dataset = new Object[ROWS][COLS];
for (int rowCount = 0; rowCount < ROWS; rowCount++) {
for (int colCount = 0; colCount < COLS; colCount++) {
dataset[rowCount][colCount] = worksheet.getCell(colCount,
rowCount + 1).getContents();
}
}
workbook.close();
file.close();
return dataset;
for loops interate through ALL columns and ALL rows and return data sets.
If you use only 1 row of data and 1 row of headers, DataProvider will pass data to test.
If you use 2 rows of data, method which invokes DataProvider will be invoked for each row (2 times for 2 rows of data)

Is there a way to get 'named' cells using EPPlus?

I have an Excel file that I am populating programmatically with EPPlus.
I have tried the following:
// provides access to named ranges, does not appear to work with single cells
worksheet.Names["namedCell1"].Value = "abc123";
// provides access to cells by address
worksheet.Cells["namedCell1"].Value = "abc123";
The following does work - so I know I am at least close.
worksheet.Cells["A1"].Value = "abc123";
Actually, its a bit misleading. The Named Ranges are stored at the workBOOK level and not the workSHEET level. So if you do something like this:
[TestMethod]
public void Get_Named_Range_Test()
{
//http://stackoverflow.com/questions/30494913/is-there-a-way-to-get-named-cells-using-epplus
var existingFile = new FileInfo(#"c:\temp\NamedRange.xlsx");
using (var pck = new ExcelPackage(existingFile))
{
var wb = pck.Workbook; //Not workSHEET
var namedCell1 = wb.Names["namedCell1"];
Console.WriteLine("{{\"before\": {0}}}", namedCell1.Value);
namedCell1.Value = "abc123";
Console.WriteLine("{{\"after\": {0}}}", namedCell1.Value);
}
}
You get this in the output (using an excel file with dummy data in it):
{"before": Range1 B2}
{"after": abc123}

keeping count and storing of text instances

I would like to make a simple code that counts the top three most recurring lines/ text in a txt file then saves that line/ text to another text file (this in turn will be read into AutoCAD’s variable system).
Forgetting the AutoCAD part which I can manage how do I in VB.net save the 3 most recurring lines of text each to its own text file see example below:
Text file to be read reads as follows:
APG
BTR
VTS
VTS
VTS
VTS
BTR
BTR
APG
PNG
The VB.net program would then save the text VTS to mostused.txt BTR to 2ndmostused.txt and APG to 3rdmostused.txt
How can this be best achieved?
Since I'm C# developer, I'll use it:
var dict = new Dictionary<string, int>();
using(var sr = new StreamReader(file))
{
var line = string.Empty;
while ((line = sr.ReadLine()) != null)
{
var words = line.Split(' '); // get the words
foreach(var word in words)
{
if(!dict.Contains(word)) dict.Add(word, 0);
dict[word]++; // count them
}
}
}
var query = from d in dict select d order by d.Value; // now you have it sorted
int counter = 1;
foreach(var pair in query)
{
using(var sw = new StreamWriter("file" + counter + ".txt"))
sw.writer(pair.Key);
}