Does EPPlus support MS Office Equations? - asp.net-core

I am interested in using EPPlus for generating some xslx report files for a system. I'm interested in whether or not it is possible for EPPlus to tap into Excel's Equation's feature. This is different from formula's. I am aware of EPPlus's ability to add formula's and that's a great feature I am also using. The Equation feature in Excel prints out equations in a nice looking format and I am interested in using that in files I generate with EPPlus.
public ActionResult OnPostDownloadExcelFile()
{
var ms = new MemoryStream();
using (var package = new OfficeOpenXml.ExcelPackage(ms))
{
package.Workbook.Properties.Title = "Excel file with an equation in it";
// Worksheet
var worksheet = package.Workbook.Worksheets.Add("Report");
worksheet.View.ShowGridLines = true;
int row = 1;
int col = 1;
worksheet.Cells[row, col].Value = "I want to put an equation here...";
}
return File(ms, "application/vnd.openxmlformats-officedocument.spreadsheetml.sheet", "ExcelFileExample.xlsx");
}

Unfortunately this is not supported in EPPlus. Equations are an embedded object. Embedded objects are not presently supported: https://github.com/JanKallman/EPPlus/wiki/FAQ#what-is-not-supported-by-the-library-these-are-the-most-obvious-features

Related

iTextSharp v5 - How do you concatenate PDFs in memory?

I am having trouble merging PDFs in-memory. I have 2 memory streams, a master and component stream, the idea is that as each component PDF is built up, the component PDF's bytes are added to the master stream. At the very end of all the components, we have a byte array that's a PDF.
I have the code below, but nothing is copying into my masterStream. I think the issue is with CopyPagesTo, but I'm not familiar enough and the documentation/examples are hard to find.
byte[] updated;
using (MemoryStream masterMemoryStream = new MemoryStream())
{
masterStream.WriteTo(masterMemoryStream);
// Read from master stream (ie. all existing components)
masterMemoryStream.Position = 0;
using (iText.Kernel.Pdf.PdfWriter masterPdfWriter = new iText.Kernel.Pdf.PdfWriter(masterMemoryStream))
using (iText.Kernel.Pdf.PdfDocument masterPdfDocument = new iText.Kernel.Pdf.PdfDocument(masterPdfWriter))
{
using (MemoryStream componentMemoryStream = new MemoryStream())
{
componentStream.WriteTo(componentMemoryStream);
// Read from new component
componentMemoryStream.Position = 0;
using (iText.Kernel.Pdf.PdfReader componentPdfReader = new iText.Kernel.Pdf.PdfReader(componentMemoryStream))
using (iText.Kernel.Pdf.PdfDocument componentPdfDocument = new iText.Kernel.Pdf.PdfDocument(componentPdfReader))
{
// Copy pages from component into master
componentPdfDocument.CopyPagesTo(1, componentPdfDocument.GetNumberOfPages(), masterPdfDocument);
}
}
}
updated = masterMemoryStream.GetBuffer();
}
// Write updates to master stream?
masterStream.SetLength(0);
using (MemoryStream temp = new MemoryStream(updated))
temp.WriteTo(masterStream);
Answer
This is mkl's answer with some of my corrections:
using (MemoryStream temporaryStream = new MemoryStream())
{
masterStream.Position = 0;
componentStream.Position = 0;
using (PdfDocument combinedDocument = new PdfDocument(new PdfReader(masterStream), new PdfWriter(temporaryStream)))
using (PdfDocument componentDocument = new PdfDocument(new PdfReader(componentStream)))
{
componentDocument.CopyPagesTo(1, componentDocument.GetNumberOfPages(), combinedDocument);
}
byte[] temporaryBytes = temporaryStream.ToArray();
masterStream.Position = 0;
masterStream.SetLength(temporaryBytes.Length);
masterStream.Capacity = temporaryBytes.Length;
masterStream.Write(temporaryBytes, 0, temporaryBytes.Length);
}
There are a number of issues in your code. I'll first give you a working version and then go into the issues in your code.
A working version (with an important limitation)
You can combine two PDFs given in MemoryStream instances masterStream and componentStream and get the result in the same MemoryStream instance masterStream as follows:
using (MemoryStream temporaryStream = new MemoryStream())
{
masterStream.Position = 0;
componentStream.Position = 0;
using (PdfDocument combinedDocument = new PdfDocument(new PdfReader(masterStream), new PdfWriter(temporaryStream)))
using (PdfDocument componentDocument = new PdfDocument(new PdfReader(componentStream)))
{
componentDocument.CopyPagesTo(1, componentDocument.GetNumberOfPages(), combinedDocument);
}
byte[] temporaryBytes = temporaryStream.ToArray();
masterStream.Position = 0;
masterStream.Capacity = temporaryBytes.Length;
masterStream.Write(temporaryBytes, 0, temporaryBytes.Length);
masterStream.Position = 0;
}
The limitation is that you have to have instantiated the masterStream with an expandable capacity; the MemoryStream class has a number of constructors only some of which create such an expandable instance while the others create non-resizable instances. For details read here.
Issues in your concept and code
Concatenating PDF files does not result in a valid merged PDF
You describe your concept like this
the idea is that as each component PDF is built up, the component PDF's bytes are added to the master stream
This does not work, though, the PDF format does not allow merging PDFs by simply concatenating them. In particular the (active) objects in a PDF have an identifier number which must be unique in the PDF, concatenating would result in a file with non-unique object identifiers; PDFs contain cross reference structures which map each object identifier to its offset from the file start, concatenating would get all these offsets wrong for the added PDFs; furthermore, a PDF has to have a single root object from which the other objects are referenced directly or indirectly, concatenating would result in multiple root objects.
Writing and immediately overwriting
In your code you have
masterStream.WriteTo(masterMemoryStream);
// Read from master stream (ie. all existing components)
masterMemoryStream.Position = 0;
using (iText.Kernel.Pdf.PdfWriter masterPdfWriter = new iText.Kernel.Pdf.PdfWriter(masterMemoryStream))
Here you write the contents of masterStream to masterMemoryStream, then set the masterMemoryStream position to the start and instantiate a PdfWriter which starts writing there. I.e. your original copy of the masterStream contents get overwritten, surely not what you wanted.
Using MemoryStream.GetBuffer
MemoryStream.GetBuffer does not only return the data written into the MemoryStream by design but the whole buffer; i.e. there may be a lot of trash bytes after the actual PDF in what you retrieve here
updated = masterMemoryStream.GetBuffer();
This may cause PDF processors trying to process your result PDFs to be unable to open the file: PDFs have a pointer to the last cross references at their end, so if you have trash bytes following the actual end of your PDF, PDF processors may not find that pointer.
PS
As worked out in the comments, the code above works fine in case of constantly growing stream lengths (which usually will happen in the use case at hand) but in general one needs to restrict the stream size before writing the new content, e.g. like this:
...
masterStream.Position = 0;
masterStream.SetLength(temporaryBytes.Length); // <<<<
masterStream.Capacity = temporaryBytes.Length;
...

How to achieve of pdf file convert to excel file format by programmatically via adobe(acrobat)?

How to achieve of pdf file convert to excel file format by programmatically via adobe(acrobat) ?
Creating desktop application for pdf file convert to excel file format by programmatically via adobe(acrobat) but still i'm not finding any code for convert pdf to excel via adobe(acrobat).
Have acrobat(adobe) provided that functionality or not?
I understand you are looking at how to convert a PDF to an Microsoft Excel format using Adobe Acrobat. If I may offer a paid alternative solution to using Adobe in your program, you could look at the DocumentConverter Class that is offered by the LEADTOOLS SDK. Please see the below code snippet.
https://www.leadtools.com/help/sdk/v21/dh/doxc/documentconverter.html
using (DocumentConverter documentConverter = new DocumentConverter())
{
var inFile = Path.Combine(ImagesPath.Path, #"Leadtools.docx");
var outFile = Path.Combine(ImagesPath.Path, #"output.pdf");
var format = DocumentFormat.Pdf;
var jobData = DocumentConverterJobs.CreateJobData(inFile, outFile, format);
jobData.JobName = "conversion job";
var documentWriter = new DocumentWriter();
documentConverter.SetDocumentWriterInstance(documentWriter);
var renderingEngine = new AnnWinFormsRenderingEngine();
documentConverter.SetAnnRenderingEngineInstance(renderingEngine);
var job = documentConverter.Jobs.CreateJob(jobData);
documentConverter.Jobs.RunJob(job);
if (job.Status == DocumentConverterJobStatus.Success)
{
Console.WriteLine("Success");
}
else
{
Console.WriteLine("{0} Errors", job.Status);
foreach (var error in job.Errors)
{
Console.WriteLine(" {0} at {1}: {2}", error.Operation, error.InputDocumentPageNumber, error.Error.Message);
}
}
}

Adding text items to an Existing PDF w/ Telerik DocumentProcessing Library

I want to open an existing PDF document and add different annotations to it. Namely bookmarks and some text
I am using the Telerik Document Processing Library (dpl) v2019.3.1021.40
I am new to dpl , but I believe the RadFlowDocument is the way to go.
I am having troubles creating the RadFlowDocument
FlowProvider.PdfFormatProvider provider = new FlowProvider.PdfFormatProvider();
using (Stream stream = File.OpenRead(sourceFile))
{
--> RadFlowDocument flowDoc = provider.Import(stream);
}
The line indicated w/ the arrow give the error "Import Not Supported"
There is a telerik blog post here
https://www.telerik.com/forums/radflowdocument-to-pdf-error
It seems relevant, but not 100% sure.
It cautions to be sure the providers are mated correctly, I believe they are in my example....
Again, ultimate goal is to open a PDF and add some stuff to it. I think the RadFlowDocument is the right direction. If there is a better solution, Im happy to hear that too.
I figured it out. The DPL is pretty good, but doc is still growing, hope this helps someone out...
This draws from a myriad of articles, I cant begin to cite them all.
There are 2 notions for working w/ PDFs in the DPL.
FixedDocument takes pages. I think this is meant for sewing docs together.
FlowDocument I believe lays things out like an HTML renderer would.
I am using Fixed, mainly b/c I can get that to work.
using System;
using System.IO;
using System.Windows; //nec for Size struct
using System.Diagnostics; //nec for launching the pdf at the end
using Telerik.Windows.Documents.Fixed.Model;
//if you have fixed and flow provider, you have to specify, so I make a shortcut
using FixedProvider = Telerik.Windows.Documents.Fixed.FormatProviders.Pdf;
using Telerik.Windows.Documents.Fixed.Model.Editing;
using Microsoft.VisualStudio.TestTools.UnitTesting;
namespace DocAggregator
{
[TestClass]
public class UnitTest2
{
[TestMethod]
public void EditNewFIle_SrcAsFixed_TrgAsFixed()
{
String dt = #"C:\USERS\greg\DESKTOP\DPL\";
String sourceFile = dt + "output.pdf";
//Open the sourceDoc so you can add stuff to it
RadFixedDocument sourceDoc;
//a provider parses the actual file into the model.
FixedProvider.PdfFormatProvider fixedProv = new FixedProvider.PdfFormatProvider();
using (Stream stream = File.OpenRead(sourceFile))
{
//'populate' the doc object from the file
//using the FLOW classes, I get "Import Not Supported".
sourceDoc = fixedProv.Import(stream);
}
int pages = sourceDoc.Pages.Count;
int pageCounter = 1;
int xoffset = 150;
int yoffset = 50;
//editor is the thing that lets you add elements into the source doc
//Like the provider, the Editor needs to match the document class (Fixed or Flow)
RadFixedDocumentEditor editor = new RadFixedDocumentEditor(sourceDoc);
foreach (RadFixedPage page in sourceDoc.Pages)
{
FixedContentEditor pEd = new FixedContentEditor(page);
Size ps = page.Size;
pEd.Position.Translate(ps.Width - xoffset, ps.Height - yoffset);
Block block = new Block();
block.HorizontalAlignment = Telerik.Windows.Documents.Fixed.Model.Editing.Flow.HorizontalAlignment.Center;
block.TextProperties.FontSize = 22;
block.InsertText(string.Format("Page {0} of {1} ", pageCounter, pages));
pEd.DrawBlock(block);
pageCounter++;
}
string exportFileName = "addedPageNums.pdf";
if (File.Exists(exportFileName))
{
File.Delete(exportFileName);
}
File.WriteAllBytes(exportFileName, fixedProv.Export(sourceDoc));
//launch the app
Process.Start(exportFileName);
}
}
}

replace string in PDF document (ITextSharp or PdfSharp)

We use non-manage DLL that has a funciton to replace text in PDF document (http://www.debenu.com/docs/pdf_library_reference/ReplaceTag.php).
We are trying to move to managed solution (ITextSharp or PdfSharp).
I know that this question has been asked before and that the answers are "you should not do it" or "it is not easily supported by PDF".
However there exists a solution that works for us and we just need to convert it to C#.
Any ideas how I should approach it?
According to your library reference link, you use the Debenu PDFLibrary function ReplaceTag. According to this Debenu knowledge base article
the ReplaceTag function simply replaces text in the page’s content stream, so for most documents it wouldn’t have any effect. For some simple documents it might be able to replace content, but it really depends on how the PDF was constructed. Essentially it’s the same as doing:
DPL.CombineContentStreams();
string content = DPL.GetContentStreamToString();
DPL.SetPageContentFromString(content.Replace("Moby", "Mary"));
That should be possible with any general purpose PDF library, it definitely is with iText(Sharp):
void VerySimpleReplaceText(string OrigFile, string ResultFile, string origText, string replaceText)
{
using (PdfReader reader = new PdfReader(OrigFile))
{
byte[] contentBytes = reader.GetPageContent(1);
string contentString = PdfEncodings.ConvertToString(contentBytes, PdfObject.TEXT_PDFDOCENCODING);
contentString = contentString.Replace(origText, replaceText);
reader.SetPageContent(1, PdfEncodings.ConvertToBytes(contentString, PdfObject.TEXT_PDFDOCENCODING));
new PdfStamper(reader, new FileStream(ResultFile, FileMode.Create, FileAccess.Write)).Close();
}
}
WARNING: Just like in case of the Debenu function, for most documents this code wouldn’t have any effect or would even be destructive. For some simple documents it might be able to replace content, but it really depends on how the PDF was constructed.
By the way, the Debenu knowledge base article continues:
If you created a PDF using Debenu Quick PDF Library and a standard font then the ReplaceTag function should work – however, for PDFs created with tools that do subsetted fonts or even kerning (where words will be split up) then the search text probably won’t be in the content in a simple format.
So in short, the ReplaceTag function will only work in some limited scenarios and isn’t a function that you can rely on for searching and replacing text.
Thus, if during your move to managed solution you also change the way the source documents are created, chances are that neither the Debenu PDFLibrary function ReplaceTag nor the code above will be able to change the content as desired.
for pdfsharp users heres a somewhat usable function, i copied from my project and it uses an utility method which is consumed by othere methods hence the unused result.
it ignores whitespaces created by Kerning, and therefore may mess up the result (all characters in the same space) depending on the source material
public static void ReplaceTextInPdfPage(PdfPage contentPage, string source, string target)
{
ModifyPdfContentStreams(contentPage, stream =>
{
if (!stream.TryUnfilter())
return false;
var search = string.Join("\\s*", source.Select(c => c.ToString()));
var stringStream = Encoding.Default.GetString(stream.Value, 0, stream.Length);
if (!Regex.IsMatch(stringStream, search))
return false;
stringStream = Regex.Replace(stringStream, search, target);
stream.Value = Encoding.Default.GetBytes(stringStream);
stream.Zip();
return false;
});
}
public static void ModifyPdfContentStreams(PdfPage contentPage,Func<PdfDictionary.PdfStream, bool> Modification)
{
for (var i = 0; i < contentPage.Contents.Elements.Count; i++)
if (Modification(contentPage.Contents.Elements.GetDictionary(i).Stream))
return;
var resources = contentPage.Elements?.GetDictionary("/Resources");
var xObjects = resources?.Elements.GetDictionary("/XObject");
if (xObjects == null)
return;
foreach (var item in xObjects.Elements.Values.OfType<PdfReference>())
{
var stream = (item.Value as PdfDictionary)?.Stream;
if (stream != null)
if (Modification(stream))
return;
}
}

Related on Custom Timer Job in sharepoint 2010

How to create a custom job to export an Excel file in a list which has only 2 columns(Title,Description) in a sharepoint 2010?i want the coding part of this question?
Reading Data from Excel and writing into a sharepoint list,this has to be done through custom job coding
Thanks in Advance...
Naresh
Use OpenXMLSDK - a free download that needs to be installed on the server.
[...]
using DocumentFormat.OpenXml;
using DocumentFormat.OpenXml.Packaging;
using DocumentFormat.OpenXml.Spreadsheet;
public class OffBookAssetLibraryEventReceiver : SPItemEventReceiver
{
public override void ItemUpdated(SPItemEventProperties properties)
{
// This if statement is to work around the sharepoint issue of this event firing twice.
if (properties.AfterProperties["vti_sourcecontrolcheckedoutby"] == null && properties.BeforeProperties["vti_sourcecontrolcheckedoutby"] != null)
{
byte[] workSheetByteArray = properties.ListItem.File.OpenBinary();
Stream stream = new MemoryStream(workSheetByteArray);
Package spreadsheetPackage = Package.Open(stream, FileMode.Open, FileAccess.ReadWrite);
SpreadsheetDocument spreadsheetDocument = SpreadsheetDocument.Open(spreadsheetPackage);
SharedStringTablePart shareStringTablePart = spreadsheetDocument.WorkbookPart.SharedStringTablePart;
Sheets sheets = spreadsheetDocument.WorkbookPart.Workbook.Sheets;
try
{
foreach (Sheet sheet in sheets)
{
var worksheetPart = (WorksheetPart)spreadsheetDocument.WorkbookPart.GetPartById(sheet.Id.Value);
IEnumerable<Row> rows = worksheetPart.Worksheet.GetFirstChild<SheetData>().Elements<Row>();
if (rows.Count() > 0)
{
int rowNumber = 0;
foreach (Row row in rows)
{
IEnumerable<Cell> cells = row.Elements<Cell>();
Cell title = null;
Cell description = null;
title = cells.ToArray()[0];
description = cells.ToArray()[1];
// This is the code used to extract cells from excel that are NOT inline (Inline cells are decimal and dates - although dates are stored as int)
int index = int.Parse(title.CellValue.Text);
string titleString = spreadsheetDocument.WorkbookPart.SharedStringTablePart.SharedStringTable.Elements<SharedStringItem>().ElementAt(index).InnerText;
index = int.Parse(description.CellValue.Text);
string descriptionString = spreadsheetDocument.WorkbookPart.SharedStringTablePart.SharedStringTable.Elements<SharedStringItem>().ElementAt(index).InnerText;
//Insert into your sharepoint list here!
}
}
}
}
}
}
}
I recommend putting this code into an event receiver on the document library (as seen above).
Have you looked at Excel REader for .NET
http://exceldatareader.codeplex.com/
Open the Excel File
Take a look at the Excel Services for SharePoint 2010. There is a walkthrough explaining the required steps to open an excel file.
SharePoint Custom Timer Job
To create a custom SharePoint timer job, you have to create a class that inherits from SPJobDefinition. A complete tutorial can be found in this blog post: Creating Custom Timer Job in SharePoint 2010.