How to retrieve multiple images from the document with document id from Filenet content engine - api

I have requirement where i need to retrieve the images from the document and to place it in a sharepath location in a bytestream format. I have access to the document id and i have established the connection to the filenet content engine. Is there anyone who knows the api codes to retrieve the image from the document.If incase there are multiple images how can i iterate those images.

Look at this example from IBM:
public static void WriteContentToFile(IDocument doc, String path)
{
String fileName = doc.Name;
String file = Path.Combine(path, fileName);
try
{
FileStream fs = new FileStream(file, FileMode.CreateNew);
BinaryWriter bw = new BinaryWriter(fs);
Stream s = doc.AccessContentStream(0);
byte[] data = new byte[s.Length];
s.Read(data,0,data.Length);
s.Close();
bw.Write(data);
bw.Close();
fs.Close();
}
catch (Exception e)
{
System.Console.WriteLine(e.StackTrace);
}
}

Related

how to read excel file in memory (without saving it in disk) and return its content dotnet core

Im working on a webApi using dotnet core that takes the excel file from IFormFile and reads its content.Iam following the article
https://levelup.gitconnected.com/reading-an-excel-file-using-an-asp-net-core-mvc-application-2693545577db which is doing the same thing except that the file here is present on the server and mine will be provided by user.
here is the code:
public IActionResult Test(IFormFile file)
{
List<UserModel> users = new List<UserModel>();
System.Text.Encoding.RegisterProvider(System.Text.CodePagesEncodingProvider.Instance);
using (var stream = System.IO.File.Open(file.FileName, FileMode.Open, FileAccess.Read))
{
using (var reader = ExcelReaderFactory.CreateReader(stream))
{
while (reader.Read()) //Each row of the file
{
users.Add(new UserModel
{
Name = reader.GetValue(0).ToString(),
Email = reader.GetValue(1).ToString(),
Phone = reader.GetValue(2).ToString()
});
}
}
}
return Ok(users);
}
}
When system.IO tries to open the file, it could not find the path as the path is not present. How it is possible to either get the file path (that would vary based on user selection of file)? are there any other ways to make it possible.
PS: I dont want to upload the file on the server first, then read it.
You're using the file.FileName property, which refers to the file name the browser send. It's good to know, but not a real file on the server yet. You have to use the CopyTo(Stream) Method to access the data:
public IActionResult Test(IFormFile file)
{
List<UserModel> users = new List<UserModel>();
System.Text.Encoding.RegisterProvider(System.Text.CodePagesEncodingProvider.Instance);
using (var stream = new MemoryStream())
{
file.CopyTo(stream);
stream.Position = 0;
using (var reader = ExcelReaderFactory.CreateReader(stream))
{
while (reader.Read()) //Each row of the file
{
users.Add(new UserModel{Name = reader.GetValue(0).ToString(), Email = reader.GetValue(1).ToString(), Phone = reader.GetValue(2).ToString()});
}
}
}
return Ok(users);
}
Reference

iText FileStream overwirtes

I am in the process of putting together some code that will merge pdf's based on the file name prefix. I currently have the below code that grabs the filename and doesn't merge, but overwrites. I believe my problem is the FileStream placement, but if I move it out of the current location, I can't get the filename. Any suggestions? Thanks.
static void CreateMergedPDFs()
{
string srcDir = "C:/PDFin/";
string resultPDF = "C:/PDFout/";
{
var files = System.IO.Directory.GetFiles(srcDir);
string prevFileName = null;
int i = 1;
foreach (string file in files)
{
string filename = Left(Path.GetFileName(file), 8);
using (FileStream stream = new FileStream(resultPDF + filename + ".pdf", FileMode.Create))
{
if (prevFileName == null || filename == prevFileName)
{
Document pdfDoc = new Document(PageSize.A4);
PdfCopy pdf = new PdfCopy(pdfDoc, stream);
pdfDoc.Open();
{
pdf.AddDocument(new PdfReader(file));
i++;
}
if (pdfDoc != null)
pdfDoc.Close();
Console.WriteLine("Merges done!");
}
}
}
}
}
}
}
The behavior you are describing is consistent with your code. You are creating the loop in an incorrect way.
Try this:
static void CreateMergedPDFs()
{
string srcDir = "C:/PDFin/";
string resultPDF = "C:/PDFout/merged.pdf";
FileStream stream = new FileStream(resultPDF, FileMode.Create);
Document pdfDoc = new Document(PageSize.A4);
PdfCopy pdf = new PdfCopy(pdfDoc, stream);
pdfDoc.Open();
var files = System.IO.Directory.GetFiles(srcDir);
foreach (string file in files)
{
pdf.AddDocument(new PdfReader(file));
}
pdfDoc.Close();
Console.WriteLine("Merges done!");
}
}
That makes more sense, doesn't it?
If you want to group files based on their prefix, you should read the answer to the question Group files in a directory based on their prefix
In the answer to this question, it is assumed that the prefix and the rest of the filename are separated by a - character. For instance 1-abc.pdf and 1-xyz.pdf have the prefix 1 whereas 2-abc.pdf and 2-xyz.pdf have the prefix 2. In your case, it's not clear how you'd determine the prefix, but it's easy to get a list of all the files, sort them and make groups of files based on whatever algorithm you want to determine the prefix.

iTextSharp Html Arabic Mixed Content to PDF

We are in project for educational domain, where we are looking for importing Arabic content, Images (uri/base64 crypted texts), html tables.
We are facing issue while executing using HtmlWorker with Stream data. It throws error as "the document has no pages"
string str="html content contains arabic font texts/images";
using (MemoryStream ms = new MemoryStream())
{
using (iTextSharp.text.Document document = new iTextSharp.text.Document(iTextSharp.text.PageSize.A4, 25, 25, 30, 30))
{
using (iTextSharp.text.pdf.PdfWriter writer = iTextSharp.text.pdf.PdfWriter.GetInstance(document,ms))
{
using (var htmlWorker = new iTextSharp.text.html.simpleparser.HTMLWorker(document))
{
//HTMLWorker doesn't read a string directly but instead needs a TextReader (which StringReader subclasses)
using (var sr = new StreamReader(str)) /// We are facing issue at this juncture where it throws error.
{
//Parse the HTML
htmlWorker.Parse(sr);
}
}
document.Close();
writer.Close();
ms.Close();
Response.ContentType = "pdf/application";
Response.AddHeader("content-disposition", "attachment;filename=First_PDF_document.pdf");
Response.OutputStream.Write(ms.ToArray(), 0, ms.ToArray().Length);
}
}
}
Could you please help us regarding this?
Now,
I have tried with other approach:
using (var htmlWorker = new iTextSharp.text.html.simpleparser.HTMLWorker(document))
{
using (Stream s = GenerateStreamFromString(str))
{
using (var srt = new StreamReader(s))
{
//Parse the HTML
htmlWorker.Parse(srt); //this line throws error now.
}
}
}
public Stream GenerateStreamFromString(string s)
{
MemoryStream stream = new MemoryStream();
StreamWriter writer = new StreamWriter(stream);
writer.Write(s);
writer.Flush();
stream.Position = 0;
return stream;
}
Error message:
"The specified path, file name, or both are too long. The fully qualified file name must be less than 260 characters, and the directory name must be less than 248 characters."

How to convert a file to bytes and bytes to a file in mvc4

I am using MVC4. My requirement is:
I have to convert the file into byte array and save to database varbinary column.
For this I written code like below:
public byte[] Doc { get; set; }
Document.Doc = GetFilesBytes(PostedFile);
public static byte[] GetFilesBytes(HttpPostedFileBase file)
{
MemoryStream target = new MemoryStream();
file.InputStream.CopyTo(target);
return target.ToArray();
}
I am downloading the file by using the following code:
public ActionResult Download(int id)
{
List<Document> Documents = new List<Document>();
using (SchedulingServiceInstanceManager facade = new SchedulingServiceInstanceManager("SchedulingServiceWsHttpEndPoint"))
{
Document Document = new Document();
Document.DMLType = Constant.DMLTYPE_SELECT;
Documents = facade.GetDocuments(Document);
}
var file = Documents.FirstOrDefault(f => f.DocumentID == id);
return File(file.Doc.ToArray(), "application/octet-stream", file.Name);
}
when I am downloading pdf file then it is showing message as "There was an error opening this document. The file is damaged and could not be repaired."
Any thing else I need to do?
I tried with the following code but no luck
return File(file.Doc.ToArray(), "application/pdf", file.Name);
Please help me to solve the issue.
Thanks in advance.
Please try as in below code in your controller
FileStream stream = File.OpenRead(#"c:\path\to\your\file\here.txt");
byte[] fileBytes= new byte[stream.Length];
stream.Read(fileBytes, 0, fileBytes.Length);
stream.Close();
//Begins the process of writing the byte array back to a file
using (Stream file = File.OpenWrite(#"c:\path\to\your\file\here.txt"))
{
file.Write(fileBytes, 0, fileBytes.Length);
}
It may helps you...

Pdfbox - adding pdf embedded File and save the PDDocument to OutputStream does not keep the embedded Files

I'm using Pdfbox (1.8.8) to adding attachments to a pdf. My problem is when one of the attachments is of type .pdf and i'm saving the PDDocument to OutputStream the final pdf document does not include the attachments. If a save the PDDocument to a file instead an OutputStream all works just fine, and if the attachments does not include any pdf, both save to file or OutputStream works fine.
I would like to know if there is any way to add pdf embedded Files and save the PDDocument to OutputStream keeping the attached files in the final pdf that is generated.
The code i'm using is:
private void insertAttachments(OutputStream out, ArrayList<Attachment> attachmentsResources) {
final PDDocument doc;
Boolean hasPdfAttach = false;
try {
doc = PDDocument.load(new ByteArrayInputStream(((ByteArrayOutputStream) out).toByteArray()));
// final PDFTextStripper pdfStripper = new PDFTextStripper();
// final String text = pdfStripper.getText(doc);
final PDEmbeddedFilesNameTreeNode efTree = new PDEmbeddedFilesNameTreeNode();
final Map embeddedFileMap = new HashMap();
PDEmbeddedFile embeddedFile;
File file = null;
for (Attachment attach : attachmentsResources) {
// first create the file specification, which holds the embedded file
final PDComplexFileSpecification fileSpecification = new PDComplexFileSpecification();
fileSpecification.setFile(attach.getFilename());
file = AttachmentUtils.getAttachmentFile(attach);
final InputStream is = new FileInputStream(file.getAbsolutePath());
embeddedFile = new PDEmbeddedFile(doc, is);
// set some of the attributes of the embedded file
if ("application/pdf".equals(attach.getMimetype())) {
hasPdfAttach = true;
}
embeddedFile.setSubtype(attach.getMimetype());
embeddedFile.setSize((int) (long) attach.getFilesize());
fileSpecification.setEmbeddedFile(embeddedFile);
// now add the entry to the embedded file tree and set in the document.
embeddedFileMap.put(attach.getFilename(), fileSpecification);
// final String text2 = pdfStripper.getText(doc);
}
// final String text3 = pdfStripper.getText(doc);
efTree.setNames(embeddedFileMap);
// ((COSDictionary) efTree.getCOSObject()).removeItem(COSName.LIMITS); (this not work for me)
// attachments are stored as part of the "names" dictionary in the document catalog
final PDDocumentNameDictionary names = new PDDocumentNameDictionary(doc.getDocumentCatalog());
names.setEmbeddedFiles(efTree);
doc.getDocumentCatalog().setNames(names);
// final ByteArrayOutputStream pdfboxToDocumentStream = new ByteArrayOutputStream();
final String tmpfile = "temporary.pdf";
if (hasPdfAttach) {
final File f = new File(tmpfile);
doc.save(f);
doc.close();
//i have try with parser but without success too
// PDFParser parser = new PDFParser(new FileInputStream(tmpfile));
// parser.parse();
// PDDocument doc2 = parser.getPDDocument();
final PDDocument doc2 = PDDocument.loadNonSeq(f, new RandomAccessFile(new File(getHomeTMP()
+ "tempppp.pdf"), "r"));
doc2.save(out);
doc2.close();
} else {
doc.save(out);
doc.close();
}
//that does not work too
// final InputStream in = new FileInputStream(tmpfile);
// IOUtils.copy(in, out);
// out = new FileOutputStream(tmpFile);
// doc.save (out);
} catch (IOException e1) {
e1.printStackTrace();
} catch (Exception e2) {
e2.printStackTrace();
}
}
Best regards
Solution:
private void insertAttachments(OutputStream out, ArrayList<Attachment> attachmentsResources) {
final PDDocument doc;
try {
doc = PDDocument.load(new ByteArrayInputStream(((ByteArrayOutputStream) out).toByteArray()));
((ByteArrayOutputStream) out).reset();
final PDEmbeddedFilesNameTreeNode efTree = new PDEmbeddedFilesNameTreeNode();
final Map embeddedFileMap = new HashMap();
PDEmbeddedFile embeddedFile;
File file = null;
for (Attachment attach : attachmentsResources) {
// first create the file specification, which holds the embedded file
final PDComplexFileSpecification fileSpecification = new PDComplexFileSpecification();
fileSpecification.setFile(attach.getFilename());
file = AttachmentUtils.getAttachmentFile(attach);
final InputStream is = new FileInputStream(file.getAbsolutePath());
embeddedFile = new PDEmbeddedFile(doc, is);
// set some of the attributes of the embedded file
embeddedFile.setSubtype(attach.getMimetype());
embeddedFile.setSize((int) (long) attach.getFilesize());
fileSpecification.setEmbeddedFile(embeddedFile);
// now add the entry to the embedded file tree and set in the document.
embeddedFileMap.put(attach.getFilename(), fileSpecification);
}
efTree.setNames(embeddedFileMap);
((COSDictionary) efTree.getCOSObject()).removeItem(COSName.LIMITS);
// attachments are stored as part of the "names" dictionary in the document catalog
final PDDocumentNameDictionary names = new PDDocumentNameDictionary(doc.getDocumentCatalog());
names.setEmbeddedFiles(efTree);
doc.getDocumentCatalog().setNames(names);
((COSDictionary) efTree.getCOSObject()).removeItem(COSName.LIMITS);
doc.save(out);
doc.close();
} catch (IOException e1) {
e1.printStackTrace();
} catch (Exception e2) {
e2.printStackTrace();
}
}
You store the new PDF after the original PDF in out:
Look at all the uses of out in your method:
private void insertAttachments(OutputStream out, ArrayList<Attachment> attachmentsResources) {
...
doc = PDDocument.load(new ByteArrayInputStream(((ByteArrayOutputStream) out).toByteArray()));
...
doc2.save(out);
...
doc.save(out);
So you get as input a ByteArrayOutputStream and take its current content as input (i.e. the ByteArrayOutputStream is not empty but already contains a PDF) and after some processing you append the modified PDF to the ByteArrayOutputStream. Depending on the PDF viewer you present this to, you will be shown either the original or the manipulated PDF or a (very correct) error message that the file is garbage.
If you want the ByteArrayOutputStream to contain only the manipulated PDF, simply add
((ByteArrayOutputStream) out).reset();
or (if you are not sure about the state of the stream)
out = new ByteArrayOutputStream();
right after
doc = PDDocument.load(new ByteArrayInputStream(((ByteArrayOutputStream) out).toByteArray()));
PS: According to the comments the OP tried the above proposed changes to his code without success.
I cannot run the code as presented in the question because it is not self-contained. Thus, I reduced it to the essentials to get a self-contained test:
#Test
public void test() throws IOException, COSVisitorException
{
ByteArrayOutputStream baos = new ByteArrayOutputStream();
try (
InputStream sourceStream = getClass().getResourceAsStream("test.pdf");
InputStream attachStream = getClass().getResourceAsStream("artificial text.pdf"))
{
final PDDocument document = PDDocument.load(sourceStream);
final PDEmbeddedFile embeddedFile = new PDEmbeddedFile(document, attachStream);
embeddedFile.setSubtype("application/pdf");
embeddedFile.setSize(10993);
final PDComplexFileSpecification fileSpecification = new PDComplexFileSpecification();
fileSpecification.setFile("artificial text.pdf");
fileSpecification.setEmbeddedFile(embeddedFile);
final Map<String, PDComplexFileSpecification> embeddedFileMap = new HashMap<String, PDComplexFileSpecification>();
embeddedFileMap.put("artificial text.pdf", fileSpecification);
final PDEmbeddedFilesNameTreeNode efTree = new PDEmbeddedFilesNameTreeNode();
efTree.setNames(embeddedFileMap);
final PDDocumentNameDictionary names = new PDDocumentNameDictionary(document.getDocumentCatalog());
names.setEmbeddedFiles(efTree);
document.getDocumentCatalog().setNames(names);
document.save(baos);
document.close();
}
Files.write(Paths.get("attachment.pdf"), baos.toByteArray());
}
As you see PDFBox here uses only streams. The result:
Thus, PDFBox without problem stores a PDF into which it has embedded a PDF file attachment.
The problem, therefore, most likely have nothing to do with this work flow as such