How to submit PDF File in a JSP to Servlet? - sql

I have a formular in my JSP, that has an input field for a PDF-file and besides that also other inputs for text and numbers. This is my line for the data upload of the PDF-File and that is how my form starts:
Upload for PDF: <input type="file" name="datei" />
When in my Servlet, I just get the name of the PDF-File, when i request it like this:
String pdf = request.getParameter("datei");
Now how do I upload a pdf-File, submit that to my Servlet, so I can get all the data and Update a row with all this data in my PostGreSQL database?
First the problem is in the form that must have the enctype="multipart/form-data", but so I can't bring over inputs from the type="text". And another form in another one is not possible either. If I leave out the enctype and just pass the to my servlet, I don't know more than to request this parameter e.g. with String pdf = request.getParameter("file");, but there I only get the PDF file name. Additionally I need to store the PDF file in the database. For this I will create another column pdf of type oid, but I don't know how to pass this PDF file through all methods, so that I can finally save it in PostGreSQL and assign it to a seminar.
Thank you

Related

GridFs read PDF

I am trying to build a financial dashboard with Flask and pymongo. The starting point is a flask form which saves data in a MongoDB database. One of the fields in the form is a FileField (wtforms) which allows the upload of a PDF, which is then stored in MongoDB with GridFS.
Now I manage to save the pdf and I can see the resulting entries within the .files and .chunks collections. Now I would like to build a function that retrieves the PDFs and analyses them with some basic NLP, however I struggle with the getting meaningful data.
When I do:
storage = gridfs.GridFS(db, collection)
data = storage.get('some id')
a = data.read()
The result is a binary file. If I continue with:
with open(data, 'rb') as f:
b = f.read()
The result is "ValueError: embedded null byte or sometimes an empty "byte string".
Any help on this?
To follow up on the above, I found a solution for myself that consists in 2 separate functions:
(1) Upon upload of the form and before uploading the files to MongoDB, I apply a function based on pdfminer that extracts the string content of the PDF and tranform it into a list of sentences using NLTK. I will then store this list in the .files via the storage.put(file, sent_list = sent_list) #sent_list being the variable name of the list of sentences.
Whenever I wish to run NLP operations on the file, I will just call the "sent_list" variable from mongodb.
(2) If I wish to display the stored pdf in its original content however, I included the following function as a separate route.
storage = GridFS(db, collection)
data = storage.get_last_version(filename)
response = make_response(data.read())
extension = data.filename.split('.')[-1]
response.headers['Content-Type'] = f'application/{extension}'
response.headers['Content-Disposition'] = f'inline; filename={data.filename}'
return response
(2) will open a new tab in my flask app showing the .pdf file in its original format.
I hope this helps anyone coming across a similar problem in the future.

How use the one Template for multiple pages in a XWPFDocument with Java

I would like to know, how can i reuse one template (with one page inside and some variables) multiple times a XWPFDocument object.
My idea is:
load the template once in a XWPFDocument as an template-object
clone/create/copy the template-object with all his styles and headers etc
fill the clone with content
add this clone to the destination-XWPFDocument
I got this work for one single page only.
When i try to clone/create/copy the template-object it will lose all his style informations.
How to copy a paragraph of .docx to another .docx withJava and retain the style
How to copy some content in one .docx to another .docx , using POI without losing format?
POI probably does not support this out of the box, but I have done a similar thing in my project poi-mail-merge, it works with the underlying XML to repeatedly replace markers in a template Microsoft Word document and combine the results into one resulting document.
So it basically duplicates the template document multiple times into the resulting document.
See here for how I do it there, basically I work on the XML body text and do replacements/changes there and then append it onto the result document.
POI Mail Merge propably helps in other cases but in my case it doesn't work.
My Workaround is to update my Template-XWPFDocument to the needed structure first, save it temporarily and read it back into a XWPFDocument-object.
Here the steps:
Read the template-file into a XWPFDocument
Read the records from data-file e.g. csv
Calculate the numbers of pages related to the data-records
Get the Bodyelements-Objects from the Template-XWPFDocument
Create new Bodyelements (depending to the numbers of pages) in the Template-XWPFDocument and replace them with the same Objects that we get before
Save the updated Template-XWPFDocument temporarily
Read the temporarily saved Template into a XWPFDocument
Replace all placeholder and fill them with your CSV-Data
Hope this helps somebody

PHPExcel write html file into existing xlsx file

I have a template file that I fill using PHPExcel. But I have terms and conditions that are saved in database with html tags and inline css. Now these terms and conditions are subject to change so I cant put it into template. So only solution is t take it from database and put it inside created template but I have no clue how to open xlsx file and insert .html file inside it perhaps as second sheet.
This is my current code:
$objPHPExcel = new PHPExcel();
$objPHPExcel = PHPExcel_IOFactory::load($inputFileName);
$objWriter = PHPExcel_IOFactory::createWriter($objPHPExcel, 'Excel2007');
$objWriter->save($outputFileName);
And of course there is lot of code that specifically deal with writing data to excel file but that is working perfectly.
Could someone please explain how could I go about doing it.
Thanks
You can't simply insert an HTML file inside an xlsx file
The latest develop branch of PHPExcel does include an HTML to Rich Text wizard that will take a block of HTML markup and convert it to a Rich Text object that can then be stored in a cell, and /Examples/42richText.php demonstrates how it can be used. At present, this only covers basic markup tags (<br />, <font>, <b>, <i>, <em>, <strong>, <sub>, <sup>, <ins>, <del>, etc) and doesn't handle inline style in any way. However, it might provide the basis for what you want with some additional work.

PowerShell and ITextSharp - create new PDF based on template

I am using ITextSharp and PowerShell to create a PDF document.
I want to be able to load an existing template PDF file which ideally has placeholders and then replace the placeholders with values I supply.
Then I want to save the document with the changes as a new PDF.
Is this possible?
Right now here is the code I have for creating a PDF
[System.Reflection.Assembly]::LoadFrom("c:\\itextsharp.dll")
[void][iTextSharp.text.pdf.PdfWriter]::GetInstance($Doc, [System.IO.File]::Create("c:\existing.pdf") )
# Need to edit $Doc (replace values, add elements) then save as new file
$Doc.Close()
Any help is appreciated.
Thanks,
Andrew
You'll use the AcroFields.SetFields method to specify the values you want in each of the fields in your fillable PDF form:
[System.Reflection.Assembly]::LoadFrom($iTextSharpLibFullname)
$reader = New-Object iTextSharp.text.pdf.PdfReader($templateFileFullname)
$stamper = New-Object iTextSharp.text.pdf.PdfStamper($reader,
[System.IO.File]::Create($outputFileFullname))
$stamper.AcroFields.SetField('Field1_Name', 'Field1_Value')
$stamper.AcroFields.SetField('Field2_Name', 'Field2_Value')
#etc. for each field in your form...
$stamper.Close()
Where:
$iTextSharpLibFullname is a reference to iTextSharp.dll
$templateFileFullname is the name of your fillable PDF template form
$outputFileFullname is the name of the PDF you'll create

Plone 4 - Get url of a file in a plone.app.blob.field.FileField

I have a custom content type with 3 FileFields (plone.app.blob.field.FileField) and I want to get their url's, so i can put them on my custom view and people will be able to download these files.
However, when using Clouseau to test and debug, I call :
context.getFirst_file().absolute_url()
Where getFirst_file() is the accessor to the first file (field called 'first_file').
The url returned is 'http://foo/.../eat.00001', where 'eat.00001' is the object of my custom type that contains the file fields...
The interesting thing is, if I call:
context.getFirst_file().getContentType()
It returns 'application/pdf', which is correct since it's a pdf file.
I'm pretty lost here, any help is appreciated. Thanks in advance!
File fields do not support a absolute_url method; instead, through acquisition you inherit the method from the object itself, hence the results you see. Moreover, calling getFirst_field() will return the actual downloadable contents of the field, not the field itself which could provide such information.
Instead, you should use the at_download script appended to the object URL, followed by the field id:
First File
You can also re-use the Archetypes widget for the field, by passing the field name to the widget method:
<metal:field use-macro="python:context.widget('first_field', mode='view')">
First File
</metal:field>
This will display the file size, icon (if available), the filename and the file mime type.
In both these examples, I assumed the name of the field is 'first_field'.