Apache tika: MSG remove extra line breaks in result string - apache

I have msg file with body text:
<html><head></head><body><div style="font-family: Verdana;font-size: 12.0px;">
<div>Test message.</div>
<div> </div>
<div>More content here...</div>
<div> </div>
<div>Best regards,</div>
<div>Mr. Crowley</div></div></body></html>
I try to get content of the file above using Apache Tika...
final InputStream input = new FileInputStream("file.html");
final ContentHandler handler = new BodyContentHandler();
final Metadata metadata = new Metadata();
final HtmlParser htmlParser = new HtmlParser();
htmlParser.parse(input, handler, metadata, new ParseContext());
String plainText = handler.toString();
System.out.println(plainText);
...and all is fine except extra linebreaks:
Test message.
More content here...
Best regards,
Mr. Crowley
<and 3 empty lines here>
Is it possible to avoid this behavior? Is it possible to get more expected result?
Help me to fix this.

Related

Why HtmlAgilityPack adds some characters to my html

Here is my code:
Dim input = "<div><textarea>something</div></textarea>"
Dim doc As New HtmlAgilityPack.HtmlDocument
doc.OptionOutputAsXml = True
doc.LoadHtml(Input)
Using writer As New StringWriter
doc.Save(writer)
Dim res = writer.ToString
End Using
and the value of 'res' is:
"<?xml version="1.0" encoding="windows-1255"?>
<div>
<textarea>
//<![CDATA[
something
//]]>//
</textarea>
</div>"
the result as html is: My textarea
How can I prevent it ?
From my understanding of it, the reason is implied by this answer to Set textarea value with HtmlAgilityPack:
A <textarea> element doesn't have a value attribute. It's content is it's own text node:
<textarea>
Some content
</textarea>
To simulate the same thing safely, HAP has to enclose the content in a //<![CDATA[ section.
The source code for HAP has this comment for the relevant line(s):
// tags whose content may be anything
ElementsFlags.Add("textarea", HtmlElementFlag.CData);
So, you can't prevent it.

Google script code formatted,colored and beautiful indent

I wrote a container-bound script and now want to make a report from it, by inserting the code into a Google Docs file. The problem is that with copy & paste from the Script Editor, the code is no longer colored or indented. I will need your help because I don't know how to make it well done.
I have this code :
createAndSendDocument() {
// Create a new Google Doc named 'Hello, world!'
var doc = DocumentApp.create('Hello, world!');
// Access the body of the document, then add a paragraph.
doc.getBody().appendParagraph('This document was created by Google Apps Script.');
// Get the URL of the document.
var url = doc.getUrl(); // Get the email address of the active user - that's you.
var email = Session.getActiveUser().getEmail();
}
As tehhowch said you'll need to write your own javascript code to do syntax formatting and then use the output of that.
You can use this https://www.w3schools.com/howto/tryit.asp?filename=tryhow_syntax_highlight they already have the script in place you only need to encode your html and put inside div id="myDiv" and run the javascript code.
<div id="myDiv">
Your encoded html goes here
</div>
Example
<div id="myDiv">
<!DOCTYPE html><br>
<html><br>
<body><br>
<br>
<h1>Testing an HTML Syntax Highlighter</h2><br>
<p>Hello world!</p><br>
<a href="https://www.w3schools.com">Back to School</a><br>
<br>
</body><br>
</html>
</div>
Make sure you first encode your html. [< -> &lt, > -> &gt, etc]
Then you can use the output of that . Sample : https://docs.google.com/document/d/1h8oDOZ0ReTgwxnYt2JKflHWJdlianSWWuBgbWcSdJC0/edit?usp=sharing
Reference and further reads : https://www.w3schools.com/howto/tryit.asp?filename=tryhow_syntax_highlight

MVC4 C# - How to submit list of object that are being displayed to the user?

I'm working on an MVC4 C# project in VS2010.
I would like to allow the user to upload the contents of a .csv file to a database but there is a requirement to first echo the contents of the file to screen (as a final visual check) before submitting. What would be the best approach of submitting to the database as I am struggling to find a way of persisting the complex object in the view?
Here is the view where I am using a form to allow the user to upload the csv file:
#model IEnumerable<MyNamespace.Models.MyModel>
#{
ViewBag.Title = "Upload";
WebGrid grid = new WebGrid(Model, rowsPerPage: 5);
}
<h2>Upload</h2>
<form action="" method="post" enctype="multipart/form-data">
<label for="file">Filename:</label>
<input type="file" name="file" id="file" />
<input type="submit" />
</form>
<h2>Grid</h2>
#grid.GetHtml(
//Displaying Grid here)
<p>
#Html.ActionLink("Submit", "Insert")
</p>
Here is the action in the controller that processes the csv file:
[HttpPost]
public ActionResult Upload(HttpPostedFileBase file)
{
var fileName = Path.GetFileName(file.FileName);
var path = Path.Combine(Server.MapPath("~/App_Data"), fileName);
file.SaveAs(path);
//Stream reader will read test.csv file in current folder
StreamReader sr = new StreamReader(path);
//Csv reader reads the stream
CsvReader csvread = new CsvReader(sr);
List<MyModel> listMyModele = new List<MyModel>(); // creating list of model.
csvread.Configuration.RegisterClassMap<MyModelMap>(); // use mapping specified.
listMyModel = csvread.GetRecords<MyModel>().ToList();
sr.Close();
//return View();
return View(listMyModel);
}
Up until this point everything is simple, I can upload the csv to the controller, read using CsvHelper, produce a list of MyModel objects and display in the view within a grid. To reiterate my initial question, is it now possible to submit the complex object (the list of MyModel) from the view as I can't figure out a way of making it available to an action within the controller.
Thank you.
Yes it's possible, It's "easier" if you have a Model with the IEnumerable in it so you can use the naming convention like this:
Property[index].ItemProperty
for every Html input/select field.
If you want to keep the IEnumerable as Model I think the naming convention is something like this:
ItemProperty[index]
So translated in code:
#Html.TextBoxFor(t => t.Property, new { name = "Property[" + i + "]" })
where i comes from a for loop to render all items or something like that.
I have already done it but I can't find the code at the moment. KendoUI uses this scheme for its multirows edit in the grid component.
You can check their POST AJAX requests for the right naming convention.
EDIT 1:
Otherwise you can think about store the model somewhere temporarily and retrieve it every time and updating with user inputs. It's a little more expensive but probably easier to write. Something like an updated csv file or a temporary db table.

First run FileResult to download file, then RedirectToAction

This will send an email when a button gets pushed. However I am trying to call the FileResult, SaveDocument, to download a file right before redirecting back to the button page.
I am using a hardcoded file for now to download for the sake of testing. I can run the SaveDocument() result using a test button. I can't send an email, run the SaveDocument Action and then redirect.
[HttpGet]
public ActionResult send(int thisbatch, string listofpositives)
{
MailMessage mail = new MailMessage();
SmtpClient smtpServer = new SmtpClient("smtperServer");
smtpServer.Port = 25; // Gmail works on this port
smtpServer.EnableSsl = false;
mail.From = new MailAddress("xxxe#xxx.com");
mail.To.Add("xxxe#xxx.com");
mail.Subject = "Batch Closed";
mail.Body="some info stuff here";
smtpServer.Send(mail);
SaveDocument();
return RedirectToAction("AddColiform");
}
//THIS WORKS BY ITSELF CALLED FROM A BUTTON
public FileResult SaveDocument()
{
string filePath = Server.MapPath("~/XML_positives/Test1.xml");
string contentType = "text/xml";
return File(filePath, contentType, "Test1.xml");
}
Well, no solution was found (so far) to download a file and RedirectToAction back to the initial page in the same ActionResult. If someone can come up with a better answer I will take this one off.
So based on contents of string "listofpositives" I call a new View "Has Positives" with 2 buttons: One calls the FileResult Action and one redirects back to where everything started (this is desired).
A lot clunkier than just popping up a File Save As dialog and then moving on automatically. But I need to build something and move on. I feel sorry for my users. oh well.
Here is the code after I send an email and return to desired view:
if (listofpositives == "")
{
return RedirectToAction("AddColiform");
}
else
{
return RedirectToAction("HasPositives",new { thisbatch=thisbatch, listofpositives=listofpositives});
}
Here is the code for the whole extra view:
#{
ViewBag.Title ="HasPositives" ;
}
<h2>HasPositives</h2>
<p>Batch: #ViewData["batchid"] </p>
<p>Postives: #ViewData["listofpositives"]</p>
<br /><br />
Process your XML file using XMLSampling<br /><br /><br />
<button onclick="location.href='#Url.Action("SaveDocument", "Home")';return false;">Download Positive XML File</button>
<button onclick="location.href='#Url.Action("AddColiform", "Home")';return false;">Done</button>

I need help in uploading a file to a folder on my server and then adding the file details to a database

I am trying to make a page for my site where users can upload a game, give it a name then the file gets uploaded to a folder on my site and the file name and game name get added to my sql database but it gives an error saying: "System.ArgumentOutOfRangeException: Index was out of range. Must be non-negative and less than the size of the collection.
Parameter name: index" when i try to run it.
My code for this page is:
#{
var db= Database.Open("Games");
var sqlQ = "SELECT * FROM Games";
var data = db.Query(sqlQ);
}
#{
var fileName = "";
if (IsPost) {
var fileSavePath = "";
var uploadedFile = Request.Files[0];
fileName = Path.GetFileName(uploadedFile.FileName);
fileSavePath = Server.MapPath("~/App_Data/UploadedFiles/" +
fileName);
uploadedFile.SaveAs(fileSavePath);
}
var GameName="";
var GameGenre="";
var GameYear="";
if(IsPost){
GameName=Request["formName"];
var SQLINSERT = "INSERT INTO Games (Name, file_path) VALUES (#0, #1)";
db.Execute(SQLINSERT, GameName, fileName);
Response.Redirect("default.cshtml");
}
}
<h1 >Add a new game to the database</h1>
<form action="" method="post">
<p>Name:<input type="text" name="formName" /></p>
#FileUpload.GetHtml(
initialNumberOfFiles:1,
allowMoreFilesToBeAdded:false,
includeFormTag:true,
uploadText:"Add")
The error page says that the problem is with line 11 but i don't know what to change.
As Hightechrider has correctly said, your code should reside in Controllers, not in views. The exception you are getting is because Request.Files is an empty array, so when you're trying to access [0] element, you get an IndexWasOutOfRange error.
I'd recommend you reading the following article for uploading files in ASP.NET MVC. It presents an easier, more consistent and flexible model by putting the code in the controller action. Basically, all code boils down to this:
You create an upload form with submit button and file input.
You create an action that accepts HttpPostedFileBase variable.
You must set the proper enctype to your form if you want to upload files:
<form action="" method="post" enctype="multipart/form-data">