Creating listview using arraylist - arraylist

I have 2 arrays as follows, I want to create a list view showing the details from both array together eg
Location:-------
Magnitude:-------
How can i do that?
String des = description.toString();
Matcher matcher = Pattern.compile("(?<=Location: ).*?(?= ;)").matcher(des);
List<String> list = new LinkedList<>();
while (matcher.find()) {
list.add(matcher.group());
}
String des1 = description.toString();
Matcher matcher1 = Pattern.compile("(?<=Magnitude: ).*?(?= ;)").matcher(des1);
List<String> listmagnitude = new ArrayList<>();
while (matcher1.find()) {
listmagnitude.add(matcher1.group());
}

Try this!
int i = 0;
for(String l: list) {
System.out.format("Location: %s \t Magnitude: %s\n", l, listMagnitude[i]);
i++;
}

Related

Update PDF using pdfBox

I would like to ask if having a PDF it is possible, using pdfbox libraries, to update it at a specific point.
I am trying to use a solution already online but seems the gettoken() method does not enter code heresection the words properly to allow me to find the part I would like to modify.
This is the code(Groovy):
for( int i = 0; i < dataContext.getDataCount(); i++ ) {
InputStream is = dataContext.getStream(i);
Properties props = dataContext.getProperties(i);
String searchString= "Hours worked";
String replacement = "Hours worked: 2";
File file = new File("\\\\****\\UKDC\\GFS\\PRE\\PREPROD\\Alchemer\\Template\\***.pdf");
PDDocument doc = PDDocument.load(file);
for ( PDPage page : doc.getPages() )
{
PDFStreamParser parser = new PDFStreamParser(page);
parser.parse();
List tokens = parser.getTokens();
logger.info("in Page");
for (int j = 0; j < tokens.size(); j++)
{
logger.info("tokens:"+tokens[j]);
Object next = tokens.get(j);
//logger.info("in Object");
if (next instanceof Operator)
{
Operator op = (Operator) next;
String pstring = "";
int prej = 0;
//Tj and TJ are the two operators that display strings in a PDF
if (op.getName().equals("Tj"))
{
logger.info("in Tj");
// Tj takes one operator and that is the string to display so lets update that operator
COSString previous = (COSString) tokens.get(j - 1);
String string = previous.getString();
logger.info("previousString:"+string);
string = string.replaceFirst(searchString, replacement);
previous.setValue(string.getBytes());
} else
if (op.getName().equals("TJ"))
{
logger.info("in TJ:"+ op.getName());
COSArray previous = (COSArray) tokens.get(j - 1);
logger.info("previous:"+previous);
for (int k = 0; k < previous.size(); k++)
{
Object arrElement = previous.getObject(k);
if (arrElement instanceof COSString)
{
COSString cosString = (COSString) arrElement;
String string = cosString.getString();
logger.info("string:"+string);
if (j == prej || string.equals(" ") || string.equals(":") || string.equals("-")) {
pstring += string;
} else {
prej = j;
pstring = string;
}
}
}
logger.info("pstring:"+pstring);
if (searchString.equals(pstring.trim()))
{
logger.info("in searchString");
COSString cosString2 = (COSString) previous.getObject(0);
cosString2.setValue(replacement.getBytes());
int total = previous.size()-1;
for (int k = total; k > 0; k--) {
previous.remove(k);
}
}
}
}
}
logger.info("in updatedStream");
// now that the tokens are updated we will replace the page content stream.
PDStream updatedStream = new PDStream(doc);
OutputStream out = updatedStream.createOutputStream(COSName.FLATE_DECODE);
ContentStreamWriter tokenWriter = new ContentStreamWriter(out);
tokenWriter.writeTokens(tokens);
logger.info("in tokenWriter");
out.close();
page.setContents(updatedStream);
doc.save("\\\\***\\UKDC\\GFS\\PRE\\PREPROD\\Alchemer\\***1.pdf");
}
Executing the code I am trying to search "Hours worked" String and update with
"Hours worked: 2"
There are 2 questions:
1.When I execute and check the logs can see the Tokens are not created properly:
enter image description here
enter image description here
So are created two different COSArrays meantime I have all in one Line:
enter image description here
and this can be a problem if I have to search a specific word.
When it find the word it seems it is working but it apply a strange char:
enter image description here
So Here 2 questions:
How to manage to specify the token behaviour (or maybe for the parser) to get an entire phrase in the same token until a special char happen?
Hot to format the new char in the new PDF?
Hope you can help me, thanks for your support.

Syntax Highlighting for go in vb.net

Ok so I have been making a simple code editor in vb.net for go.. (for personal uses)
I tried this code -
Dim tokens As String = "(break|default|func|interface|select|case|defer|go|map|struct|chan|else|goto|package|switch|const|fallthrough|if|range|type|continue|for|import|return|var)"
Dim rex As New Regex(tokens)
Dim mc As MatchCollection = rex.Matches(TextBox2.Text)
Dim StartCursorPosition As Integer = TextBox2.SelectionStart
For Each m As Match In mc
Dim startIndex As Integer = m.Index
Dim StopIndex As Integer = m.Length
TextBox2.[Select](startIndex, StopIndex)
TextBox2.SelectionColor = Color.FromArgb(0, 122, 204)
TextBox2.SelectionStart = StartCursorPosition
TextBox2.SelectionColor = Color.RebeccaPurple
Next
but I couldn't add something like print statements say I want a fmt.Println("Hello World"), that is not possible, anyone help me?
I want a simple result that will do proper syntax without glitching text colors like this current code does.
Here's a code showing how to update highlighting with strings and numbers.
You would need to tweak it further to support syntax like comments, etc.
private Regex BuildExpression()
{
string[] exprs = {
"(break|default|func|interface|select|case|defer|go|map|struct|chan|else|goto|package|switch|const|fallthrough|if|range|type|continue|for|import|return|var)",
#"([0-9]+\.[0-9]*(e|E)(\+|\-)?[0-9]+)|([0-9]+\.[0-9]*)|([0-9]+)",
"(\"\")|\"((((\\\\\")|(\"\")|[^\"])*\")|(((\\\\\")|(\"\")|[^\"])*))"
};
StringBuilder sb = new StringBuilder();
for (int i = 0; i < exprs.Length; i++)
{
string expr = exprs[i];
if ((expr != null) && (expr != string.Empty))
sb.Append(string.Format("(?<{0}>{1})", "_" + i.ToString(), expr) + "|");
}
if (sb.Length > 0)
sb.Remove(sb.Length - 1, 1);
RegexOptions options = RegexOptions.ExplicitCapture | RegexOptions.IgnorePatternWhitespace | RegexOptions.Singleline | RegexOptions.Compiled | RegexOptions.IgnoreCase;
return new Regex(sb.ToString(), options);
}
private void HighlightSyntax()
{
var colors = new Dictionary<int, Color>();
var expression = BuildExpression();
Color[] clrs = { Color.Teal, Color.Red, Color.Blue };
int[] intarray = expression.GetGroupNumbers();
foreach (int i in intarray)
{
var name = expression.GroupNameFromNumber(i);
if ((name != null) && (name.Length > 0) && (name[0] == '_'))
{
var idx = int.Parse(name.Substring(1));
if (idx < clrs.Length)
colors.Add(i, clrs[idx]);
}
}
foreach (Match match in expression.Matches(richTextBox1.Text))
{
int index = match.Index;
int length = match.Length;
richTextBox1.Select(index, length);
for (int i = 0; i < match.Groups.Count; i++)
{
if (match.Groups[i].Success)
{
if (colors.ContainsKey(i))
{
richTextBox1.SelectionColor = colors[i];
break;
}
}
}
}
}
What we found during development of our Code Editor libraries, is that the regular expression-based parsers are hard to adapt to fully support advanced syntax like contextual keywords (LINQ) or interpolated strings.
You might find a bit more information here:
https://www.alternetsoft.com/blog/code-parsing-explained
The most accurate syntax highlighting for VB.NET can be implemented using Microsoft.CodeAnalysis API, it's the same API used internally by Visual Studio text editor.
Below is sample code showing how to get classified spans for VB.NET code (every span contains start/end position within the text and classification type, i.e. keyword, string, etc.). These spans then can be used to highlight text inside a textbox.
using System;
using System.Collections.Generic;
using System.Linq;
using System.Reflection;
using Microsoft.CodeAnalysis;
using Microsoft.CodeAnalysis.Classification;
using Microsoft.CodeAnalysis.Host.Mef;
using Microsoft.CodeAnalysis.Text;
public class VBClassifier
{
private Workspace workspace;
private static string FileContent = #"
Public Sub Run()
Dim test as TestClass = new TestClass()
End Sub";
public void Classify()
{
var project = InitProject();
var doc = AddDocument(project, "file1.vb", FileContent);
var spans = Classify(doc);
}
protected IEnumerable<ClassifiedSpan> Classify(Document document)
{
var text = document.GetTextAsync().Result;
var span = new TextSpan(0, text.Length);
return Classifier.GetClassifiedSpansAsync(document, span).Result;
}
protected Document AddDocument(Project project, string fileName, string code)
{
var documentId = DocumentId.CreateNewId(project.Id, fileName);
ApplySolutionChanges(s => s.AddDocument(documentId, fileName, code, filePath: fileName));
return workspace.CurrentSolution.GetDocument(documentId);
}
protected virtual void ApplySolutionChanges(Func<Solution, Solution> action)
{
var solution = workspace.CurrentSolution;
solution = action(solution);
workspace.TryApplyChanges(solution);
}
protected MefHostServices GetRoslynCompositionHost()
{
IEnumerable<Assembly> assemblies = MefHostServices.DefaultAssemblies;
var compositionHost = MefHostServices.Create(assemblies);
return compositionHost;
}
protected Project CreateDefaultProject()
{
var solution = workspace.CurrentSolution;
var projectId = ProjectId.CreateNewId();
var projectName = "VBTest";
ProjectInfo projectInfo = ProjectInfo.Create(
projectId,
VersionStamp.Default,
projectName,
projectName,
LanguageNames.VisualBasic,
filePath: null);
ApplySolutionChanges(s => s.AddProject(projectInfo));
return workspace.CurrentSolution.Projects.FirstOrDefault();
}
protected Project InitProject()
{
var host = GetRoslynCompositionHost();
workspace = new AdhocWorkspace(host);
return CreateDefaultProject();
}
}
Update:
Here's a Visual Studio project demonstrating both approaches:
https://drive.google.com/file/d/1LLuzy7yDFAE-v40I7EswECYQSthxheEf/view?usp=sharing

Automating Import of large csv files(~3gb) with C#

I am a bit new to this but my goal is to import the data from a csv file into a sql table and include additional values for each row being the file name and date. I was able to accomplish this using entity frame work and iterating through each row of the file but with the size of the files it will take too long too actually complete.
I am looking for a method to accomplish this import faster. I was looking into potentially using csvhelper with sqlbulkcopy to accomplish this but was not sure if there was a way to pass in the additional values needed for each row.
public void Process(string filePath)
{
InputFilePath = filePath;
DateTime fileDate = DateTime.Today;
string[] fPath = Directory.GetFiles(InputFilePath);
foreach (var file in fPath)
{
string fileName = Path.GetFileName(file);
char[] delimiter = new char[] { '\t' };
try
{
using (var db = new DatabaseName())
{
using (var reader = new StreamReader(file))
{
string line;
int count = 0;
int sCount = 0;
reader.ReadLine();
reader.ReadLine();
while ((line = reader.ReadLine()) != null)
{
count++;
string[] row = line.Split(delimiter);
var rowload = new ImportDestinationTable()
{
ImportCol0 = row[0],
ImportCol1 = row[1],
ImportCol2 = TryParseNullable(row[2]),
ImportCol3 = row[3],
ImportCol4 = row[4],
ImportCol5 = row[5],
IMPORT_FILE_NM = fileName,
IMPORT_DT = fileDate
};
db.ImportDestinationTable.Add(rowload);
if (count > 100)
{
db.SaveChanges();
count = 0;
}
}
db.SaveChanges();
//ReadLine();
}
}
}
static int? TryParseNullable(string val)
{
int outValue;
return int.TryParse(val, out outValue) ? (int?)outValue : null;
}
}

How to add columns subsequently in a dataset using Spark within a for loop ( where for loop contains the column name)

Here trying to add subsequently column to dataset Row, the issue coming up is last column is only visible. The columns added earlier do not persist
private static void populate(Dataset<Row> res, String[] args)
{
String[] propArr = args[0].split(","); // Eg: [abc, def, ghi]
// Dataset<Row> addColToMergedData = null;
/** Here each element is the name of the column to be inserted */
for(int i = 0; i < propArr.length; i++){
// addColToMergedData = res.withColumn(propArr[i], lit(null));
}
}
the logic in the for loop is flawed hence the issue.
you can modify the program as follows :
private static void populate(Dataset<Row> res, String[] args)
{
String[] propArr = args[0].split(","); // Eg: [abc, def, ghi]
Dataset<Row> addColToMergedData = null;
/** Here each element is the name of the column to be inserted */
for(int i = 0; i < propArr.length; i++)
{
res = res.withColumn(propArr[i], lit(null));
}
addColToMergedData = res
}
Sol:
// addColToMergedData = res.withColumn(colMap.get(propArr[i]), lit(null));
should be written as :
res = res.withColumn(colMap.get(propArr[i]), lit(null));
Here's the clean and conscious way to do this by using Scala:
val columnsList = Seq("one", "two", "three", "four", "five")
val res = columnsList.foldLeft(addColToMergedData) { case (df, colName) =>
df.withColumn(colu, lit(null))
}

parser.getTokens() gives out junk data and singlecharacters PDFBox-1.8.9 version

I am new to pdfbox. I am using pdfbox-app-2.0.0-RC1 version to fetch the entire text from the pdf using PDFTextStripperByArea. Is it possible for me to get each string separately?
For Example,
In the following text,
Nomination : Name&Address
Shipper : shipper name
I need Nomination as seperate string and "Name&Address" as separate string. Instead I am getting each character separately. I have tried with different Pdfs. For most pdfs I am able to get the exact string but for few pdfs I don't.
I am using the following code to get the separate string.
for (PDPage page : doc.getPages()) {
PDFStreamParser parser = new PDFStreamParser(page);
parser.parse();
List<Object> tokens = parser.getTokens();
for (int j = 0; j < tokens.size(); j++) {
Object next = tokens.get(j);
if (next instanceof Operator) {
Operator op = (Operator) next;
if (op.getName().equals("Tj")) {
COSString previous = (COSString) tokens.get(j - 1);
String string = previous.getString();
System.out.println("string1===" + string);
if (string.contains("Plant")) {
int size = al.size();
al.add(string);
stop = false;
continue;
}
if (!string.contains("_") && !stop) {
if (string.contains("Nomination")) {
stop = true;
} else {
al.add(string);
}
}
} else if (op.getName().equals("TJ")) {
COSArray previous = (COSArray) tokens.get(j - 1);
for (int k = 0; k < previous.size(); k++) {
Object arrElement = previous.getObject(k);
if (arrElement instanceof COSString) {
COSString cosString = (COSString)arrElement;
String string = cosString.getString();
System.out.println("string2====>>"+string);
al.add(string);
}
}
}
}
}
}
I am getting the following output:
string2====>>Nom
string2====>>i
string2====>>na
string2====>>t
string2====>>i
string2====>>on
string1===
string2====>>(
string2====>>T
string2====>>o
string1===
string2====>>Loa
string2====>>di
string2====>>ng
string1===
string2====>>Fa
string2====>>c
string2====>>i
string2====>>l
string2====>>i
string2====>>t
string2====>>y
string2====>>)