Is there a way to retrieve the entire line that the FileHelpers Engine is parsing - filehelpers

We are using FileHelpers to parse a text file into numerous entities, based on the first character in the line. Each entity is then stored in a particular database table. We would also like to store each input string, as a whole, in addition to the parsed fields.
Is there a way to capture the input line, before it is parsed into the individual fields of the entity?

You can use events to get the full line, BeforeReadRecord or AfterReadRecord have an argument that contains the property RecordLine
Here is an example: https://www.filehelpers.net/example/EventsAndNotification/ReadEvents/
[FixedLengthRecord(FixedMode.AllowVariableLength)]
[IgnoreEmptyLines]
public class OrdersFixed
{
[FieldFixedLength(7)]
public int OrderID;
[FieldFixedLength(8)]
public string CustomerID;
[FieldFixedLength(8)]
public DateTime OrderDate;
[FieldFixedLength(11)]
public decimal Freight;
}
public override void Run()
{
var engine = new FileHelperEngine<OrdersFixed>();
engine.BeforeReadRecord += BeforeEvent;
engine.AfterReadRecord += AfterEvent;
var result = engine.ReadFile("report.inp");
foreach (var value in result)
Console.WriteLine("Customer: {0} Freight: {1}", value.CustomerID, value.Freight);
}
private void BeforeEvent(EngineBase engine, BeforeReadEventArgs<OrdersFixed> e)
{
Console.Write(e.RecordLine)
}
private void AfterEvent(EngineBase engine, AfterReadEventArgs<OrdersFixed> e)
{
Console.Write(e.RecordLine)
}

Related

Trouble employing BeanItemContainer and TreeTable in Vaadin

I have reviewed multiple examples for how to construct a TreeTable from from a Container datasource and just adding items iterating over an Object[][]. Still I'm stuck for my use case.
I have a bean like so...
public class DSRUpdateHourlyDTO implements UniquelyKeyed<AssetOwnedHourlyLocatableId>, Serializable
{
private static final long serialVersionUID = 1L;
private final AssetOwnedHourlyLocatableId id = new AssetOwnedHourlyLocatableId();
private String commitStatus;
private BigDecimal economicMax;
private BigDecimal economicMin;
public void setCommitStatus(String commitStatus) { this.commitStatus = commitStatus; }
public void setEconomicMax(BigDecimal economicMax) { this.economicMax = economicMax; }
public void setEconomicMin(BigDecimal economicMin) { this.economicMin = economicMin; }
public String getCommitStatus() { return commitStatus; }
public BigDecimal getEconomicMax() { return economicMax; }
public BigDecimal getEconomicMin() { return economicMin; }
public AssetOwnedHourlyLocatableId getId() { return id; }
#Override
public AssetOwnedHourlyLocatableId getKey() {
return getId();
}
}
The AssetOwnedHourlyLocatableId is a compound id. It looks like...
public class AssetOwnedHourlyLocatableId implements Serializable, AssetOwned, HasHour, Locatable,
UniquelyKeyed<AssetOwnedHourlyLocatableId> {
private static final long serialVersionUID = 1L;
private String location;
private String hour;
private String assetOwner;
#Override
public String getLocation() {
return location;
}
#Override
public void setLocation(final String location) {
this.location = location;
}
#Override
public String getHour() {
return hour;
}
#Override
public void setHour(final String hour) {
this.hour = hour;
}
#Override
public String getAssetOwner() {
return assetOwner;
}
#Override
public void setAssetOwner(final String assetOwner) {
this.assetOwner = assetOwner;
}
}
I want to generate a grid where the hours are pivoted into column headers and the location is the only other additional column header.
E.g.,
Location 1 2 3 4 5 6 ... 24
would be the column headers.
Underneath each column you might see...
> L1
> Commit Status Status1 .... Status24
> Eco Min EcoMin1 .... EcoMin24
> Eco Max EcoMax1 .... EcoMax24
> L2
> Commit Status Status1 .... Status24
> Eco Min EcoMin1 .... EcoMin24
> Eco Max EcoMax1 .... EcoMax24
So, if I'm provided a List<DSRUpdateHourlyDTO> I want to convert it into the presentation format described above.
What would be the best way to do this?
I have a few additional functional requirements.
I want to be able to toggle between read-only and editable views of the same table.
I want to be able to complete a round-trip to a datasource (e.g., JPAContainerSource).
I (will eventually) want to filter items by any part of the compound id.
My challenge is in the adaptation. I well understand the simple use case where I could take the list and simply splat it into a BeanItemContainer and use addNestedContainerProperty and setVisibleColumns. Pivoting properties into columns seems to be what's stumping me.
As it turns out this was an ill-conceived question.
For data entry purposes, one could use a BeanItemContainer and have the columns include nested container property hour from the composite id and instead of a TreeTable, use a Table that has commitStatus, ecoMin and ecoMax as columns. Limitation: you'd only ever query for / submit one assetOwner and location's worth of data.
As for display, where you don't care to filter one assetOwner and location's worth of data, you could pivot the hour info as originally described. You could just convert the original bean into another bean suitable for display (where each hour is its own column).

C# How to define a variable as global within a (Step Defintion) class

below is an extract from a Step Definition class of my Specflow project.
In the first method public void WhenIExtractTheReferenceNumber() I can successfully extract the text from the application under test, and I have proved this using the Console.WriteLine();
I need to be able to use this text in other methods with in my class I.e. public void WhenIPrintNumber(); But I'm not sure how to do this!
I read about Get/Set but I could not get this working. So I'm thinking is it possible to make my var result global somehow, so that I can call it at anytime during the test?
namespace Application.Tests.StepDefinitions
{
[Binding]
public class AllSharedSteps
{
[When(#"I extract the reference number")]
public void WhenIExtractTheReferenceNumber()
{
Text textCaseReference = ActiveCase.CaseReferenceNumber;
Ranorex.Core.Element elem = textCaseReference;
var result = elem.GetAttributeValue("Text");
Console.WriteLine(result);
}
[When(#"I print number")]
public void WhenIPrintNumber()
{
Keyboard.Press(result);
}
}
}
Thanks in advance for any thoughts.
Here is the solution to my question. Now I can access my variable(s) from any methods within my class. I have also included code that I'm using to split my string and then use the first part of the string. In my case I need the numerical part of '12345 - some text':
namespace Application.Tests.StepDefinitions
{
[Binding]
public class AllSharedSteps
{
private string result;
public Array splitReference;
[When(#"I extract the case reference number")]
public void WhenIExtractTheCaseReferenceNumber()
{
Text textCaseReference = ActiveCase.CaseReferenceNumber;
Ranorex.Core.Element elem = textCaseReference;
result = elem.GetAttributeValue("Text").ToString();
splitReference = result.Split('-'); // example of string to be split '12345 - some text'
Console.WriteLine(splitReference.GetValue(0).ToString().Trim());
}
[When(#"I print number")]
public void WhenIPrintNumber()
{
Keyboard.Press(result); // prints full string
Keyboard.Press(splitReference.GetValue(0).ToString()); // prints first part of string i.e. in this case, a reference number
}
}
}
I hope this help somebody else :)

Text extraction from table cells

I have a pdf. The pdf contains a table. The table contains many cells (>100). I know the exact position (x,y) and dimension (w,h) of every cell of the table.
I need to extract text from cells using itextsharp. Using PdfReaderContentParser + FilteredTextRenderListener (using a code like this http://itextpdf.com/examples/iia.php?id=279 ) I can extract text but I need to run the whole procedure for each cell. My pdf have many cells and the program needs too much time to run. Is there a way to extract text from a list of "rectangle"? I need to know the text of each rectangle. I'm looking for something like PDFTextStripperByArea by PdfBox (you can define as many regions as you need and the get text using .getTextForRegion("region-name") ).
This option is not immediately included in the iTextSharp distribution but it is easy to realize. In the following I use the iText (Java) class, interface, and method names because I am more at home with Java. They should easily be translatable into iTextSharp (C#) names.
If you use the LocationTextExtractionStrategy, you can can use its a posteriori TextChunkFilter mechanism instead of the a priori FilteredRenderListener mechanism used in the sample you linked to. This mechanism has been introduced in version 5.3.3.
For this you first parse the whole page content using the LocationTextExtractionStrategy without any FilteredRenderListener filtering applied. This makes the strategy object collect TextChunk objects for all PDF text objects on the page containing the associated base line segment.
Then you call the strategy's getResultantText overload with a TextChunkFilter argument (instead of the regular no-argument overload):
public String getResultantText(TextChunkFilter chunkFilter)
You call it with a different TextChunkFilter instance for each table cell. You have to implement this filter interface which is not too difficult as it only defines one method:
public static interface TextChunkFilter
{
/**
* #param textChunk the chunk to check
* #return true if the chunk should be allowed
*/
public boolean accept(TextChunk textChunk);
}
So the accept method of the filter for a given cell must test whether the text chunk in question is inside your cell.
(Instead of separate instances for each cell you can of course also create one instance whose parameters, i.e. cell coordinates, can be changed between getResultantText calls.)
PS: As mentioned by the OP, this TextChunkFilter has not yet been ported to iTextSharp. It should not be hard to do so, though, only one small interface and one method to add to the strategy.
PPS: In a comment sschuberth asked
Do you then still call PdfTextExtractor.getTextFromPage() when using getResultantText(), or does it somehow replace that call? If so, how to you then specify the page to extract to?
Actually PdfTextExtractor.getTextFromPage() internally already uses the no-argument getResultantText() overload:
public static String getTextFromPage(PdfReader reader, int pageNumber, TextExtractionStrategy strategy, Map<String, ContentOperator> additionalContentOperators) throws IOException
{
PdfReaderContentParser parser = new PdfReaderContentParser(reader);
return parser.processContent(pageNumber, strategy, additionalContentOperators).getResultantText();
}
To make use of a TextChunkFilter you could simply build a similar convenience method, e.g.
public static String getTextFromPage(PdfReader reader, int pageNumber, LocationTextExtractionStrategy strategy, Map<String, ContentOperator> additionalContentOperators, TextChunkFilter chunkFilter) throws IOException
{
PdfReaderContentParser parser = new PdfReaderContentParser(reader);
return parser.processContent(pageNumber, strategy, additionalContentOperators).getResultantText(chunkFilter);
}
In the context at hand, though, in which we want to parse the page content only once and apply multiple filters, one for each cell, we might generalize this to:
public static List<String> getTextFromPage(PdfReader reader, int pageNumber, LocationTextExtractionStrategy strategy, Map<String, ContentOperator> additionalContentOperators, Iterable<TextChunkFilter> chunkFilters) throws IOException
{
PdfReaderContentParser parser = new PdfReaderContentParser(reader);
parser.processContent(pageNumber, strategy, additionalContentOperators)
List<String> result = new ArrayList<>();
for (TextChunkFilter chunkFilter : chunkFilters)
{
result.add(strategy).getResultantText(chunkFilter);
}
return result;
}
(You can make this look fancier by using Java 8 collection streaming instead of the old'fashioned for loop.)
Here's my take on how to extract text from a table-like structure in a PDF using itextsharp. It returns a collection of rows and each row contains a collection of interpreted columns. This may work for you on the premise that there is a gap between one column and the next which is greater than the average width of a single character. I also added an option to check for wrapped text within a virtual column. Your mileage may vary.
using (PdfReader pdfReader = new PdfReader(stream))
{
for (int page = 1; page <= pdfReader.NumberOfPages; page++)
{
TableExtractionStrategy tableExtractionStrategy = new TableExtractionStrategy();
string pageText = PdfTextExtractor.GetTextFromPage(pdfReader, page, tableExtractionStrategy);
var table = tableExtractionStrategy.GetTable();
}
}
public class TableExtractionStrategy : LocationTextExtractionStrategy
{
public float NextCharacterThreshold { get; set; } = 1;
public int NextLineLookAheadDepth { get; set; } = 500;
public bool AccomodateWordWrapping { get; set; } = true;
private List<TableTextChunk> Chunks { get; set; } = new List<TableTextChunk>();
public override void RenderText(TextRenderInfo renderInfo)
{
base.RenderText(renderInfo);
string text = renderInfo.GetText();
Vector bottomLeft = renderInfo.GetDescentLine().GetStartPoint();
Vector topRight = renderInfo.GetAscentLine().GetEndPoint();
Rectangle rectangle = new Rectangle(bottomLeft[Vector.I1], bottomLeft[Vector.I2], topRight[Vector.I1], topRight[Vector.I2]);
Chunks.Add(new TableTextChunk(rectangle, text));
}
public List<List<string>> GetTable()
{
List<List<string>> lines = new List<List<string>>();
List<string> currentLine = new List<string>();
float? previousBottom = null;
float? previousRight = null;
StringBuilder currentString = new StringBuilder();
// iterate through all chunks and evaluate
for (int i = 0; i < Chunks.Count; i++)
{
TableTextChunk chunk = Chunks[i];
// determine if we are processing the same row based on defined space between subsequent chunks
if (previousBottom.HasValue && previousBottom == chunk.Rectangle.Bottom)
{
if (chunk.Rectangle.Left - previousRight > 1)
{
currentLine.Add(currentString.ToString());
currentString.Clear();
}
currentString.Append(chunk.Text);
previousRight = chunk.Rectangle.Right;
}
else
{
// if we are processing a new line let's check to see if this could be word wrapping behavior
bool isNewLine = true;
if (AccomodateWordWrapping)
{
int readAheadDepth = Math.Min(i + NextLineLookAheadDepth, Chunks.Count);
if (previousBottom.HasValue)
for (int j = i; j < readAheadDepth; j++)
{
if (previousBottom == Chunks[j].Rectangle.Bottom)
{
isNewLine = false;
break;
}
}
}
// if the text was not word wrapped let's treat this as a new table row
if (isNewLine)
{
if (currentString.Length > 0)
currentLine.Add(currentString.ToString());
currentString.Clear();
previousBottom = chunk.Rectangle.Bottom;
previousRight = chunk.Rectangle.Right;
currentString.Append(chunk.Text);
if (currentLine.Count > 0)
lines.Add(currentLine);
currentLine = new List<string>();
}
else
{
if (chunk.Rectangle.Left - previousRight > 1)
{
currentLine.Add(currentString.ToString());
currentString.Clear();
}
currentString.Append(chunk.Text);
previousRight = chunk.Rectangle.Right;
}
}
}
return lines;
}
private struct TableTextChunk
{
public Rectangle Rectangle;
public string Text;
public TableTextChunk(Rectangle rect, string text)
{
Rectangle = rect;
Text = text;
}
public override string ToString()
{
return Text + " (" + Rectangle.Left + ", " + Rectangle.Bottom + ")";
}
}
}

Jackson : Conditional select the fields

I have a scenario where i need to use the payload as
{"authType":"PDS"}
or
{"authType":"xyz","authType2":"abc",}
or
{"authType":"xyz","authType2":"abc","authType3":"123"}
or
any combination except for null values.
referring to the code i have 3 fields but only not null value fields be used.
Basically i don't want to include the field which has null value.
Are there any annotations to be used to get it done
public class AuthJSONRequest {
private String authType;
private String authType2;
private String authType3;
public String getAuthType() {
return authType;
}
public void setAuthType(String authType) {
this.authType = authType;
}
public String getAuthType2() {
return authType2;
}
public void setAuthType2(String authType2) {
this.authType2 = authType2;
}
public String getAuthType3() {
return authType3;
}
public void setAuthType3(String authType3) {
this.authType3 = authType3;
}
}
Try JSON Views? See this or this. Or for more filtering features, see this blog entry (Json Filters for example).
This is exactly what the annotation #JsonInclude in Jackson2 and #JsonSerialize in Jackson are meant for.
If you want a property to show up only when it is not equal to null, add #JsonInclude(Include.NON_NULL) resp. #JsonSerialize(include=Include.NON_NULL).

How to rename a component column that is a foreign key?

We are using fluentnhibernate with automapping and we have a naming convention that all columns that are foreign keys, there column name will end with "Key". So we have a convention that looks like this:
public class ForeignKeyColumnNameConvention : IReferenceConvention
{
public void Apply ( IManyToOneInstance instance )
{
// name the key field
string propertyName = instance.Property.Name;
instance.Column ( propertyName + "Key" );
}
}
This works great until we created a component in which one of its values is a foreign key. By renaming the column here it overrides the default name given to the component column which includes the ComponentPrefix which is defined in the AutomappingConfiguration. Is there a way for me to get the ComponentPrefix in this convention? or is there some other way for me to get the column name for components with a property that is a foreign key to end in the word "Key"?
After a lot of fiddling and trial & error (thus being tempted to use your solution with Reflection) I came up with the following:
This method depends on the order of the execution of the conventions. This convention-order happens via a strict hierarchy. In this example, at first, the convention of the component (IDynamicComponentConvention) is being handled and after that the conventions of the inner properties are being handled such as the References mapping (IReferenceConvention).
The strict order is where we make our strike:
We assemble the correct name of the column in the call to Apply(IDynamicComponentConvention instance), put it on the queue. Note that a Queue<T> is used which is a FIFO (first-in-first-out) collection type thus it keeps the order correctly.
Almost immediately after that, Apply(IManyToOneInstanceinstance) is called. We check if there is anything in the queue. If there is, we take it out of the queue and set it as column name. Note that you should not use Peek() instead of Dequeue() as it does not remove the object from the queue.
The code is as follows:
public sealed class CustomNamingConvention : IDynamicComponentConvention, IReferenceConvention {
private static Queue<string> ColumnNames = new Queue<string>();
public void Apply(IDynamicComponentInstance instance) {
foreach (var referenceInspector in instance.References) {
// All the information we need is right here
// But only to inspect, no editing yet :(
// Don't worry, just assemble the name and enqueue it
var name = string.Format("{0}_{1}",
instance.Name,
referenceInspector.Columns.Single().Name);
ColumnNames.Enqueue(name);
}
}
public void Apply(IManyToOneInstance instance) {
if (!ColumnNames.Any())
// Nothing in the queue? Just return then (^_^)
return;
// Set the retrieved string as the column name
var columnName = ColumnNames.Dequeue();
instance.Column(columnName);
// Pick a beer and celebrate the correct naming!
}
}
I Have figured out a way to do this using reflection to get to the underlying mapping of the IManyToOneInspector exposed by the IComponentInstance but was hoping there was a better way to do this?
Here is some example code of how I achieved this:
#region IConvention<IComponentInspector, IComponentInstance> Members
public void Apply(IComponentInstance instance)
{
foreach (var manyToOneInspector in instance.References)
{
var referenceName = string.Format("{0}_{1}_{2}{3}", instance.EntityType.Name, manyToOneInspector.Property.PropertyType.Name, _autoMappingConfiguration.GetComponentColumnPrefix(instance.Property), manyToOneInspector.Property.Name);
if(manyToOneInspector.Property.PropertyType.IsSubclassOf(typeof(LookupBase)))
{
referenceName += "Lkp";
}
manyToOneInspector.Index ( string.Format ( "{0}_FK_IDX", referenceName ) );
}
}
#endregion
public static class ManyToOneInspectorExtensions
{
public static ManyToOneMapping GetMapping(this IManyToOneInspector manyToOneInspector)
{
var fieldInfo = manyToOneInspector.GetType ().GetField( "mapping", BindingFlags.NonPublic | BindingFlags.Instance );
if (fieldInfo != null)
{
var manyToOneMapping = fieldInfo.GetValue( manyToOneInspector ) as ManyToOneMapping;
return manyToOneMapping;
}
return null;
}
public static void Index(this IManyToOneInspector manyToOneInspector, string indexName)
{
var mapping = manyToOneInspector.GetMapping ();
mapping.Index ( indexName );
}
public static void Column(this IManyToOneInspector manyToOneInspector, string columnName)
{
var mapping = manyToOneInspector.GetMapping ();
mapping.Column ( columnName );
}
public static void ForeignKey(this IManyToOneInspector manyToOneInspector, string foreignKeyName)
{
var mapping = manyToOneInspector.GetMapping();
mapping.ForeignKey ( foreignKeyName );
}
}
public static class ManyToOneMappingExtensions
{
public static void Index (this ManyToOneMapping manyToOneMapping, string indexName)
{
if (manyToOneMapping.Columns.First().IsSpecified("Index"))
return;
foreach (var column in manyToOneMapping.Columns)
{
column.Index = indexName;
}
}
public static void Column(this ManyToOneMapping manyToOneMapping, string columnName)
{
if (manyToOneMapping.Columns.UserDefined.Count() > 0)
return;
var originalColumn = manyToOneMapping.Columns.FirstOrDefault();
var column = originalColumn == null ? new ColumnMapping() : originalColumn.Clone();
column.Name = columnName;
manyToOneMapping.ClearColumns();
manyToOneMapping.AddColumn(column);
}
public static void ForeignKey(this ManyToOneMapping manyToOneMapping, string foreignKeyName)
{
if (!manyToOneMapping.IsSpecified("ForeignKey"))
manyToOneMapping.ForeignKey = foreignKeyName;
}
}