Object oriented design patterns for parsing text files?

Object oriented design patterns for parsing text files? - oop

As part of a software package I'm working on, I need to implement a parser for application specific text files. I've already specified the grammar for these file on paper, but am having a hard time translating it into easily readable/updatable code (right now just it passes each line through a huge number of switch statements).
So, are there any good design patterns for implementing a parser in a Java style OO environment?

Any easy way to break a massive switch into an OO design would be to have
pseudo code
class XTokenType {
public bool isToken(string data);
}
class TokenParse {
public void parseTokens(string data) {
for each step in data {
for each tokenType in tokenTypess {
if (tokenType.isToken(step)) {
parsedTokens[len] = new tokenType(step);
}
...
}
}
...
}
}
Here your breaking each switch statement into a method on that token object to detect whether the next bit of the string is of that token type.
Previously:
class TokenParse {
public void parseTokens(string data) {
for each step in data {
switch (step) {
case x:
...
case y:
...
...
}
}
...
}
}

One suggestion is to create property file where you define rules. Load it during run time and use if else loop (since switch statements also does the same internally). This way if you want to change some parsing rules you have to change .property file not code. :)

You need to learn how to express context free grammars. You should be thinking about the GoF Interpreter and parser/generators like Bison, ANTRL, lex/yacc, etc.

Related

Enforcing read-only attributes from the metaclass

Yes, still going with this. My impression is that there's this powerful facility in Raku, which is not really easy to use, and there's so little documentation for that. I'd like to kind of mitigate that.
In this case, I'm trying to force attributes to be read-only by default, to make immutable classes. Here's my attempt:
my class MetamodelX::Frozen is Metamodel::ClassHOW {
method compose_attributes($the-obj, :$compiler_services) {
my $attribute-container = callsame;
my $new-container = Perl6::Metamodel::AttributeContainer.new(
:attributes($attribute-container.attributes),
:attribute_lookup($attribute-container.attribute_table),
:0attr_rw_by_default
);
$new-container.compose_attributes($the-obj, $compiler_services);
}
}
my package EXPORTHOW {
package DECLARE {
constant frozen = MetamodelX::Frozen;
}
}
I'm calling that from a main function that looks like this:
use Frozen;
frozen Foo {
has $.bar;
method gist() {
return "→ $!bar";
}
}
my $foo = Foo.new(:3bar);
say $foo.bar;
$foo.bar(33);
I'm trying to follow the source, that does not really give a lot of facilities to change attribute stuff, so there seems to be no other way that creating a new instance of the container. And that might fail in impredictable ways, and that's what it does:
Type check failed in binding to parameter '$the-obj'; expected Any but got Foo (Foo)
at /home/jmerelo/Code/raku/my-raku-examples/frozen.raku:7
Not clear if this is the first the-obj or the second one, but any way, some help is appreciated.

Testing state machine transitions implemented as Map

I have a state machine with a relatively small set of states and inputs and I want to test the transitions exhaustively.
Transitions are coded using a Map<State, Map<Input, State>>, the code is something like this:
enum State {
S1,
// ...
}
enum Input {
I1,
// ...
}
class StateMachine {
State current;
Map<State, Map<Input, State>> transitions = {
S1: {
I1: S2,
// ...
},
// ...
};
State changeState(Input x) {
if (transitions[current] == null)
throw Error('Unknows state ${current}');
if (transitions[current][x] == null)
throw Error('Unknown transition from state ${current} with input ${x}');
current = transitions[current][x];
return current;
}
void execute() {
// ...
}
}
To test it I see 3 approaches:
1) Write lot of boilerplate code to check every single combination
2) Automate the tests creation: this seems a better approach to me, but this would end up using a structure that is identical to the Map used in the StateMachine. What should I do? Copy the Map in the test file or import it from the implementation file? The latter would make the test file depend on the implementation and doesn't seem a good idea.
3) Test Map for equality, same problem as before: equality with itself or with a copy? This approach is essentially what I do with the other 2 but doesn't seem like a canonical test

Maybe you want to have a look at this: https://www.itemis.com/en/yakindu/state-machine/documentation/user-guide/sctunit_test-driven_statechart_development_with_sctunit
It shows, how you can do a model based and test driven development of state machines including the option to generate unit test code and measuring the test coverage.

How do I execute Dynamically (like Eval) in Dart?

Since getting started in Dart I've been watching for a way to execute Dart (Text) Source (that the same program may well be generating dynamically) as Code. Like the infamous "eval()" function.
Recently I have caught a few hints that the communication port between Isolates support some sort of "Spawn" that seems like it could allow this "trick". In Ruby there is also the possibility to load a module dynamically as a language feature, perhaps there is some way to do this in Dart?
Any clues or a simple example will be greatly appreciated.
Thanks in advance!

Ladislav Thon provided this answer on the Dart forum:
I believe it's very safe to say that Dart will never have eval. But it will have other, more structured ways of dynamically generating code (code name mirror builders). There is nothing like that right now, though.
There are two ways of spawning an isolate: spawnFunction, which runs an existing function from the existing code in a new isolate, so nothing you are looking for, and spawnUri, which downloads code from given URI and runs it in new isolate. That is essentially dynamic code loading -- but the dynamically loaded code is isolated from the existing code. It runs in a new isolate, so the only means of communicating with it is via message passing (through ports).

You can run a string as Dart code by first constructing a data URI from it and then passing it into Isolate.spawnUri.
import 'dart:isolate';
void main() async {
final uri = Uri.dataFromString(
'''
void main() {
print("Hellooooooo from the other side!");
}
''',
mimeType: 'application/dart',
);
await Isolate.spawnUri(uri, [], null);
}
Note that you can only do this in JIT mode, which means that the only place you might benefit from it is Dart VM command line apps / package:build scripts. It will not work in Flutter release builds.
To get a result back from it, you can use ports:
import 'dart:isolate';
void main() async {
final name = 'Eval Knievel';
final uri = Uri.dataFromString(
'''
import "dart:isolate";
void main(_, SendPort port) {
port.send("Nice to meet you, $name!");
}
''',
mimeType: 'application/dart',
);
final port = ReceivePort();
await Isolate.spawnUri(uri, [], port.sendPort);
final String response = await port.first;
print(response);
}
I wrote about it on my blog.

Eval(), in Ruby at least, can execute anything from a single statement (like an assignment) to complete involved programs. There is a substantial time penalty for executing many small snippets over most any other form of execution that is possible.
Looking at the problem closer, there are at least three different functions that were at the base of the various schemes where eval might be used. Dart handles at least 2 of these in at least minimal ways.
Dart does not, nor does it look like there is any plan to support "general" script execution.
However, the NoSuchMethod method can be used to effectively implement the dynamic "injection" of variables into your local class environment. It replaces an eval() with a string that would look like this: eval( "String text = 'your first name here';" );
The second function that Dart readily supports now is the invocation of a method, that would look like this: eval( "Map map = SomeClass.some_method()" );
After messing about with this it finally dawned on me that a single simple class can be used to store the information needed to invoke a method, for a class, as a string which seems to have general utility. I can replace a big maintenance prone switch statement that might otherwise be necessary to invoke a series of methods. In Ruby this was almost trivial, however in Dart there are some fairly less than intuitive calls so I wanted to get this "trick" in one place, which fits will with doing ordering and filtering on the strings such as you may need.
Here's the code to "accumulate" as many classes (a whole library?) into a map using reflection such that the class.methodName() can be called with nothing more than a key (as a string).
Note: I used a few "helper methods" to do Map & List functions, you will probably want to replace them with straight Dart. However this code is used and tested only using the functions..
Here's the code:
//The used "Helpers" here..
MAP_add(var map, var key, var value){ if(key != null){map[key] = value;}return(map);}
Object MAP_fetch(var map, var key, [var dflt = null]) {var value = map[key];if (value==null) {value = dflt;}return( value );}
class ClassMethodMapper {
Map _helperMirrorsMap, _methodMap;
void accum_class_map(Object myClass){
InstanceMirror helperMirror = reflect(myClass);
List methodsAr = helperMirror.type.methods.values;
String classNm = myClass.toString().split("'")[1]; ///#FRAGILE
MAP_add(_helperMirrorsMap, classNm, helperMirror);
methodsAr.forEach(( method) {
String key = method.simpleName;
if (key.charCodeAt(0) != 95) { //Ignore private methods
MAP_add(_methodMap, "${classNm}.${key}()", method);
}
});
}
Map invoker( String methodNm ) {
var method = MAP_fetch(_methodMap, methodNm, null);
if (method != null) {
String classNm = methodNm.split('.')[0];
InstanceMirror helperMirror = MAP_fetch(_helperMirrorsMap, classNm);
helperMirror.invoke(method.simpleName, []);
}
}
ClassMethodMapper() {
_methodMap = {};
_helperMirrorsMap = {};
}
}//END_OF_CLASS( ClassMethodMapper );
============
main() {
ClassMethodMapper cMM = new ClassMethodMapper();
cMM.accum_class_map(MyFirstExampleClass);
cMM.accum_class_map(MySecondExampleClass);
//Now you're ready to execute any method (not private as per a special line of code above)
//by simply doing this:
cMM.invoker( MyFirstExampleClass.my_example_method() );
}

Actually there some libraries in pub.dev/packages but has some limitations because are young versions, so that I can recommend you this library expressions to dart and flutter.
A library to parse and evaluate simple expressions.
This library can handle simple expressions, but no blocks of code, control flow statements and so on. It supports a syntax that is common to most programming languages.
There I create an example of code to evaluate arithmetic operations and comparations of data.
import 'package:expressions/expressions.dart';
import 'dart:math';
#override
Widget build(BuildContext context) {
final parsing = FormulaMath();
// Expression example
String condition = "(cos(x)*cos(x)+sin(x)*sin(x)==1) && respuesta_texto == 'si'";
Expression expression = Expression.parse(condition);
var context = {
"x": pi / 5,
"cos": cos,
"sin": sin,
"respuesta_texto" : 'si'
};
// Evaluate expression
final evaluator = const ExpressionEvaluator();
var r = evaluator.eval(expression, context);
print(r);
return Scaffold(
body: Container(
margin: EdgeInsets.only(top: 50.0),
child: Column(
children: [
Text(condition),
Text(r.toString())
],
),
),
);
}
I/flutter (27188): true

Optimizing a method with boolean flag

I have a method whose purpose is to retrieve collection items.
A collection can contain a mix of items, let's say: pens, pencils, and papers.
The 1st parameter allows me to tell the method to retrieve only the itemTypes I pass (e.g, just pens and pencils).
The 2nd parameter flags the function to use the collection's default item types, instead.
getCollectionItems($itemTypes,$useCollectionDefaultItemTypes) {
foreach() {
foreach() {
foreach() {
// lots of code...
if($useCollectionDefaultItemTypes) {
// get collection's items using collection->itemTypes
}
else {
// get collection's items using $itemTypes
}
// lots of code...
}
}
}
}
What feels odd is that if I set the $useCollectionDefaultItemTypes to true, there is no need for the function to use the first parameter. I was considering refactoring this method into two such as:
getCollectionItems($itemTypes); // get the items using $itemTypes
getCollectionItems(); // get the items using default settings
The problem is that the methods will have lots of duplicate code except for the if-statement area.
Is there a better way to optimize this?

Pass in $itemTypes as null when you're not using it. Have your if statement check if $itemTypes === null; if it is, use default settings.
If this is php, which I assume it is, you can make your method signature function getCollectionItems($itemTypes = null) and then you can call getCollectionItems() and it will call it as if you had typed getCollectionItems(null).

It's generally a bad idea to write methods that use flags like that. I've seen that written in several places (here at #16, Uncle Bob here and elsewhere). It makes the method hard to understand, read, and refactor.
An alternative design would be to use closures. Your code could look something like this:
$specificWayOfProcessing = function($a) {
//do something with each $a
};
getCollectionItems($processer) {
foreach() {
foreach() {
foreach() {
// lots of code...
$processor(...)
// lots of code...
}
}
}
}
getCollectionItems($specificWayOfProcessing);
This design is better because
It's more flexible. What happens when you need to decide between three different things?
You can now test the code inside the loop much easier
It is now easier to read, because the last line tells you that you are "getting collection items using a specific way of processing" - it reads like an English sentence.

Yes, there is a better way of doing this -- though this question is not an optimization question, but a style question. (Duplicated code has little effect on performance!)
The simplest way to implement this along the lines of your original idea is to make the no-argument form of getCollectionItems() define the default arguments, and then call the version of it that requires an argument:
getCollectionItems($itemTypes) {
foreach() {
foreach() {
foreach() {
// lots of code...
// get collection's items using $itemTypes
}
// lots of code...
}
}
}
getCollectionItems() {
getCollectionItems(collection->itemTypes)
}
Depending on what language you are using, you may even be able to collapse these into a single function definition with a default argument:
getCollectionItems($itemTypes = collection->itemTypes) {
foreach() {
foreach() {
foreach() {
// lots of code...
// get collection's items using $itemTypes
}
// lots of code...
}
}
}
That has the advantage of clearly expressing your original idea, which is that you use $itemTypes if provided, and collection->itemTypes if not.
(This does, of course, assume that you're talking about a single "collection", rather than having one of those foreach iterations be iterating over collections. If you are, the idea to use a null value is a good one.)

Is it a good idea to inject a TestSettings parameter to a method to make it (Unit or integration) Testable?

Is it a good practice to introduce a TestSettings class in order to provide flexible testing possibilities of a method that has many processes inside?
Maybe not a good example but can be simple: Suppose that I have this method and I want to test its sub-processes:
public void TheBigMethod(myMethodParameters parameter)
{
if(parameter.Condition1)
{
MethodForCondition1("BigMac");
}
if(parameter.Condition2)
{
MethodForCondition2("MilkShake");
}
if(parameter.Condition3)
{
MethodForCondition3("Coke");
}
SomeCommonMethod1('A');
SomeCommonMethod2('B');
SomeCommonMethod3('C');
}
And imagine that I have unit tests for all
void MethodForCondition1 (string s)
void MethodForCondition2 (string s)
void MethodForCondition3 (string s)
void SomeCommonMethod1 (char c)
void SomeCommonMethod2 (char c)
void SomeCommonMethod3 (char c)
And now i want to test the TheBigMethod itself by introducing such test methods with required Asserts in them:
TheBigMethod_MethodForCondition1_TestCaseX_DoesGood
TheBigMethod_MethodForCondition2_TestCaseY_DoesGood
TheBigMethod_MethodForCondition3_TestCaseZ_DoesGood
TheBigMethod_SomeCommonMethod1_TestCaseU_DoesGood
TheBigMethod_SomeCommonMethod2_TestCaseP_DoesGood
TheBigMethod_SomeCommonMethod3_TestCaseQ_DoesGood
So, I want TheBigMethod to be exit-able at some points if it is called by one of my integration tests above.
public void TheBigMethod(myMethodParameters parameter, TestSettings setting)
{
if(parameter.Condition1)
{
MethodForCondition1("BigMac");
if(setting.ExitAfter_MethodForCondition1)
return;
}
if(parameter.Condition2)
{
MethodForCondition2("MilkShake");
if(setting.ExitAfter_MethodForCondition2)
return;
}
if(parameter.Condition3)
{
MethodForCondition3("Coke");
if(setting.ExitAfter_MethodForCondition3)
return;
}
SomeCommonMethod1('A');
if(setting.ExitAfter_SomeCommonMethod1)
return;
SomeCommonMethod2('B');
if(setting.ExitAfter_SomeCommonMethod2)
return;
SomeCommonMethod3('C');
if(setting.ExitAfter_SomeCommonMethod3)
return;
}
Even though it looks it does what I need to introduce a TestSetting parameter can makee the code less readable and does not look nice to have testing logic and the main functionality combined to me.
Can you advise a better design for such cases so that it can replace a TestSetting parameter idea?
thanks

It would (IMO) be a very bad thing to add this TestSetting. An alternative would be to add an interface (or set of interfaces) for MethodForConditionX and SomeCommonMethodX. The test each of the MethodForConditionX & SomeCommonMethodX in isolation and pass in a stub for TheBigMethod which validates that MethodForConditionX is called with value Z.
Edit: You could also make the methods virtual if you don't want to use an interface.

A bit late to the game here, but I would concur that mixing test and production code is a big code smell to be avoided. Big methods in legacy code provide all sorts of issues. I would highly recommend reading Michael Feather's Working Effectively with Legacy Code. It's all about dealing with the myriad of problems encountered in legacy code and how to deal with them.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Object oriented design patterns for parsing text files? - oop

One suggestion is to create property file where you define rules. Load it during run time and use if else loop (since switch statements also does the same internally). This way if you want to change some parsing rules you have to change .property file not code. :)

You need to learn how to express context free grammars. You should be thinking about the GoF Interpreter and parser/generators like Bison, ANTRL, lex/yacc, etc.

Related

Enforcing read-only attributes from the metaclass

Testing state machine transitions implemented as Map

How do I execute Dynamically (like Eval) in Dart?

Optimizing a method with boolean flag

Is it a good idea to inject a TestSettings parameter to a method to make it (Unit or integration) Testable?

Categories

Resources