language for data-driven programming - language-design

I'm wondering if there is a language that supports programming in the following style that I think of as "data driven":
Imagine a language like C or Python except that it's possible to define a function whose input parameters are bound to particular variables. Then, whenever one of those input variables changes, the function is re-run.
Alternatively, the function might only ever be run when its output is needed, regardless of when its inputs are changed.
The first option is useful when things need to be kept up to date whenever something changes. The second option is useful when the computation is expensive, and you want to run it only when needed.
In my mind, this type of programming paradigm would require many stacks, perhaps one for each function that was defined in the above manner. This would allow those functions to be run in any order, and it would allow their execution to be occassionally blocked. Execution on one stack would be blocked anytime it needs the output of a function whose inputs are not yet ready.
This would be a single-threaded application. The runtime system would take care of switching from one stack to another in an appropriate manner. Deadlocks would be possible.

Related

Are the optimization and parameters variation experiments in AnyLogic limited to 255 parameters?

In AnyLogic
You may vary only the parameters of the top-level agent.
(https://anylogic.help/anylogic/experiments/optimization.html#:~:text=Optimization%20experiment&text=If%20you%20need%20to%20run,the%20optimization%20capability%20of%20AnyLogic.)
(https://anylogic.help/anylogic/experiments/parameter-variation.html)
The top-level agent can not have more than 255 parameters.
The number of method parameters is limited to 255 (https://docs.oracle.com/javase/specs/jvms/se7/html/jvms-4.html#jvms-4.11)
The question here is not why 255 parameters are required for an optimization problem or if simulation-based optimization is the best way to handle a problem with more than 255 parameters (Decision Variables). The question is about ways to overcome this limitation.
I thought
The best option would be to follow Java best practices and have a Java
class (which has almost no limitations) (Maximum number of parameters in a Agent-Type)
However,
AnyLogic provides Agents, that are basically predefined classes
with several built-in functionalities, (https://noorjax.com/2018/11/12/an-example-on-the-use-of-classes-in-anylogic/)
Therefore, it seems that using Java Class would not help. I'm I missing a Java trick here? would it be possible in any way to perform an optimization experiment in AnyLogic with more than 255 parameters?
Sorry if this question is not under the scope of SOF. I'm still trying to distinguish between what can be asked and what not.
There are several ways to avoid the limit:
Structure your parameters in 'raw' Java classes so, for example, your Main agent may have 3 parameters of type CoreParameters, ClimateVariables and EconomicVariables. (Just look at any Java tutorials for how to define simple classes which are effectively just data structures.) But now your experiments have to create appropriate instances of those classes as parameter values (and this makes things like a Parameter Variation experiment harder to define; you'd typically use a Custom Experiment instead since then you have full control of how you setup the parameters for each run). For optimisation, you'd also have to use a Custom Experiment to define an optimisation where, for example, you might have 500 optimisation variables but your code to setup the model from them sets up your 3 model parameter class instances with those 500 values. The Custom Experiment help page has examples of this.
Use external data (e.g., Excel loaded into the AnyLogic DB) to provide some/all of the 'parameters' for your model. The issue here is that AnyLogic's multi-run experiments (including optimisation-based ones) expect to be varying top-level agent parameters. But often
A load of 'parameters' will stay fixed so those can be supplied from this external data.
'Parameters' may be varied in related sets, so this can boil down to a single parameter which provides, say, the filename to load the relevant external data from (and you vary that across a pre-prepared set). But this requires writing some specific Java to allow you to import external data from a dynamically-defined filename into the AnyLogic DB, or the equivalent but reading directly from the Excel file. (But this is simple boilerplate code you can copy and reuse once you've 'learnt' it.)
P.S. I'd reiterate though that any optimisation involving 255+ parameters is probably pointless, with little likelihood of finding a near-optimum (and if you have a model with that many parameters --- given that you might genuinely want to vary all of them independently --- you have a model design problem).
P.P.S. Your two quoted bits of text don't contradict each other. You can write raw Java classes in AnyLogic or use, say, an Agent which just contains a set of parameters as a 'data structure class'. Agents (together with everything else) are Java classes, but that's not relevant to your question.

How to quickly analyse the impact of a program change?

Lately I need to do an impact analysis on changing a DB column definition of a widely used table (like PRODUCT, USER, etc). I find it is a very time consuming, boring and difficult task. I would like to ask if there is any known methodology to do so?
The question also apply to changes on application, file system, search engine, etc. At first, I thought this kind of functional relationship should be pre-documented or some how keep tracked, but then I realize that everything can have changes, it would be impossible to do so.
I don't even know what should be tagged to this question, please help.
Sorry for my poor English.
Sure. One can technically at least know what code touches the DB column (reads or writes it), by determining program slices.
Methodology: Find all SQL code elements in your sources. Determine which ones touch the column in question. (Careful: SELECT ALL may touch your column, so you need to know the schema). Determine which variables read or write that column. Follow those variables wherever they go, and determine the code and variables they affect; follow all those variables too. (This amounts to computing a forward slice). Likewise, find the sources of the variables used to fill the column; follow them back to their code and sources, and follow those variables too. (This amounts to computing a backward slice).
All the elements of the slice are potentially affecting/affected by a change. There may be conditions in the slice-selected code that are clearly outside the conditions expected by your new use case, and you can eliminate that code from consideration. Everything else in the slices you may have inspect/modify to make your change.
Now, your change may affect some other code (e.g., a new place to use the DB column, or combine the value from the DB column with some other value). You'll want to inspect up and downstream slices on the code you change too.
You can apply this process for any change you might make to the code base, not just DB columns.
Manually this is not easy to do in a big code base, and it certainly isn't quick. There is some automation to do for C and C++ code, but not much for other languages.
You can get a bad approximation by running test cases that involve you desired variable or action, and inspecting the test coverage. (Your approximation gets better if you run test cases you are sure does NOT cover your desired variable or action, and eliminating all the code it covers).
Eventually this task cannot be automated or reduced to an algorithm, otherwise there would be a tool to preview refactored changes. The better you wrote code in the beginning, the easier the task.
Let me explain how to reach the answer: isolation is the key. Mapping everything to object properties can help you automate your review.
I can give you an example. If you can manage to map your specific case to the below, it will save your life.
The OR/M change pattern
Like Hibernate or Entity Framework...
A change to a database column may be simply previewed by analysing what code uses a certain object's property. Since all DB columns are mapped to object properties, and assuming no code uses pure SQL, you are good to go for your estimations
This is a very simple pattern for change management.
In order to reduce a file system/network or data file issue to the above pattern you need other software patterns implemented. I mean, if you can reduce a complex scenario to a change in your objects' properties, you can leverage your IDE to detect the changes for you, including code that needs a slight modification to compile or needs to be rewritten at all.
If you want to manage a change in a remote service when you initially write your software, wrap that service in an interface. So you will only have to modify its implementation
If you want to manage a possible change in a data file format (e.g. length of field change in positional format, column reordering), write a service that maps that file to object (like using BeanIO parser)
If you want to manage a possible change in file system paths, design your application to use more runtime variables
If you want to manage a possible change in cryptography algorithms, wrap them in services (e.g. HashService, CryptoService, SignService)
If you do the above, your manual requirements review will be easier. Because the overall task is manual, but can be aided with automated tools. You can try to change the name of a class's property and see its side effects in the compiler
Worst case
Obviously if you need to change the name, type and length of a specific column in a database in a software with plain SQL hardcoded and shattered in multiple places around the code, and worse many tables present similar column namings, plus without project documentation (did I write worst case, right?) of a total of 10000+ classes, you have no other way than manually exploring your project, using find tools but not relying on them.
And if you don't have a test plan, which is the document from which you can hope to originate a software test suite, it will be time to make one.
Just adding my 2 cents. I'm assuming you're working in a production environment so there's got to be some form of unit tests, integration tests and system tests already written.
If yes, then a good way to validate your changes is to run all these tests again and create any new tests which might be necessary.
And to state the obvious, do not integrate your code changes into the main production code base without running these tests.
Yet again changes which worked fine in a test environment may not work in a production environment.
Have some form of source code configuration management system like Subversion, GitHub, CVS etc.
This enables you to roll back your changes

In “Given-When-Then” style BDD tests, is it OK to have multiple “When”s conjoined with an “And”?

I read Bob Martin's brilliant article on how "Given-When-Then" can actual be compared to an FSM. It got me thinking. Is it OK for a BDD test to have multiple "When"s?
For eg.
GIVEN my system is in a defined state
WHEN an event A occurs
AND an event B occurs
AND an event C occurs
THEN my system should behave in this manner
I personally think these should be 3 different tests for good separation of intent. But other than that, are there any compelling reasons for or against this approach?
When multiple steps (WHEN) are needed before you do your actual assertion (THEN), I prefer to group them in the initial condition part (GIVEN) and keep only one in the WHEN section. This kind of shows that the event that really triggers the "action" of my SUT is this one, and that the previous one are more steps to get there.
Your test would become:
GIVEN my system is in a defined state
AND an event A occurs
AND an event B occurs
WHEN an event C occurs
THEN my system should behave in this manner
but this is more of a personal preference I guess.
If you truly need to test that a system behaves in a particular manner under those specific conditions, it's a perfectly acceptable way to write a test.
I found that the other limiting factor could be in an E2E testing scenario that you would like to reuse a statement multiple times. In my case the BDD framework of my choice(pytest_bdd) is implemented in a way that a given statement can have a singular return value and it maps the then input parameters automagically by the name of the function that was mapped to the given step. Now this design prevents reusability whereas in my case I wanted that. In short I needed to create objects and add them to a sequence object provided by another given statement. The way I worked around this limitation is by using a test fixture(which I named test_context), which was a python dictionary(a hashmap) and used when statements that don't have same singular requirement so the '(when)add object to sequence' step looked up the sequence in the context and appended the object in question to it. So now I could reuse the add object to sequence action multiple times.
This requirement was tricky because BDD aims to be descriptive. So I could have used a single given statement with the pickled memory map of the sequence object that I wanted to perform test action on. BUT would it have been useful? I think not. I needed to get the sequence constructed first and that needed reusable statements. And although this is not in the BDD bible I think in the end it is a practical and pragmatic solution to a very real E2E descriptive testing problem.

help me define process and procedure?

I have never undertood the basic difference (if there is any) between these two terms "process" and "procedure", could you help me out? it can be answered in programming-terms or in any other terms you like.
A process involves procedures, because the process is the whole, while the procedure is the part. In some languages (like vb, sql) procedure is a method which does not return values, in counterpart to the function that return values. Also in computing a process means a program that is being executed or at least is loaded in memory.
Process is business oriented (it can be represented by a workflow diagram), normally includes a set of business rules, while the procedure is algorithm oriented (it can be represented by a flow diagram).
See:
http://en.wikipedia.org/wiki/Procedure_(term)
http://en.wikipedia.org/wiki/Process_(computing)
Here are the definitions for both terms provided by the Information Technology Infrastructure Library (ITIL):
Procedure: A Document containing steps that specify how to achieve an
Activity. Procedures are defined as
part of Processes. See Work
Instruction.
Process: A structured set of activities designed to accomplish a
specific Objective. A Process takes
one or more defined inputs and turns
them into defined outputs. A Process
may include any of the Roles,
responsibilities, tools and management
Controls required to reliably deliver
the outputs. A Process may define
Policies, Standards, Guidelines,
Activities, and Work Instructions if
they are needed.
I found this link which I think sums it up Process versus Procedures
I think the first two comparisons are crucial and give a good idea of what the rest elaborate on:
Procedures are driven by completion of the task
Processes are driven by achievement of a desired outcome
Procedures are implemented
Processes are operated
In the sicp book, there is a section: 1.2 Procedures and the Processes They Generate
And the description of procedure may help understand:
A procedure is a pattern for the local evolution of a computational process. It specifies how each stage of the process is built upon the previous stage. We would like to be able to make statements about the overall, or global, behavior of a process whose local evolution has been specified by a procedure. This is very difficult to do in general, but we can at least try to describe some typical patterns of process evolution.
Per my understanding, a procedure is about how to program to resolve your problems with the program language while a process is what the computer need to do according to your defined procedure.
Policy is a rule or regulation for a task.
Process is a high level view on how to achieve task, simply it is a way.
Procedure is an instruction to perform an activity within a process.

How to nest rules in HP Exstream?

I am using HP Exstream (formerly Dialogue from Exstream Software) version 5.0.x. It has a feature to define and save boolean expressions as "Rules".
It has been about 6 years since I used this, but does anybody know if you can define a rule in terms of another rule? There is a "VB-like" language in a popup window, so you are not forced to use the and/or, variable-relational expression form, but I don't have documentation handy. :-(
I would like to define a rule, "NotFoo", in terms of "Foo", instead of repeating the inverse of the whole thing. (Yes, that would be retarded, but that's probably what I will be forced to do, as in other examples of what I am maintaining.) Actually, nested rules would have many uses, if I can figure out how to do it.
I later found that what one needs to do in this case is create user defined "functions", which can reference each other (so long as you avoid indirect recursion). Then, use the functions to define the "rules" (and, don't even bother with "library" rules instead of "inline" rules, most of the time).
I'm late to the question but since you had to answer yourself there is a better way to handle it.
The issue with using functions and testing the result is that there's a good chance that you're going to be adding unnecessary processing because the engine will run through the function every time it's called. Not a big issue with a simple function but it can easily become a problem if the function is complex, especially if it's called in several places.
Depending on the timing of the function (you didn't say whether it was a run level, customer level, or specific to particular documents), it's often better to have the function set a User Boolean variable to store the result then in your library rules you can just check the value of the variable without having to run through the function every time.