Boolean return value of scripts in booggie2 - booggie

How is the Boolean return value of scripts (without explicit return value) defined? A rule always returns either TRUE or FALSE (depending if it was succesfully applied) which can be used to control your rule sequence. For scripts this currently doesn't work (at least in my applications).
Please note: The booggie-project does not exist anymore but led to the development of Soley Studio which covers the same functionality.

By default, scripts return a boolean value to the sequence. This is done using a bool()-cast.
Hence, if your script has no return value, it is internally interpreted as bool(None) which gives you False.
If your script has an explicitly defined return statement, the bool()-cast naturally return True.

Related

Kotlin checkNotNull vs requireNotNull

As I learn new components in Kotlin, I came accross requireNotNull and checkNotNull but the only difference I've found is that requireNotNull can throw an IllegalArgumentException while checkNotNull can throw an IllegalStateException. Is this the only reason why there are two methods, or I'm missing some under-the-hood implementation detail?
The exception types are the only practical difference, as far as the compiler is concerned — but there's a big difference in intent, for anyone reading the code:
• require…() functions are for checking parameters, to confirm that a function's input fulfils its contract. So you'd normally call them first thing in a function. (Of course, Kotlin's non-nullable types mean that you wouldn't need to call requireNotNull() for a single parameter; but you might need to check a more complex condition on a combination of parameters or their sub-objects.) That's why they throw IllegalArgumentException: it's checking that the arguments are legal.
• check…() functions are for checking the relevant properties, to confirm that the object or whatever is in a valid state for this function to be called now. (Again, any properties that were never null would be typed accordingly, so checkNotNull() is more appropriate for cases where a property, combination, and/or sub-property can be null, but this function mustn't be called when they are.) So they throw IllegalStateException: they're checking that the object's current state allows the function to be called.
In both cases, you could of course write a standard if check (as you would in Java). Or you could use the Elvis operator ?: to do the check the first time the possibly-null value is used. But these functions give you an alternative that's in a more declarative form: you'd normally put them at the top of the function, where they spell out what the function's contract is, in a way that's obvious to anyone glancing at the code.
As a linked answer points out, there are also assert…() functions, which again have more of a semantic difference than a practical one. Those are for detecting programming errors away from the boundary of a function call: for confirming invariants and other conditions, and for all the checks in unit tests and other automated tests.
(Assertions have another important difference: they can be enabled and disabled from the command-line. Though in my experience, that's not a very good thing. If a check is important, it should always be run: be mandatory; if not, then it should be removed, or at least moved to automated tests, once the code is debugged.)
It is a semantic difference and hence it throws different exceptions. RequireNotNull is used to check input values, typically at the beginning of a method, while checkNotNull is used anywhere to check the current state.
If you're looking for differences in implementation, the best place to go would be the source code. In this case it seems like there are no differences aside from the different exception thrown, the source for both methods is otherwise identical.
checkNotNull
[...]
if (value == null) {
val message = lazyMessage()
throw IllegalStateException(message.toString())
} else {
return value
}
requireNotNull
[...]
if (value == null) {
val message = lazyMessage()
throw IllegalArgumentException(message.toString())
} else {
return value
}
Therefore the difference is purely semantic. The answer from #gidds details some good scenarios for using them both.

What is the difference between an Idempotent and a Deterministic function?

Are idempotent and deterministic functions both just functions that return the same result given the same inputs?
Or is there a distinction that I'm missing?
(And if there is a distinction, could you please help me understand what it is)
In more simple terms:
Pure deterministic function: The output is based entirely, and only, on the input values and nothing else: there is no other (hidden) input or state that it relies on to generate its output. There are no side-effects or other output.
Impure deterministic function: As with a deterministic function that is a pure function: the output is based entirely, and only, on the input values and nothing else: there is no other (hidden) input or state that it relies on to generate its output - however there is other output (side-effects).
Idempotency: The practical definition is that you can safely call the same function multiple times without fear of negative side-effects. More formally: there are no changes of state between subsequent identical calls.
Idempotency does not imply determinacy (as a function can alter state on the first call while being idempotent on subsequent calls), but all pure deterministic functions are inherently idempotent (as there is no internal state to persist between calls). Impure deterministic functions are not necessarily idempotent.
Pure deterministic
Impure deterministic
Pure Nondeterministic
Impure Nondeterministic
Idempotent
Input
Only parameter arguments (incl. this)
Only parameter arguments (incl. this)
Parameter arguments and hidden state
Parameter arguments and hidden state
Any
Output
Only return value
Return value or side-effects
Only return value
Return value or side-effects
Any
Side-effects
None
Yes
None
Yes
After 1st call: Maybe.After 2nd call: None
SQL Example
UCASE
CREATE TABLE
GETDATE
DROP TABLE
C# Example
String.IndexOf
DateTime.Now
Directory.Create(String)Footnote1
Footnote1 - Directory.Create(String) is idempotent because if the directory already exists it doesn't raise an error, instead it returns a new DirectoryInfo instance pointing to the specified extant filesystem directory (instead of creating the filesystem directory first and then returning a new DirectoryInfo instance pointing to it) - this is just like how Win32's CreateFile can be used to open an existing file.
A temporary note on non-scalar parameters, this, and mutating input arguments:
(I'm currently unsure how instance methods in OOP languages (with their hidden this parameter) can be categorized as pure/impure or deterministic or not - especially when it comes to mutating the the target of this - so I've asked the experts in CS.SE to help me come to an answer - once I've got a satisfactory answer there I'll update this answer).
A note on Exceptions
Many (most?) programming languages today treat thrown exceptions as either a separate "kind" of return (i.e. "return to nearest catch") or as an explicit side-effect (often due to how that language's runtime works). However, as far as this answer is concerned, a given function's ability to throw an exception does not alter its pure/impure/deterministic/non-deterministic label - ditto idempotency (in fact: throwing is often how idempotency is implemented in the first place e.g. a function can avoid causing any side-effects simply by throwing right-before it makes those state changes - but alternatively it could simply return too.).
So, for our CS-theoretical purposes, if a given function can throw an exception then you can consider the exception as simply part of that function's output. What does matter is if the exception is thrown deterministically or not, and if (e.g. List<T>.get(int index) deterministically throws if index < 0).
Note that things are very different for functions that catch exceptions, however.
Determinacy of Pure Functions
For example, in SQL UCASE(val), or in C#/.NET String.IndexOf are both deterministic because the output depends only on the input. Note that in instance methods (such as IndexOf) the instance object (i.e. the hidden this parameter) counts as input, even though it's "hidden":
"foo".IndexOf("o") == 1 // first cal
"foo".IndexOf("o") == 1 // second call
// the third call will also be == 1
Whereas in SQL NOW() or in C#/.NET DateTime.UtcNow is not deterministic because the output changes even though the input remains the same (note that property getters in .NET are equivalent to a method that accepts no parameters besides the implicit this parameter):
DateTime.UtcNow == 2016-10-27 18:10:01 // first call
DateTime.UtcNow == 2016-10-27 18:10:02 // second call
Idempotency
A good example in .NET is the Dispose() method: See Should IDisposable.Dispose() implementations be idempotent?
a Dispose method should be callable multiple times without throwing an exception.
So if a parent component X makes an initial call to foo.Dispose() then it will invoke the disposal operation and X can now consider foo to be disposed. Execution/control then passes to another component Y which also then tries to dispose of foo, after Y calls foo.Dispose() it too can expect foo to be disposed (which it is), even though X already disposed it. This means Y does not need to check to see if foo is already disposed, saving the developer time - and also eliminating bugs where calling Dispose a second time might throw an exception, for example.
Another (general) example is in REST: the RFC for HTTP1.1 states that GET, HEAD, PUT, and DELETE are idempotent, but POST is not ( https://www.w3.org/Protocols/rfc2616/rfc2616-sec9.html )
Methods can also have the property of "idempotence" in that (aside from error or expiration issues) the side-effects of N > 0 identical requests is the same as for a single request. The methods GET, HEAD, PUT and DELETE share this property. Also, the methods OPTIONS and TRACE SHOULD NOT have side effects, and so are inherently idempotent.
So if you use DELETE then:
Client->Server: DELETE /foo/bar
// `foo/bar` is now deleted
Server->Client: 200 OK
Client->Server DELETE /foo/bar
// foo/bar` is already deleted, so there's nothing to do, but inform the client that foo/bar doesn't exist
Server->Client: 404 Not Found
// the client asks again:
Client->Server: DELETE /foo/bar
// foo/bar` is already deleted, so there's nothing to do, but inform the client that foo/bar doesn't exist
Server->Client: 404 Not Found
So you see in the above example that DELETE is idempotent in that the state of the server did not change between the last two DELETE requests, but it is not deterministic because the server returned 200 for the first request but 404 for the second request.
A deterministic function is just a function in the mathematical sense. Given the same input, you always get the same output. On the other hand, an idempotent function is a function which satisfies the identity
f(f(x)) = f(x)
As a simple example. If UCase() is a function that converts a string to an upper case string, then clearly UCase(Ucase(s)) = UCase(s).
Idempotent functions are a subset of all functions.
A deterministic function will return the same result for the same inputs, regardless of how many times you call it.
An idempotent function may NOT return the same result (it will return the result in the same form but the value could be different, see http example below). It only guarantees that it will have no side effects. In other words it will not change anything.
For example, the GET verb is meant to be idempotent in HTTP protocol. If you call "~/employees/1" it will return the info for employee with ID of 1 in a specific format. It should never change anything but simply return the employee information. If you call it 10, 100 or so times, the returned format will always be the same. However, by no means can it be deterministic. Maybe if you call it the second time, the employee info has changed or perhaps the employee no longer even exists. But never should it have side effects or return the result in a different format.
My Opinion
Idempotent is a weird word but knowing the origin can be very helpful, idem meaning same and potent meaning power. In other words it means having the same power which clearly doesn't mean no side effects so not sure where that comes from. A classic example of There are only two hard things in computer science, cache invalidation and naming things. Why couldn't they just use read-only? Oh wait, they wanted to sound extra smart, perhaps? Perhaps like cyclomatic complexity?

How can I verify that a Lucene query embedded in a larger XQuery does not contain a syntax error before launching the complete XQuery I want to run?

I have an application for which I need to allow the user to perform full text search on documents, and use the Lucene Query Parser syntax if desired. The eXist database is queried from a Django backend that uses eulexistdb to talk to eXist.
The problem is that when the user uses an incorrect syntax for the full text search, this is discovered late in the game. The Django application has to query a SQL database to determine some of the parameters of the search. By the time the complete XQuery is built and eXist is accessed, the SQL query has already run, which means that the cost of the SQL query has already been spent. (I know I could marshal the data queried on the SQL side into eXist so that only eXist is queried. It's just not an option for now.)
I'd like to know ahead of time whether the Lucene query has a syntactical error to that I can avoid starting querying the SQL database for nothing.
I've checked the documentation of eXist, but I've not found anything in the API which would be a simple function that checks whether a full-text query is syntactically valid or not.
Here is a simple function that will return True if a Lucene query is fine, or False if there is a syntax error in the query. db must be an instance of eulexistdb.db.ExistDB and query is the Lucene query:
def check(db, query):
try:
db.query(safe_interpolate("ft:query(<doc/>, {lucene_query})",
lucene_query=query))
except ExistDBException as ex:
if ex.message().startswith(
"exerr:ERROR Syntax error in Lucene query string"):
return False
raise ex # Don't swallow other problems that may occur.
return True
This should be adaptable to any language for which there is a library that provides access to eXist. The idea is to run the query of interest against a bogus document (<doc/>). Using the bogus document avoids having to actually search the database. (An empty node sequence might seem better, but we're not running ft:query against an empty node sequence because then the XQuery optimizer could skip trying to parse and run the Lucene query since a valid query on an empty sequence will necessarily return an empty sequence, irrespective of the actual Lucene query.) It does not matter whether it returns any results or not. If the query has no errors, then there won't be an exception. If the query has a syntax error, then an exception will be raised. I've not found a more robust way than checking the error message stored with the exception to detect whether it is a Lucene syntax error or something else.
(The safe_interpolate function is a function that should interpolate lucene_query so as to avoid injections. It is up to you to decide what you need in your application.)
Here is an approach I consider complementary to the one I posted earlier. I'm using lucene-query-parser to perform the check client-side (i.e. in the browser):
define(function (require, exports, _module) {
"use strict";
var lqp = require("lucene-query-parser");
function preDrawCallback() {
// We get the content of the search field.
var search = this.api().search();
var good = true;
try {
lqp.parse(search); // Here we check whether it is syntactically valid.
}
catch (ex) {
if (!(ex instanceof lqp.SyntaxError)) {
throw ex; // Don't swallow exceptions.
}
good = false;
}
// Some work is performed here depending on whether
// the query is good or bad.
return good; // And finally we tell DataTables whether to inhibit the draw.
}
// ....
});
preDrawCallback is used with a DataTables instance. Returning false inhibits drawing the table, which also inhibits performing a query to the server. So if the query is syntactically incorrect, it won't ever make it to the backend. (The define and require calls are there because both my code and lucene-query-parser are AMD modules.)
Potential issues:
If the library that performs the check is buggy or otherwise does not support the entire syntax that Lucene supports, it will block queries that should go through. I've found a few buggy (or at best severely obsolete) libraries before I settled on lucene-query-parser.
If the client-side library happens to support a construct introduced in a later version of Lucene but which is not supported in the version used with eXist. Keeping the backend check I show in my other answer allows to make sure that anything that would slip through is caught there.

In data flow coverage, does returning a variable use it?

I have a small question in my mind. I researched it on the Internet but no-one is providing the exact answer. My question is:
In data flow coverage criteria, say there is a method which finally returns variable x. When drawing the graph for that method, is that return statement considered to be a use of x?
Yes, a return statement uses the value that it returns. I couldn't find an authoritative reference that says so in plain English either, but here are two arguments:
A return statement passes control from one part of a program to another, just like a method call does. The value being returned is analogous to a function parameter. return therefore is a use just like being a function parameter is a use.
The other kind of use in data flow analysis is when a value leaves the program and has some effect on the outside world, for example by being printed. If we're analyzing a method, rather than an entire program, return causes the value to leave the scope which we're analyzing. So it's a use for the same reason that printing is a use.

Remove temp variable in code with switch-block

Have such code in project:
Cookie CookieCreate(CookiesTypes type)
{
Cookie user_cookie = null;
switch (type)
{
case CookiesTypes.SessionId:
user_cookie = new Cookie("session_id", Convert.ToBase64String(Guid.NewGuid().ToByteArray()));
break;
case CookiesTypes.ClientIp:
HttpListenerContext context = listener.GetContext();
user_cookie = new Cookie("client_ip", context.Request.RemoteEndPoint.ToString());
break;
}
return user_cookie;
}
I understand, that temp variable user_cookie is bad syntax... I've tried to use return in switch-block both in two cases, but I've got an compiler erros, when I tried so:
Pseudo-Code:
case ... :
//some action
return var;
Having a temporary that is set in a case of a switch statement to be returned at the end is not a bad syntax, it is also the only choice if you need to do something on user_cookie for all cases before returning it.
The only problem is see in your code is the lack of a default case which is indeed useful because:
either you can require a default case (so that you do something in that situation)
either the switch should never reach a default case (so you should manage that situation in a special way, for example by throwing an exception)
If you blindly remove temporary variable and return the value directly like you are trying to do, then it gives you a compiler error probably because not all your branches do return something (because you are lacking a default clause or lacking a return after the switch).
Despite the fact that there's nothing inherently wrong with temporary variables, if you really want to avoid it you just need to ensure that all code paths return something.
That means (for example) changing your current return to:
return null;
and having both cases contain:
return new Cookie (whatever);
instead of the assignment.