I need help with realizing quite complex business logic which operates on many tables and executes quite a few SQL commands. However I want to be sure that the data will not be left in incosistent state and to this moment I don't see the solution which would not require nested transactions. I wrote a simple pseudo-code which illustrates a scenario similar to what I want to accomplish:
Dictionary<int, bool> opSucceeded = new Dictionary<int, bool> ();
for (int i = 0; i < 10; i++)
{
try
{
// this operation must be atomic
Operation(dbContext, i);
// commit (?)
opSucceeded[i] = true;
}
catch
{
// ignore
}
}
try
{
// this operation must know which Operation(i) has succeeded;
// it also must be atomic
FinalOperation(dbContext, opSucceeded);
// commit all
}
catch
{
// rollback FinalOperation and operation(i) where opSucceeded[i] == true
}
The biggest problem for me is: how to ensure that if the FinalOperation fails, all operations Operation(i) which succeeded are rolled back? Note that I also would like to be able to ignore failures of single Operation(i).
Is it possible to achieve this by using nested TransactionScope objects and if not - how would you approach such problem?
If I am following your question, you want to have a series of operations against the database, and you capture enough information to determine if each operating succeeds or fails (the dictionary in your simplified code).
From there, you have a final operation that must roll back all of the successful operations from earlier if it fails itself.
It would seem this is exactly the type of case that a simple transaction is for. There is no need to keep track of the success or failure of the child/early operations as long as failure of the final operation rolls the entire transaction back (here assuming that FinalOperation isn't using that information for other reasons).
Simply start a transaction before you enter the block described, and commit or rollback the entire thing after you know the status of your FinalOperation. There is no need to nest the child operations as far as I can see from your current description.
Perhaps I a missing something? (Note, if you wanted to RETAIN the earlier/child operations, that would be something different entirely... but a failure of the final op rolling the whole package of operations back makes the simple transaction usable).
Related
I have noticed that, my lucene index segment files (file names) are always changing constantly, even when I am not performing any add, update, or delete operations. The only operations I am performing is reading and searching. So, my question is, does Lucene index segment files get updated internally somehow just from reading and searching operations?
I am using Lucene.Net v4.8 beta, if that matters. Thanks!
Here is an example of how I found this issue (I wanted to get the index size). Assuming a Lucene Index already exists, I used the following code to get the index size:
Example:
private long GetIndexSize()
{
var reader = GetDirectoryReader("validPath");
long size = 0;
foreach (var fileName in reader.Directory.ListAll())
{
size += reader.Directory.FileLength(fileName);
}
return size;
}
private DirectoryReader GetDirectoryReader(string path)
{
var directory = FSDirectory.Open(path);
var reader = DirectoryReader.Open(directory);
return reader;
}
The above method is called every 5 minutes. It works fine ~98% of the time. However, the other 2% of the time, I would get the error file not found in the foreach loop, and after debugging, I saw that the files in reader.Directory are changing in count. The index is updated at certain times by another service, but I can assure that no updates were made to the index anywhere near the times when this error occurs.
Since you have multiple processes writing/reading the same set of files, it is difficult to isolate what is happening. Lucene.NET does locking and exception handling to ensure operations can be synced up between processes, but if you read the files in the directory directly without doing any locking, you need to be prepared to deal with IOExceptions.
The solution depends on how up to date you need the index size to be:
If it is okay to be a bit out of date, I would suggest using DirectoryInfo.EnumerateFiles on the directory itself. This may be a bit more up to date than Directory.ListAll() because that method stores the file names in an array, which may go stale before the loop is done. But, you still need to catch FileNotFoundException and ignore it and possibly deal with other IOExceptions.
If you need the size to be absolutely up to date and plan to do an operation that requires the index to be that size, you need to open a write lock to prevent the files from changing while you get the value.
private long GetIndexSize()
{
// DirectoryReader is superfluous for this example. Also,
// using a MMapDirectory (which DirectoryReader.Open() may return)
// will use more RAM than simply using SimpleFSDirectory.
var directory = new SimpleFSDirectory("validPath");
long size = 0;
// NOTE: The lock will stay active until this is disposed,
// so if you have any follow-on actions to perform, the lock
// should be obtained before calling this method and disposed
// after you have completed all of your operations.
using Lock writeLock = directory.MakeLock(IndexWriter.WRITE_LOCK_NAME);
// Obtain exclusive write access to the directory
if (!writeLock.Obtain(/* optional timeout */))
{
// timeout failed, either throw an exception or retry...
}
foreach (var fileName in directory.ListAll())
{
size += directory.FileLength(fileName);
}
return size;
}
Of course, if you go that route, your IndexWriter may throw a LockObtainFailedException and you should be prepared to handle them during the write process.
However you deal with it, you need to be catching and handling exceptions because IO by its nature has many things that can go wrong. But exactly how you deal with it depends on what your priorities are.
Original Answer
If you have an IndexWriter instance open, Lucene.NET will run a background process to merge segments based on the MergePolicy being used. The default settings can be used with most applications.
However, the settings are configurable through the IndexWriterConfig.MergePolicy property. By default, it uses the TieredMergePolicy.
var config = new IndexWriterConfig(LuceneVersion.LUCENE_48, analyzer)
{
MergePolicy = new TieredMergePolicy()
};
There are several properties on TieredMergePolicy that can be used to change the thresholds that it uses to merge.
Or, it can be changed to a different MergePolicy implementation. Lucene.NET comes with:
LogByteSizeMergePolicy
LogDocMergePolicy
NoMergePolicy
TieredMergePolicy
UpgradeIndexMergePolicy
SortingMergePolicy
The NoMergePolicy class can be used to disable merging entirely.
If your application never needs to add documents to the index (for example, if the index is built as part of the application deployment), it is also possible to use a IndexReader from a Directory instance directly, which does not do any background segment merges.
The merge scheduler can also be swapped and/or configured using the IndexWriterConfig.MergeScheduler property. By default, it uses the ConcurrentMergeScheduler.
var config = new IndexWriterConfig(LuceneVersion.LUCENE_48, analyzer)
{
MergePolicy = new TieredMergePolicy(),
MergeScheduler = new ConcurrentMergeScheduler()
};
The merge schedulers that are included with Lucene.NET 4.8.0 are:
ConcurrentMergeScheduler
NoMergeScheduler
SerialMergeScheduler
The NoMergeScheduler class can be used to disable merging entirely. This has the same effect as using NoMergePolicy, but also prevents any scheduling code from being executed.
I populate an object based on the users input from the commandline.
The object needs to have a certain amount of data to proceed. My solution so far is nested if-statements to check if the object is ready. Like below example.
Maybe 3 if-statements aren't so bad(?) but what if that number of if-statements starts to increase? What are my alternatives here? Let's say that X, Y and Z are three completely different things. For example let's say that object.X is a list of integers and object.Y is a string and maybe Z is some sort of boolean to return true only if object.Y has a certain amount of values?
I'm not sure polymorhism will work in this case?
do
{
if (object.HasX)
{
if (object.HasY)
{
if (object.HasZ)
{
//Object is ready to proceed.
}
else
{
//Object is missing Z. Handle it...
}
}
else
{
//Object is missing Y. Handle it...
}
}
else
{
//Object is missing X. Handle it...
}
} while (!String.IsNullOrEmpty(line));
For complex logic workflow, I have found, it's important for maintainability to decide which level of abstraction the logic should live in.
Will new logic/parsing rules have to be added regularly?
Unfortunately, there isn't a way to avoid having to do explicit conditionals, they have to live somewhere.
Some things that can help keep it clean could be:
Main function is only responsible for converting command line arguments to native datatypes, then it pushes the logic down to an object builder class, This will keep main function stable and unchanged, except for adding flag descriptions, THis should keep the logic out of the domain, and centralized to the builder abstraction
Main function is responsible for parsing and configuring the domain, this isolates all the messy conditionals in the main/parsing function and keeps the logic outside of the domain models
Flatten the logic, if not object.hasX; return, next step you know has.X, this will still have a list of conditionals but will be flatter
Create a DSL declarative rule language (more apparent when flattening). This could be a rule processor, where the logic lives, then the outer main function could define that states that are necessary to proceed
In the process of learning Rust, I am getting acquainted with error propagation and the choice between unwrap and the ? operator. After writing some prototype code that only uses unwrap(), I would like to remove unwrap from reusable parts, where panicking on every error is inappropriate.
How would one avoid the use of unwrap in a closure, like in this example?
// todo is VecDeque<PathBuf>
let dir = fs::read_dir(&filename).unwrap();
todo.extend(dir.map(|dirent| dirent.unwrap().path()));
The first unwrap can be easily changed to ?, as long as the containing function returns Result<(), io::Error> or similar. However, the second unwrap, the one in dirent.unwrap().path(), cannot be changed to dirent?.path() because the closure must return a PathBuf, not a Result<PathBuf, io::Error>.
One option is to change extend to an explicit loop:
let dir = fs::read_dir(&filename)?;
for dirent in dir {
todo.push_back(dirent?.path());
}
But that feels wrong - the original extend was elegant and clearly reflected the intention of the code. (It might also have been more efficient than a sequence of push_backs.) How would an experienced Rust developer express error checking in such code?
How would one avoid the use of unwrap in a closure, like in this example?
Well, it really depends on what you wish to do upon failure.
should failure be reported to the user or be silent
if reported, should one failure be reported or all?
if a failure occur, should it interrupt processing?
For example, you could perfectly decide to silently ignore all failures and just skip the entries that fail. In this case, the Iterator::filter_map combined with Result::ok is exactly what you are asking for.
let dir = fs::read_dir(&filename)?;
let todos.extend(dir.filter_map(Result::ok));
The Iterator interface is full of goodies, it's definitely worth perusing when looking for tidier code.
Here is a solution based on filter_map suggested by Matthieu. It calls Result::map_err to ensure the error is "caught" and logged, sending it further to Result::ok and filter_map to remove it from iteration:
fn log_error(e: io::Error) {
eprintln!("{}", e);
}
(|| {
let dir = fs::read_dir(&filename)?;
todo.extend(dir
.filter_map(|res| res.map_err(log_error).ok()))
.map(|dirent| dirent.path()));
})().unwrap_or_else(log_error)
So I've been doing some work with NHibernate, and while what I have seen is mostly great, one thing is a little frustrating. If I try to query for some objects and there is a mapping problem in or with the hbm file, often there is no indication that there is a mapping problem, it just returns no results.
A simple example is if I forget to set the hbm file as an embedded resource, and then do a session.Query<Variable>().ToList(), there is no indication that there is no mapping for Variable, it just returns an empty list.
Is there any way to tell NHibernate to throw an exception or otherwise indicate that there is a problem with the mapping(s) in situations like this?
This does result in an exception:
_session.Get<Variable>(1)
But these do not:
_session.Query<Variable>().Where(e => e.VariableId == 1).ToList()
_session.CreateCriteria(typeof(Variable)).Add(Restrictions.Eq("VariableId",1)).List<Variable>();
Hopefully it's something that I'm doing wrong, or something that can be configured, otherwise it will probably be a deal-breaker for using NHibernate. I can catch these things in my own unit tests relatively easy, but I can just see this becoming a festival of bugs when other developers start touching it.
After doing a bit of digging through the code, I suspect you are right. Unmapped classes don't throw an exception on a List.
https://github.com/nhibernate/nhibernate-core/blob/master/src/NHibernate/Impl/SessionImpl.cs#L1869
string[] implementors = Factory.GetImplementors(criteria.EntityOrClassName);
int size = implementors.Length;
CriteriaLoader[] loaders = new CriteriaLoader[size];
ISet<string> spaces = new HashSet<string>();
for (int i = 0; i < size; i++)
{
loaders[i] = new CriteriaLoader(
GetOuterJoinLoadable(implementors[i]),
Factory,
criteria,
implementors[i],
enabledFilters
);
spaces.UnionWith(loaders[i].QuerySpaces);
}
Note how if there are no implementors that are returned by Factory.GetImplementors, no error is generated, but nothing is done. Hence why you are seeing an empty list come back.
If we look at SessionFactoryImpl.GetImplementors we see that at no point is an exception thrown if no implementor is found. Just an empty array of implementors is returned.
So potentially there needs to be a check that if no implementors are returned by Factory.GetImplementors, a MappingException needs to be thrown.
assume you have a function that polls some kind of queue and blocks for a certain amount of time. If this time has passed without something showing up on the queue, some indication of the timeout should be delivered to the caller, otherwise the something that showed up should be returned.
Now you could write something like:
class Queue
{
Thing GetThing();
}
and throw an exception in case of a timeout. Or you
write
class Queue
{
int GetThing(Thing& t);
}
and return an error code for success and timeout.
However, drawback of solution 1 is that the on a not so busy queue timeout is not an exceptional case, but rather common. And solution 2 uses return values for errors and ugly syntax, since you can end up with a Thing that contains nothing.
Is there another (smart) solution for that problem? What is the preferred solution in an object oriented environment?
I would use exceptions only when the error is serious enough to stop the execution of the application, or of any big-enough application's component. But I wouldn't use exceptions for common cases, after which we continue the normal execution or execute the same function again. This would be just using exceptions for flow control, which is wrong.
So, I suggest you to either use the second solution that you proposed, or to do the following:
class Queue
{
bool GetThing(Thing& t); // true on success, false on failure
string GetLastError();
};
Of course you can stick with an int for an error code, instead of a string for the full error message. Or even better, just define class Error and have GetLastError() return it.
Why not just return null from GetThing in your first solution, changing it to return a Thing *? It seems to fit the bill, at least from the information you've given so far.
In the first, and second case, you can't do anything but throw an exception. When you return a Thing, or a Thing&, you don't have the option of not returning a Thing.
If you want to fail without using an exception then you need:
class Queue
{
// Either something like this. GetThing retuns NULL on an error,
// GetError returns a specific error code
Thing* GetThing();
int GetError();
// This kind of pattern is common. Return a result code
// and set ppOut to a valid thing or NULL.
int GetThing(Thing** ppOut);
};