StringTemplate Probable I/O race condition detected while copying memory - race-condition

Hy,
In my project I am using Antlr.StringTemplate.StringTemplateGroup class in order to create a localized template. I access the .st file and set the desired attributes as below.
public StringTemplate WrapValuesReportTemplateContent(
private StringTemplateGroup StringTemplateGroup = new StringTemplateGroup(StringTemplateGroupName);
StringTemplate stringTemplate = this.StringTemplateGroup.GetInstanceOf(path);
stringTemplate.SetAttribute("atr1", value1);
stringTemplate.SetAttribute("atr2", value2);
return stringTemplate
)
The class is repeatedly used by a manager and due to this reason the following exception was triggered.
System.IndexOutOfRangeException: Probable I/O race condition detected while copying memory. The I/O package is not thread safe by default. In multithread applications, a stream must be accessed in a thread-safe way, such as a thread-safe wrapper returned by TextReader's or TextWriter's Synchronized methods. This also applies to classes like StreamWriter and StreamReader.
at System.Buffer.InternalBlockCopy(Array src, Int32 srcOffsetBytes, Array dst, Int32 dstOffsetBytes, Int32 byteCount)
at System.IO.StreamWriter.Write(Char[] buffer, Int32 index, Int32 count)
at System.IO.TextWriter.WriteLine(String value)
at System.IO.TextWriter.SyncTextWriter.WriteLine(String value)
at Antlr.StringTemplate.ConsoleErrorListener.Error(String s, Exception e)
at Antlr.StringTemplate.StringTemplate.BreakTemplateIntoChunks()
I am quire new to StringTemplate and it's not clear for me how StringTemplates really work. From the error description I understand that the .st resource is not closed. And I have the following questions:
When creating a new StringTemplate we create a Stream for writing and reading to the .st file, or a new object where we modify the attributes
After a .st file is opened is it closed automatically once it gets out of scope
How is the best approach in order to avoid this error. We should use locks on the resources, or wrap everything in a using?
Any clarification would be very useful.
Thanks

You could try synchronizing accesses on the stringTemplate object.
My guess is that you will only need to synchronize if you are simultaneously modifying it and reading or modifying it. If you are just reading it, it doesn't usually matter.
lock (stringTemplate)
{
// Access thread-sensitive resources.
}
For efficiency reasons, you should keep your synchronized block as small as possible; just with the stringTemplate access.

I've recently encountered the same bug!
Look at your stacktrace: it is thrown from ConsoleErrorListener.
To be short, the exception is thrown because of two threads are trying to write to Console.Out stream.
To eliminate this error You should override error handler for your template.
The solution which is marked as answer will work, but there will be too much lock contention.

Related

Lucene index files changing constantly even when there is no adding, updating, or deletion operations performed on it

I have noticed that, my lucene index segment files (file names) are always changing constantly, even when I am not performing any add, update, or delete operations. The only operations I am performing is reading and searching. So, my question is, does Lucene index segment files get updated internally somehow just from reading and searching operations?
I am using Lucene.Net v4.8 beta, if that matters. Thanks!
Here is an example of how I found this issue (I wanted to get the index size). Assuming a Lucene Index already exists, I used the following code to get the index size:
Example:
private long GetIndexSize()
{
var reader = GetDirectoryReader("validPath");
long size = 0;
foreach (var fileName in reader.Directory.ListAll())
{
size += reader.Directory.FileLength(fileName);
}
return size;
}
private DirectoryReader GetDirectoryReader(string path)
{
var directory = FSDirectory.Open(path);
var reader = DirectoryReader.Open(directory);
return reader;
}
The above method is called every 5 minutes. It works fine ~98% of the time. However, the other 2% of the time, I would get the error file not found in the foreach loop, and after debugging, I saw that the files in reader.Directory are changing in count. The index is updated at certain times by another service, but I can assure that no updates were made to the index anywhere near the times when this error occurs.
Since you have multiple processes writing/reading the same set of files, it is difficult to isolate what is happening. Lucene.NET does locking and exception handling to ensure operations can be synced up between processes, but if you read the files in the directory directly without doing any locking, you need to be prepared to deal with IOExceptions.
The solution depends on how up to date you need the index size to be:
If it is okay to be a bit out of date, I would suggest using DirectoryInfo.EnumerateFiles on the directory itself. This may be a bit more up to date than Directory.ListAll() because that method stores the file names in an array, which may go stale before the loop is done. But, you still need to catch FileNotFoundException and ignore it and possibly deal with other IOExceptions.
If you need the size to be absolutely up to date and plan to do an operation that requires the index to be that size, you need to open a write lock to prevent the files from changing while you get the value.
private long GetIndexSize()
{
// DirectoryReader is superfluous for this example. Also,
// using a MMapDirectory (which DirectoryReader.Open() may return)
// will use more RAM than simply using SimpleFSDirectory.
var directory = new SimpleFSDirectory("validPath");
long size = 0;
// NOTE: The lock will stay active until this is disposed,
// so if you have any follow-on actions to perform, the lock
// should be obtained before calling this method and disposed
// after you have completed all of your operations.
using Lock writeLock = directory.MakeLock(IndexWriter.WRITE_LOCK_NAME);
// Obtain exclusive write access to the directory
if (!writeLock.Obtain(/* optional timeout */))
{
// timeout failed, either throw an exception or retry...
}
foreach (var fileName in directory.ListAll())
{
size += directory.FileLength(fileName);
}
return size;
}
Of course, if you go that route, your IndexWriter may throw a LockObtainFailedException and you should be prepared to handle them during the write process.
However you deal with it, you need to be catching and handling exceptions because IO by its nature has many things that can go wrong. But exactly how you deal with it depends on what your priorities are.
Original Answer
If you have an IndexWriter instance open, Lucene.NET will run a background process to merge segments based on the MergePolicy being used. The default settings can be used with most applications.
However, the settings are configurable through the IndexWriterConfig.MergePolicy property. By default, it uses the TieredMergePolicy.
var config = new IndexWriterConfig(LuceneVersion.LUCENE_48, analyzer)
{
MergePolicy = new TieredMergePolicy()
};
There are several properties on TieredMergePolicy that can be used to change the thresholds that it uses to merge.
Or, it can be changed to a different MergePolicy implementation. Lucene.NET comes with:
LogByteSizeMergePolicy
LogDocMergePolicy
NoMergePolicy
TieredMergePolicy
UpgradeIndexMergePolicy
SortingMergePolicy
The NoMergePolicy class can be used to disable merging entirely.
If your application never needs to add documents to the index (for example, if the index is built as part of the application deployment), it is also possible to use a IndexReader from a Directory instance directly, which does not do any background segment merges.
The merge scheduler can also be swapped and/or configured using the IndexWriterConfig.MergeScheduler property. By default, it uses the ConcurrentMergeScheduler.
var config = new IndexWriterConfig(LuceneVersion.LUCENE_48, analyzer)
{
MergePolicy = new TieredMergePolicy(),
MergeScheduler = new ConcurrentMergeScheduler()
};
The merge schedulers that are included with Lucene.NET 4.8.0 are:
ConcurrentMergeScheduler
NoMergeScheduler
SerialMergeScheduler
The NoMergeScheduler class can be used to disable merging entirely. This has the same effect as using NoMergePolicy, but also prevents any scheduling code from being executed.

JavaCPP Leptonica : How to clear memory of pixClone handles

Until now, I've always used pixDestroy to clean up PIX objects in my JavaCPP/Leptonica application. However, I recently noticed a weird memory leak issue that I tracked down to a Leptonica function internally returning a pixClone result. I managed to reproduce the issue by using the following simple test:
#Test
public void test() throws InterruptedException {
String pathImg = "...";
for (int i = 0; i < 100; i++) {
PIX img = pixRead(pathImg);
PIX clone = pixClone(img);
pixDestroy(clone);
pixDestroy(img);
}
Thread.sleep(10000);
}
When the Thread.sleep is reached, the RAM memory usage in Windows task manager (not the heap size) has increased to about 1GB and is not released until the sleep ends and the test finishes.
Looking at the docs of pixClone, we see it actually creates a handle to the existing PIX:
Notes:
A "clone" is simply a handle (ptr) to an existing pix. It is implemented because (a) images can be large and hence expensive to
copy, and (b) extra handles to a data structure need to be made with a
simple policy to avoid both double frees and memory leaks. Pix are
reference counted. The side effect of pixClone() is an increase by 1
in the ref count.
The protocol to be used is: (a) Whenever you want a new handle to an existing image, call pixClone(), which just bumps a ref count. (b)
Always call pixDestroy() on all handles. This decrements the ref
count, nulls the handle, and only destroys the pix when pixDestroy()
has been called on all handles.
If I understand this correctly, I am indeed calling pixDestroy on all handles, so the ref count should reach zero and thus the PIX should have been destroyed. Clearly, this is not the case though. Can someone tell me what I'm doing wrong? Thanks in advance!
As an optimization for the common case when a function returns a pointer it receives as argument, JavaCPP also returns the same object to the JVM. This is what is happening with pixClone(). It simply returns the pointer that the user passes as argument, and thus both img and clone end up referencing the same object in Java.
Now, when pixDestroy() gets called on the first reference img, Leptonica helpfully resets its address to 0, but we've now lost the address, and the second call to pixDestroy() receives that null pointer, resulting in a noop, and a memory leak.
One easy way to avoid this issue is by creating explicitly a new PIX reference after each call to pixClone(), for example, in this case:
PIX clone = new PIX(pixClone(img));

System.AccessViolationException: 'Attempted to read or write protected memory. (Making a wrapper for a c++ lib)

My constructor is as next
ScaperEngine::ScaperEngine(GrabberType grabberType, bool timing) {
switch (grabberType)
{
case GrabberType::DepthSenseGrabber:
this->interface = new pcl::DepthSenseGrabber("");
break;
default:
throw new std::exception("Grabber type wasn't chosen correctly");
break;
}
executionPipeline = new ExecutionPipeline();
executionPipeline->setTiming(timing);
}
And then I have some code like:
void ScaperEngine::StartPipeline()
{
IPCLNormalCalculator* normalCalculator = new PCLNormalCalculator(normalCalcMaxDepthChangeFactor, normalSmoothingSize);
executionPipeline->SetPCLNormalCalculator(normalCalculator);
The most strange thing is that the constructor is building executionPipeline in the right way putting its place in memory in 0x0000020ef385e830, but when my c# managed code calls StartPipeline the executionPipeline address changed to 0xcdcdcdcdcdcdcdcd and in Quick Watch the following text appears for its variables <Unable to read memory>.
Please anyone has a clue whats going on?
With many thanks.
The 0xcdcdcdcdcdcdcdcd you are seeing is a special feature of the Visual Studio debugger that represents uninitialized heap memory. A more comprehensive list of codes are available from this StackOverflow question. In brief, it seems as though your C# code is calling StartPipeline() on an invalid object. This could happen, for example, if the pointer is altered to point to a random location in heap memory. Make your C# code (and the runtime) is properly storing the pointer to the ScraperEngine object and not corrupting it along the way.

Occasional bad data error when decrypting

I have a very strange situation.
Basically I have code that uses a decryptor created by:
Dim des3 As New TripleDESCryptoServiceProvider
des3.Mode = CipherMode.CBC
Return des3.CreateDecryptor(_encKey, _initVec)
The _encKey and _initVec are hardcoded.
I use it by calling:
Dim res() As Byte = decrypt(Convert.FromBase64String(_data))
m_transformDec.TransformFinalBlock(res, 0, res.Length)
Here _data is a string containing the encrypted value. m_transformDec is the Decryptor created previously.
Usually this works. Occasionally, I get a "bad data" error. I print out the value of _data, and it is always the same.
The code is multithreaded, which I suspect is the reason for both the problem, and it being hard to reproduce. The decryptor is created in the creation of the class, and the decryption is done in a Shared function, but I don't see anything there which is not thread-safe.
Any ideas?
You should not assume anything is safe for concurrent calls unless you have reason to believe it is. In the docs, you have the boilerplate text that instance members are not guaranteed to be thread-safe, so you should defensively lock the des3 object when you're using it.
You should not be hard coding the initialization vector; it should be randomly chosen when encrypting data, then stored in some way with the encrypted data (many people choose to tack it onto the beginning of the data, then remove it and use it for decryption; use whatever storage scheme you prefer, though). Using the same IV defeats the purpose of the IV, which serves to make plaintext attacks more difficult.

Copy bytes in memory to an Array in VB.NET

unfortunately I cannot resort to C# in my current project, so I'll have to solve this without the unsafe keyword.
I've got a bitmap, and I need to access the pixels and channel values directly. I'd like to go beyond Marshal.ReadByte() and Marshal.WriteByte() (and definitely beyond GetPixel and SetPixel).
Is there a way to put all the pixel data of the bitmap into a Byte array that works on both 32 and 64 bit systems? I want the exact same layout as the original bitmap, so the padding for each row (if it exists) also needs to be included.
Marshal doesn't seem to have something akin to:
byte[] ReadBytes(IntPtr start, int offset, int count)
Unless I totally missed it...
Any help greatly appreciated,
David
ps. So far all my images are in 32BppPArgb pixelformat.
Marshal does have a Method that does exactly what you are asking. See Marshall.Copy()
public static void Copy(
IntPtr source,
byte[] destination,
int startIndex,
int length
)
Copies data from an unmanaged memory
pointer to a managed 8-bit unsigned
integer array.
And there are overloads to go the other direction as well
Would something like this do? (untested):
Public Shared Function BytesFromBitmap(ByVal Image As Drawing.Bitmap) As Byte()
Using buffer As New IO.MemoryStream()
image.Save(result, Drawing.Imaging.ImageFormat.Bmp)
Using rdr As New IO.BinaryReader(buffer)
Return rdr.ReadBytes(buffer.Length)
End Using
End Using
End Function
It won't let you manipulate the pixels in a Drawing.Bitmap object directly, but it will let you copy that bitmap to a byte array, as per the question title.
Another option is serialization via the BinaryFormatter, but I think that will still require you to pass it through a MemoryStream.
VB does not offer methods for direct memory access. You have two choices:
Use the Marshal class
Write a small unsafe C# (or C++/CLI) library that handles only these operations and reference it from your VB code.
Alright, there is a third option. VB.Net does not inherently support direct memory access, but it can be accomplished. It's just ugly and prone to errors. Nonetheless, if you're willing to put in the effort, you can try building a bitmap access library using these techniques combined with the approach referenced previously.
shf301 is right on the money, but I'd like to add a link to a comprehensive explanation/tutorial on fast pixel data access. Rather than saving the image to a stream and accessing a file-in-memory, it would be better to lock the bitmap, copy pixel data out, access it, and copy it back in. The performance of this technique is pretty good.
Code is in c#, but the approach is language-neutral and easy to read.
http://ilab.ahemm.org/tutBitmap.html