Related
I am using Zivid.NET, Halcon.NET and ML.NET together. Zivid provides me with a 3D byte array (row, column, channel), Halcon uses HImages/HObjects, ML.NET functionality expects a 1D byte array (same as File.ReadAllBytes())
So far I used a workaround where:
I save()'d Zivid's imageRGBA as a PNG,
which I read with Halcon's read_image() that gives me a HObject.
After some graphical work I saved the HObject again as a PNG using write_image().
Using File.ReadAllBytes() to read that PNG I get the byte[] that my ML.NET functionalities expect.
But this is far from ideal with larger amounts of data.
What I need is:
a way to convert byte[r,c,c] images to HObject/HImage.
a way to convert HObject/HImage images to byte[].
Halcon's read_image() and write_image() don't seem to have any options for this and I haven't found anything helpful so far.
To create an HImage object from byte, you need a pointer to an array and then it's simple:
public HImage(string type, int width, int height, IntPtr pixelPointer)
To get a pointer and acess the data from HImage, the following function is needed:
IntPtr HImage.GetImagePointer1(out string type, out int width, out int height)
To convert from Zivid.NET's byte[r,c,c] to HImage one can:
var byteArr = imgRGBA.ToByteArray();
byte[,] redByteArray = new byte[1200, 1920];
byte[,] greenByteArray = new byte[1200, 1920];
byte[,] blueByteArray = new byte[1200, 1920];
for (int row = 0; row < 1200; row++)
for (int col = 0; col < 1920; col++)
{
redByteArray[row, col] = byteArr[row, col, 0];
greenByteArray[row, col] = byteArr[row, col, 1];
blueByteArray[row, col] = byteArr[row, col, 2];
}
GCHandle pinnedArray_red = GCHandle.Alloc(redByteArray, GCHandleType.Pinned);
IntPtr pointer_red = pinnedArray_red.AddrOfPinnedObject();
GCHandle pinnedArray_green = GCHandle.Alloc(greenByteArray, GCHandleType.Pinned);
IntPtr pointer_green = pinnedArray_green.AddrOfPinnedObject();
GCHandle pinnedArray_blue = GCHandle.Alloc(blueByteArray, GCHandleType.Pinned);
IntPtr pointer_blue = pinnedArray_blue.AddrOfPinnedObject();
GenImage3(out HObject imgHImage, "byte", 1920, 1200, pointer_red, pointer_green, pointer_blue);
pinnedArray_red.Free();
pinnedArray_green.Free();
pinnedArray_blue.Free();
Still working on the second part..
A better / more efficient method is very welcome.
I'm using a for-loop to determine whether the long double is an int. I have it set up that the for loop loops another long double that is between 2 and final^1/2. Final is a loop I have set up that is basically 2 to the power of 2-10 minus 1. I am then checking if final is an integer. My question is how can I get only the final values that are integers?
My explanation may have been a bit confusing so here is my entire loop code. BTW I am using long doubles because I plan on increasing these numbers very largely.
for (long double ld = 1; ld<10; ld++) {
long double final = powl(2, ld) - 1;
//Would return e.g. 1, 3, 7, 15, 31, 63...etc.
for (long double pD = 2; pD <= powl(final, 0.5); pD++) {
//Create new long double
long double newFinal = final / pD;
//Check if new long double is int
long int intPart = (long int)newFinal;
long double newLong = newFinal - intPart;
if (newLong == 0) {
NSLog(#"Integer");
//Return only the final ints?
}
}
}
Just cast it to an int and subtract it from itself?
long double d;
//assign a value to d
int i = (int)d;
if((double)(d - i) == 0) {
//d has no fractional part
}
As a note... because of the way floating point math works in programming, this == check isn't necessarily the best thing to do. Better would be to decide on a certain level of tolerance, and check whether d was within that tolerance.
For example:
if(fabs((double)(d - i)) < 0.000001) {
//d's fractional part is close enough to 0 for your purposes
}
You can also use long long int and long double to accomplish the same thing. Just be sure you're using the right absolute value function for whatever type you're using:
fabsf(float)
fabs(double)
fabsl(long double)
EDIT... Based on clarification of the actual problem... it seems you're just trying to figure out how to return a collection from a method.
-(NSMutableArray*)yourMethodName {
NSMutableArray *retnArr = [NSMutableArray array];
for(/*some loop logic*/) {
// logic to determine if the number is an int
if(/*number is an int*/) {
[retnArr addObject:[NSNumber numberWithInt:/*current number*/]];
}
}
return retnArr;
}
Stick your logic into this method. Once you've found a number you want to return, stick it into the array using the [retnArr addObject:[NSNumber numberWithInt:]]; method I put up there.
Once you've returned the array, access the numbers like this:
[[arrReturnedFromMethod objectAtIndex:someIndex] intValue];
Optionally, you might want to throw them into the NSNumber object as different types.
You can also use:
[NSNumber numberWithDouble:]
[NSNumber numberWithLongLong:]
And there are matching getters (doubleValue,longLongValue) to extract the number. There are lots of other methods for NSNumber, but these seem the most likely you'd want to be using.
I was recently asked to complete a task for a c++ role, however as the application was decided not to be progressed any further I thought that I would post here for some feedback / advice / improvements / reminder of concepts I've forgotten.
The task was:
The following data is a time series of integer values
int timeseries[32] = {67497, 67376, 67173, 67235, 67057, 67031, 66951,
66974, 67042, 67025, 66897, 67077, 67082, 67033, 67019, 67149, 67044,
67012, 67220, 67239, 66893, 66984, 66866, 66693, 66770, 66722, 66620,
66579, 66596, 66713, 66852, 66715};
The series might be, for example, the closing price of a stock each day
over a 32 day period.
As stored above, the data will occupy 32 x sizeof(int) bytes = 128 bytes
assuming 4 byte ints.
Using delta encoding , write a function to compress, and a function to
uncompress data like the above.
Ok, so before this point I had never looked at compression so my solution is far from perfect. The manner in which I approached the problem is by compressing the array of integers into a array of bytes. When representing the integer as a byte I keep the calculate most
significant byte (msb) and keep everything up to this point, whilst throwing the rest away. This is then added to the byte array. For negative values I increment the msb by 1 so that we can
differentiate between positive and negative bytes when decoding by keeping the leading
1 bit values.
When decoding I parse this jagged byte array and simply reverse my
previous actions performed when compressing. As mentioned I have never looked at compression prior to this task so I did come up with my own method to compress the data. I was looking at C++/Cli recently, had not really used it previously so just decided to write it in this language, no particular reason. Below is the class, and a unit test at the very bottom. Any advice / improvements / enhancements will be much appreciated.
Thanks.
array<array<Byte>^>^ CDeltaEncoding::CompressArray(array<int>^ data)
{
int temp = 0;
int original;
int size = 0;
array<int>^ tempData = gcnew array<int>(data->Length);
data->CopyTo(tempData, 0);
array<array<Byte>^>^ byteArray = gcnew array<array<Byte>^>(tempData->Length);
for (int i = 0; i < tempData->Length; ++i)
{
original = tempData[i];
tempData[i] -= temp;
temp = original;
int msb = GetMostSignificantByte(tempData[i]);
byteArray[i] = gcnew array<Byte>(msb);
System::Buffer::BlockCopy(BitConverter::GetBytes(tempData[i]), 0, byteArray[i], 0, msb );
size += byteArray[i]->Length;
}
return byteArray;
}
array<int>^ CDeltaEncoding::DecompressArray(array<array<Byte>^>^ buffer)
{
System::Collections::Generic::List<int>^ decodedArray = gcnew System::Collections::Generic::List<int>();
int temp = 0;
for (int i = 0; i < buffer->Length; ++i)
{
int retrievedVal = GetValueAsInteger(buffer[i]);
decodedArray->Add(retrievedVal);
decodedArray[i] += temp;
temp = decodedArray[i];
}
return decodedArray->ToArray();
}
int CDeltaEncoding::GetMostSignificantByte(int value)
{
array<Byte>^ tempBuf = BitConverter::GetBytes(Math::Abs(value));
int msb = tempBuf->Length;
for (int i = tempBuf->Length -1; i >= 0; --i)
{
if (tempBuf[i] != 0)
{
msb = i + 1;
break;
}
}
if (!IsPositiveInteger(value))
{
//We need an extra byte to differentiate the negative integers
msb++;
}
return msb;
}
bool CDeltaEncoding::IsPositiveInteger(int value)
{
return value / Math::Abs(value) == 1;
}
int CDeltaEncoding::GetValueAsInteger(array<Byte>^ buffer)
{
array<Byte>^ tempBuf;
if(buffer->Length % 2 == 0)
{
//With even integers there is no need to allocate a new byte array
tempBuf = buffer;
}
else
{
tempBuf = gcnew array<Byte>(4);
System::Buffer::BlockCopy(buffer, 0, tempBuf, 0, buffer->Length );
unsigned int val = buffer[buffer->Length-1] &= 0xFF;
if ( val == 0xFF )
{
//We have negative integer compressed into 3 bytes
//Copy over the this last byte as well so we keep the negative pattern
System::Buffer::BlockCopy(buffer, buffer->Length-1, tempBuf, buffer->Length, 1 );
}
}
switch(tempBuf->Length)
{
case sizeof(short):
return BitConverter::ToInt16(tempBuf,0);
case sizeof(int):
default:
return BitConverter::ToInt32(tempBuf,0);
}
}
And then in a test class I had:
void CTestDeltaEncoding::TestCompression()
{
array<array<Byte>^>^ byteArray = CDeltaEncoding::CompressArray(m_testdata);
array<int>^ decompressedArray = CDeltaEncoding::DecompressArray(byteArray);
int totalBytes = 0;
for (int i = 0; i<byteArray->Length; i++)
{
totalBytes += byteArray[i]->Length;
}
Assert::IsTrue(m_testdata->Length * sizeof(m_testdata) > totalBytes, "Expected the total bytes to be less than the original array!!");
//Expected totalBytes = 53
}
This smells a lot like homework to me. The crucial phrase is: "Using delta encoding."
Delta encoding means you encode the delta (difference) between each number and the next:
67497, 67376, 67173, 67235, 67057, 67031, 66951, 66974, 67042, 67025, 66897, 67077, 67082, 67033, 67019, 67149, 67044, 67012, 67220, 67239, 66893, 66984, 66866, 66693, 66770, 66722, 66620, 66579, 66596, 66713, 66852, 66715
would turn into:
[Base: 67497]: -121, -203, +62
and so on. Assuming 8-bit bytes, the original numbers require 3 bytes apiece (and given the number of compilers with 3-byte integer types, you're normally going to end up with 4 bytes apiece). From the looks of things, the differences will fit quite easily in 2 bytes apiece, and if you can ignore one (or possibly two) of the least significant bits, you can fit them in one byte apiece.
Delta encoding is most often used for things like sound encoding where you can "fudge" the accuracy at times without major problems. For example, if you have a change from one sample to the next that's larger than you've left space to encode, you can encode a maximum change in the current difference, and add the difference to the next delta (and if you don't mind some back-tracking, you can distribute some to the previous delta as well). This will act as a low-pass filter, limiting the gradient between samples.
For example, in the series you gave, a simple delta encoding requires ten bits to represent all the differences. By dropping the LSB, however, nearly all the samples (all but one, in fact) can be encoded in 8 bits. That one has a difference (right shifted one bit) of -173, so if we represent it as -128, we have 45 left. We can distribute that error evenly between the preceding and following sample. In that case, the output won't be an exact match for the input, but if we're talking about something like sound, the difference probably won't be particularly obvious.
I did mention that it was an exercise that I had to complete and the solution that I received was deemed not good enough, so I wanted some constructive feedback seeing as actual companies never decide to tell you what you did wrong.
When the array is compressed I store the differences and not the original values except the first as this was my understanding. If you had looked at my code I have provided a full solution but my question was how bad was it?
I am trying to compare two long bytearrays in VB.NET and have run into a snag. Comparing two 50 megabyte files takes almost two minutes, so I'm clearly doing something wrong. I'm on an x64 machine with tons of memory so there are no issues there. Here is the code that I'm using at the moment and would like to change.
_Bytes and item.Bytes are the two different arrays to compare and are already the same length.
For Each B In item.Bytes
If B <> _Bytes(I) Then
Mismatch = True
Exit For
End If
I += 1
Next
I need to be able to compare as fast as possible files that are potentially hundreds of megabytes and even possibly a gigabyte or two. Any suggests or algorithms that would be able to do this faster?
Item.bytes is an object taken from the database/filesystem that is returned to compare, because its byte length matches the item that the user wants to add. By comparing the two arrays I can then determine if the user has added something new to the DB and if not then I can just map them to the other file and not waste hard disk drive space.
[Update]
I converted the arrays to local variables of Byte() and then did the same comparison, same code and it ran in like one second (I have to benchmark it still and compare it to others), but if you do the same thing with local variables and use a generic array it becomes massively slower. I’m not sure why, but it raises a lot more questions for me about the use of arrays.
What is the _Bytes(I) call doing? It's not loading the file each time, is it? Even with buffering, that would be bad news!
There will be plenty of ways to micro-optimise this in terms of looking at longs at a time, potentially using unsafe code etc - but I'd just concentrate on getting reasonable performance first. Clearly there's something very odd going on.
I suggest you extract the comparison code into a separate function which takes two byte arrays. That way you know you won't be doing anything odd. I'd also use a simple For loop rather than For Each in this case - it'll be simpler. Oh, and check whether the lengths are correct first :)
EDIT: Here's the code (untested, but simple enough) that I'd use. It's in C# for the minute - I'll convert it in a sec:
public static bool Equals(byte[] first, byte[] second)
{
if (first == second)
{
return true;
}
if (first == null || second == null)
{
return false;
}
if (first.Length != second.Length)
{
return false;
}
for (int i=0; i < first.Length; i++)
{
if (first[i] != second[i])
{
return false;
}
}
return true;
}
EDIT: And here's the VB:
Public Shared Function ArraysEqual(ByVal first As Byte(), _
ByVal second As Byte()) As Boolean
If (first Is second) Then
Return True
End If
If (first Is Nothing OrElse second Is Nothing) Then
Return False
End If
If (first.Length <> second.Length) Then
Return False
End If
For i as Integer = 0 To first.Length - 1
If (first(i) <> second(i)) Then
Return False
End If
Next i
Return True
End Function
The fastest way to compare two byte arrays of equal size is to use interop. Run the following code on a console application:
using System;
using System.Runtime.InteropServices;
using System.Security;
namespace CompareByteArray
{
class Program
{
static void Main(string[] args)
{
const int SIZE = 100000;
const int TEST_COUNT = 100;
byte[] arrayA = new byte[SIZE];
byte[] arrayB = new byte[SIZE];
for (int i = 0; i < SIZE; i++)
{
arrayA[i] = 0x22;
arrayB[i] = 0x22;
}
{
DateTime before = DateTime.Now;
for (int i = 0; i < TEST_COUNT; i++)
{
int result = MemCmp_Safe(arrayA, arrayB, (UIntPtr)SIZE);
if (result != 0) throw new Exception();
}
DateTime after = DateTime.Now;
Console.WriteLine("MemCmp_Safe: {0}", after - before);
}
{
DateTime before = DateTime.Now;
for (int i = 0; i < TEST_COUNT; i++)
{
int result = MemCmp_Unsafe(arrayA, arrayB, (UIntPtr)SIZE);
if (result != 0) throw new Exception();
}
DateTime after = DateTime.Now;
Console.WriteLine("MemCmp_Unsafe: {0}", after - before);
}
{
DateTime before = DateTime.Now;
for (int i = 0; i < TEST_COUNT; i++)
{
int result = MemCmp_Pure(arrayA, arrayB, SIZE);
if (result != 0) throw new Exception();
}
DateTime after = DateTime.Now;
Console.WriteLine("MemCmp_Pure: {0}", after - before);
}
return;
}
[DllImport("msvcrt.dll", CallingConvention = CallingConvention.Cdecl, EntryPoint="memcmp", ExactSpelling=true)]
[SuppressUnmanagedCodeSecurity]
static extern int memcmp_1(byte[] b1, byte[] b2, UIntPtr count);
[DllImport("msvcrt.dll", CallingConvention = CallingConvention.Cdecl, EntryPoint = "memcmp", ExactSpelling = true)]
[SuppressUnmanagedCodeSecurity]
static extern unsafe int memcmp_2(byte* b1, byte* b2, UIntPtr count);
public static int MemCmp_Safe(byte[] a, byte[] b, UIntPtr count)
{
return memcmp_1(a, b, count);
}
public unsafe static int MemCmp_Unsafe(byte[] a, byte[] b, UIntPtr count)
{
fixed(byte* p_a = a)
{
fixed (byte* p_b = b)
{
return memcmp_2(p_a, p_b, count);
}
}
}
public static int MemCmp_Pure(byte[] a, byte[] b, int count)
{
int result = 0;
for (int i = 0; i < count && result == 0; i += 1)
{
result = a[0] - b[0];
}
return result;
}
}
}
If you don't need to know the byte, use 64-bit ints that gives you 8 at once. Actually, you can figure out the wrong byte, once you've isolated it to a set of 8.
Use BinaryReader:
saveTime = binReader.ReadInt32()
Or for arrays of ints:
Dim count As Integer = binReader.Read(testArray, 0, 3)
Better approach... If you are just trying to see if the two are different then save some time by not having to go through the entire byte array and generate a hash of each byte array as strings and compare the strings. MD5 should work fine and is pretty efficient.
I see two things that might help:
First, rather than always accessing the second array as item.Bytes, use a local variable to point directly at the array. That is, before starting the loop, do something like this:
array2 = item.Bytes
That will save the overhead of dereferencing from the object each time you want a byte. That could be expensive in Visual Basic, especially if there's a Getter method on that property.
Also, use a "definite loop" instead of "for each". You already know the length of the arrays, so just code the loop using that value. This will avoid the overhead of treating the array as a collection. The loop would look something like this:
For i = 1 to max Step 1
If (array1(i) <> array2(i))
Exit For
EndIf
Next
Not strictly related to the comparison algorithm:
Are you sure your bottleneck is not related to the memory available and the time used to load the byte arrays? Loading two 2 GB byte arrays just to compare them could bring most machines to their knees. If the program design allows, try using streams to read smaller chunks instead.
I am coding VB.NET in VS2008.
I have a comma delimited string of numbers, i.e. 16,7,99,1456,1,3
I do this in VB:
Dim MyArr() As String = MyString.Split(",")
Will MyArr keep the items in the order they were in the string?
If I do this:
For Each S as String in MyString.Split(",")
'Do something with S
'Will my items be in the same order they were
'in the string?
Next
I tested it and it appears to keep the sort order but will it ~always~ keep the order?
If it does not maintain the order then what is a good way to split a string and keep order?
I'm asking because MSDN Array documentation says: "The Array is not guaranteed to be sorted." So I'm a bit unsure.
Yes, in your example the items will stay in the original order.
The MSDN documentation indicates that an Array is not necessarily sorted just because it's an Array, but once the items are in the Array, they won't be rearranged. The Split() operation will break it down based on the given token, preserving the order.
Yes, order will be maintained for these operations.
Yes, String.Split walks down the string, everything will stay in order. From .NET Reflector:
private string[] InternalSplitKeepEmptyEntries(int[] sepList, int[] lengthList, int numReplaces, int count)
{
int startIndex = 0;
int index = 0;
count--;
int num3 = (numReplaces < count) ? numReplaces : count;
string[] strArray = new string[num3 + 1];
for (int i = 0; (i < num3) && (startIndex < this.Length); i++)
{
strArray[index++] = this.Substring(startIndex, sepList[i] - startIndex);
startIndex = sepList[i] + ((lengthList == null) ? 1 : lengthList[i]);
}
if ((startIndex < this.Length) && (num3 >= 0))
{
strArray[index] = this.Substring(startIndex);
return strArray;
}
if (index == num3)
{
strArray[index] = Empty;
}
return strArray;
}
In .NET strings are immutable objects. Long story short, the string S and those returned by Split(",") live in different memory.