How to bitwise-and CFBitVector - objective-c

I have two instances of CFMutableBitVector, like so:
CFBitVectorRef ref1, ref2;
How can I do bit-wise operations to these guys? For right now, I only care about and, but obviously xor, or, etc would be useful to know.
Obviously I can iterate through the bits in the vector, but that seems silly when I'm working at the bit level. I feel like there are just some Core Foundation functions that I'm missing, but I can't find them.
Thanks,
Kurt

Well a
CFBitVectorRef
is a
typedef const struct __CFBitVector *CFBitVectorRef;
which is a
struct __CFBitVector {
CFRuntimeBase _base;
CFIndex _count; /* number of bits */
CFIndex _capacity; /* maximum number of bits */
__CFBitVectorBucket *_buckets;
};
Where
/* The bucket type must be unsigned, at least one byte in size, and
a power of 2 in number of bits; bits are numbered from 0 from left
to right (bit 0 is the most significant) */
typedef uint8_t __CFBitVectorBucket;
So you can dive in a do byte wise operations which could speed things up. Of course being non-mutable might hinder things a bit :D

Related

how do I divide large number into two smaller integers and then reassemble the large number?

i have tried the below but do not seem to get the correct value in the end:
I have a number that may be larger than 32bit and hence I want to store it into two 32 bit array indices.
I broke them up like:
int[0] = lgval%(2^32);
int[1] = lgval/(2^32);
and reassembling the 64bit value I tried like:
CPU: PowerPC e500v2
lgval= ((uint64)int[0]) | (((uint64)int[1])>>32);
mind shift to right since we're on big endian. For some reason I do not get the correct value at the end, why not? What am I doing wrong here?
The ^ operator is xor, not power.
The way you want to do this is probably:
uint32_t split[2];
uint64_t lgval;
/* ... */
split[0] = lgval & 0xffffffff;
split[1] = lgval >> 32;
/* code to operate on your 32-bit array elements goes here */
lgval = ((uint64_t)split[1] << 32) | (uint64_t)(split[0]);
As Raymond Chen has mentioned, endianness is about storage. In this case, you only need to consider endianness if you want to access the bytes in your split-32-bit-int as a single 64-bit value. This probably isn't a good idea anyway.

Issue precision CGAL's Delaunay triangulation

I am having an issue that is probably due to the mismatch between the precision of a double and the infinite precision given in CGAL, but I cannot seem to solve it, and do not see anyway to set a tolerance.
I input a set of points (initially the locations are doubles).
When the points are aligned horizontally on the upper part (and only happens there) I sometimes (not always) get an issue of too many triangles (with a really small area, almost 0) being generated: see image
(*notice in the upper section, how there is a line that seems to be thicker than the rest, because there are at least 3 triangles there).
This is what I am doing in my code, I tried to set the kernel to handle the imprecision of doubles:
typedef CGAL::Simple_cartesian<double> CK;
typedef CGAL::Filtered_kernel<CK> K;
//typedef CGAL::Exact_predicates_inexact_constructions_kernel K;
typedef K::FT FT;
typedef K::Point_2 Point;
typedef K::Segment_2 Segment;
typedef CGAL::Polygon_2<K> Polygon_2;
typedef CGAL::Triangulation_vertex_base_with_info_2<unsigned long, K> Vb2;
typedef CGAL::Triangulation_data_structure_2<Vb2,Fb> Tds2;
typedef CGAL::Delaunay_triangulation_2<K,Tds2> Delaunay;
std::vector<std::vector<long> > Geometry::delaunay(std::vector<double> xs,std::vector<double> ys){
std::vector<Point> points;
std::vector<unsigned long> indices;
points.resize(xs.size());
indices.resize(xs.size());
for(long i=0;i<xs.size();i++){
indices[i]=i;
points[i]=Point(xs[i],ys[i]);
}
std::vector<long> idAs;
std::vector<long> idBs;
Delaunay dt;
dt.insert( boost::make_zip_iterator(boost::make_tuple( points.begin(),indices.begin() )),boost::make_zip_iterator(boost::make_tuple( points.end(),indices.end() ) ) );
for(Delaunay::Finite_edges_iterator it = dt.finite_edges_begin(); it != dt.finite_edges_end(); ++it)
{
Delaunay::Edge e=*it;
long i1= e.first->vertex( (e.second+1)%3 )->info();
long i2= e.first->vertex( (e.second+2)%3 )->info();
idAs.push_back(i1);
idBs.push_back(i2);
}
std::vector<std::vector<long> > result;
result.resize(2);
result[0]=(idAs);
result[1]=(idBs);
return result;
}
I am completely new to CGAL, and this code is something that I have been able to put together after a lot of looking up in web pages in the last 2 days. So if there is something else that might be improved, please, do not hesitate to mention it, the sintaxis of CGAL is not really straightforward.
*the code works perfectly for random points, and 70% of the times even for points that are aligned, but the other 30% worries me.
THE QUESTION IS, how can I set a tolerance, so CGAL does not generate triangles on top of points that are almost almost aligned??, or is there a better kernel for this? (as you see I tried also the Exact_predicates_inexact_constructions_kernel, but it is even worst).

Memcpy and Memset on structures of Short Type in C

I have a query about using memset and memcopy on structures and their reliablity. For eg:
I have a code looks like this
typedef struct
{
short a[10];
short b[10];
}tDataStruct;
tDataStruct m,n;
memset(&m, 2, sizeof(m));
memcpy(&n,&m,sizeof(m));
My question is,
1): in memset if i set to 0 it is fine. But when setting 2 i get m.a and m.b as 514 instead of 2. When I make them as char instead of short it is fine. Does it mean we cannot use memset for any initialization other than 0? Is it a limitation on short for eg
2): Is it reliable to do memcopy between two structures above of type short. I have a huge
strings of a,b,c,d,e... I need to make sure copy is perfect one to one.
3): Am I better off using memset and memcopy on individual arrays rather than collecting in a structure as above?
One more query,
In the structue above i have array of variables. But if I am passed pointer to these arrays
and I want to collect these pointers in a structure
typedef struct
{
short *pa[10];
short *pb[10];
}tDataStruct;
tDataStruct m,n;
memset(&m, 2, sizeof(m));
memcpy(&n,&m,sizeof(m));
In this case if i or memset of memcopy it only changes the address rather than value. How do i change the values instead? Is the prototype wrong?
Please suggest. Your inputs are very imp
Thanks
dsp guy
memset set's bytes, not shorts. always. 514 = (256*2) + (1*2)... 2s appearing on byte boundaries.
1.a. This does, admittedly, lessen it's usefulness for purposes such as you're trying to do (array fill).
reliable as long as both structs are of the same type. Just to be clear, these structures are NOT of "type short" as you suggest.
if I understand your question, I don't believe it matters as long as they are of the same type.
Just remember, these are byte level operations, nothing more, nothing less. See also this.
For the second part of your question, try
memset(m.pa, 0, sizeof(*(m.pa));
memset(m.pb, 0, sizeof(*(m.pb));
Note two operations to copy from two different addresses (m.pa, m.pb are effectively addresses as you recognized). Note also the sizeof: not sizeof the references, but sizeof what's being referenced. Similarly for memcopy.

How do I implement a bit array in C / Objective C

iOS / Objective-C: I have a large array of boolean values.
This is an inefficient way to store these values – at least eight bits are used for each element when only one is needed.
How can I optimise?
see CFMutableBitVector/CFBitVector for a CFType option
Try this:
#define BITOP(a,b,op) \
((a)[(size_t)(b)/(8*sizeof *(a))] op ((size_t)1<<((size_t)(b)%(8*sizeof *(a)))))
Then for any array of unsigned integer elements no larger than size_t, the BITOP macro can access the array as a bit array. For example:
unsigned char array[16] = {0};
BITOP(array, 40, |=); /* sets bit 40 */
BITOP(array, 41, ^=); /* toggles bit 41 */
if (BITOP(array, 42, &)) return 0; /* tests bit 42 */
BITOP(array, 43, &=~); /* clears bit 43 */
etc.
You use the bitwise logical operations and bit-shifting. (A Google search for these terms might give you some examples.)
Basically you declare an integer type (including int, char, etc.), then you "shift" integer values to the bit you want, then you do an OR or an AND with the integer.
Some quick illustrative examples (in C++):
inline bool bit_is_on(int bit_array, int bit_number)
{
return ((bit_array) & (1 << bit_number)) ? true : false;
}
inline void set_bit(int &bit_array, int bit_number)
{
bit_array |= (1 << bit_number);
}
inline void clear_bit(int &bit_array, int bit_number)
{
bit_array &= ~(1 << bit_number);
}
Note that this provides "bit arrays" of constant size (sizeof(int) * 8 bits). Maybe that's OK for you, or maybe you will want to build something on top of this. (Or re-use whatever some library provides.)
This will use less memory than bool arrays... HOWEVER... The code the compiler generates to access these bits will be larger and slower. So unless you have a large number of objects that need to contain these bit arrays, it might have a net-negative impact on both speed and memory usage.
#define BITOP(a,b,op) \
((a)[(size_t)(b)/(8*sizeof *(a))] op (size_t)1<<((size_t)(b)%(8*sizeof *(a))))
will not work ...
Fix:
#define BITOP(a,b,op) \
((a)[(size_t)(b)/(8*sizeof *(a))] op ((size_t)1<<((size_t)(b)%(8*sizeof *(a)))))
I came across this question as I am writing a bit array framework that is intent to manage large amounts of 'bits' similar to Java BitSet. I was looking to see if the name I decided on was in conflict with other Objective-C frameworks.
Anyway, I'm just starting this and am deciding whether to post it on SourceForge or other open source hosting sites.
Let me know if you are interested
Edit: I've created the project, called BitArray, on SourceForge. The source is in the SF SVN repository and I've also uploaded a compiled framework. This LINK will get your there.
Frank

Is there a practical limit to the size of bit masks?

There's a common way to store multiple values in one variable, by using a bitmask. For example, if a user has read, write and execute privileges on an item, that can be converted to a single number by saying read = 4 (2^2), write = 2 (2^1), execute = 1 (2^0) and then add them together to get 7.
I use this technique in several web applications, where I'd usually store the variable into a field and give it a type of MEDIUMINT or whatever, depending on the number of different values.
What I'm interested in, is whether or not there is a practical limit to the number of values you can store like this? For example, if the number was over 64, you couldn't use (64 bit) integers any more. If this was the case, what would you use? How would it affect your program logic (ie: could you still use bitwise comparisons)?
I know that once you start getting really large sets of values, a different method would be the optimal solution, but I'm interested in the boundaries of this method.
Off the top of my head, I'd write a set_bit and get_bit function that could take an array of bytes and a bit offset in the array, and use some bit-twiddling to set/get the appropriate bit in the array. Something like this (in C, but hopefully you get the idea):
// sets the n-th bit in |bytes|. num_bytes is the number of bytes in the array
// result is 0 on success, non-zero on failure (offset out-of-bounds)
int set_bit(char* bytes, unsigned long num_bytes, unsigned long offset)
{
// make sure offset is valid
if(offset < 0 || offset > (num_bytes<<3)-1) { return -1; }
//set the right bit
bytes[offset >> 3] |= (1 << (offset & 0x7));
return 0; //success
}
//gets the n-th bit in |bytes|. num_bytes is the number of bytes in the array
// returns (-1) on error, 0 if bit is "off", positive number if "on"
int get_bit(char* bytes, unsigned long num_bytes, unsigned long offset)
{
// make sure offset is valid
if(offset < 0 || offset > (num_bytes<<3)-1) { return -1; }
//get the right bit
return (bytes[offset >> 3] & (1 << (offset & 0x7));
}
I've used bit masks in filesystem code where the bit mask is many times bigger than a machine word. think of it like an "array of booleans";
(journalling masks in flash memory if you want to know)
many compilers know how to do this for you. Adda bit of OO code to have types that operate senibly and then your code starts looking like it's intent, not some bit-banging.
My 2 cents.
With a 64-bit integer, you can store values up to 2^64-1, 64 is only 2^6. So yes, there is a limit, but if you need more than 64-its worth of flags, I'd be very interested to know what they were all doing :)
How many states so you need to potentially think about? If you have 64 potential states, the number of combinations they can exist in is the full size of a 64-bit integer.
If you need to worry about 128 flags, then a pair of bit vectors would suffice (2^64 * 2).
Addition: in Programming Pearls, there is an extended discussion of using a bit array of length 10^7, implemented in integers (for holding used 800 numbers) - it's very fast, and very appropriate for the task described in that chapter.
Some languages ( I believe perl does, not sure ) permit bitwise arithmetic on strings. Giving you a much greater effective range. ( (strlen * 8bit chars ) combinations )
However, I wouldn't use a single value for superimposition of more than one /type/ of data. The basic r/w/x triplet of 3-bit ints would probably be the upper "practical" limit, not for space efficiency reasons, but for practical development reasons.
( Php uses this system to control its error-messages, and I have already found that its a bit over-the-top when you have to define values where php's constants are not resident and you have to generate the integer by hand, and to be honest, if chmod didn't support the 'ugo+rwx' style syntax I'd never want to use it because i can never remember the magic numbers )
The instant you have to crack open a constants table to debug code you know you've gone too far.
Old thread, but it's worth mentioning that there are cases requiring bloated bit masks, e.g., molecular fingerprints, which are often generated as 1024-bit arrays which we have packed in 32 bigint fields (SQL Server not supporting UInt32). Bit wise operations work fine - until your table starts to grow and you realize the sluggishness of separate function calls. The binary data type would work, were it not for T-SQL's ban on bitwise operators having two binary operands.
For example .NET uses array of integers as an internal storage for their BitArray class.
Practically there's no other way around.
That being said, in SQL you will need more than one column (or use the BLOBS) to store all the states.
You tagged this question SQL, so I think you need to consult with the documentation for your database to find the size of an integer. Then subtract one bit for the sign, just to be safe.
Edit: Your comment says you're using MySQL. The documentation for MySQL 5.0 Numeric Types states that the maximum size of a NUMERIC is 64 or 65 digits. That's 212 bits for 64 digits.
Remember that your language of choice has to be able to work with those digits, so you may be limited to a 64-bit integer anyway.