Arbitrary precision integer programming solvers? - bignum

Are there any free solvers capable of solving integer programming problems with large numbers (at least 336 bits)? All the solvers I've looked at appear to assume only double precision, and I haven't been able to find any that claim arbitrary precision.

The only one that I know of is SCIP - its a really good solver and they have a beta of a version that does exact arithmetic, although I haven't tried that side of SCIP yet:
http://scip.zib.de/exactmip.shtml
Also, the people working on SCIP all seem to be really helpful, and there is an active support email list too so you are likely to get good support.
Tim

Related

Complexity of Integer vs. Binary Constraints in CPLEX

Recently, I have been trying to learn a bit about CPLEX and was hoping someone could help me understand the complexity when solving for integer vs. binary constraints.
For example, say we are trying to allocate a pie around 10 people for maximum utility, where each person has a utility that is linear with the amount of pie they receive. However, we want to introduce the constraint that at least 3 people have to get a bit of pie.
What's the difference between thinking of this as a single integer constraint (number_of_people_with_pie >= 3) vs. 10 binary variables (person_1_has_pie + person_2_has_pie + ... person_10_has_pie >= 3)? I would imagine the former is simplest but wonder if there is any benefits to forming the problem in terms of binary variables?
In addition to this, any recommended reading for better understanding MIP and CPLEX would be greatly appreciated, especially in better understanding where the problem becomes NP or in what situations simplex struggles to find the global minima.
Thanks!
I agree with Alex and Erwin's comment that this really depends on what you want to model. For this particular model I disagree with Alex: to me it makes more sense to use one decision variable per person, otherwise it may become hard to figure out which person gets how much of the pie.
A problem becomes NP hard as soon as you add integrality or SOS constraints. A good reading for MIP in general is Alex Schrijver's "Theory of Integer and Linear Programming". That should cover all the topics you need for an in-depth understanding of things.
It really depends on the case but in yours I would use 1 decision variable rather than 10.
Sometimes, that's not obvious and trying and measuring can prove oneself right or wrong. And that's one of the reason why using high modeling languages can help. (Abstract modeling languages such as OPL)
I recommend a MOOC on cognitive class : https://cognitiveclass.ai/courses/mathematical-optimization-for-business-problems/
and the OPL language manual : https://www.ibm.com/support/knowledgecenter/SSSA5P_12.7.0/ilog.odms.studio.help/pdf/opl_languser.pdf

What is the difference between SAT and linear programming

I have an optimization problem that is subjected to linear constraints.
How to know which method is better for modelling and solving the problem.
I am generally asking about solving a problem as a satisfiability problem (SAT or SMT) vs. Solving as a linear programming problem (ILP OR MILP).
I don't have much knowledge in both. So, please simplify your answer if you have any.
Generally speaking, the difference is that SAT is only trying for feasible solutions, while ILP is trying to optimize something subject to constraints. I believe some ILP solvers actually use SAT solvers to get an initial feasible solution. The sensor array problem you describe in a comment is formulated as an ILP: "minimize this subject to that." A SAT version of that would instead pick a maximum acceptable number of sensors and use that as a constraint. Now, this is a satisfiability problem, but not one that's easily expressed in conjunctive normal form. I'd recommend using a solver with a theory of integers. My favorite is Z3.
However, before you give up on optimizing, you should try GMPL / GLPK. You might be surprised by how tractable your problem is. If you're not so lucky, turn it into a satisfiability problem and bring out Z3.

SMT solvers for bit vector arithmetic

I'm planning some experiments in symbolic execution of C code, using an off-the-shelf SMT solver, and wondering which solver to use; looking at e.g. the SMT contest entrants, and taking only the open-source systems, narrows it down to Beaver, Boolector, CVC3, OpenSMT, Sateen, Sonolar, STP, Verit; which is still a long list.
Trying to narrow it down a little further, I notice that some of the systems advertise the ability to handle bit vector arithmetic, whereas others only advertise the ability to handle general integer arithmetic. In principle, the former is correct for C, where variables are machine words, not unbounded integers. How much difference does it make in practice? What happens if you try to use a general integer system for this kind of job? Does one of the following scenarios apply?
A bit vector system is slightly more efficient, but you can use either, no problem.
You can use a general integer system with a bit of tweaking.
A general integer system is fine for signed int (because the result of overflow is undefined) but will give the wrong answer for unsigned.
A general integer system just isn't correct for machine word arithmetic, and I can reduce my short list to only those systems that provide bit vector arithmetic.
Something else...?
I've tried to ask as specific a question as possible, but if anyone can suggest any other criteria for narrowing down the list, that would be great!
I've had good experience using STP for symbolic execution. STP was designed precisely for this task. Also, there have been a number of symbolic execution tools that have successfully used STP for this purpose, so there is reason to believe that STP doesn't suck. I would definitely recommend STP to others as a default choice for this sort of experimentation.
However, I haven't tried the other systems, so I don't know how STP compares to them.
Personally, I see STP as the baseline and the default choice for this kind of application. So, if you only have time to try one solver, trying STP seems like a pretty reasonable choice.
If I had to guess, my guess would be that bit-vector arithmetic is important to support, because any large systems code is going to have a non-trivial amount of code that performs bitwise operations. Also, I'd suspect/worry some systems code may rely upon the behavior of unsigned arithmetic to wrap modulo 2n, and if you try to model it with integers, you will not get the semantics of C right (because, as you say, integers just aren't correct for machine-word arithmetic) and consequently, if you try to use an integer-only solver, you may experience some difficulties. However, I have no hard evidence for either of these suspicions.
P.S. Z3 might also be a contender to add to your list to consider. (Do you really need your solver to be open-source, as long as it is free? I'd expect that a symbolic execution tool would use it only as a blackbox, without modification.)
According to SMT-Wikipedia at 2011-08, we have:
Based on these measures, it appears that the most vibrant, well-organized projects are OpenSMT, STP and CVC4.
I'm just checking this stuff - so far, all three seems reasonable, plus older CVC -> CVC3.

How to implement long division for enormous numbers (bignums)

I'm trying to implement long division for bignums. I can't use a library like GMP unfortunately due to the limitations of embedded programming. Besides, i want the intellectual exercise of learning how to implement it. So far i've got addition and multiplication done using any-length arrays of bytes (so each byte is like a base-256 digit).
I'm just trying to get started on implementing division / modulus and i want to know where to start? I've found lots of highly-optimised (aka unreadable) code on the net, which doesn't help me, and i've found lots of highly-technical mathematical whitepapers from which I can't bridge the gap between theory and implementation.
If someone could recommend a popular algorithm, and point me to a simple to understand explanation of it that leans towards implmenentation, that'd be fantastic.
-edit: I need algorithms which work when the dividend is ~4000bits, and divisor is ~2000bits
-edit: Will this algorithm work with base-256 ? http://courses.cs.vt.edu/~cs1104/BuildingBlocks/divide.030.html
-edit: Is this the algorithm (newton division) i should really be using? http://en.wikipedia.org/wiki/Division_(digital)#Newton.E2.80.93Raphson_division
If you want to learn, then start with the pencil and paper method you used in elementary school. Believe it or not, that is essentially the same O(n^2) algorithm that is used in most bignum libraries for numbers that are in the range you are looking for. The tricky first step is called "quotient estimation", and that will probably be the hardest to understand. Once you understand that, the rest should come easy.
A good reference is Knuth's "Seminumerical Algorithms". He has many discussions about different ways to do quotient estimation both in the text and in the exercises. That book has chapters devoted to bignum implementations.
Are you using the void Four1(long double[],int,int) in your code and then convolving and then doing a inverse transform well I got multiplication to work but when I tried to do division the same way it spat out one result then quit so I cannot help but if you have the tome called "Numeric Recipes in C++" go to near the end and you will find what you are looking for actually it starts on Page 916 to 926.
This question is over 2 years old, but for this size numbers you can look at the OpenSSL source code. It does RSA with this size numbers so has lots of math routines optimized for 1000 to 4000 bit numbers.

Why aren't Floating-Point Decimal numbers hardware accelerated like Floating-Point Binary numbers?

Is it worth it to implement it in hardware? If yes why? If not why not?
Sorry I thought it is clear that I am talking about Decimal Rational Numbers! Ok something like decNumber++ for C++, decimal for .NET... Hope it is clear now :)
The latest revision of the IEEE 754:2008 standard does indeed define hardware decimal floating point numbers, using the representations shown in the software referenced in the question. The previous version of the standard (IEEE 754:1985) did not provide decimal floating point numbers. Most current hardware implements the 1985 standard and not the 2008 standard, but IBM's iSeries computers using Power6 chips have such support, and so do the z10 mainframes.
The standardization effort for decimal floating point was spearheaded by Mike Cowlishaw of IBM UK, who has a web site full of useful information (including the software in the question). It is likely that in due course, other hardware manufacturers will also introduce decimal floating point units on their chips, but I have not heard a statement of direction for when (or whether) Intel might add one. Intel does have optimized software libraries for it.
The C standards committee is looking to add support for decimal floating point and that work is TR 24732.
Some IBM processors have dedicated decimal hardware included (Decimal Floating Point | DFP- unit).
In contribution of
answered Sep 18 at 23:43
Daniel Pryden
the main reason is that DFP-units need more transistors in a chip then BFP-units. The reason is the BCD Code to calculate decimal numbers in a binary environment. The IEEE754-2008 has several methods to minimize the overload. It seems that the DPD hxxp://en.wikipedia.org/wiki/Densely_packed_decimal method is more effective in comparison to the BID hxxp://en.wikipedia.org/wiki/Binary_Integer_Decimal method.
Normally, you need 4 bits to cover the decimal range from 0 to 9. Bit the 10 to 15 are invalid but still possible with BCD.
Therefore, the DPD compress 3*4=12 bit into 10 bit to cover the range from 000 to 999 with 1024 (10^2)possibilities.
In general it is to say, that BFP is faster then DFP.
And BFP need less space on a chip then DFP.
The question why IBM implemented a DFP unit is quite simple answered:
They build servers for the finance market. If data represents money, it should be reliable.
With hardware accelerated decimal arithmetic, some errors do not accour as in binary.
1/5 = 0.2 => 0.0110011001100110011001100110... in binary so recurrent fractions could be avoided.
And the overhelming round() function in excel would be useless anymore :D
(->function =1*(0,5-0,4-0,1) wtf!)
hope that explain your question a little!
There is (a tiny bit of) decimal string acceleration, but...
This is a good question. My first reaction was "macro ops have always failed to prove out", but after thinking about it, what you are talking about would go a whole lot faster if implemented in a functional unit. I guess it comes down to whether those operations are done enough to matter. There is a rather sorry history of macro op and application-specific special-purpose instructions, and in particular the older attempts at decimal financial formats are just legacy baggage now. For example, I doubt if they are used much, but every PC has the Intel BCD opcodes, which consist of
DAA, AAA, AAD, AAM, DAS, AAS
Once upon a time, decimal string instructions were common on high-end hardware. It's not clear that they ever made much of a benchmark difference. Programs spend a lot of time testing and branching and moving things and calculating addresses. It normally doesn't make sense to put macro-operations into the instruction set architecture, because overall things seem to go faster if you give the CPU the smallest number of fundamental things to do, so it can put all its resources into doing them as fast as possible.
These days, not even all the binary ops are actually in the real ISA. The cpu translates the legacy ISA into micro-ops at runtime. It's all part of going fast by specializing in core operations. For now the left-over transisters seem to be waiting for some graphics and 3D work, i.e., MMX, SSE, 3DNow!
I suppose it's possible that a clean-sheet design might do something radical and unify the current (HW) scientific and (SW) decimal floating point formats, but don't hold your breath.
No, they are very memory-inefficient. And the calculations are also on hardware not easy to implement (of course it can be done, but it also can use a lot of time).
Another disadvantage of the decimal format is, it's not widly used, before research showed that the binary-formatted numbers were more accurate the format was popular for a time. But now programmers know better. The decimal format is't efficient and is more lossy. Also additional hardware-representations require additional instruction-sets, that can lead to more difficult code.
Decimals (and more generally, fractions) are relatively easy to implement as a pair of integers. General purpose libraries are ubiquitous and easily fast enough for most applications.
Anyone who needs the ultimate in speed is going to hand-tune their implementation (eg changing the divisor to suit a particular usage, algebraicly combining/reordering the operations, clever use of SIMD shuffles...). Merely encoding the most common functions into a hardware ISA would surely never satisfy them -- in all likelihood, it wouldn't help at all.
The hardware you want used to be fairly common.
Older CPU's had hardware BCD (Binaray coded decimal) arithmetic. ( The little Intel chips had a little support as noted by earlier posters)
Hardware BCD was very good at speeding up FORTRAN which used 80 bit BCD for numbers.
Scientific computing used to make up a significant percentage of the worldwide market.
Since everyone (relatively speaking) got home PC running windows, the market got tiny
as a percentage. So nobody does it anymore.
Since you don't mind having 64bit doubles (binary floating point) for most things, it mostly works.
If you use 128bit binary floating point on modern hardware vector units it's not too bad. Still less accurate than 80bit BCD, but you get that.
At an earlier job, a colleague formerly from JPL was astonished we still used FORTRAN. "We've converted to C and C++ he told us." I asked him how they solved the problem of lack of precision. They'd not noticed. (They have also not the same space probe landing accuracy they used to have. But anyone can miss a planet.)
So, basically 128bit doubles in the vector unit are more okay, and widely available.
My twenty cents. Please don't represent it as a floating point number :)
Decimal floating-point standard (IEEE 754-2008) is already implemented in hardware by two companies; IBM's POWER 6/7 based servers, and SilMinds SilAx PCIe-based acceleration card.
SilMinds published a case study about converting the Decimal arithmetic execution to use its HW solutions. A great boost in time and slashed energy consumption are presented.
Moreover several publications by "Michael J. Schulte" and others reveal very positive benchmarks results, and some comparison between DPD and BID formats (both defined in the IEEE 754-2008 standard)
You can find pdfs to:
Performance analysis of decimal floating-point libraries and its impact on decimal hardware and software solutions
A survey of hardware designs for decimal arithmetic
Energy and Delay Improvement via Decimal Floating Point Units
Those 3 papers should be more than enough for your questions!
I speculate that there are no compute-intensive applications of decimal numbers. On the other hand, floating points number are extensively used in engineering applications, which must handle enormous amounts of data and do not need exact results, just need to stay within a desired precision.
The simple answer is that computers are binary machines. They don't have ten fingers, they have two. So building hardware for binary numbers is considerably faster, easier, and more efficient than building hardware for decimal numbers.
By the way: decimal and binary are number bases, while fixed-point and floating-point are mechanisms for approximating rational numbers. The two are completely orthogonal: you can have floating-point decimal numbers (.NET's System.Decimal is implemented this way) and fixed-point binary numbers (normal integers are just a special case of this).
Floating point math essentially IS an attempt to implement decimals in hardware. It's troublesome, which is why the Decimal types are created partly in software. It's a good question, why CPUs don't support more types, but I suppose it goes back to CISC vs. RISC processors -- RISC won the performance battle, so they try to keep things simple these days I guess.
Modern computers are usually general purpose. Floating point arithmetic is very general purpose, while Decimal has a far more specific purpose. I think that's part of the reason.
Do you mean the typical numeric integral types "int", "long", "short" (etc.)? Because operations on those types are definitely implemented in hardware. If you're talking about arbitrary-precision large numbers ("BigNums" and "Decimals" and such), it's probably a combination of rarity of operations using these data types and the complexity of building hardware to deal with arbitrarily large data formats.