Computing the complex eigenvectors of a sparse matrix in Java - eigenvalue

I am trying to compute the eigenvalues and eigenvectors of a potentially large and sparse non-symmetrical NxN matrix (N > 10^6). I would not need all of them, but maybe the first of them. Ideally, I'd like to do so from Java but could move to C, C++ or Python if required.
My matrix can potentially have both complex eigenvalues and eigenvectors. For example, see the results for this Wolfram Alpha sample.
I found several ways to do this using a number of Java libraries and wrote some evaluation code for them:
Commons-Math: EigenDecomposition
JAMA: EigenvalueDecomposition
MTJ: EVD
COLT: EigenvalueDecomposition
But the problem I am facing is that these libraries do not return (or at least I found no way to get) the complex valued eigenvectors. Most of them do return the complex valued eigenvalues, but not complex eigenvectors. They typically provide the latter in the form of a "vector of reals" or "real matrix" having columns as each eigenvector.
I do as a matter of fact need the eigenvalues in complex form, if any.
Now, I recently started looking into Spectra (C++) which seems to support my use case. But would like to first ask and maybe discard a misunderstanding on my side or something I may have skipped from Java land because I'd like to keep using a single platform/language as far as it's possible.
Is there anything I should be looking into? Also, If I end up moving away from Java for this task, any other alternatives to Spectra I could be looking into? Thanks!

Just in case anyone stumbles upon this, I finally went the C++ way because none of the Java libraries provided the complex eigenvectors as I needed.
I have ended up implementing most of the stuff I need with C++ using Spectra and Eigen. Then I have built a series of native wrapper classes using SWIG.

For everyone in the future with the same question: there is a library for Java called Jeigen that can do this. It is actually a Java wrapper for the Eigen C++ library that the original poster already mentioned in his own answer.
You can find Jeigen here.

Related

SCIP using old code

I am kind of new to the SCIP. I want to use SCIP as a branch and price framework. I have coded the problem in C++ already and also have implemented the pricer or column generation as a function. In fact I have implemented the BP algorithm for the root node by linking Cplex.dll to the project and now need to code the branching tree and decided to use SCIP for this purpose.
I want to know what is the fastest way I can solve my problem using SCIP and the old codes which I have? Or maybe using GCG is a better and faster way?
I have read the GCG documentation but doesn't understand if I should implement the pricer myself again or not? In fact I don't understand the difference between these two (SCIP and GCG).
Thanks.
In GCG, you do not need to implement anything yourself. It is a generic solver for branch-and-price. You have to provide the compact formulation, that is, a model which after applying a Dantzig-Wolfe reformulation leads to the master problem you are solving. The reformulation also provides a MIP-formulation of the pricing problem, so GCG can solve this as a sub-MIP for pricing. There is the possibility, however, to plug-in a pricing solver in GCG, to which the pricing MIP to be solved will be passed (with objective function corresponding to the current pricing round). The pricing solver can then solve this problem with any problem-specific algorithm and pass solutions back to GCG.
In SCIP, on the other hand, you create the master problem you want to solve and implement a pricer which gets dual values from the LP and solves the corresponding pricing problem. This is probably very similar to what you have already.
Additionally, if you want to do branch-and-price, you need a branching rule. GCG comes with some generic ones, in SCIP you would have to implement one yourself (since the branching decisions must be regarded within your pricing procedure).
Overall, SCIP is a framework for branch-and-price, i.e., it provides the tree management, LP solving and updates, etc., but you need to implement some things yourself like a reader, the pricer, and the branching rules. GCG is a generic solver, so you can just plug in a compact model, which is reformulated and solved in a generic way. The reformulation is either provided by you via an input file or you can try to let GCG detect an appropriate structure. You do not need to implement anything. It already provides some nice features like primal heuristics that make use of the reformulation, an automatic management of which pricing problem is solved when, and more. On the other hand, the possibilities to extend it further, e.g., by a pricing solver and branching rules are restricted compared to SCIP, since you have to stick to the structure defined by GCG.
I would say that using SCIP and adding your pricer is probably the easier way and more similar to what you already have (you do not need to formulate the compact model). If you already have an idea on how your branching should work, it should also not be too hard to implement within SCIP.

Converting Matlab to VB.NET

I recently got assigned a task to convert a few algorithms written in matlab to VB.NET (or C# if VB.NET isn't efficient).
The matlab code itself consists of a lot of matrix algebra. I initially looked through here and found there was a Matlab Coder that wrapped the matlab code but when I presented that option I was told it isn't desirable.
I am stuck in a sense that I don't know how to approach this with the proper tools.
Is it normally acceptable to grab libraries (http://www.codeproject.com/Articles/5835/DotNetMatrix-Simple-Matrix-Library-for-NET or http://www.planet-source-code.com/vb/scripts/ShowCode.asp?txtCodeId=907&lngWId=10, these are the only ones I could find) to implement these algorithms or is that generally frowned upon?
Do I need to reinvent the wheel and implement my own algorithms for the algebra (matrix multiplication, choleksy decomposition etc)?
Basically, I am not sure what the accepted way of accomplishing this task is, any input would be appreciated. I apologize if this isn't allowed in here, this is my first time posting but I am a long time lurker.
You have several possibilities.
If your application can bear the loading time of MCR, you can use Matlab .NET Builder. It will compile a .NET class, which will run MCR silently underneath. All of your clients will be forced to install MCR on their computer.
If your code must be native you can either rewrite the code, or use Matlab Coder, which will convert the code into unreadable, native c++ code.
If you choose to rewrite the code, I would recommend finding an implementation of LAPACK linear algebra routines on .NET, as Matlab is based on them.
Code like that is published so it can be used and learned from. Just make sure the code's license (if any) is acceptable for your situation.
You can access Matlab functions from VB.net through COM interface
http://www.mathworks.com/help/matlab/matlab_external/view-matlab-functions-from-visual-basic-object-browser.html

Compiler Optimization of Deterministic Functions

I was reading about Deterministic Execution, which is that for the same input, you have the same output. I was wondering whether any compiler writer has thought about optimizing deterministic functions at runtime.
For example, take the factorial function. If at runtime, it is detected that it is continuously being called with the same input value, the compiler can cache the output value and instead of executing the factorial function, can directly use that output value. Seems like a nice research topic. Are there any papers or work on this topic?
This is usually called memoization, and is a fairly common optimization in functional languages.
It can be done but as far as I know, it's not common for compilers to do it. The trouble is that users can define as many types as they like and equality in any way that they like, and with heap allocation and stuff it's very, very difficult to prove such a thing. Basically, it could be done, but only if your function involves straight numerical computation, which is rare, and thus it's usually not of high value.
You're talking about referential transparency. And it's a big part of functional programming.
http://en.wikipedia.org/wiki/Referential_transparency_(computer_science)
http://blogs.msdn.com/b/vcblog/archive/2008/11/12/pogo.aspx talks about profile guided optimization.
doesnot answer your questions per se but in general talks about using runtime behavior to optimize assembly

Programming languages that define the problem instead of the solution?

Are there any programming languages designed to define the solution to a given problem instead of defining instructions to solve it? So, one would define what the solution or end result should look like and the language interpreter would determine how to arrive at that result. Looking at the list of programming languages, I'm not sure how to even begin to research this.
The best examples I can currently think of to help illustrate what I'm trying to ask are SQL and MapReduce, although those are both sort of mini-languages designed to retrieve data. But, when writing SQL or MapReduce statements, you're defining the end result, and the DB decides the best course of action to arrive at the end result set.
I could see these types of languages, if they exist, being used in crunching a lot of data or finding solutions to a set of equations. The dream language would be one that could interpret the defined problem, identify which parts are parallelizable, and execute the solution across multiple processes/cores/boxes.
What about Declarative Programming? Excerpt from wikipedia article (emphasis added):
In computer science, declarative
programming is a programming paradigm
that expresses the logic of a
computation without describing its
control flow. Many languages
applying this style attempt to
minimize or eliminate side effects by
describing what the program should
accomplish, rather than describing how
to go about accomplishing it. This
is in contrast with imperative
programming, which requires an
explicitly provided algorithm.
The closest you can get to something like this is with a logic language such as Prolog. In these languages you model the problem's logic but again it's not magic.
This sounds like a description of a declarative language (specifically a logic programming language), the most well-known example of which is Prolog. I have no idea whether Prolog is parallelizable, though.
In my experience, Prolog is great for solving constraint-satisfaction problems (ones where there's a set of conditions that must be satisfied) -- you define your input set, define the constraints (e.g., an ordering that must be imposed on the previously unordered inputs) -- but pathological cases are possible, and sometimes the logical deduction process takes a very long time to complete.
If you can define your problem in terms of a Boolean formula you could throw a SAT solver at it, but note that the 3SAT problem (Boolean variable assignment over three-variable clauses) is NP-complete, and its first-order-logic big brother, the Quantified Boolean formula problem (which uses the existential quantifier as well as the universal quantifier), is PSPACE-complete.
There are some very good theorem provers written in OCaml and other FP languages; here are a whole bunch of them.
And of course there's always linear programming via the simplex method.
These languages are commonly referred to as 5th generation programming languages. There are a few examples on the Wikipedia entry I have linked to.
Let me try to answer ... may be Prolog could answer your needs.
I would say Objective Caml (OCaml) too...
This may seem flippant but in a sense that is what stackoverflow is. You declare a problem and or intended result and the community provides the solution, usually in code.
It seems immensely difficult to model dynamic open systems down to a finite number of solutions. I think there is a reason most programming languages are imperative. Not to mention there are massive P = NP problems lurking in the dark that would make such a system difficult to engineer.
Although what would be interesting is if there was a formal framework that could leverage human input to "crunch the numbers" and provide a solution, perhaps imperative code generation. The internet and google search engines are kind of that tool but very primitive.
Large problems and software are basically just a collection of smaller problems solved in code. So any system that generated code would require fairly delimited problem sets that can be mapped to more or less atomic solutions.
Lisp. There are so many Lisp systems out there defined in terms of rules not imperative commands. Google ahoy...
There are various Java-based rules engines which allow declarative programming - Drools is one that I've played with and it seems pretty interesting.
A lot of languages define more problems than solutions (don't take this one seriously).
On a serious note: one more vote for Prolog and different kinds of DSLs designed to be declarative.
I remember reading something about computation using DNA back when I was in college. You would put segments of DNA in a solution that represented segments of the problem, and define it in such a way that if the DNA fits together, it's a valid solution. Then you let the properties of chemicals solve the problem for you and look for finished strands that represent a solution. It sounds sort of like what you are refering to.
I don't recall if it was theoretical or had been done, though.
LINQ could also be considered another declarative DSL (aschewing the argument that it's too similar to SQL). Again, you declare what your solution looks like, and LINQ decides how to find it.
The beauty of these kinds of languages is that projects like PLINQ (which I just found) can spring up around them. Check out this video with the PLINQ developers (WMV direct link) on how they parallelize solution finding without modifying the LINQ language (much).
While mathematical proofs don't constitute a programming language, they do form a formal language where you simply define solutions (as long as you allow nonconstructive proofs). Of course, it's not algorithmic, so "math" might not be an acceptable answer.
Meta Discussion
What constitutes a problem or a solution is not absolute and depends on the level of abstraction that you are taking as a reference point.
Let's compare the following 3 languages: SQL, C++, and CPU instructions.
C++ vs CPU instructions
If you choose array manipulation as the desired level of abstraction, then C++ allows you to "define the problem" instead of the solution:
array[i * 2 + 3] = 5;
array[t] = array[k - m] - 1;
Note what this C++ snippet does not state: how the memory is laid out, how many bits are used by each array element, which CPU registers hold the data, and even in which order the arithmetic operations will be performed (as long as the result is the same).
The C++ compiler, however, will translate this code to lower-level CPU instructions that will contain all of these details.
At the abstraction level of array manipulation, C++ is declarative, and CPU instructions are imperative.
SQL vs C++
If you choose a sorting algorithm as the desired level of abstraction, then SQL allows you to "define the problem" instead of the solution:
select *
from table
order by key
This snippet of code is declarative with respect to the sorting algorithm's level of abstraction because it declares that the output is sorted without using lower-level concepts (like array manipulation).
If you had to sort an array in C++ (without using a library), the program would be expressed in terms of array manipulation steps of a particular sorting algorithm.
void sort(int *array, int size) {
int key, j;
for(int i = 1; i < size; i++) {
key = array[i];
j = i;
while(j > 0 && array[j-1] > key) {
array[j] = array[j-1];
j--;
}
array[j] = key;
}
}
This snippet is not declarative with respect to the sorting algorithm's level of abstraction because it uses concepts (such as array manipulation) that are constituents of the sorting algorithm.
Summary
To summarize, whether a language defines problems or solutions depends on what problems and solutions you are referring to.
Many answers here have brought up examples: SQL, LINQ, Prolog, Lisp, OCaml. I am sure there are many useful levels of abstractions with respect to which these languages are declarative.
However, do not forget that you can build a language with an even higher level of abstraction on top of them.

Really Big Numbers and Objective-C

I've been toying around with some Project Euler problems and naturally am running into a lot that require the handling of bigger than long long type numbers. I am committed to using Cocoa and Objective-C (I need to stay sharp for work) but can't find an elegant way (read: library) to handle these really big numbers.
I'd love to use GMP but is sounds like using it with Xcode is a complete world of hurt.
Does anyone know of any other options?
If I were you I would compile gmp outside XCode and use just gmp.h and libgmp.a (or libgmp.dylib) in my XCode project.
Try storing the digits in arrays.
Although you will have to write some new functions for all your arithmatic problems but thats how we were told to do it in college.
Plus the speed of calculations was pretty improved as big numbers weren't really big afterall and were not numbers really altogether
see if it helps
regards
vBigNum in vecLib implements 1024 bit integers (signed or unsigned). Is that big enough?
If you wanted to use matlab (or anything close) you could look at my implementation of a big integer form (vpi) on the file exchange.
It is rather simple. Store each digit separately. Adds and subtracts are simple, just implement a carry operation. Multiplies are best done using convolution, then a carry. Implement divide and mod operators, then a powermod operation, useful for many of the PE problems. Powers are easy - just repeated squaring and multiplication, based on the binary representation of the exponent.
This will let you solve many PE problems.
I too got the bright idea to attempt some Euler Project problems with Cocoa/Objective-C and have found it frustrating. I previously used Java and perhaps some PHP. I posted my exact problem in this thread.
I always considered using a library cheating for this project. Just write a class with the things you need. And don't be afraid to use malloc and uint64_t and so on. NSNumber is not a good idea in many cases.
On the other hand, there are many problems where the obvious solution would require huge to enormously huge numbers, and the trick is to find a way to solve the problem without using these huge numbers. (For example, what is the sum of the last thousand digits of 1,000,000 factorial)?