Wednesday, February 15, 2012

abusing for-loops in scala

This is not usually the case, but sometimes it's nice to avoid nested iterators. If a block of code has too many, they can become visually disruptive. For an example, take a look at line 52 of this Sudoku example.

In these cases I find Python's product generator particularly useful. Consider the output of the following code:

from itertools import product
for i, j, k in product(range(2), range(10,12), range(20,22)):
    print(i, j, k)

# Output:
# 0 10 20
# 0 10 21
# 0 11 20
# 0 11 21
# 1 10 20
# 1 10 21
# 1 11 20
# 1 11 21

This behaves just like three nested for loops. You can give it an arbitrary number of sequences, too. Scala has a similar construct built into its for loops which is quite nice. Just separate your iterables by semicolons:

for (
  a <- 0 to 1;
  b <- 10 to 11;
  c <- 20 to 21
) println(a + " " + b + " " + c)

// 0 10 20
// ...

The real power here lies in the ability to apply filters to your iterables. Here we only execute the nested loops if a is even:

for (
  a <- 0 to 9 if a % 2 == 0;
  b <- 10 to 11;
  c <- 20 to 21
) println(a + " " + b + " " + c)

// 0 10 20
// 0 10 21
// 0 11 20
// 0 11 21
// 2 10 20
// 2 10 21
// 2 11 20
// 2 11 21
// 4 10 20
// ...

Of course, the more expressive a syntactic construct, the more ripe it is for abuse. It's not as blindingly obvious what the differences between the above and the following are as it would be with nesting, especially since they have the same output:

for (
  a <- 0 to 9;
  b <- 10 to 11;
  c <- 20 to 21 if a % 2 == 0
) println(a + " " + b + " " + c)

// 0 10 20
// ...

To see this a bit more clearly, we can add tests against println(...) to the innermost loop. For the uninitiated, println(...) returns Unit (think None or null), which is the same as (). The println(...) calls are essentially side effects:

for (
  a <- 0 to 9 if a % 2 == 0;
  b <- 10 to 11;
  c <- 20 to 21 if println("iterating") == ()
) println(a + " " + b + " " + c)

// iterating
// 0 10 20
// iterating
// 0 10 21
// iterating
// 0 11 20
// iterating
// 0 11 21
// iterating
// 2 10 20
// ...

We can chain our conditions nicely. Of course we have the put the println(...) first so it always executes. Note the extra iterations before a takes the value 2:

for (
  a <- 0 to 9;
  b <- 10 to 11;
  c <- 20 to 21 if println("iterating") == () if a % 2 == 0
) println(a + " " + b + " " + c)

// iterating
// 0 10 20
// iterating
// 0 10 21
// iterating
// 0 11 20
// iterating
// 0 11 21
// iterating
// iterating
// iterating
// iterating
// iterating
// 2 10 20
// ...

Friday, January 20, 2012

python-zibopt cython port started

I've finally bitten the bullet and stated a branch of python-zibopt for purpose of porting it to Cython.  In a sense this is like starting over, though I expect it will go fairly quickly once I get in the swing of things.  The result should be a library that works in both Python 2 and 3 (currently it just supports 3) and is nearly identical to the existing API.  There will probably be a few changes, but if you've only used it in the expected manner (as a modeling interface) then I doubt you'll notice any change at all.

Cython is interesting.  It reminds me a bit of Inline::C, but not as terrifying.  It certainly has its quirks, but its developers have done an excellent job of making the transitions between using Python and C as painless as possible.  I haven't read all of the docs yet, opting for my usual method of jumping in and seeing how fast I drown, but a few points already stand out:

  • It isn't necessary to define every field on a C struct that you are using.  You can just define the ones you use or say pass.  This is extremely important, as there are so many large structs in SCIP that I thought it would be a deal-breaker.
  • Python strings and C char * are pretty much inter-operable.  Nice.
  • Constants, C functions, and other things that one yanks into a .pxd (the Cython equivalent of a header file) get treated to proper Python module encapsulation and are available using dots, like foo.bar.  Also nice.
So far I've just done the most basic of things: load SCIP and ask it to solve.  However, there are already vast improvements in its error handling mechanisms.  The C version used macros and other horrors, and it didn't catch all of SCIP's errors.

Using Cython, I was able to register an error handler with SCIP as if it were a C function (as void *, no less).  I defined a simple PY_SCIP_CALL function that I wrap all my calls to SCIP in, similar to the SCIP_CALL macro that litters the SCIP source, and have it raise a Python Exception if anything is amiss.  Suddenly I get all SCIP errors with appropriate text and even solver source code lines.  Much better.

As a quick example, this code produces the following output.  The scip.c reference in the last line shows the actual line number from ZIBopt source:
Traceback (most recent call last):
 File "<string>", line 1, in <module>
 File "scip.pyx", line 28, in zibopt.scip.test (src/zibopt/scip.c:606)
 File "scip.pyx", line 15, in zibopt.scip.PY_SCIP_CALL (src/zibopt/scip.c:513)
Exception: [src/scip/scip.c:7500] ERROR: no node selector available

Friday, January 13, 2012

normal magic squares

As a followup to yesterday's post, I created another python-zibopt example for finding Normal Magic Squares.  This is similar to the Sudoku example, except that here the number of binary variables depends on the square size.  In the case of Sudoku, each cell has 9 binary variables -- one for each potential value it might take.  For a normal magic square, there are n^2 possible values for each cell, n^2 cells, and one variable representing the row, column, and diagonal sums.  This makes a total of n^4 binary variables and one continuous variables in the model.

However, there are no big-Ms.

I think the neat part of this code is in lines 57-62.  It creates sums of the n^2 variables for each cell with their appropriate coefficients (1 to n^2) and stores those expressions to make the subsequent constraint creation simpler (try doing that in your modeling language!).  All made possible thanks to python-algebraic.

Thursday, January 12, 2012

magic squares and big-Ms

When I visited the LA PyLadies back in October of 2011, I started toying with a model for finding Magic Squares in python-zibopt.  As a modeling exercise, this is fun but not too terribly challenging.  Construct a square matrix of integer-valued variables and add the following constraints:

  1. All variables >= 1.
  2. All rows, columns, and the diagonal sum to the same value.
  3. All variables take different values.
Admittedly, I had a few bugs to fix in the code before I could get this working.  If you'd like to run it yourself, the model is here.  It works against the latest development version in svn trunk of python-zibopt and python-algebraic 0.3.1.  When python-zibopt 0.7.2-dev is tagged soon, it will be a part of that.

The first two constraint types are trivial to implement, and relatively easy for the solver.  What I do is add a single extra variable then set it equal to the sums of each row, column, and the diagonal.

It's the third that messes things up.  You can think of this as saying, for every possible pair of integer-valued variables x & y:

Either x >= y + 1 or x <= y - 1.

Why is this hard?  Because we can't add both constraints to the model and maintain feasibility.  What we have to do is add them in such a way that exactly one will be active for any any given solution.  This requires, for each pair of variables, an additional binary variable (we will call this z) and a (possibly big) constant (M).  Thus the above must be reformulated as the following before adding it to our model:

x >= (y + 1) - M*z
x <= (y - 1) + M*(1-z)

All of a sudden, here be dragons.  We may not know how big or small to make M.  Generally we want it as small as possible to avoid playing too much havoc with the LP relaxations of our integer programming model.  It contributes to rounding errors (in the magic square problem, if I make M really big, all the variables will come back as 1).  Setting M to different values may have an unpredictable effect on the solution time of a given model.  So on, so forth.

Which brings us to an interesting idea:

SCIP now supports bilinear constraints out of the box.  This means that I can make M a variable in the above model.  (Heck, I can even make it an integer variable if I'm feeling particularly insane.)

The magic square model linked to in this post (the astute reader will notice it does not solve the normal magic square problem) provides both methods.  The first command line argument it requires is the matrix size.  The second one, M, is optional.  If not given, it will leave M up to the solver.

I didn't expect this to perform as well as providing sensible values for M, but for small matrices it didn't perform too terribly worse either.  Not quite twice the run time in most of my unscientific tests.  Given the early state of MINLP development, that's pretty encouraging.

I'd love to see what one of the many far more knowledgeable OR bloggers out there has to say about this.