<?xml version='1.0' encoding='UTF-8'?><?xml-stylesheet href="http://www.blogger.com/styles/atom.css" type="text/css"?><feed xmlns='http://www.w3.org/2005/Atom' xmlns:openSearch='http://a9.com/-/spec/opensearchrss/1.0/' xmlns:georss='http://www.georss.org/georss' xmlns:gd='http://schemas.google.com/g/2005' xmlns:thr='http://purl.org/syndication/thread/1.0'><id>tag:blogger.com,1999:blog-6378430709872859668</id><updated>2012-02-15T15:01:22.496-05:00</updated><category term='simulation'/><category term='processing'/><category term='hibernate'/><category term='education'/><category term='media'/><category term='zibopt'/><category term='bip'/><category term='scala'/><category term='erlang'/><category term='scip'/><category term='perl'/><category term='r'/><category term='maven'/><category term='music'/><category term='pso'/><category term='lisp'/><category term='graphs'/><category term='algorithms'/><category term='concurrency'/><category term='pycon'/><category term='scala-pso'/><category term='tsp'/><category term='formulation'/><category term='phd'/><category term='cython'/><category term='python'/><category term='pyladies'/><category term='yapc'/><category term='zimpl'/><category term='functional programming'/><category term='python-zibopt'/><category term='power set'/><category term='japh'/><category term='data fitting'/><category term='gmu'/><category term='python-algebraic'/><category term='ampl'/><category term='linear programming'/><category term='uls'/><category term='msor'/><title type='text'>adventures in optimization</title><subtitle type='html'></subtitle><link rel='http://schemas.google.com/g/2005#feed' type='application/atom+xml' href='http://adventuresinoptimization.blogspot.com/feeds/posts/default'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6378430709872859668/posts/default?max-results=100'/><link rel='alternate' type='text/html' href='http://adventuresinoptimization.blogspot.com/'/><link rel='hub' href='http://pubsubhubbub.appspot.com/'/><author><name>Ryan J. O'Neil</name><uri>http://www.blogger.com/profile/01002920068652190833</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://4.bp.blogspot.com/-7zSc00n-pqc/TbgcePz1iNI/AAAAAAAAAcc/RSMQsfpwnHo/s220/foo.jpg'/></author><generator version='7.00' uri='http://www.blogger.com'>Blogger</generator><openSearch:totalResults>55</openSearch:totalResults><openSearch:startIndex>1</openSearch:startIndex><openSearch:itemsPerPage>100</openSearch:itemsPerPage><entry><id>tag:blogger.com,1999:blog-6378430709872859668.post-6797691380847034670</id><published>2012-02-15T14:45:00.000-05:00</published><updated>2012-02-15T15:01:22.502-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='functional programming'/><category scheme='http://www.blogger.com/atom/ns#' term='algorithms'/><category scheme='http://www.blogger.com/atom/ns#' term='python'/><category scheme='http://www.blogger.com/atom/ns#' term='scala'/><title type='text'>abusing for-loops in scala</title><content type='html'>&lt;p&gt;This is not usually the case, but sometimes it's nice to avoid nested iterators.  If a block of code has too many, they can become visually disruptive.  For an example, take a look at line 52 of &lt;a href="http://code.google.com/p/python-zibopt/source/browse/trunk/examples/sudoku.py?spec=svn217&amp;r=170#52"&gt;this Sudoku example&lt;/a&gt;.&lt;/p&gt;&lt;p&gt;In these cases I find &lt;a href="http://docs.python.org/py3k/library/itertools.html#itertools.product"&gt;Python's product generator&lt;/a&gt; particularly useful.  Consider the output of the following code:&lt;/p&gt;&lt;pre class="brush: python; toolbar: false;"&gt;from itertools import product&lt;br /&gt;for i, j, k in product(range(2), range(10,12), range(20,22)):&lt;br /&gt;    print(i, j, k)&lt;br /&gt;&lt;br /&gt;# Output:&lt;br /&gt;# 0 10 20&lt;br /&gt;# 0 10 21&lt;br /&gt;# 0 11 20&lt;br /&gt;# 0 11 21&lt;br /&gt;# 1 10 20&lt;br /&gt;# 1 10 21&lt;br /&gt;# 1 11 20&lt;br /&gt;# 1 11 21&lt;br /&gt;&lt;/pre&gt;&lt;p&gt;This behaves just like three nested for loops.  You can give it an arbitrary number of sequences, too.  Scala has a similar construct built into its for loops which is quite nice.  Just separate your iterables by semicolons:&lt;/p&gt;&lt;pre class="brush: scala; toolbar: false;"&gt;for (&lt;br /&gt;  a &lt;- 0 to 1;&lt;br /&gt;  b &lt;- 10 to 11;&lt;br /&gt;  c &lt;- 20 to 21&lt;br /&gt;) println(a + " " + b + " " + c)&lt;br /&gt;&lt;br /&gt;// 0 10 20&lt;br /&gt;// ...&lt;br /&gt;&lt;/pre&gt;&lt;p&gt;The real power here lies in the ability to apply filters to your iterables.  Here we only execute the nested loops if a is even:&lt;/p&gt;&lt;pre class="brush: scala; toolbar: false;"&gt;for (&lt;br /&gt;  a &lt;- 0 to 9 if a % 2 == 0;&lt;br /&gt;  b &lt;- 10 to 11;&lt;br /&gt;  c &lt;- 20 to 21&lt;br /&gt;) println(a + " " + b + " " + c)&lt;br /&gt;&lt;br /&gt;// 0 10 20&lt;br /&gt;// 0 10 21&lt;br /&gt;// 0 11 20&lt;br /&gt;// 0 11 21&lt;br /&gt;// 2 10 20&lt;br /&gt;// 2 10 21&lt;br /&gt;// 2 11 20&lt;br /&gt;// 2 11 21&lt;br /&gt;// 4 10 20&lt;br /&gt;// ...&lt;br /&gt;&lt;/pre&gt;&lt;p&gt;Of course, the more expressive a syntactic construct, the more ripe it is for abuse.  It's not as blindingly obvious what the differences between the above and the following are as it would be with nesting, especially since they have the same output:&lt;/p&gt;&lt;pre class="brush: scala; toolbar: false;"&gt;for (&lt;br /&gt;  a &lt;- 0 to 9;&lt;br /&gt;  b &lt;- 10 to 11;&lt;br /&gt;  c &lt;- 20 to 21 if a % 2 == 0&lt;br /&gt;) println(a + " " + b + " " + c)&lt;br /&gt;&lt;br /&gt;// 0 10 20&lt;br /&gt;// ...&lt;br /&gt;&lt;/pre&gt;&lt;p&gt;To see this a bit more clearly, we can add tests against println(...) to the innermost loop.  For the uninitiated, println(...) returns Unit (think None or null), which is the same as ().  The println(...) calls are essentially side effects:&lt;/p&gt;&lt;pre class="brush: scala; toolbar: false;"&gt;for (&lt;br /&gt;  a &lt;- 0 to 9 if a % 2 == 0;&lt;br /&gt;  b &lt;- 10 to 11;&lt;br /&gt;  c &lt;- 20 to 21 if println("iterating") == ()&lt;br /&gt;) println(a + " " + b + " " + c)&lt;br /&gt;&lt;br /&gt;// iterating&lt;br /&gt;// 0 10 20&lt;br /&gt;// iterating&lt;br /&gt;// 0 10 21&lt;br /&gt;// iterating&lt;br /&gt;// 0 11 20&lt;br /&gt;// iterating&lt;br /&gt;// 0 11 21&lt;br /&gt;// iterating&lt;br /&gt;// 2 10 20&lt;br /&gt;// ...&lt;br /&gt;&lt;/pre&gt;&lt;p&gt;We can chain our conditions nicely.  Of course we have the put the println(...) first so it always executes.  Note the extra iterations before a takes the value 2:&lt;/p&gt;&lt;pre class="brush: scala; toolbar: false;"&gt;for (&lt;br /&gt;  a &lt;- 0 to 9;&lt;br /&gt;  b &lt;- 10 to 11;&lt;br /&gt;  c &lt;- 20 to 21 if println("iterating") == () if a % 2 == 0&lt;br /&gt;) println(a + " " + b + " " + c)&lt;br /&gt;&lt;br /&gt;// iterating&lt;br /&gt;// 0 10 20&lt;br /&gt;// iterating&lt;br /&gt;// 0 10 21&lt;br /&gt;// iterating&lt;br /&gt;// 0 11 20&lt;br /&gt;// iterating&lt;br /&gt;// 0 11 21&lt;br /&gt;// iterating&lt;br /&gt;// iterating&lt;br /&gt;// iterating&lt;br /&gt;// iterating&lt;br /&gt;// iterating&lt;br /&gt;// 2 10 20&lt;br /&gt;// ...&lt;br /&gt;&lt;/pre&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6378430709872859668-6797691380847034670?l=adventuresinoptimization.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://adventuresinoptimization.blogspot.com/feeds/6797691380847034670/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://adventuresinoptimization.blogspot.com/2012/02/abusing-for-loops-in-scala.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6378430709872859668/posts/default/6797691380847034670'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6378430709872859668/posts/default/6797691380847034670'/><link rel='alternate' type='text/html' href='http://adventuresinoptimization.blogspot.com/2012/02/abusing-for-loops-in-scala.html' title='abusing for-loops in scala'/><author><name>Ryan J. O'Neil</name><uri>http://www.blogger.com/profile/01002920068652190833</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://4.bp.blogspot.com/-7zSc00n-pqc/TbgcePz1iNI/AAAAAAAAAcc/RSMQsfpwnHo/s220/foo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6378430709872859668.post-2498195623905936350</id><published>2012-01-20T17:33:00.002-05:00</published><updated>2012-01-20T17:37:17.370-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='scip'/><category scheme='http://www.blogger.com/atom/ns#' term='python-algebraic'/><category scheme='http://www.blogger.com/atom/ns#' term='zibopt'/><category scheme='http://www.blogger.com/atom/ns#' term='cython'/><title type='text'>python-zibopt cython port started</title><content type='html'>I've finally bitten the bullet and stated a branch of python-zibopt for purpose of porting it to &lt;a href="http://cython.org/"&gt;Cython&lt;/a&gt;. &amp;nbsp;In a sense this is like starting over, though I expect it will go fairly quickly once I get in the swing of things. &amp;nbsp;The result should be a library that works in both Python 2 and 3 (currently it just supports 3) and is nearly identical to the existing API. &amp;nbsp;There will probably be a few changes, but if you've only used it in the expected manner (as a modeling interface) then I doubt you'll notice any change at all.&lt;br /&gt;&lt;br /&gt;Cython is interesting. &amp;nbsp;It reminds me a bit of&amp;nbsp;&lt;a href="http://search.cpan.org/~sisyphus/Inline-0.49/C/C.pod"&gt;Inline::C&lt;/a&gt;, but not as terrifying. &amp;nbsp;It certainly has its quirks, but its developers have done an excellent job of making the transitions between using Python and C as painless as possible. &amp;nbsp;I haven't read all of the docs yet, opting for my usual method of jumping in and seeing how fast I drown, but a few points already stand out:&lt;br /&gt;&lt;br /&gt;&lt;ul&gt;&lt;li&gt;It isn't necessary to define every field on a C struct that you are using. &amp;nbsp;You can just define the ones you use or say pass. &amp;nbsp;This is extremely important, as there are so many large structs in SCIP that I thought it would be a deal-breaker.&lt;br /&gt;&lt;/li&gt;&lt;li&gt;Python strings and C char * are pretty much inter-operable. &amp;nbsp;Nice.&lt;br /&gt;&lt;/li&gt;&lt;li&gt;Constants, C functions, and other things that one yanks into a .pxd (the Cython equivalent of a header file) get treated to proper Python module encapsulation and are available using dots, like foo.bar. &amp;nbsp;Also nice.&lt;/li&gt;&lt;/ul&gt;&lt;div&gt;So far I've just done the most basic of things: load SCIP and ask it to solve. &amp;nbsp;However, there are already vast improvements in its error handling mechanisms. &amp;nbsp;The C version used macros and other horrors, and it didn't catch all of SCIP's errors.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Using Cython, I was able to register an error handler with SCIP as if it were a C function (as void *, no less). &amp;nbsp;I defined a simple PY_SCIP_CALL function that I wrap all my calls to SCIP in, similar to the SCIP_CALL macro that litters the SCIP source, and have it raise a Python Exception if anything is amiss. &amp;nbsp;Suddenly I get all SCIP errors with appropriate text and even solver source code lines. &amp;nbsp;Much better.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;As a quick example, &lt;a href="http://code.google.com/p/python-zibopt/source/browse/branches/python-zibopt-0.8/src/zibopt/scip.pyx?spec=svn215&amp;amp;r=215"&gt;this code&lt;/a&gt; produces the following output. &amp;nbsp;The scip.c reference in the last line shows the actual line number from ZIBopt source:&lt;/div&gt;&lt;blockquote class="tr_bq"&gt;&lt;span style="background-color: rgba(255, 255, 255, 0.917969); color: #222222; font-family: arial, sans-serif; font-size: 13px;"&gt;Traceback (most recent call last):&lt;/span&gt;&lt;br /&gt;&lt;span style="background-color: rgba(255, 255, 255, 0.917969); color: #222222; font-family: arial, sans-serif; font-size: 13px;"&gt;&amp;nbsp;File "&amp;lt;string&amp;gt;", line 1, in &amp;lt;module&amp;gt;&lt;/span&gt;&lt;br /&gt;&lt;span style="background-color: rgba(255, 255, 255, 0.917969); color: #222222; font-family: arial, sans-serif; font-size: 13px;"&gt;&amp;nbsp;File "scip.pyx", line 28, in zibopt.scip.test (src/zibopt/scip.c:606)&lt;/span&gt;&lt;br /&gt;&lt;span style="background-color: rgba(255, 255, 255, 0.917969); color: #222222; font-family: arial, sans-serif; font-size: 13px;"&gt;&amp;nbsp;File "scip.pyx", line 15, in zibopt.scip.PY_SCIP_CALL (src/zibopt/scip.c:513)&lt;/span&gt;&lt;br /&gt;&lt;span style="background-color: rgba(255, 255, 255, 0.917969); color: #222222; font-family: arial, sans-serif; font-size: 13px;"&gt;Exception: [src/scip/scip.c:7500] ERROR: no node selector available&lt;/span&gt;&lt;/blockquote&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6378430709872859668-2498195623905936350?l=adventuresinoptimization.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://adventuresinoptimization.blogspot.com/feeds/2498195623905936350/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://adventuresinoptimization.blogspot.com/2012/01/python-zibopt-cython-port-started.html#comment-form' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6378430709872859668/posts/default/2498195623905936350'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6378430709872859668/posts/default/2498195623905936350'/><link rel='alternate' type='text/html' href='http://adventuresinoptimization.blogspot.com/2012/01/python-zibopt-cython-port-started.html' title='python-zibopt cython port started'/><author><name>Ryan J. O'Neil</name><uri>http://www.blogger.com/profile/01002920068652190833</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://4.bp.blogspot.com/-7zSc00n-pqc/TbgcePz1iNI/AAAAAAAAAcc/RSMQsfpwnHo/s220/foo.jpg'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6378430709872859668.post-3227743859301575539</id><published>2012-01-13T11:03:00.000-05:00</published><updated>2012-01-13T11:03:00.155-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='formulation'/><category scheme='http://www.blogger.com/atom/ns#' term='python-zibopt'/><category scheme='http://www.blogger.com/atom/ns#' term='scip'/><category scheme='http://www.blogger.com/atom/ns#' term='bip'/><category scheme='http://www.blogger.com/atom/ns#' term='python-algebraic'/><title type='text'>normal magic squares</title><content type='html'>As a followup to yesterday's post, I created &lt;a href="http://code.google.com/p/python-zibopt/source/browse/trunk/examples/normal-magic-square.py"&gt;another python-zibopt example&lt;/a&gt; for finding &lt;a href="http://adventuresinoptimization.blogspot.com/2012/01/magic-squares-and-big-ms.html"&gt;Normal Magic Squares&lt;/a&gt;. &amp;nbsp;This is similar to &lt;a href="http://code.google.com/p/python-zibopt/source/browse/trunk/examples/sudoku.py"&gt;the Sudoku example&lt;/a&gt;, except that here the number of binary variables depends on the square size. &amp;nbsp;In the case of Sudoku, each cell has 9 binary variables -- one for each potential value it might take. &amp;nbsp;For a normal magic square, there are n^2 possible values for each cell, n^2 cells, and one variable representing the row, column, and diagonal sums. &amp;nbsp;This makes a total of n^4 binary variables and one continuous variables in the model.&lt;br /&gt;&lt;br /&gt;However, there are no big-Ms.&lt;br /&gt;&lt;br /&gt;I think the neat part of this code is in &lt;a href="http://code.google.com/p/python-zibopt/source/browse/trunk/examples/normal-magic-square.py#57"&gt;lines 57-62&lt;/a&gt;. &amp;nbsp;It creates sums of the n^2 variables for each cell with their appropriate coefficients (1 to n^2) and stores those expressions to make the &lt;a href="http://code.google.com/p/python-zibopt/source/browse/trunk/examples/normal-magic-square.py#66"&gt;subsequent constraint creation&lt;/a&gt; simpler &lt;i&gt;(try doing that in your modeling language!)&lt;/i&gt;. &amp;nbsp;All made possible thanks to python-algebraic.&lt;br /&gt;&lt;br /&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6378430709872859668-3227743859301575539?l=adventuresinoptimization.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://adventuresinoptimization.blogspot.com/feeds/3227743859301575539/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://adventuresinoptimization.blogspot.com/2012/01/normal-magic-squares.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6378430709872859668/posts/default/3227743859301575539'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6378430709872859668/posts/default/3227743859301575539'/><link rel='alternate' type='text/html' href='http://adventuresinoptimization.blogspot.com/2012/01/normal-magic-squares.html' title='normal magic squares'/><author><name>Ryan J. O'Neil</name><uri>http://www.blogger.com/profile/01002920068652190833</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://4.bp.blogspot.com/-7zSc00n-pqc/TbgcePz1iNI/AAAAAAAAAcc/RSMQsfpwnHo/s220/foo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6378430709872859668.post-9126644048718923420</id><published>2012-01-12T16:50:00.003-05:00</published><updated>2012-01-12T16:56:34.704-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='formulation'/><category scheme='http://www.blogger.com/atom/ns#' term='python-zibopt'/><category scheme='http://www.blogger.com/atom/ns#' term='scip'/><category scheme='http://www.blogger.com/atom/ns#' term='bip'/><category scheme='http://www.blogger.com/atom/ns#' term='python-algebraic'/><category scheme='http://www.blogger.com/atom/ns#' term='zibopt'/><title type='text'>magic squares and big-Ms</title><content type='html'>When I visited the LA PyLadies &lt;a href="http://www.meetup.com/la-pyladies/events/34789522/"&gt;back in October&lt;/a&gt; of 2011, I started toying with a model for finding&amp;nbsp;&lt;a href="http://en.wikipedia.org/wiki/Magic_square"&gt;Magic Squares&lt;/a&gt;&amp;nbsp;in python-zibopt. &amp;nbsp;As a modeling exercise, this is fun but not too terribly challenging. &amp;nbsp;Construct a square matrix of integer-valued variables and add the following constraints:&lt;br /&gt;&lt;br /&gt;&lt;ol&gt;&lt;li&gt;All variables &amp;gt;= 1.&lt;/li&gt;&lt;li&gt;All rows, columns, and the diagonal sum to the same value.&lt;/li&gt;&lt;li&gt;All variables take different values.&lt;/li&gt;&lt;/ol&gt;&lt;div&gt;&lt;i&gt;Admittedly, I had &lt;a href="http://code.google.com/p/python-zibopt/wiki/ChangeLog#0.7.2_dev"&gt;a few bugs&lt;/a&gt; to fix in the code before I could get this working. &amp;nbsp;If you'd like to run it yourself, the model is &lt;a href="http://code.google.com/p/python-zibopt/source/browse/trunk/examples/magic-square.py"&gt;here&lt;/a&gt;. &amp;nbsp;It works against the latest development version in svn trunk of python-zibopt and python-algebraic 0.3.1. &amp;nbsp;When python-zibopt 0.7.2-dev is tagged soon, it will be a part of that.&lt;/i&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The first two constraint types are trivial to implement, and relatively easy for the solver. &amp;nbsp;What I do is add a single extra variable then set it equal to the sums of each row, column, and the diagonal.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;It's the third that messes things up. &amp;nbsp;You can think of this as saying, for every possible pair of integer-valued variables x &amp;amp; y:&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: center;"&gt;Either x &amp;gt;= y + 1 or x &amp;lt;= y - 1.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Why is this hard? &amp;nbsp;Because we can't add both constraints to the model and maintain feasibility. &amp;nbsp;What we have to do is add them in such a way that exactly one will be active for any any given solution. &amp;nbsp;This requires, for each pair of variables, an additional binary variable (we will call this z) and a &lt;a href="http://www.inf.ufpr.br/aurora/disciplinas/topicosia2/livros/search/integer.pdf"&gt;(possibly big) constant (M)&lt;/a&gt;. &amp;nbsp;Thus the above must be reformulated as the following before adding it to our model:&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: center;"&gt;x &amp;gt;= (y + 1) - M*z&lt;/div&gt;&lt;div style="text-align: center;"&gt;x &amp;lt;= (y - 1) + M*(1-z)&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;All of a sudden, &lt;a href="http://orinanobworld.blogspot.com/2011/07/perils-of-big-m.html"&gt;here be dragons&lt;/a&gt;. &amp;nbsp;We may not know how big or small to make M. &amp;nbsp;Generally we want it as small as possible to avoid playing too much havoc with the LP relaxations of our integer programming model. &amp;nbsp;It contributes to rounding errors (in the magic square problem, if I make M really big, all the variables will come back as 1). &amp;nbsp;Setting M to different values may have an unpredictable effect on the solution time of a given model. &amp;nbsp;So on, so forth.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Which brings us to an interesting idea:&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;SCIP now supports bilinear constraints out of the box. &amp;nbsp;This means that I can make M a variable in the above model. &amp;nbsp;(Heck, I can even make it an integer variable if I'm feeling particularly insane.)&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The magic square model linked to in this post (the astute reader will notice it does not solve the &lt;i&gt;normal&lt;/i&gt; magic square problem) provides both methods. &amp;nbsp;The first command line argument it requires is the matrix size. &amp;nbsp;The second one, M, is optional. &amp;nbsp;If not given, it will leave M up to the solver.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;I didn't expect this to perform as well as providing sensible values for M, but for small matrices it didn't perform too terribly worse either. &amp;nbsp;Not quite twice the run time in most of my unscientific tests. &amp;nbsp;Given the early state of MINLP development, that's pretty encouraging.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;I'd love to see what one of the many far more knowledgeable OR bloggers out there has to say about this.&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6378430709872859668-9126644048718923420?l=adventuresinoptimization.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://adventuresinoptimization.blogspot.com/feeds/9126644048718923420/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://adventuresinoptimization.blogspot.com/2012/01/magic-squares-and-big-ms.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6378430709872859668/posts/default/9126644048718923420'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6378430709872859668/posts/default/9126644048718923420'/><link rel='alternate' type='text/html' href='http://adventuresinoptimization.blogspot.com/2012/01/magic-squares-and-big-ms.html' title='magic squares and big-Ms'/><author><name>Ryan J. O'Neil</name><uri>http://www.blogger.com/profile/01002920068652190833</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://4.bp.blogspot.com/-7zSc00n-pqc/TbgcePz1iNI/AAAAAAAAAcc/RSMQsfpwnHo/s220/foo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6378430709872859668.post-5686563344423099870</id><published>2011-12-13T11:47:00.000-05:00</published><updated>2011-12-13T11:47:45.442-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='education'/><category scheme='http://www.blogger.com/atom/ns#' term='phd'/><category scheme='http://www.blogger.com/atom/ns#' term='msor'/><title type='text'>in defense of getting a phd</title><content type='html'>Every so often I see a barrage of articles or posts &lt;a href="http://www.economist.com/node/17723223"&gt;discounting&lt;/a&gt; the &lt;a href="http://www.xtranormal.com/watch/7520547/so-you-want-to-get-a-phd-in-theoretical-computer-science"&gt;value&lt;/a&gt; of a PhD. &amp;nbsp;They make arguments along the lines of "you will spend the next six years earning minimum wage as slave to a bitter professor" or "the opportunity cost of not working until you are 30 far outweighs the benefits of such a degree".&lt;br /&gt;&lt;br /&gt;Rarely do I see anyone argue the opposite. &amp;nbsp;Granted, the fact that I am still far away from having finished a PhD at this point means that my opinion is probably worth less than the bits required to store it. &amp;nbsp;However, I suspect there exists a recipe that's being overlooked by PhD naysayers.&lt;br /&gt;&lt;br /&gt;&lt;ol&gt;&lt;li&gt;Learn to program, learn databases, and learn the fundamentals of computer science. &amp;nbsp;Learn them Really Well.&lt;br /&gt;&lt;/li&gt;&lt;li&gt;Get several years of work experience during and after college. &amp;nbsp;Finish an MS or MA part time. &amp;nbsp;It won't take that much longer and you'll be a much stronger candidate for jobs and PhD programs after.&lt;br /&gt;&lt;/li&gt;&lt;li&gt;Now go and do research in &lt;i&gt;something else&lt;/i&gt;. &amp;nbsp;Something that interests you. &amp;nbsp;It almost doesn't matter what that is.&lt;/li&gt;&lt;/ol&gt;&lt;div&gt;For me, that something else is operations research. &amp;nbsp;But I've seen the formula work in completely different fields too, like political campaigns and journalism. &amp;nbsp;Computing experts are scarce in most areas, and are universally necessary these days. &amp;nbsp;Having multiple areas of expertise may make you the least fire-able person in your organization.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;In my case, getting an MSOR qualifies me to do modeling and simulation, but if I really want to solve big, hard problems (think gas pipe lines or massive transportation networks) then I need that PhD. &amp;nbsp;So much so that I sometimes consider going to school full time to speed up the process and enhance my career opportunities.&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6378430709872859668-5686563344423099870?l=adventuresinoptimization.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://adventuresinoptimization.blogspot.com/feeds/5686563344423099870/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://adventuresinoptimization.blogspot.com/2011/12/in-defense-of-getting-phd.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6378430709872859668/posts/default/5686563344423099870'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6378430709872859668/posts/default/5686563344423099870'/><link rel='alternate' type='text/html' href='http://adventuresinoptimization.blogspot.com/2011/12/in-defense-of-getting-phd.html' title='in defense of getting a phd'/><author><name>Ryan J. O'Neil</name><uri>http://www.blogger.com/profile/01002920068652190833</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://4.bp.blogspot.com/-7zSc00n-pqc/TbgcePz1iNI/AAAAAAAAAcc/RSMQsfpwnHo/s220/foo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6378430709872859668.post-282575005886517074</id><published>2011-11-18T00:12:00.001-05:00</published><updated>2011-11-18T09:31:01.197-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='pso'/><category scheme='http://www.blogger.com/atom/ns#' term='algorithms'/><category scheme='http://www.blogger.com/atom/ns#' term='scala-pso'/><title type='text'>some comments on particle swarm optimization</title><content type='html'>&lt;p&gt;I think I'm done making changes to &lt;a href="https://github.com/rzoz/scala-pso"&gt;scala-pso&lt;/a&gt; for now.  It finds the global minimum of Griewank's Function pretty readily, despite being somewhat magical &lt;i&gt;(read: directed stochastic search)&lt;/i&gt;.  Here are a few observations, in no particular order, should anyone be interested:&lt;/p&gt;&lt;ul&gt;&lt;li&gt;PSO does a pretty decent job of finding the minimum in this case, &lt;i&gt;when it has the right parameters&lt;/i&gt;.  Make the step sizes too large, and your swarm will diverge like a herd of angry, disoriented buffalo.  In this case I found that values of 0.7, 1.0, and 1.5 work well for the parameters when &lt;a href="http://en.wikipedia.org/wiki/Particle_swarm_optimization#Algorithm"&gt;updating the velocity vector&lt;/a&gt;.&lt;/li&gt;&lt;li&gt;Following on the previous note, the fact that these sorts of algorithms require "metaoptimization" for their parameters is slightly distressing.  Folks who are used to deterministic modeling in particular are likely to find this idea somewhat offensive.&lt;/li&gt;&lt;li&gt;PSO doesn't make much sense in the context of constrained optimization.  To enforce boundaries, I initially just set particles on any boundary they crossed.  This did not work well.  Various papers suggested using the maximum distance outside of any boundary squared * 10^8 as a penalty instead.  This works well, again, &lt;i&gt;with the right parameters&lt;/i&gt;.&lt;/li&gt;&lt;/ul&gt;&lt;p&gt;Here's what a swarm of 50 looks like over [-17.5, 2.5] in R^2.  The global minimum is at (0, 0).  I'm not including a visualization of the z-axis here -- you can find that in a &lt;a href="http://adventuresinoptimization.blogspot.com/2011/11/particle-swarm-optimization-project.html"&gt;previous post&lt;/a&gt;.  Just remember that we are minimizing; think of orange as higher than blue; and larger as, well, larger.  In short, we are looking for small dark blue particles.  These are snapshots at the start and after successive runs of 25 swarm update iterations.&lt;/p&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://2.bp.blogspot.com/-TD0wCLusQ8w/TsXpZr_ckPI/AAAAAAAAAoA/hyFP-MzqnvY/s1600/animation-iteration-000.png" imageanchor="1" style="margin-left:1em; margin-right:1em"&gt;&lt;img border="0" width="580" src="http://2.bp.blogspot.com/-TD0wCLusQ8w/TsXpZr_ckPI/AAAAAAAAAoA/hyFP-MzqnvY/s320/animation-iteration-000.png" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;center&gt;&lt;i&gt;Beginning&lt;/i&gt;&lt;/center&gt;&lt;br/&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://1.bp.blogspot.com/-hGdIvG2Uvdc/TsXpcnbjbPI/AAAAAAAAAoM/aGn201urCmI/s1600/animation-iteration-025.png" imageanchor="1" style="margin-left:1em; margin-right:1em"&gt;&lt;img border="0" width="580" src="http://1.bp.blogspot.com/-hGdIvG2Uvdc/TsXpcnbjbPI/AAAAAAAAAoM/aGn201urCmI/s320/animation-iteration-025.png" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;center&gt;&lt;i&gt;25 iterations&lt;/i&gt;&lt;/center&gt;&lt;br/&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://4.bp.blogspot.com/-PK-Lakqjuao/TsXpgOhEHHI/AAAAAAAAAoY/dkfvNtLSYy8/s1600/animation-iteration-050.png" imageanchor="1" style="margin-left:1em; margin-right:1em"&gt;&lt;img border="0" width="580" src="http://4.bp.blogspot.com/-PK-Lakqjuao/TsXpgOhEHHI/AAAAAAAAAoY/dkfvNtLSYy8/s320/animation-iteration-050.png" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;center&gt;&lt;i&gt;50 iterations&lt;/i&gt;&lt;/center&gt;&lt;br/&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://4.bp.blogspot.com/-pncCYd47Zhs/TsXpjW83KCI/AAAAAAAAAok/U8PRNkHPuGE/s1600/animation-iteration-075.png" imageanchor="1" style="margin-left:1em; margin-right:1em"&gt;&lt;img border="0" width="580" src="http://4.bp.blogspot.com/-pncCYd47Zhs/TsXpjW83KCI/AAAAAAAAAok/U8PRNkHPuGE/s320/animation-iteration-075.png" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;center&gt;&lt;i&gt;75 iterations&lt;/i&gt;&lt;/center&gt;&lt;br/&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://3.bp.blogspot.com/-9-eg7EMNkPo/TsXpm9GozvI/AAAAAAAAAow/LvmiilprOOE/s1600/animation-iteration-100.png" imageanchor="1" style="margin-left:1em; margin-right:1em"&gt;&lt;img border="0" width="580" src="http://3.bp.blogspot.com/-9-eg7EMNkPo/TsXpm9GozvI/AAAAAAAAAow/LvmiilprOOE/s320/animation-iteration-100.png" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;center&gt;&lt;i&gt;100 iterations&lt;/i&gt;&lt;/center&gt;&lt;p&gt;The swarm always keeps its best found point around.  In Genetic Algorithms this is known as &lt;i&gt;elitism&lt;/i&gt;.  Some algorithms use that for updating each particle's velocity.  I am using the current population best, which promotes diversity a bit more.  These are 95% confidence intervals of the objective (z), x1, and x2 using the same parameters, region, for 100 runs and 25 iterations &lt;i&gt;(it gets kind of boring after 25)&lt;/i&gt;:&lt;/p&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://3.bp.blogspot.com/-wJfs68y04BE/TsXq7UeyXkI/AAAAAAAAAo8/zVNFYmYfd4s/s1600/ci-best-z.png" imageanchor="1" style="margin-left:1em; margin-right:1em"&gt;&lt;img border="0" width="580" src="http://3.bp.blogspot.com/-wJfs68y04BE/TsXq7UeyXkI/AAAAAAAAAo8/zVNFYmYfd4s/s320/ci-best-z.png" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;center&gt;&lt;i&gt;Best objective value&lt;/i&gt;&lt;/center&gt;&lt;br/&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://2.bp.blogspot.com/-2sjviA5eZN8/TsXrFBFa3aI/AAAAAAAAApI/GlDzDLk14uA/s1600/ci-best-x1.png" imageanchor="1" style="margin-left:1em; margin-right:1em"&gt;&lt;img border="0" width="580" src="http://2.bp.blogspot.com/-2sjviA5eZN8/TsXrFBFa3aI/AAAAAAAAApI/GlDzDLk14uA/s320/ci-best-x1.png" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;center&gt;&lt;i&gt;Best X1&lt;/i&gt;&lt;/center&gt;&lt;br/&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://2.bp.blogspot.com/-VGKga5zudM4/TsXrLPwa5dI/AAAAAAAAApU/fEH0SkAEWFU/s1600/ci-best-x2.png" imageanchor="1" style="margin-left:1em; margin-right:1em"&gt;&lt;img border="0" width="580" src="http://2.bp.blogspot.com/-VGKga5zudM4/TsXrLPwa5dI/AAAAAAAAApU/fEH0SkAEWFU/s320/ci-best-x2.png" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;center&gt;&lt;i&gt;Best X2&lt;/i&gt;&lt;/center&gt;&lt;p&gt;Finally, these are 95% CIs for every particle in each of the 100 swarms, this time for 100 iterations:&lt;/p&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://4.bp.blogspot.com/-8ZKiX1QF-x0/TsXrRtYuhiI/AAAAAAAAApg/3bJ6K971C-I/s1600/ci-pop-z.png" imageanchor="1" style="margin-left:1em; margin-right:1em"&gt;&lt;img border="0" width="580" src="http://4.bp.blogspot.com/-8ZKiX1QF-x0/TsXrRtYuhiI/AAAAAAAAApg/3bJ6K971C-I/s320/ci-pop-z.png" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;center&gt;&lt;i&gt;Population objective value&lt;/i&gt;&lt;/center&gt;&lt;br/&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://4.bp.blogspot.com/-yMZcTu4Ny6s/TsXrYk4ibJI/AAAAAAAAAps/TmG_rV_mUSM/s1600/ci-pop-x1.png" imageanchor="1" style="margin-left:1em; margin-right:1em"&gt;&lt;img border="0" width="580" src="http://4.bp.blogspot.com/-yMZcTu4Ny6s/TsXrYk4ibJI/AAAAAAAAAps/TmG_rV_mUSM/s320/ci-pop-x1.png" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;center&gt;&lt;i&gt;Population X1&lt;/i&gt;&lt;/center&gt;&lt;br/&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://2.bp.blogspot.com/-KIF11Ih-wL8/TsXrfjOg-NI/AAAAAAAAAp4/AUomWY4XFzU/s1600/ci-pop-x2.png" imageanchor="1" style="margin-left:1em; margin-right:1em"&gt;&lt;img border="0" width="580" src="http://2.bp.blogspot.com/-KIF11Ih-wL8/TsXrfjOg-NI/AAAAAAAAAp4/AUomWY4XFzU/s320/ci-pop-x2.png" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;center&gt;&lt;i&gt;Population X2&lt;/i&gt;&lt;/center&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6378430709872859668-282575005886517074?l=adventuresinoptimization.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://adventuresinoptimization.blogspot.com/feeds/282575005886517074/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://adventuresinoptimization.blogspot.com/2011/11/some-comments-on-particle-swarm.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6378430709872859668/posts/default/282575005886517074'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6378430709872859668/posts/default/282575005886517074'/><link rel='alternate' type='text/html' href='http://adventuresinoptimization.blogspot.com/2011/11/some-comments-on-particle-swarm.html' title='some comments on particle swarm optimization'/><author><name>Ryan J. O'Neil</name><uri>http://www.blogger.com/profile/01002920068652190833</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://4.bp.blogspot.com/-7zSc00n-pqc/TbgcePz1iNI/AAAAAAAAAcc/RSMQsfpwnHo/s220/foo.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://2.bp.blogspot.com/-TD0wCLusQ8w/TsXpZr_ckPI/AAAAAAAAAoA/hyFP-MzqnvY/s72-c/animation-iteration-000.png' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6378430709872859668.post-4930612762668721868</id><published>2011-11-14T00:12:00.001-05:00</published><updated>2011-11-14T00:25:47.938-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='pso'/><category scheme='http://www.blogger.com/atom/ns#' term='graphs'/><category scheme='http://www.blogger.com/atom/ns#' term='processing'/><title type='text'>griewank's function in processing</title><content type='html'>&lt;p&gt;I'm a little ambivalent about using Tableau for visualizing the output of this &lt;a href="http://en.wikipedia.org/wiki/Particle_swarm_optimization"&gt;PSO&lt;/a&gt; project, so I turned to &lt;a href="http://processing.org/"&gt;Processing&lt;/a&gt;, which is a bit nicer and fairly easy.  If I can get it to create video of the particle swarm movement without too much hassle, I'll probably stick with this.  Also, the image output is higher quality.&lt;/p&gt;&lt;p&gt;Here's what Griewank's function looks like in R^2 from [-40,10] on both axes.  Processing code for doing multi-color gradients is below.  The trick is to convert between Processing's idea of the origin (where 0 is at the upper left) to the human idea of that (bottom left).  This is what the assignments to x1 and y1 do.&lt;/p&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://3.bp.blogspot.com/--hhbRKTNW5o/TsCkMbdz8AI/AAAAAAAAAn0/iofIeD4w7A8/s1600/griewank.png" imageanchor="1" style="margin-left:1em; margin-right:1em"&gt;&lt;img border="0" height="600" width="600" src="http://3.bp.blogspot.com/--hhbRKTNW5o/TsCkMbdz8AI/AAAAAAAAAn0/iofIeD4w7A8/s320/griewank.png" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;p&gt;Code (sadly, no &lt;a href="http://alexgorbatchev.com/SyntaxHighlighter/"&gt;SyntaxHighlighter&lt;/a&gt; support yet):&lt;/p&gt;&lt;pre&gt;&lt;br /&gt;// Exterior point values&lt;br /&gt;float x_min = -40;&lt;br /&gt;float x_max =  10;&lt;br /&gt;float y_min = -40;&lt;br /&gt;float y_max =  10;&lt;br /&gt;&lt;br /&gt;// The coefficient in Griewank's function&lt;br /&gt;float c = 200.0;&lt;br /&gt;&lt;br /&gt;float griewank(float x, float y) {&lt;br /&gt;  // First convert from a width x height rectangle with (0, 0)&lt;br /&gt;  // at the top left to an (x_max - x_min) x (y_max - y_min) &lt;br /&gt;  // with (x_min, y_min) at the bottom left. &lt;br /&gt;  float x1 = (x/width) * (x_max-x_min) + x_min;&lt;br /&gt;  float y1 = y_max - (y/height) * (y_max-y_min);&lt;br /&gt;  return (x1*x1 + y1*y1) / c - (cos(x1) + cos(y1/sqrt(2))) + 1;&lt;br /&gt;} &lt;br /&gt;&lt;br /&gt;void setup() {&lt;br /&gt;  size(600, 600);&lt;br /&gt;  background(0,0,0);&lt;br /&gt;  noLoop();&lt;br /&gt;}&lt;br /&gt;&lt;br /&gt;void draw() {&lt;br /&gt;  // Iteration 1: find the maximum.  Yes this is ugly, but it's &lt;br /&gt;  // better than storing them in an n*n array.&lt;br /&gt;  float max_z = 0.0;&lt;br /&gt;  for (int i = 0; i &lt; height; i++) {&lt;br /&gt;    for (int j = 0; j &lt; width; j++) {&lt;br /&gt;      float z = griewank(i, j);&lt;br /&gt;      if (z &gt; max_z)&lt;br /&gt;        max_z = z;&lt;br /&gt;    }&lt;br /&gt;  }&lt;br /&gt;  &lt;br /&gt;  // Iteration 2: color in our gradient&lt;br /&gt;  for (int i = 0; i &lt; height; i++) {&lt;br /&gt;    for (int j = 0; j &lt; width; j++) {&lt;br /&gt;      float z = griewank(i, j);&lt;br /&gt;      color c = color(&lt;br /&gt;        round(255*z/max_z), &lt;br /&gt;        round(125*z/max_z), &lt;br /&gt;        round(35*(max_z-z)/max_z)&lt;br /&gt;      );      &lt;br /&gt;      set(i, j, c);&lt;br /&gt;    } &lt;br /&gt;  }&lt;br /&gt;&lt;br /&gt;  save("griewank.png");&lt;br /&gt;}&lt;br /&gt;&lt;/pre&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6378430709872859668-4930612762668721868?l=adventuresinoptimization.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://adventuresinoptimization.blogspot.com/feeds/4930612762668721868/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://adventuresinoptimization.blogspot.com/2011/11/griewanks-function-in-processing.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6378430709872859668/posts/default/4930612762668721868'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6378430709872859668/posts/default/4930612762668721868'/><link rel='alternate' type='text/html' href='http://adventuresinoptimization.blogspot.com/2011/11/griewanks-function-in-processing.html' title='griewank&apos;s function in processing'/><author><name>Ryan J. O'Neil</name><uri>http://www.blogger.com/profile/01002920068652190833</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://4.bp.blogspot.com/-7zSc00n-pqc/TbgcePz1iNI/AAAAAAAAAcc/RSMQsfpwnHo/s220/foo.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://3.bp.blogspot.com/--hhbRKTNW5o/TsCkMbdz8AI/AAAAAAAAAn0/iofIeD4w7A8/s72-c/griewank.png' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6378430709872859668.post-426407223977201450</id><published>2011-11-10T16:42:00.001-05:00</published><updated>2011-11-10T16:47:34.392-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='functional programming'/><category scheme='http://www.blogger.com/atom/ns#' term='pso'/><category scheme='http://www.blogger.com/atom/ns#' term='scala-pso'/><category scheme='http://www.blogger.com/atom/ns#' term='scala'/><title type='text'>preliminary code for scala-pso</title><content type='html'>&lt;p&gt;I posted &lt;a href="https://github.com/rzoz/scala-pso"&gt;some prelimary code&lt;/a&gt; for PSO to github.  The interface is fairly simple.  Here's an example of how it works:&lt;/p&gt;&lt;pre class="brush: scala; toolbar: false;"&gt;package scalapso.examples&lt;br /&gt;&lt;br /&gt;import scala.math&lt;br /&gt;import scalapso.Swarm&lt;br /&gt;&lt;br /&gt;object GriewankPSO {&lt;br /&gt;    val c = 200&lt;br /&gt;    val lowerBounds = Array.fill(2)(-100.0)&lt;br /&gt;    val upperBounds = Array.fill(2)( 100.0)&lt;br /&gt;    &lt;br /&gt;    // Simple implementation of Griewank's function&lt;br /&gt;    def griewank(x: Array[Double]): Double = {&lt;br /&gt;        var r = 0.0&lt;br /&gt;        var p = 1.0&lt;br /&gt;        for ((xi, i) &lt;- x.zipWithIndex) {&lt;br /&gt;            r += xi * xi / c&lt;br /&gt;            p *= math.cos(xi / math.sqrt(i+1))&lt;br /&gt;        }&lt;br /&gt;        &lt;br /&gt;        return r - p + 1&lt;br /&gt;    }&lt;br /&gt;    &lt;br /&gt;    def main(args: Array[String]): Unit = {&lt;br /&gt;        val swarm = new Swarm(griewank, lowerBounds, upperBounds, 100)&lt;br /&gt;        for (i &lt;- 0 until 10000) {&lt;br /&gt;            swarm.iterate()&lt;br /&gt;            println("[" + i + "] f(" + swarm.bestPoint.mkString(", ") + ") = " + swarm.bestObjective)&lt;br /&gt;        }&lt;br /&gt;    }&lt;br /&gt;}&lt;br /&gt;&lt;/pre&gt;&lt;p&gt;Lines 8 and 9 describe the feasible region.  In this case we're looking in two dimensions through [-100, 100].  By changing the arguments to Array.fill &lt;i&gt;(they both have to match)&lt;/i&gt;, we could search as many dimensions as we want over any feasible region.&lt;/p&gt;&lt;p&gt;Lines 12 to 21 define the Griewank Function, as I posted about &lt;a href="http://adventuresinoptimization.blogspot.com/2011/11/idiom-vs-naivete-in-scala.html"&gt;before&lt;/a&gt;.  This is the most important parameter we must define for the PSO engine.  As long as it has the signature Array[Double] =&gt; Double, it can be any function, which is one of the things that's attractive about this technique.  &lt;i&gt;(It can also be a weakness, realistically.)&lt;/i&gt;&lt;/p&gt;&lt;p&gt;In the main method, we create a swarm of 100 particles and search for 10,000 iterations.  That's about it.  It does reasonably well finding a point close to the global minimum of (0, 0):&lt;/p&gt;&lt;pre&gt;&lt;br /&gt;[0] f(-9.907037564106162, 3.5295131972546727) = 0.845525426092787&lt;br /&gt;[1] f(-9.907037564106162, 3.5295131972546727) = 0.845525426092787&lt;br /&gt;[2] f(-2.75393756437488, 5.082915206876536) = 0.3345069817102082&lt;br /&gt;[... snip ...]&lt;br /&gt;[9998] f(-0.023389653019847856, -0.07359085931487641) = 0.0016565668346417706&lt;br /&gt;[9999] f(-0.023389653019847856, -0.07359085931487641) = 0.0016565668346417706&lt;br /&gt;&lt;/pre&gt;&lt;p&gt;Next step: create some fancy visualizations.&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6378430709872859668-426407223977201450?l=adventuresinoptimization.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://adventuresinoptimization.blogspot.com/feeds/426407223977201450/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://adventuresinoptimization.blogspot.com/2011/11/preliminary-code-for-scala-pso.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6378430709872859668/posts/default/426407223977201450'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6378430709872859668/posts/default/426407223977201450'/><link rel='alternate' type='text/html' href='http://adventuresinoptimization.blogspot.com/2011/11/preliminary-code-for-scala-pso.html' title='preliminary code for scala-pso'/><author><name>Ryan J. O'Neil</name><uri>http://www.blogger.com/profile/01002920068652190833</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://4.bp.blogspot.com/-7zSc00n-pqc/TbgcePz1iNI/AAAAAAAAAcc/RSMQsfpwnHo/s220/foo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6378430709872859668.post-1674678190390914243</id><published>2011-11-09T15:51:00.000-05:00</published><updated>2011-11-09T16:09:38.600-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='functional programming'/><category scheme='http://www.blogger.com/atom/ns#' term='pso'/><category scheme='http://www.blogger.com/atom/ns#' term='algorithms'/><category scheme='http://www.blogger.com/atom/ns#' term='scala'/><title type='text'>idiom vs. naivete in scala</title><content type='html'>&lt;p&gt;This post examines two methods for computing the &lt;a href="http://mathworld.wolfram.com/GriewankFunction.html"&gt;Griewank Function&lt;/a&gt; in Scala.  It compares the performance of naive versus idiomatic code for this case.&lt;/p&gt;&lt;p&gt;The Griewank Function is computed as follows, given an arbitrary vector of real inputs, x, and a constant, c:&lt;/p&gt;&lt;p&gt;&lt;center&gt;$f(x) = \frac{1}{c}\sum_{i=1}^n{x_i^2} - \prod_{i=1}^n{\cos{\frac{x_i}{\sqrt{i}}}} + 1$&lt;/center&gt;&lt;/p&gt;&lt;p&gt;Obviously this will require some sort of iteration, and for my &lt;a href="http://en.wikipedia.org/wiki/Particle_swarm_optimization"&gt;PSO&lt;/a&gt; library I need it extremely fast as it will be called quite a bit.  A naive implementation in Scala might look something like this:&lt;/p&gt;&lt;pre class="brush: scala; toolbar: false;"&gt;import scala.math&lt;br /&gt;&lt;br /&gt;def griewank(x: Array[Double], c: Double): Double = {&lt;br /&gt;    var r = 0.0&lt;br /&gt;    var p = 1.0&lt;br /&gt;    for ((xi, i) &lt;- x.zipWithIndex) {&lt;br /&gt;        r += xi * xi / c&lt;br /&gt;        p *= math.cos(xi / math.sqrt(i+1))&lt;br /&gt;    }&lt;br /&gt; &lt;br /&gt;    return r - p + 1&lt;br /&gt;}&lt;br /&gt;&lt;/pre&gt;&lt;p&gt;Here we keep two accumulator variables, one for the sum and one for the product.  We iterate once over x.  Using x.zipWithIndex returns each value of x with its associated index, allowing us to compute the square roots in the denominator.  This functions similarly to enumerate in Python, but with the order of the resulting tuples reversed.&lt;/p&gt;&lt;p&gt;While learning a new language, it's best to try and be as idiomatic as possible.  This was a main point of the classic book, &lt;a href="http://www.amazon.com/Effective-Perl-Programming-Writing-Programs/dp/0201419750"&gt;Effective Perl Programming&lt;/a&gt;.  One can write Perl, Python, Java, etc. &lt;a href="http://c2.com/cgi/wiki?RealProgrammer"&gt;in any language&lt;/a&gt;, but when in Scala one should try and do as the Scalans do.  Thus I spent a little time reformulating this implementation to use the sum and product folds that live on iterators.  Here is a more Scala-esque way of doing things:&lt;/p&gt;&lt;pre class="brush: scala; toolbar: false;"&gt;def griewank2(x: Array[Double], c: Double): Double = {&lt;br /&gt;    val r2 = x.map(y =&gt; y * y / c).sum&lt;br /&gt;    val p2 = x.zipWithIndex.map(&lt;br /&gt;        y =&gt; math.cos(y._1 / math.sqrt(y._2 + 1))&lt;br /&gt;    ).product&lt;br /&gt; &lt;br /&gt;    return r2 - p2 + 1&lt;br /&gt;}&lt;br /&gt;&lt;/pre&gt;&lt;p&gt;For the first term, x.map takes in a lambda function and then folds it with sum.  We do the same thing for the second term with zipWithIndex, using _1 and _2 to extract the tuple components and product for folding.&lt;/p&gt;&lt;p&gt;But there's something wrong here, which is that the latter implementation iterates twice over x.  The following benchmark shows the naive implementation is about 25% faster on my machine than the idiomatic one.  I guess I'll be sticking with the former.&lt;/p&gt;&lt;pre class="brush: scala; toolbar: false;"&gt;package scalapso.examples&lt;br /&gt;&lt;br /&gt;import scala.math&lt;br /&gt;import scala.util.Random&lt;br /&gt;&lt;br /&gt;object GriewankTest {&lt;br /&gt;    // Simple implementation of Griewank's function &lt;br /&gt;    def griewank(x: Array[Double], c: Double): Double = {&lt;br /&gt;        var r = 0.0&lt;br /&gt;        var p = 1.0&lt;br /&gt;        for ((xi, i) &lt;- x.zipWithIndex) {&lt;br /&gt;            r += xi * xi / c&lt;br /&gt;            p *= math.cos(xi / math.sqrt(i+1))&lt;br /&gt;        }&lt;br /&gt;        &lt;br /&gt;        return r - p + 1&lt;br /&gt;    }&lt;br /&gt;    &lt;br /&gt;    // Alternative implementation using folding&lt;br /&gt;    def griewank2(x: Array[Double], c: Double): Double = {&lt;br /&gt;        val r2 = x.map(y =&gt; y * y / c).sum&lt;br /&gt;        val p2 = x.zipWithIndex.map(&lt;br /&gt;            y =&gt; math.cos(y._1 / math.sqrt(y._2 + 1))&lt;br /&gt;        ).product&lt;br /&gt;        &lt;br /&gt;        return r2 - p2 + 1&lt;br /&gt;    }&lt;br /&gt;    &lt;br /&gt;    def main(args: Array[String]): Unit = {&lt;br /&gt;        // Generate a bunch of random points in 100 dimensions&lt;br /&gt;        val rand = new Random&lt;br /&gt;        val points: Array[Array[Double]] = Array.ofDim(100000, 100)&lt;br /&gt;        for (point &lt;- points)&lt;br /&gt;            for (i &lt;- 0 until point.length)&lt;br /&gt;                point(i) = rand.nextDouble&lt;br /&gt;    &lt;br /&gt;        val c = 200&lt;br /&gt;                &lt;br /&gt;        // Time both implementations&lt;br /&gt;        val t1 = System.currentTimeMillis&lt;br /&gt;        for (point &lt;- points)&lt;br /&gt;            griewank(point, c)&lt;br /&gt;        val t2 = System.currentTimeMillis    &lt;br /&gt;        for (point &lt;- points)&lt;br /&gt;            griewank2(point, c)&lt;br /&gt;        val t3 = System.currentTimeMillis&lt;br /&gt;        &lt;br /&gt;        println("simple implementation: " + (t2-t1))&lt;br /&gt;        println("folded implementation: " + (t3-t2))&lt;br /&gt;    }&lt;br /&gt;}&lt;br /&gt;&lt;/pre&gt;&lt;p&gt;Here's the output for good measure:&lt;/p&gt;&lt;pre&gt;&lt;br /&gt;simple implementation: 1236&lt;br /&gt;folded implementation: 1627&lt;br /&gt;&lt;/pre&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6378430709872859668-1674678190390914243?l=adventuresinoptimization.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://adventuresinoptimization.blogspot.com/feeds/1674678190390914243/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://adventuresinoptimization.blogspot.com/2011/11/idiom-vs-naivete-in-scala.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6378430709872859668/posts/default/1674678190390914243'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6378430709872859668/posts/default/1674678190390914243'/><link rel='alternate' type='text/html' href='http://adventuresinoptimization.blogspot.com/2011/11/idiom-vs-naivete-in-scala.html' title='idiom vs. naivete in scala'/><author><name>Ryan J. O'Neil</name><uri>http://www.blogger.com/profile/01002920068652190833</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://4.bp.blogspot.com/-7zSc00n-pqc/TbgcePz1iNI/AAAAAAAAAcc/RSMQsfpwnHo/s220/foo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6378430709872859668.post-3240069814202337689</id><published>2011-11-07T14:06:00.000-05:00</published><updated>2011-11-07T14:17:33.908-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='hibernate'/><category scheme='http://www.blogger.com/atom/ns#' term='maven'/><category scheme='http://www.blogger.com/atom/ns#' term='scala'/><title type='text'>scala, maven, and hibernate quickstart</title><content type='html'>&lt;p&gt;Lately I've started a new project which will use data from pre-existing tables.  It has to target the JVM, but that's its only real technical restriction.  I'm taking this moment to modernize some of the components in our application stack, such as setting a better precedent for dependency handling and connecting to data tables.  These are notes from the last few days of my infrastructural futzing around in &lt;a href="http://www.scala-lang.org/"&gt;Scala&lt;/a&gt;, &lt;a href="http://maven.apache.org/"&gt;Maven&lt;/a&gt;, and &lt;a href="http://hibernate.org/"&gt;Hibernate&lt;/a&gt;.&lt;/p&gt;&lt;p&gt;In this case the tables I need to connect to already exist, so I don't need any of Hibernate's functionality for building out schema.  For the sake of example, let's say a table I need to use &lt;i&gt;(we'll call it Foo)&lt;/i&gt; looks like the following:&lt;/p&gt;&lt;ul&gt;    &lt;li&gt;foo_id: integer primary key&lt;/li&gt;    &lt;li&gt;type: varchar(5)&lt;/li&gt;    &lt;li&gt;baz: integer&lt;/li&gt;    &lt;li&gt;xyzzy: date&lt;/li&gt;&lt;/ul&gt;&lt;p&gt;I create a new Scala project called &lt;i&gt;My Awesome App&lt;/i&gt; using &lt;a href="http://www.scala-ide.org/"&gt;ScalaIDE&lt;/a&gt; and convert it to a Maven project.  I then build out my directory and package structure so it resembles this:&lt;/p&gt;&lt;ul&gt;    &lt;li&gt;pom.xml&lt;/li&gt;    &lt;li&gt;        src        &lt;ul&gt;            &lt;li&gt;                main.resources                &lt;ul&gt;&lt;li&gt;hibernate.cfg.xml&lt;/li&gt;&lt;/ul&gt;            &lt;/li&gt;            &lt;li&gt;                my.awesome.app                &lt;ul&gt;                    &lt;li&gt;Foo.scala&lt;/li&gt;                    &lt;li&gt;TestFoo.scala&lt;/li&gt;                &lt;/ul&gt;            &lt;/li&gt;        &lt;/ul&gt;    &lt;/li&gt;&lt;/ul&gt;    &lt;p&gt;I update the pom.xml file, telling Maven where to get Hibernate.  Normally one would put the database driver library here too, but I'm connecting to Oracle which does not, um, provide a Maven repository.  So in my case I download ojbc6.jar and add it to my libraries in Eclipse, which will do for now.&lt;/p&gt;&lt;pre class="brush: xml; toolbar: false;"&gt;&lt;project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd"&gt;&lt;br /&gt;    &lt;modelVersion&gt;4.0.0&lt;/modelVersion&gt;&lt;br /&gt;    &lt;groupId&gt;my_group_id&lt;/groupId&gt;&lt;br /&gt;    &lt;artifactId&gt;my-artifact-id&lt;/artifactId&gt;&lt;br /&gt;    &lt;version&gt;0.0.1-SNAPSHOT&lt;/version&gt;&lt;br /&gt;    &lt;name&gt;My Awesome App&lt;/name&gt;&lt;br /&gt;    &lt;br /&gt;    &lt;repositories&gt;&lt;br /&gt;        &lt;repository&gt;&lt;br /&gt;            &lt;id&gt;JBoss&lt;/id&gt;&lt;br /&gt;            &lt;name&gt;JBoss Repository&lt;/name&gt;&lt;br /&gt;            &lt;layout&gt;default&lt;/layout&gt;&lt;br /&gt;            &lt;url&gt;http://repository.jboss.org/nexus/content/groups/public-jboss/&lt;/url&gt;&lt;br /&gt;        &lt;/repository&gt;&lt;br /&gt;    &lt;/repositories&gt;&lt;br /&gt;    &lt;br /&gt;    &lt;dependencies&gt;&lt;br /&gt;        &lt;dependency&gt;&lt;br /&gt;            &lt;groupId&gt;javassist&lt;/groupId&gt;&lt;br /&gt;            &lt;artifactId&gt;javassist&lt;/artifactId&gt;&lt;br /&gt;            &lt;version&gt;3.12.1.GA&lt;/version&gt;&lt;br /&gt;        &lt;/dependency&gt;&lt;br /&gt;&lt;br /&gt;        &lt;dependency&gt;&lt;br /&gt;            &lt;groupId&gt;org.hibernate&lt;/groupId&gt;&lt;br /&gt;            &lt;artifactId&gt;hibernate-core&lt;/artifactId&gt;&lt;br /&gt;            &lt;version&gt;3.6.8.Final&lt;/version&gt;&lt;br /&gt;        &lt;/dependency&gt;&lt;br /&gt;    &lt;/dependencies&gt;&lt;br /&gt;&lt;/project&gt;&lt;br /&gt;&lt;/pre&gt;&lt;p&gt;Now I create a Scala class representing the Foo table above.  A few notes about this:&lt;/p&gt;&lt;ul&gt;&lt;li&gt;I use val instead of var, per best practices in Scala.  In my case all data are read-only anyway.  This makes all fields final and is a lot less annoying than using getters and setters.&lt;/li&gt;&lt;li&gt;Make sure to import the JPA annotations &lt;i&gt;(from javax.persistence)&lt;/i&gt; instead of those with the same name from Hibernate.  I had Eclipse import the wrong ones and figuring out what went wrong was no fun.&lt;/li&gt;&lt;li&gt;When a database column is named something incorrect or poor for Scala, you can give it whatever name you want on your class and use the @Column annotation to point it to the right database field.&lt;/li&gt;&lt;/ul&gt;&lt;pre class="brush: scala; toolbar: false;"&gt;package my.awesome.app&lt;br /&gt;&lt;br /&gt;import java.util.Date&lt;br /&gt;import javax.persistence.{Column, Entity, GeneratedValue, Id, Table}&lt;br /&gt;&lt;br /&gt;@Entity @Table(name="Foo") class Foo {&lt;br /&gt;    @Id @GeneratedValue @Column(name="foo_id") val fooId: Long = 0&lt;br /&gt;    @Column(name="type") val fooType: String = ""  &lt;br /&gt;    val baz: Long = 0&lt;br /&gt;    val xyzzy: Date = new Date()&lt;br /&gt;}&lt;br /&gt;&lt;/pre&gt;&lt;p&gt;Now that I have a Scala representation of this table, I can configure Hibernate.  The following is my hibernate.cfg.xml file.  Note the mapping tag at the end that tells it about the annotated Foo class.&lt;/p&gt;&lt;pre class="brush: xml; toolbar: false;"&gt;&lt;?xml version='1.0' encoding='utf-8'?&gt;&lt;br /&gt;&lt;!DOCTYPE hibernate-configuration PUBLIC "-//Hibernate/Hibernate Configuration DTD 3.0//EN" "http://hibernate.sourceforge.net/hibernate-configuration-3.0.dtd"&gt;&lt;br /&gt;&lt;br /&gt;&lt;hibernate-configuration&gt;&lt;br /&gt;    &lt;session-factory&gt;&lt;br /&gt;        &lt;!-- Database connection settings --&gt;&lt;br /&gt;        &lt;property name="connection.driver_class"&gt;my database driver&lt;/property&gt;&lt;br /&gt;        &lt;property name="connection.url"&gt;my connection string&lt;/property&gt;&lt;br /&gt;        &lt;property name="connection.username"&gt;my database user&lt;/property&gt;&lt;br /&gt;        &lt;property name="connection.password"&gt;my database password&lt;/property&gt;&lt;br /&gt;&lt;br /&gt;        &lt;!-- JDBC connection pool (use the built-in) --&gt;&lt;br /&gt;        &lt;property name="connection.pool_size"&gt;1&lt;/property&gt;&lt;br /&gt;&lt;br /&gt;        &lt;!-- SQL dialect --&gt;&lt;br /&gt;        &lt;property name="dialect"&gt;dialect for my database&lt;/property&gt;&lt;br /&gt;&lt;br /&gt;        &lt;!-- Enable Hibernate's automatic session context management --&gt;&lt;br /&gt;        &lt;property name="current_session_context_class"&gt;thread&lt;/property&gt;&lt;br /&gt;&lt;br /&gt;        &lt;!-- Disable the second-level cache  --&gt;&lt;br /&gt;        &lt;property name="cache.provider_class"&gt;org.hibernate.cache.NoCacheProvider&lt;/property&gt;&lt;br /&gt;&lt;br /&gt;        &lt;!-- Echo all executed SQL to stdout --&gt;&lt;br /&gt;        &lt;property name="show_sql"&gt;true&lt;/property&gt;&lt;br /&gt;&lt;br /&gt;        &lt;!--  Table definitions --&gt;&lt;br /&gt;        &lt;mapping class="my.awesome.app.Foo"&gt;&lt;/mapping&gt;&lt;br /&gt;    &lt;/session-factory&gt;&lt;br /&gt;&lt;/hibernate-configuration&gt;&lt;br /&gt;&lt;/pre&gt;&lt;p&gt;Finally, I create a simple main method for tesing this.  Importing scala.collection.JavaConversions._ allows me to iterate over a java.util.List[Foo] instance as I would any Scala iterable.&lt;/p&gt;&lt;pre class="brush: scala; toolbar: false;"&gt;package my.awesome.app&lt;br /&gt;&lt;br /&gt;import org.hibernate.cfg.Configuration&lt;br /&gt;import scala.collection.JavaConversions._&lt;br /&gt;&lt;br /&gt;object TestFoo {&lt;br /&gt;    def main(args: Array[String]): Unit = {&lt;br /&gt;        val session = new Configuration().configure().buildSessionFactory().openSession()&lt;br /&gt;        val result = session.createQuery("from Foo").list().asInstanceOf[java.util.List[Foo]]&lt;br /&gt;        for (val foo &lt;- result)&lt;br /&gt;            println(foo.fooId + " | " + foo.fooType + " | " + foo.baz + " | " + foo.xyzzy)&lt;br /&gt;        session.close()&lt;br /&gt;    }        &lt;br /&gt;}&lt;br /&gt;&lt;/pre&gt;&lt;p&gt;Profit!&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6378430709872859668-3240069814202337689?l=adventuresinoptimization.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://adventuresinoptimization.blogspot.com/feeds/3240069814202337689/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://adventuresinoptimization.blogspot.com/2011/11/scala-maven-and-hibernate-quickstart.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6378430709872859668/posts/default/3240069814202337689'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6378430709872859668/posts/default/3240069814202337689'/><link rel='alternate' type='text/html' href='http://adventuresinoptimization.blogspot.com/2011/11/scala-maven-and-hibernate-quickstart.html' title='scala, maven, and hibernate quickstart'/><author><name>Ryan J. O'Neil</name><uri>http://www.blogger.com/profile/01002920068652190833</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://4.bp.blogspot.com/-7zSc00n-pqc/TbgcePz1iNI/AAAAAAAAAcc/RSMQsfpwnHo/s220/foo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6378430709872859668.post-1411184409021540453</id><published>2011-11-03T14:37:00.001-04:00</published><updated>2011-11-03T14:37:04.693-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='pyladies'/><category scheme='http://www.blogger.com/atom/ns#' term='algorithms'/><category scheme='http://www.blogger.com/atom/ns#' term='python'/><title type='text'>know your time complexities: part 2</title><content type='html'>&lt;p&gt;In response to &lt;a href=""&gt;this&lt;/a&gt; post, &lt;a href="http://www.indopedia.org/index.php?title=Ben_Bitdiddle"&gt;Ben Bitdiddle&lt;/a&gt; inquires:&lt;/p&gt;&lt;blockquote&gt;"I understand the concept of using a companion set to remove duplicates from a list while preserving the order of its elements.  But what should I do if these elements are composed of smaller pieces?  For instance, say I am generating &lt;a href="http://en.wikipedia.org/wiki/Combination"&gt;combinations&lt;/a&gt; of numbers in which order is unimportant.  How do I make a set recognize that [1,2,3] is the same as [3,2,1] in this case?"&lt;/blockquote&gt;&lt;p&gt;There are a couple points that should help here:&lt;/p&gt;&lt;ol&gt;&lt;li&gt;While lists are unhashable and therefore cannot be put into sets, tuples are perfectly capable of this.  Therefore I cannot do:&lt;pre&gt;&lt;br /&gt;&gt;&gt;&gt; s = set()&lt;br /&gt;&gt;&gt;&gt; s.add([1,2,3])&lt;br /&gt;Traceback (most recent call last):&lt;br /&gt; File "&lt;stdin&gt;", line 1, in &lt;module&gt;&lt;br /&gt;TypeError: unhashable type: 'list'&lt;br /&gt;&lt;/pre&gt;But this works just fine &lt;i&gt;(extra space added for emphasis of tuple parentheses)&lt;/i&gt;:&lt;pre&gt;&lt;br /&gt;&gt;&gt;&gt; s.add( (1,2,3) )&lt;br /&gt;&lt;/pre&gt;&lt;/li&gt;&lt;li&gt;(3,2,1) and (1,2,3) may not hash to the same thing, but tuples are easily sortable.  If I sort them before adding them to a set, they look the same: &lt;pre&gt;&lt;br /&gt;&gt;&gt;&gt; tuple(sorted( (3,2,1) ))&lt;br /&gt;(1, 2, 3)&lt;br /&gt;&lt;/pre&gt;&lt;/li&gt;&lt;/ol&gt;&lt;p&gt;If I want to be a little fancier, I can user &lt;a href="http://docs.python.org/library/itertools.html#itertools.combinations"&gt;itertools.combinations&lt;/a&gt;.  The following generates all unique 3-digit combinations of integers from 1 to 4:&lt;/p&gt;&lt;pre&gt;&lt;br /&gt;&gt;&gt;&gt; from itertools import combinations&lt;br /&gt;&gt;&gt;&gt; list(combinations(range(1,5), 3))&lt;br /&gt;[(1, 2, 3), (1, 2, 4), (1, 3, 4), (2, 3, 4)]&lt;br /&gt;&lt;/pre&gt;&lt;p&gt;Now say I want to only find those that match some condition.  I can add a filter to return, say, only those 3-digit combinations of integers from 1 to 6 that multiply to a number divisible by 10:&lt;/p&gt;&lt;pre&gt;&lt;br /&gt;&gt;&gt;&gt; list(filter(&lt;br /&gt;        lambda x: not (x[0]*x[1]*x[2]) % 10, &lt;br /&gt;        combinations(range(1, 7), 3)&lt;br /&gt;    ))&lt;br /&gt;[(1, 2, 5), (1, 4, 5), (1, 5, 6), (2, 3, 5), (2, 4, 5),&lt;br /&gt; (2, 5, 6), (3, 4, 5), (3, 5, 6), (4, 5, 6)]&lt;br /&gt;&lt;/pre&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6378430709872859668-1411184409021540453?l=adventuresinoptimization.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://adventuresinoptimization.blogspot.com/feeds/1411184409021540453/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://adventuresinoptimization.blogspot.com/2011/11/know-your-time-complexities-part-2.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6378430709872859668/posts/default/1411184409021540453'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6378430709872859668/posts/default/1411184409021540453'/><link rel='alternate' type='text/html' href='http://adventuresinoptimization.blogspot.com/2011/11/know-your-time-complexities-part-2.html' title='know your time complexities: part 2'/><author><name>Ryan J. O'Neil</name><uri>http://www.blogger.com/profile/01002920068652190833</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://4.bp.blogspot.com/-7zSc00n-pqc/TbgcePz1iNI/AAAAAAAAAcc/RSMQsfpwnHo/s220/foo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6378430709872859668.post-3250991730554484711</id><published>2011-11-01T16:10:00.001-04:00</published><updated>2011-11-01T16:11:17.238-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='pso'/><category scheme='http://www.blogger.com/atom/ns#' term='erlang'/><category scheme='http://www.blogger.com/atom/ns#' term='scala'/><title type='text'>particle swarm optimization project</title><content type='html'>&lt;p&gt;For my Numerical Analysis &lt;a href="http://catalog.gmu.edu/preview_course.php?catoid=17&amp;coid=108800&amp;print"&gt;class&lt;/a&gt;, we've been tasked with implementing &lt;a href="http://en.wikipedia.org/wiki/Particle_swarm_optimization"&gt;Particle Swarm Optimization&lt;/a&gt; (PSO) against the &lt;a href="http://mathworld.wolfram.com/GriewankFunction.html"&gt;Griewank Function&lt;/a&gt;.  A few notes about this:&lt;/p&gt;&lt;ul&gt;&lt;li&gt;Griewank's Function is unconstrained and non-convex.  Due to the latter characteristic, its global optimum may not be easily found using traditional nonlinear optimization.&lt;/li&gt;&lt;li&gt;It takes in a vector of arbitrary dimension and returns a single output.  The global minimum is at x = 0, f(x) = 0, making it simple to verify results.  &lt;i&gt;(We're looking for the global minimum, in case that wasn't clear.)&lt;/i&gt;&lt;/li&gt;&lt;li&gt;Particle Swarm "Optimization" is a bit of a misnomer.  Like other types of swarm intelligence or evolutionary computation (EC), it would be better called a "directed stochastic search" than an optimization method.  There's nothing wrong with that; these are really useful in problems that are non-convex or otherwise ill defined.  &lt;i&gt;(If this statement piques your interest, I point you to &lt;a href="http://cs.gmu.edu/~eclab/papers/foga2_opt.ps"&gt;this classic paper&lt;/a&gt;.)&lt;/i&gt;&lt;/li&gt;&lt;/ul&gt;&lt;p&gt;Unfortunately for me, I've about had my fill of implementing EC techniques.  I actually wish I could get back some of the hours I've spent on that, but until somebody fires up a Genetic Programming engine on a Beowulf cluster to invent a time machine, that's not going to happen.  Still, a few modifications should make the project really interesting:&lt;/p&gt;&lt;ul&gt;&lt;li&gt;Instead of using Matlab or a procedural language, implement it using message passing in either Erlang or Scala, where each particle has its own process or actor.  This will make it scale nicely over a network.&lt;/li&gt;&lt;li&gt;Compare it with a simple hill climbing algorithm that starts at discretized points, or possibly with Newton's Method.&lt;/li&gt;&lt;li&gt;Use Tableau for visualizing the output.&lt;/li&gt;&lt;/ul&gt;&lt;p&gt;Here's what Griewank's Function looks like in two dimensions.  Orange corresponds to higher objective values.  The global minimum is at x1 = x2 = 0, in the dark blue near the top right:&lt;/p&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://2.bp.blogspot.com/-Iyhf19P1kKQ/TrBNEBP7qkI/AAAAAAAAAeA/WnB3VHIyD5s/s1600/griewank.png" imageanchor="1" style="margin-left:1em; margin-right:1em"&gt;&lt;img border="0" height="400" width="640" src="http://2.bp.blogspot.com/-Iyhf19P1kKQ/TrBNEBP7qkI/AAAAAAAAAeA/WnB3VHIyD5s/s320/griewank.png" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;p&gt;I'll post here and stick the code on github or some other place when it's coming along.&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6378430709872859668-3250991730554484711?l=adventuresinoptimization.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://adventuresinoptimization.blogspot.com/feeds/3250991730554484711/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://adventuresinoptimization.blogspot.com/2011/11/particle-swarm-optimization-project.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6378430709872859668/posts/default/3250991730554484711'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6378430709872859668/posts/default/3250991730554484711'/><link rel='alternate' type='text/html' href='http://adventuresinoptimization.blogspot.com/2011/11/particle-swarm-optimization-project.html' title='particle swarm optimization project'/><author><name>Ryan J. O'Neil</name><uri>http://www.blogger.com/profile/01002920068652190833</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://4.bp.blogspot.com/-7zSc00n-pqc/TbgcePz1iNI/AAAAAAAAAcc/RSMQsfpwnHo/s220/foo.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://2.bp.blogspot.com/-Iyhf19P1kKQ/TrBNEBP7qkI/AAAAAAAAAeA/WnB3VHIyD5s/s72-c/griewank.png' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6378430709872859668.post-8349101899471871547</id><published>2011-10-25T17:42:00.001-04:00</published><updated>2011-10-25T17:44:06.590-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='pyladies'/><category scheme='http://www.blogger.com/atom/ns#' term='algorithms'/><category scheme='http://www.blogger.com/atom/ns#' term='python'/><title type='text'>know your time complexities</title><content type='html'>&lt;i&gt;This is based on a lightning talk I gave at the LA PyLadies &lt;a href="http://www.meetup.com/la-pyladies/events/34789522/"&gt;October Hackathon&lt;/a&gt;.&lt;/i&gt;&lt;br /&gt;&lt;br /&gt; I'm actually not going to go into anything much resembling algorithmic complexity here.  What I'd like to do is present a common performance anti-pattern that I see from novice programmers about once every year or so.  If I can prevent one person from committing this error, this post will have achieved its goal.  I'd also like to show how an intuitive understanding of time required by operations in relation to the size of data they operate on can be helpful.&lt;br /&gt;&lt;br /&gt;Say you have a Big List of Things.  It doesn't particularly matter what these things are.  Often they might be objects or dictionaries of denormalized data.  In this example we'll use numbers.  Let's generate a list of 1 million integers, each randomly chosen from the first 100 thousand natural numbers:&lt;br /&gt;&lt;pre class="brush: python; toolbar: false;"&gt;import random&lt;br /&gt;&lt;br /&gt;choices = range(100000)&lt;br /&gt;x = [random.choice(choices) for i in xrange(1000000)]&lt;br /&gt;&lt;/pre&gt;Now say you want to remove &lt;i&gt;(or aggregate, or structure)&lt;/i&gt; duplicate data while keeping them &lt;i&gt;in order of appearance&lt;/i&gt;.  Intuitively, this seems simple enough.  A first solution might involve creating a new empty list, iterating over x, and only appending those items that are not already in the new list:&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;The Bad Way:&lt;/strong&gt;&lt;br /&gt;&lt;pre class="brush: python; toolbar: false;"&gt;order = []&lt;br /&gt;for i in x:&lt;br /&gt;    if i not in order:&lt;br /&gt;        order.append(i)&lt;br /&gt;&lt;/pre&gt;Try running this.  What's wrong with it?&lt;br /&gt;&lt;br /&gt;The issue is the conditional on line 3.  In the worst case, it could look at every item in the order list for each item in x.  If the list is big, as it is in our example, that wastes a lot of cycles.  We can reason that we can improve the performance of our code by replacing this conditional with something faster.&lt;br /&gt;&lt;br /&gt;Given that sets have near constant time for membership tests, one solution is to create a companion data structure, which we'll call seen.  Being a set, it doesn't care about the order of the items, but it will allow us to test for membership quickly:&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;The Good Way:&lt;/strong&gt;&lt;br /&gt;&lt;pre class="brush: python; toolbar: false;"&gt;order = []&lt;br /&gt;seen = set()&lt;br /&gt;for i in x:&lt;br /&gt;    if i not in seen:&lt;br /&gt;        seen.add(i)&lt;br /&gt;        order.append(i)&lt;br /&gt;&lt;/pre&gt;Now try running this.  Better?&lt;br /&gt;&lt;br /&gt;Not that this is the best way to perform this particular action.  If you aren't familiar with it, take a look at the &lt;a href="http://docs.python.org/library/itertools.html#itertools.groupby"&gt;groupby&lt;/a&gt; function from itertools, which is what I will sometimes reach for in a case like this.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6378430709872859668-8349101899471871547?l=adventuresinoptimization.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://adventuresinoptimization.blogspot.com/feeds/8349101899471871547/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://adventuresinoptimization.blogspot.com/2011/10/know-your-time-complexities.html#comment-form' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6378430709872859668/posts/default/8349101899471871547'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6378430709872859668/posts/default/8349101899471871547'/><link rel='alternate' type='text/html' href='http://adventuresinoptimization.blogspot.com/2011/10/know-your-time-complexities.html' title='know your time complexities'/><author><name>Ryan J. O'Neil</name><uri>http://www.blogger.com/profile/01002920068652190833</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://4.bp.blogspot.com/-7zSc00n-pqc/TbgcePz1iNI/AAAAAAAAAcc/RSMQsfpwnHo/s220/foo.jpg'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6378430709872859668.post-5592588364351064972</id><published>2011-08-24T00:56:00.002-04:00</published><updated>2011-08-24T01:05:58.455-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='python-zibopt'/><category scheme='http://www.blogger.com/atom/ns#' term='python-algebraic'/><title type='text'>python-algebraic</title><content type='html'>&lt;p&gt;After talking with a number of people and processing various bits of feedback, I've come to a few realizations regarding a shift in direction for python-zibopt.&lt;/p&gt;&lt;ol&gt;&lt;li&gt;It will be extremely painful, if not impossible, to expose solver components such as branching rules to Python using the C API.  I almost got this working, but ran headlong into the problem of exception handling.  Consider what happens when a branching rule or other component is written in Python and raises and exception.  This exception must then be captured, handed off to the C API in python-zibopt, managed by SCIP, and then handed back to Python.  Nasty stuff.&lt;/li&gt;&lt;li&gt;The only right way to target both Python 2 and 3 with a C extension is using Cython.  This should also help with the previous item.  Without porting to Cython, supporting multiple Python versions would require horrible and painful sessions of implementing unmaintainable macros.&lt;/li&gt;&lt;li&gt;The flexible expression handling syntax of python-zibopt may be useful to other libraries, so I am spinning it out into its own project called python-algebraic.  It supports both Python 2 and 3 seamlessly.  Right now it only handles those components necessary for SCIP, but with a little work it could develop into a useful library for other solver interfaces.&lt;/li&gt;&lt;/ol&gt;&lt;p&gt;Lately I've felt a little like python-zibopt is taking two steps forward and one step back, but I think porting the C extension parts to Cython will help with that somewhat.  It will certainly simplify the implementation.&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6378430709872859668-5592588364351064972?l=adventuresinoptimization.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://adventuresinoptimization.blogspot.com/feeds/5592588364351064972/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://adventuresinoptimization.blogspot.com/2011/08/python-algebraic.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6378430709872859668/posts/default/5592588364351064972'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6378430709872859668/posts/default/5592588364351064972'/><link rel='alternate' type='text/html' href='http://adventuresinoptimization.blogspot.com/2011/08/python-algebraic.html' title='python-algebraic'/><author><name>Ryan J. O'Neil</name><uri>http://www.blogger.com/profile/01002920068652190833</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://4.bp.blogspot.com/-7zSc00n-pqc/TbgcePz1iNI/AAAAAAAAAcc/RSMQsfpwnHo/s220/foo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6378430709872859668.post-7988998613699682045</id><published>2011-07-07T15:52:00.005-04:00</published><updated>2011-07-07T15:57:25.608-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='python-zibopt'/><title type='text'>python-zibopt 0.7 released</title><content type='html'>&lt;p&gt;This is mostly a documentation and bug fixing release.  A few notes:&lt;p&gt;&lt;ul&gt;&lt;li&gt;The build instructions now link python-zibopt against &lt;a href="http://www.coin-or.org/Ipopt/"&gt;Ipopt&lt;/a&gt; for improved performance on nonlinear models.&lt;/li&gt;&lt;li&gt;Module docs are now being generated using reStructuredText and Sphinx. They live &lt;a href="http://packages.python.org/python-zibopt/"&gt;here&lt;/a&gt;.&lt;/li&gt;&lt;li&gt;The new expressions broke chained inequalities against a single variable in v0.6.  These now work as expected: 2 &lt;= x &lt;= 4.&lt;/li&gt;&lt;li&gt;Nonlinear objective functions have to create a new variable and maximize or minimize the value of that variable.  These use &gt;= and &lt;= in constraints for the new variable.&lt;/li&gt;&lt;li&gt;Dividing by variables (1/x) raises an exception.&lt;/li&gt;&lt;/ul&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6378430709872859668-7988998613699682045?l=adventuresinoptimization.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://adventuresinoptimization.blogspot.com/feeds/7988998613699682045/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://adventuresinoptimization.blogspot.com/2011/07/python-zibopt-07-released.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6378430709872859668/posts/default/7988998613699682045'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6378430709872859668/posts/default/7988998613699682045'/><link rel='alternate' type='text/html' href='http://adventuresinoptimization.blogspot.com/2011/07/python-zibopt-07-released.html' title='python-zibopt 0.7 released'/><author><name>Ryan J. O'Neil</name><uri>http://www.blogger.com/profile/01002920068652190833</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://4.bp.blogspot.com/-7zSc00n-pqc/TbgcePz1iNI/AAAAAAAAAcc/RSMQsfpwnHo/s220/foo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6378430709872859668.post-8468893432538388818</id><published>2011-07-01T16:28:00.000-04:00</published><updated>2011-07-01T16:28:00.202-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='python-zibopt'/><category scheme='http://www.blogger.com/atom/ns#' term='python'/><title type='text'>python-zibopt module docs</title><content type='html'>&lt;p&gt;I needed a change of pace, so I took some time to convert all the docstrings to &lt;a href="http://docutils.sourceforge.net/rst.html"&gt;reStructuredText&lt;/a&gt; and integrate with &lt;a href="http://sphinx.pocoo.org/"&gt;Sphinx&lt;/a&gt;, the documentation builder.  Apart from having to fix a few things to get Sphinx 1.1pre working in Python 3 (apparently it has to actually load a given module to generate its documentation), this was not too painful.  I should have jumped on the reST bandwagon some time ago.&lt;/p&gt;&lt;p&gt;The module docs can be found on PyPI &lt;a href="http://packages.python.org/python-zibopt/"&gt;here&lt;/a&gt;.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6378430709872859668-8468893432538388818?l=adventuresinoptimization.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://adventuresinoptimization.blogspot.com/feeds/8468893432538388818/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://adventuresinoptimization.blogspot.com/2011/07/python-zibopt-module-docs.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6378430709872859668/posts/default/8468893432538388818'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6378430709872859668/posts/default/8468893432538388818'/><link rel='alternate' type='text/html' href='http://adventuresinoptimization.blogspot.com/2011/07/python-zibopt-module-docs.html' title='python-zibopt module docs'/><author><name>Ryan J. O'Neil</name><uri>http://www.blogger.com/profile/01002920068652190833</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://4.bp.blogspot.com/-7zSc00n-pqc/TbgcePz1iNI/AAAAAAAAAcc/RSMQsfpwnHo/s220/foo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6378430709872859668.post-7957742782837455198</id><published>2011-06-25T12:50:00.003-04:00</published><updated>2011-06-25T15:23:00.630-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='formulation'/><category scheme='http://www.blogger.com/atom/ns#' term='python-zibopt'/><category scheme='http://www.blogger.com/atom/ns#' term='scip'/><title type='text'>bilinear expressions in python-zibopt 0.6</title><content type='html'>&lt;p&gt;The latest &lt;a href="http://code.google.com/p/python-zibopt/downloads/detail?name=python-zibopt-0.6.dev-r157.tar.gz&amp;can=2&amp;q="&gt;development release&lt;/a&gt; of python-zibopt supports bilinear constraints and objective functions and can be built against SCIP 2.0.1 and Python 3.2.  This is very much a development version, so your mileage may vary.  Algebraic expression handling has been greatly improved, along with &lt;a href="http://code.google.com/p/python-zibopt/wiki/ChangeLog"&gt;a few other things&lt;/a&gt; such as the ability to remove constraints.&lt;/p&gt;&lt;p&gt;For 0.7 and 0.8 we are attempting to improve the build situation, providing .deb and .rpm files if possible.  There will also likely be access to such properties as shadow costs and reduced costs, in addition to the ability to start writing branching rules and other solver internals directly in Python.  In addition to its algebraic syntax, it is the last feature that will differentiate python-zibopt somewhat from similar libraries, in my opinion.&lt;/p&gt;&lt;p&gt;In 0.6 I had to completely change the way expressions are handled in order to support bilinear constraints.  The following types of constraints are now possible:&lt;/p&gt;&lt;pre class="brush: python; toolbar: false;"&gt;solver += 2*x**2 + (3*y)**2 &lt;= z**2 + 4&lt;br /&gt;solver += (x + y) * (x - y) &gt;= 0&lt;br /&gt;solver += (x/4)**2 == 3&lt;br /&gt;solver += 2 &lt;= x &lt;= 4&lt;/pre&gt;&lt;p&gt;A few notes about the new features:&lt;/p&gt;&lt;ul&gt;&lt;li&gt;Expressions can exist on either side of an inequality.  Previously, one side had to be a constant.&lt;/li&gt;&lt;li&gt;SCIP supports bilinear terms, but that is all.  x**3 is invalid.  Exponents must be integers from 0 to 2.  Similarly, x*y is allowed but x*y*z is not.&lt;/li&gt;&lt;li&gt;Dividing by variables, like 1/x, is not allowed.&lt;/li&gt;&lt;li&gt;Chained inequalities are possible when they apply to a single variable, such as 2 &lt;= x &lt;= 4, but not to multiple variables as it can be impossible determine lower and upper bound constants. Consider x &lt;= y &lt;= z.&lt;/li&gt;&lt;/ul&gt; &lt;p&gt;All constraints can be stored in Python variables using this syntax and solver.constraint(...), then removed from or added back to a solver later:&lt;/p&gt;&lt;pre class="brush: python; toolbar: false;"&gt;constraint = solver.constraint(2 &lt;= x &lt;= 4)&lt;br /&gt;solver -= constraint # remove from formulation&lt;br /&gt;solver += constraint # add back to formulation&lt;/pre&gt;&lt;p&gt;When SCIP variables are subjected to mathematical operators, they are wrapped up into expression objects.  These are essentially sums of terms with associated coefficients.  Terms are tuples of the multiplied variables, so x becomes (x,) and x*y becomes (x, y).  Thus the expression 2*x**2 + (3*y)**2 + 4*x*y - y is stored as:&lt;/p&gt;&lt;pre class="brush: python; toolbar: false;"&gt;{&lt;br /&gt;    (x, x): 2,&lt;br /&gt;    (y, y): 9,&lt;br /&gt;    (x, y): 4,&lt;br /&gt;    (y,): -1&lt;br /&gt;}&lt;/pre&gt;&lt;p&gt;The above requires that variables be orderable so x*y is stored the same way as y*x.  Expressions handle applying algebraic logic and storing inequalities by overriding Python's mathematical functions like __add__ and __le__.&lt;/p&gt;&lt;p&gt;An interesting case arose when trying to deal with chained inequalities of a single variable, such as 2 &lt;= x &lt;= 4.  Initially, I had variables return a new expression when they were subjected to &lt;=, &gt;= or ==.  These methods normally return self, so it seemed like this might work.  However, it ended up creating two expressions in this case since Python evalutes it as (2 &lt;= x) and (x &lt;= 4) instead of the more LISP-ish ((2 &lt;= x) &lt;= 4) that I was looking for.  I've gotten around this in the source trunk (0.7) by having variables subclass expressions.  This required adding some inelegant code to clear expression bounds off of variables once they are added to constraints, otherwise the logic for removing constraints could malfunction in single variable cases.&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6378430709872859668-7957742782837455198?l=adventuresinoptimization.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://adventuresinoptimization.blogspot.com/feeds/7957742782837455198/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://adventuresinoptimization.blogspot.com/2011/06/bilinear-expressions-in-python-zibopt.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6378430709872859668/posts/default/7957742782837455198'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6378430709872859668/posts/default/7957742782837455198'/><link rel='alternate' type='text/html' href='http://adventuresinoptimization.blogspot.com/2011/06/bilinear-expressions-in-python-zibopt.html' title='bilinear expressions in python-zibopt 0.6'/><author><name>Ryan J. O'Neil</name><uri>http://www.blogger.com/profile/01002920068652190833</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://4.bp.blogspot.com/-7zSc00n-pqc/TbgcePz1iNI/AAAAAAAAAcc/RSMQsfpwnHo/s220/foo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6378430709872859668.post-6240313286157009902</id><published>2011-06-20T22:23:00.002-04:00</published><updated>2011-06-20T22:33:50.515-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='formulation'/><category scheme='http://www.blogger.com/atom/ns#' term='python-zibopt'/><category scheme='http://www.blogger.com/atom/ns#' term='scip'/><title type='text'>optimal cd laddering</title><content type='html'>&lt;p&gt;Here's an application that came up in conversation recently which I couldn't find any existing optimization models for: &lt;a href="http://www.bankrate.com/finance/savings/how-to-ladder-a-cd-portfolio.aspx"&gt;CD laddering&lt;/a&gt;.  There are general methods for doing this, but I was interested in creating one that could take in the following inputs and produce an optimal schedule to maximize expected revenue:&lt;/p&gt;&lt;ul&gt;&lt;li&gt;A starting amount of money.&lt;/li&gt;&lt;li&gt;A minimum amount of money to have available each period (month) in cash and expiring CDs.&lt;/li&gt;&lt;li&gt;A maximum amount of money to put in any given CD in a period.&lt;/li&gt;&lt;li&gt;A list of CDs, their APYs, and their minimum investment amounts.&lt;/li&gt;&lt;/ul&gt;&lt;p&gt;The model can be found &lt;a href="http://code.google.com/p/python-zibopt/source/browse/trunk/examples/cd-laddering.py"&gt;here&lt;/a&gt;.  It takes JSON &lt;a href="http://code.google.com/p/python-zibopt/source/browse/trunk/examples/cd-laddering.json"&gt;input&lt;/a&gt; and makes the following basic assumptions, among other things:&lt;/p&gt;&lt;ul&gt;&lt;li&gt;APYs and investment minimums stay the same.  That is, CD investment options are the same from one period to the next.&lt;/li&gt;&lt;li&gt;All CDs are compounded monthly.&lt;/li&gt;&lt;/ul&gt;&lt;p&gt;Of course, a really useful attempt at something like this would try to predict future CD rates and simulate various scenarios to determine best and worst case.  It would also deal with the fact that some CDs are compounded daily while others compounded monthly.  However, given current CD rates, this probably isn't worth too much more thought at this point.&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6378430709872859668-6240313286157009902?l=adventuresinoptimization.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://adventuresinoptimization.blogspot.com/feeds/6240313286157009902/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://adventuresinoptimization.blogspot.com/2011/06/optimal-cd-laddering.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6378430709872859668/posts/default/6240313286157009902'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6378430709872859668/posts/default/6240313286157009902'/><link rel='alternate' type='text/html' href='http://adventuresinoptimization.blogspot.com/2011/06/optimal-cd-laddering.html' title='optimal cd laddering'/><author><name>Ryan J. O'Neil</name><uri>http://www.blogger.com/profile/01002920068652190833</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://4.bp.blogspot.com/-7zSc00n-pqc/TbgcePz1iNI/AAAAAAAAAcc/RSMQsfpwnHo/s220/foo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6378430709872859668.post-8996666988741474128</id><published>2011-06-20T12:42:00.004-04:00</published><updated>2011-06-20T12:44:08.198-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='python-zibopt'/><title type='text'>python-zibopt google group</title><content type='html'>There is a &lt;a href="http://groups.google.com/group/python-zibopt/"&gt;new Google group&lt;/a&gt; intended for discussion of python-zibopt announcements, usage, and features.  Please consider joining if you use this API at all.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6378430709872859668-8996666988741474128?l=adventuresinoptimization.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://adventuresinoptimization.blogspot.com/feeds/8996666988741474128/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://adventuresinoptimization.blogspot.com/2011/06/python-zibopt-google-group.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6378430709872859668/posts/default/8996666988741474128'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6378430709872859668/posts/default/8996666988741474128'/><link rel='alternate' type='text/html' href='http://adventuresinoptimization.blogspot.com/2011/06/python-zibopt-google-group.html' title='python-zibopt google group'/><author><name>Ryan J. O'Neil</name><uri>http://www.blogger.com/profile/01002920068652190833</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://4.bp.blogspot.com/-7zSc00n-pqc/TbgcePz1iNI/AAAAAAAAAcc/RSMQsfpwnHo/s220/foo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6378430709872859668.post-8021060162674853877</id><published>2011-06-13T11:44:00.002-04:00</published><updated>2011-06-13T11:47:15.526-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='r'/><title type='text'>syntax highlighting for r</title><content type='html'>Thanks to &lt;a href="http://orinanobworld.blogspot.com/"&gt;Paul Rubin&lt;/a&gt; and his blog for pointing me to &lt;a href="http://www.inside-r.org/pretty-r"&gt;Pretty R&lt;/a&gt;.  I've been using &lt;a href="http://alexgorbatchev.com/SyntaxHighlighter/"&gt;SyntaxHighlighter&lt;/a&gt; for displaying Python and C, but it doesn't have a brush for highlighting R code.  Pretty R is a nice substitute for the interim, as you can see in &lt;a href="http://adventuresinoptimization.blogspot.com/2011/04/affine-scaling-in-r.html"&gt;this updated post&lt;/a&gt;.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6378430709872859668-8021060162674853877?l=adventuresinoptimization.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://adventuresinoptimization.blogspot.com/feeds/8021060162674853877/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://adventuresinoptimization.blogspot.com/2011/06/syntax-highlighting-for-r.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6378430709872859668/posts/default/8021060162674853877'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6378430709872859668/posts/default/8021060162674853877'/><link rel='alternate' type='text/html' href='http://adventuresinoptimization.blogspot.com/2011/06/syntax-highlighting-for-r.html' title='syntax highlighting for r'/><author><name>Ryan J. O'Neil</name><uri>http://www.blogger.com/profile/01002920068652190833</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://4.bp.blogspot.com/-7zSc00n-pqc/TbgcePz1iNI/AAAAAAAAAcc/RSMQsfpwnHo/s220/foo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6378430709872859668.post-4011080233274352139</id><published>2011-06-09T15:49:00.002-04:00</published><updated>2011-06-09T15:52:14.330-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='python'/><category scheme='http://www.blogger.com/atom/ns#' term='simulation'/><title type='text'>a few notes on deterministic vs. stochastic simulation</title><content type='html'>&lt;p&gt;I find I have to build simulations with increasing frequency in my work and life.  Usually this indicates I'm faced with one of the following situations:&lt;/p&gt;&lt;ol&gt;&lt;li&gt;The need for a quick estimate regarding the quantitative behavior of some situation.&lt;/li&gt;&lt;li&gt;The desire to verify the result of a computation or assumption.&lt;/li&gt;&lt;li&gt;A situation which is too complex or random to effectively model or understand.&lt;/li&gt;&lt;/ol&gt;&lt;p&gt;Anyone familiar at all with simulation will recognize the last item as the motivating force of the entire field.  Simulation models tend to take over when systems become so complex that understanding them is prohibitive in cost and time or entirely infeasible.  In a simulation, the modeler can focus on individual interactions between entities while still hoping for useful output in the form of descriptive statistics.&lt;/p&gt;&lt;p&gt;As such, simulations are nearly always stochastic.  The output of a simulation, whether it be the mean time to service upon entering a queue or the number of fish alive in a pond, is determined by a number of random inputs.  It is estimated by looking at a sample of the entire, often infinite, problem space and therefore must be described in terms of mean and variance.&lt;/p&gt;&lt;p&gt;For me, simulation building usually follows a process roughly like this:&lt;/p&gt;&lt;ol&gt;&lt;li&gt;Work with a domain expert to understand the process under study.&lt;/li&gt;&lt;li&gt;Convert this process into a deterministic simulation (no randomness).&lt;/li&gt;&lt;li&gt;Verify the output of the deterministic simulation.&lt;/li&gt;&lt;li&gt;Anlyze the inputs of the simulation to determine their probability distributions.&lt;/li&gt;&lt;li&gt;Convert the deterministic simulation to a stochastic simulation.&lt;/li&gt;&lt;/ol&gt;&lt;p&gt;The reason for creating a simulation without randomness first is that it can be difficult or impossible to verify its correctness otherwise.  Thus one may focus on the simulation logic first before analyzing and adding sources of randomness.&lt;/p&gt;&lt;p&gt;Where the procedure breaks down is after the third step.  Domain experts are often happy to share their knowledge about systems to aid in designing simulations, and typically can understand the resulting abstractions.  They are also invaluable in verifying simulation output.  However, they are unlikely to understand why it is necessary to add randomness to a system that they already perceive as functional.  Further, doing so can be just as difficult and time consuming as the initial model development and therefore requires justification.&lt;/p&gt;&lt;p&gt;This can be a quandary for the model builder.  How does one communicate the need to incorporate randomness to decision makers who lack understanding of probability?  It is trivially easy to construct simulations that use the same input parameters but yield drastically different outputs.  Consider the code below, which simulates two events occurring and counts the number of times event b happens before event a.&lt;/p&gt;&lt;pre class="brush: python; toolbar: false;"&gt;import random&lt;br /&gt;&lt;br /&gt;def sim_stochastic(event_a_lambda, event_b_lambda):&lt;br /&gt;    # Returns 0 if event A arrives first, 1 if event B arrives first&lt;br /&gt;&lt;br /&gt;    # Calculate next arrival time for each event randomly.&lt;br /&gt;    event_a_arrival = random.expovariate(event_a_lambda)&lt;br /&gt;    event_b_arrival = random.expovariate(event_b_lambda)&lt;br /&gt;&lt;br /&gt;    return 0.0 if event_a_arrival &lt;= event_b_arrival else 1.0&lt;br /&gt;&lt;br /&gt;def sim_deterministic(event_a_lambda, event_b_lambda):&lt;br /&gt;    # Returns 0 if event A arrives first, 1 if event B arrives first&lt;br /&gt;&lt;br /&gt;    # Calculate next arrival time for each event deterministically.&lt;br /&gt;    event_a_arrival = 1.0 / event_a_lambda&lt;br /&gt;    event_b_arrival = 1.0 / event_b_lambda&lt;br /&gt;&lt;br /&gt;    return 0.0 if event_a_arrival &lt;= event_b_arrival else 1.0&lt;br /&gt;&lt;br /&gt;if __name__ == '__main__':&lt;br /&gt;    event_a_lambda = 0.3&lt;br /&gt;    event_b_lambda = 0.5&lt;br /&gt;&lt;br /&gt;    repetitions = 10000&lt;br /&gt;&lt;br /&gt;    for sim in (sim_stochastic, sim_deterministic):&lt;br /&gt;        output = [&lt;br /&gt;            sim(event_a_lambda, event_b_lambda)&lt;br /&gt;            for _ in range(repetitions)&lt;br /&gt;        ]&lt;br /&gt;        event_b_first = 100.0 * (sum(output) / len(output))&lt;br /&gt;        print('event b is first %0.1f%% of the time' % event_b_first)&lt;br /&gt;&lt;/pre&gt;&lt;p&gt;Both simulations use the same input parameter, but the second one is essentially wrong as b will always happen first.  In the stochastic version, we use exponential distributions for the inputs and obtain an output that verifies our basic understanding of these distributions.&lt;/p&gt;&lt;pre&gt;event b is first 63.0% of the time&lt;br /&gt;event b is first 100.0% of the time&lt;br /&gt;&lt;/pre&gt;&lt;p&gt;How about you?  How do you discuss the need to model a random world with decision makers?&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6378430709872859668-4011080233274352139?l=adventuresinoptimization.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://adventuresinoptimization.blogspot.com/feeds/4011080233274352139/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://adventuresinoptimization.blogspot.com/2011/06/few-notes-on-deterministic-vs.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6378430709872859668/posts/default/4011080233274352139'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6378430709872859668/posts/default/4011080233274352139'/><link rel='alternate' type='text/html' href='http://adventuresinoptimization.blogspot.com/2011/06/few-notes-on-deterministic-vs.html' title='a few notes on deterministic vs. stochastic simulation'/><author><name>Ryan J. O'Neil</name><uri>http://www.blogger.com/profile/01002920068652190833</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://4.bp.blogspot.com/-7zSc00n-pqc/TbgcePz1iNI/AAAAAAAAAcc/RSMQsfpwnHo/s220/foo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6378430709872859668.post-4584987205230848582</id><published>2011-05-24T17:47:00.002-04:00</published><updated>2011-05-24T17:52:24.235-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='python-zibopt'/><category scheme='http://www.blogger.com/atom/ns#' term='scip'/><category scheme='http://www.blogger.com/atom/ns#' term='zibopt'/><title type='text'>new in python-zibopt: constraint removal</title><content type='html'>&lt;p&gt;At long last, python-zibopt has constraint removal and a nice interface to go with it.  The idea here is that you store constraints in variables once they are created and can remove them from the solver, and add them back in, as you see fit.&lt;/p&gt;&lt;p&gt;The feature is in trunk, so it will be part of 0.6 when that is released.  A few notes about using it:&lt;/p&gt;&lt;ul&gt;&lt;li&gt;Constraint instances are only returned using solver.constraint(...).  The algebraic syntax can't actually return the constraints, so that limits you from removing them later.&lt;/li&gt;&lt;li&gt;It is possible to reinstall a removed constraint using +=.  See below.&lt;/li&gt;&lt;li&gt;Removal or addition of constraints necessitates restarting the solver.&lt;/li&gt;&lt;/ul&gt;&lt;p&gt;Here's an example.  I have to admit, I'm getting pretty excited about this release.&lt;/p&gt;&lt;pre class="brush: python; toolbar: false;"&gt;from zibopt import scip&lt;br /&gt;&lt;br /&gt;solver = scip.solver()&lt;br /&gt;&lt;br /&gt;x1 = solver.variable(scip.INTEGER)&lt;br /&gt;x2 = solver.variable(scip.INTEGER)&lt;br /&gt;&lt;br /&gt;c1 = solver.constraint(upper=4, coefficients={x1:2, x2:2})&lt;br /&gt;c2 = solver.constraint(upper=3, coefficients={x1:2, x2:2})&lt;br /&gt;&lt;br /&gt;print(solver.maximize(objective=x1+x2).objective) # 1.0&lt;br /&gt;&lt;br /&gt;solver -= c2&lt;br /&gt;print(solver.maximize(objective=x1+x2).objective) # 2.0&lt;br /&gt;&lt;br /&gt;solver += c2&lt;br /&gt;print(solver.maximize(objective=x1+x2).objective) # 1.0&lt;br /&gt;&lt;/pre&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6378430709872859668-4584987205230848582?l=adventuresinoptimization.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://adventuresinoptimization.blogspot.com/feeds/4584987205230848582/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://adventuresinoptimization.blogspot.com/2011/05/new-in-python-zibopt-constraint-removal.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6378430709872859668/posts/default/4584987205230848582'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6378430709872859668/posts/default/4584987205230848582'/><link rel='alternate' type='text/html' href='http://adventuresinoptimization.blogspot.com/2011/05/new-in-python-zibopt-constraint-removal.html' title='new in python-zibopt: constraint removal'/><author><name>Ryan J. O'Neil</name><uri>http://www.blogger.com/profile/01002920068652190833</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://4.bp.blogspot.com/-7zSc00n-pqc/TbgcePz1iNI/AAAAAAAAAcc/RSMQsfpwnHo/s220/foo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6378430709872859668.post-885040383573237831</id><published>2011-05-21T16:05:00.005-04:00</published><updated>2011-06-13T11:31:56.001-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='python-zibopt'/><category scheme='http://www.blogger.com/atom/ns#' term='python'/><category scheme='http://www.blogger.com/atom/ns#' term='zibopt'/><title type='text'>python-zibopt works with python 3.2 in trunk</title><content type='html'>&lt;p&gt;python-zibopt is now ported to Python 3.2 in the repository trunk.  All the tests pass and the examples are working, so if anybody wants to get latest and play around, that should be OK.  Some notes about the process of porting the C extension, in case anyone else finds them useful:&lt;/p&gt;&lt;ul&gt;&lt;li&gt;Changing special method names in a duck typed language makes porting painful.  For instance, __nonzero__ in Python 2 became&lt;br /&gt;__bool__ in Python 3.  Without proper unit tests, I may never even have caught this, since Python will go ahead and provide default implementations for these sorts of methods, completely ignoring the existence of older versions of them.&lt;/li&gt;&lt;br /&gt;&lt;li&gt;The easiest part was getting the examples to work.  These just required changing of print statements and adding calls to list around functions that have become lazy (zip, for instance).  This is a good sign for folks that have existing model code in python-zibopt.&lt;/li&gt;&lt;br /&gt;&lt;li&gt;init_foo should now by PyInit_foo.  Further, each C type needs a PyModuleDef and should use PyModule_Create instead of Py_InitModule3.  The current docs show this accurately.&lt;/li&gt;&lt;br /&gt;&lt;li&gt;It's difficult to find examples of certain, rather fundamental things in the documentation.  Particularly, I had to pull the right method for overriding __getattr__ and __setattr__ in C from the code from pyexpat.  Since strings in Python are all now unicode, what once looked like:&lt;pre class="brush: cpp; toolbar: false;"&gt;static PyObject* branching_rule_getattr(branching_rule *self, PyObject *attr_name) {&lt;br /&gt;   char *attr;&lt;br /&gt;&lt;br /&gt;   // Check and make sure we have a string as attribute name...&lt;br /&gt;   if (PyString_Check(attr_name)) {&lt;br /&gt;       attr = PyString_AsString(attr_name);&lt;br /&gt;&lt;br /&gt;       if (!strcmp(attr, "maxbounddist"))&lt;br /&gt;           return Py_BuildValue("d", SCIPbranchruleGetMaxbounddist(self-&amp;gt;branch));&lt;br /&gt;       if (!strcmp(attr, "maxdepth"))&lt;br /&gt;           return Py_BuildValue("i", SCIPbranchruleGetMaxdepth(self-&amp;gt;branch));&lt;br /&gt;       if (!strcmp(attr, "priority"))&lt;br /&gt;           return Py_BuildValue("i", SCIPbranchruleGetPriority(self-&amp;gt;branch));&lt;br /&gt;   }&lt;br /&gt;   return PyObject_GenericGetAttr(self, attr_name);&lt;br /&gt;}&lt;/pre&gt;now requires use of PyUnicode_Check and PyUnicode_CompareWithASCIIString (note the == 0 after these function calls):&lt;pre class="brush: cpp; toolbar: false;"&gt;static PyObject* branching_rule_getattr(branching_rule *self, PyObject *attr_name) {&lt;br /&gt;   // Check and make sure we have a string as attribute name...&lt;br /&gt;   if (PyUnicode_Check(attr_name)) {&lt;br /&gt;       if (PyUnicode_CompareWithASCIIString(attr_name, "maxbounddist") == 0)&lt;br /&gt;           return Py_BuildValue("d", SCIPbranchruleGetMaxbounddist(self-&amp;gt;branch));&lt;br /&gt;       if (PyUnicode_CompareWithASCIIString(attr_name, "maxdepth") == 0)&lt;br /&gt;           return Py_BuildValue("i", SCIPbranchruleGetMaxdepth(self-&amp;gt;branch));&lt;br /&gt;       if (PyUnicode_CompareWithASCIIString(attr_name, "priority") == 0)&lt;br /&gt;           return Py_BuildValue("i", SCIPbranchruleGetPriority(self-&amp;gt;branch));&lt;br /&gt;   }&lt;br /&gt;   return PyObject_GenericGetAttr((PyObject *) self, attr_name);&lt;br /&gt;}&lt;/pre&gt;Changes to the __setattr__ code are similar.&lt;/li&gt;&lt;br /&gt;&lt;li&gt;SCIP provides an opt-shared makefile for Linux x86_64, but SoPlex does not, for whatever reason.  Here is a diff to create a make.linux.x86_64.gnu.opt-shared file from make.linux.x86_64.gnu.opt in SoPlex 1.5.0 in case you need that.  The existing build instructions are generally accurate for the new version on SCIP and Python 3.2, though an update will arrive with v0.6.  Your mileage may vary for now.&lt;pre&gt;4,5c4,5&lt;br /&gt;&amp;lt; CXXFLAGS= -O3 -m64 -mtune=native -fomit-frame-pointer # -malign-double -ffast-math &amp;lt; LDFLAGS = -lm -static --- &amp;gt; CXXFLAGS= -O3 -m64 -mtune=native -fomit-frame-pointer -fPIC # -malign-double -ffast-math&lt;br /&gt;&amp;gt; LDFLAGS = -lm -Wl,-rpath,$(CURDIR)/$(LIBDIR)&lt;br /&gt;8a9,15&lt;br /&gt;&amp;gt; LIBBUILD        =       $(CXX)&lt;br /&gt;&amp;gt; LIBBUILDFLAGS   = -shared -FPIC&lt;br /&gt;&amp;gt; LIBBUILD_o      =       -o # the trailing space is important&lt;br /&gt;&amp;gt; LIBEXT  = so&lt;br /&gt;&amp;gt; ARFLAGS         =&lt;br /&gt;&amp;gt; RANLIB          =&lt;br /&gt;&amp;gt; &lt;/pre&gt;&lt;/li&gt;&lt;/ul&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6378430709872859668-885040383573237831?l=adventuresinoptimization.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://adventuresinoptimization.blogspot.com/feeds/885040383573237831/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://adventuresinoptimization.blogspot.com/2011/05/python-zibopt-works-with-python-32-in.html#comment-form' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6378430709872859668/posts/default/885040383573237831'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6378430709872859668/posts/default/885040383573237831'/><link rel='alternate' type='text/html' href='http://adventuresinoptimization.blogspot.com/2011/05/python-zibopt-works-with-python-32-in.html' title='python-zibopt works with python 3.2 in trunk'/><author><name>Ryan J. O'Neil</name><uri>http://www.blogger.com/profile/01002920068652190833</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://4.bp.blogspot.com/-7zSc00n-pqc/TbgcePz1iNI/AAAAAAAAAcc/RSMQsfpwnHo/s220/foo.jpg'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6378430709872859668.post-631888369364784777</id><published>2011-05-19T07:42:00.000-04:00</published><updated>2011-05-19T14:38:09.089-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='graphs'/><category scheme='http://www.blogger.com/atom/ns#' term='python'/><category scheme='http://www.blogger.com/atom/ns#' term='concurrency'/><title type='text'>joy in the time of the python futures module</title><content type='html'>&lt;p&gt;Starting with a does of realism, it's possible this will turn out like the day when &lt;a href="http://docs.python.org/release/2.5/whatsnew/pep-342.html"&gt;coroutines were introduced into Python 2.5&lt;/a&gt;.  At the time I was extremely excited.  Upon hearing the news I spent several hours trying to convince developers at my then-employer that the only way we could survive as an organization was to immediately abandon all our existing Java infrastructure, porting it to a new and beautiful world based on finite state machines implemented using Python coroutines.  After a day of hand waving over a proof of concept, we all continued about our lives.  Soon after, I left for a Python shop, but in the next half decade I still never found a good place to use this beloved feature in solving the daily challenges of my professional life.&lt;/p&gt;&lt;p&gt;As I come to terms more with switching to Python 3.2, the &lt;a href="http://docs.python.org/py3k/library/concurrent.futures.html"&gt;futures&lt;/a&gt; module emerges as a source of similar excitement.  This is one of those would-have-made-my-life-so-much-easier features I wish I'd had years ago, and is almost reason in itself to upgrade from Python 2.7.  &lt;i&gt;Who cares if none of your libraries have been ported yet?&lt;/i&gt;&lt;/p&gt;&lt;p&gt;I think the real strength of this library springs from its ability to take any pre-existing function and distribute it over a process pool.  Here is an example that computes minimum spanning trees for fully connected graphs.  For purposes of testing, we generate 100-node fully connected graphs with random arc weights between 1 and 10.  We then find their minimum spanning trees using the &lt;a href="http://code.google.com/p/python-graph/"&gt;pygraph&lt;/a&gt; library, using groups of 4, 8, ..., 28 random graphs in serial and in parallel.  Note how easy it is to take the minimum spanning tree function and map it over a process pool without any changes to its code.&lt;/p&gt;&lt;pre class="brush: python; toolbar: false;"&gt;from concurrent import futures&lt;br /&gt;from csv import writer&lt;br /&gt;from pygraph.algorithms import minmax&lt;br /&gt;from pygraph.classes.graph import graph&lt;br /&gt;import random&lt;br /&gt;import string&lt;br /&gt;import sys&lt;br /&gt;import time&lt;br /&gt;&lt;br /&gt;def generate_graph():&lt;br /&gt;    # Generates a randomly weighted, fully-connected, undirected graph&lt;br /&gt;    nodes = [str(i) for i in range(100)]&lt;br /&gt;&lt;br /&gt;    g = graph()&lt;br /&gt;    for n in nodes:&lt;br /&gt;        g.add_node(n)&lt;br /&gt;&lt;br /&gt;    # Build an edge from each node to every other node&lt;br /&gt;    for i, n in enumerate(nodes):&lt;br /&gt;        for o in nodes[i+1:]:&lt;br /&gt;            weight = random.uniform(1, 10)&lt;br /&gt;            g.add_edge((n, o), weight)&lt;br /&gt;&lt;br /&gt;    return g&lt;br /&gt;&lt;br /&gt;def serial_test(graphs):&lt;br /&gt;    for g in graphs:&lt;br /&gt;        tree = minmax.minimal_spanning_tree(g)&lt;br /&gt;&lt;br /&gt;def parallel_test(graphs, max_workers):&lt;br /&gt;    with futures.ProcessPoolExecutor(max_workers=max_workers) as executor:&lt;br /&gt;        for tree in executor.map(minmax.minimal_spanning_tree, graphs):&lt;br /&gt;            pass # normally we'd do something with this...&lt;br /&gt;&lt;br /&gt;if __name__ == '__main__':&lt;br /&gt;    out = writer(sys.stdout)&lt;br /&gt;    out.writerow(['num graphs', 'serial time', 'parallel time'])&lt;br /&gt;&lt;br /&gt;    # Run with a number of different randomly generated graphs&lt;br /&gt;    for num_graphs in (4, 8, 12, 16, 20, 24, 28):&lt;br /&gt;        graphs = [generate_graph() for _ in range(num_graphs)]&lt;br /&gt;&lt;br /&gt;        start = time.clock()&lt;br /&gt;        serial_test(graphs)&lt;br /&gt;        serial_time = time.clock() - start&lt;br /&gt;&lt;br /&gt;        start = time.clock()&lt;br /&gt;        parallel_test(graphs, 4)&lt;br /&gt;        parallel_time = time.clock() - start&lt;br /&gt;&lt;br /&gt;        out.writerow([num_graphs, serial_time, parallel_time])&lt;/pre&gt;&lt;p&gt;The output of this script shows that we get a fairly linear speedup in this particular example with little to no effort.&lt;/p&gt;&lt;center&gt;&lt;img src="http://3.bp.blogspot.com/-msCaTAMeC2U/TdVaX-kQdxI/AAAAAAAAAdQ/xCIIT3mbhWw/s1600/times_minimal_spanning_tree.png". /&gt;&lt;/center&gt;&lt;p&gt;Given that the box I'm running this on has 4 cores, it's a little odd that the speedup factor is more like 2.  It's probably just that the machine has a lot going on, so it's not really worth investigating right now.  At the very least, each core is kept busy when the test forks.&lt;/p&gt;&lt;img width="100%" src="http://2.bp.blogspot.com/-VRdmUUnrj_c/TdVamvVQieI/AAAAAAAAAdY/HTxahSDg_TM/s1600/post2-performance.png" /&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6378430709872859668-631888369364784777?l=adventuresinoptimization.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://adventuresinoptimization.blogspot.com/feeds/631888369364784777/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://adventuresinoptimization.blogspot.com/2011/05/joy-in-time-of-python-futures-module.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6378430709872859668/posts/default/631888369364784777'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6378430709872859668/posts/default/631888369364784777'/><link rel='alternate' type='text/html' href='http://adventuresinoptimization.blogspot.com/2011/05/joy-in-time-of-python-futures-module.html' title='joy in the time of the python futures module'/><author><name>Ryan J. O'Neil</name><uri>http://www.blogger.com/profile/01002920068652190833</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://4.bp.blogspot.com/-7zSc00n-pqc/TbgcePz1iNI/AAAAAAAAAcc/RSMQsfpwnHo/s220/foo.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://3.bp.blogspot.com/-msCaTAMeC2U/TdVaX-kQdxI/AAAAAAAAAdQ/xCIIT3mbhWw/s72-c/times_minimal_spanning_tree.png' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6378430709872859668.post-8717388652734899981</id><published>2011-05-18T16:18:00.002-04:00</published><updated>2011-05-18T16:26:11.435-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='python-zibopt'/><category scheme='http://www.blogger.com/atom/ns#' term='python'/><category scheme='http://www.blogger.com/atom/ns#' term='zibopt'/><title type='text'>python-zibopt upgrade on the way</title><content type='html'>&lt;p&gt;To anyone out there who is using &lt;a href="http://code.google.com/p/python-zibopt/"&gt;python-zibopt&lt;/a&gt;, I have begun the painful processing of upgrading it to both Python 3.2 and SCIP 2.0.1 simultaneously.  I think it makes sense to do this now, because both C APIs have changed and I'd rather just have to deal with this once.&lt;/p&gt;&lt;p&gt;There is also a compelling reason to upgrade the Python API now in &lt;a href="http://www.python.org/dev/peps/pep-0384/"&gt;PEP 384: Defining a Stable ABI&lt;/a&gt;.  Hopefully, the Python C API should remain stable after this and thus not require future changes for C extension maintainers.  It's hard enough keeping up with changes to one major system, let alone the underlying interpreter.&lt;/p&gt;&lt;p&gt;With any luck I'll get this working in the next few weeks, along with the new nonlinear constraints available in SCIP.  That will be packaged into a 0.6 release, then I can start to focus on some of the remaining issues such as figuring out a better method for building the library.&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6378430709872859668-8717388652734899981?l=adventuresinoptimization.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://adventuresinoptimization.blogspot.com/feeds/8717388652734899981/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://adventuresinoptimization.blogspot.com/2011/05/python-zibopt-upgrade-on-way.html#comment-form' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6378430709872859668/posts/default/8717388652734899981'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6378430709872859668/posts/default/8717388652734899981'/><link rel='alternate' type='text/html' href='http://adventuresinoptimization.blogspot.com/2011/05/python-zibopt-upgrade-on-way.html' title='python-zibopt upgrade on the way'/><author><name>Ryan J. O'Neil</name><uri>http://www.blogger.com/profile/01002920068652190833</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://4.bp.blogspot.com/-7zSc00n-pqc/TbgcePz1iNI/AAAAAAAAAcc/RSMQsfpwnHo/s220/foo.jpg'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6378430709872859668.post-406973647472959573</id><published>2011-05-16T19:37:00.003-04:00</published><updated>2011-05-16T19:43:36.460-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='music'/><title type='text'>rip bernard greenhouse</title><content type='html'>&lt;a href="http://www.npr.org/2011/05/16/136368728/cellist-bernard-greenhouse-dies"&gt;R.I.P. Bernard Greenhouse&lt;/a&gt;.  As with so many other cellists of my generation, you were my teacher's teacher.  While I never saw you perform in person, my lessons were often sprinkled with references to you.  You will be missed, and it is lost on no one that you passed three weeks after you stopped practicing.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6378430709872859668-406973647472959573?l=adventuresinoptimization.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://adventuresinoptimization.blogspot.com/feeds/406973647472959573/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://adventuresinoptimization.blogspot.com/2011/05/rip-bernard-greenhouse.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6378430709872859668/posts/default/406973647472959573'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6378430709872859668/posts/default/406973647472959573'/><link rel='alternate' type='text/html' href='http://adventuresinoptimization.blogspot.com/2011/05/rip-bernard-greenhouse.html' title='rip bernard greenhouse'/><author><name>Ryan J. O'Neil</name><uri>http://www.blogger.com/profile/01002920068652190833</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://4.bp.blogspot.com/-7zSc00n-pqc/TbgcePz1iNI/AAAAAAAAAcc/RSMQsfpwnHo/s220/foo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6378430709872859668.post-6018924171662743643</id><published>2011-05-16T16:15:00.002-04:00</published><updated>2011-05-16T16:20:37.225-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='media'/><title type='text'>thoughts on mandarin in the media</title><content type='html'>&lt;p&gt;&lt;i&gt;Thanks, &lt;a href="http://macdiva.tumblr.com/post/4925205093/via-infographic-should-young-americans-learn"&gt;MacDiva&lt;/a&gt;, for posting a link to &lt;a href="http://asiasociety.org/blog/reasia/infographic-should-young-americans-learn-chinese"&gt;this article&lt;/a&gt;.&lt;/i&gt;&lt;/p&gt;&lt;p&gt;I believe this is a good example of the recent media movement trying to shock Americans into believing we'll fall behind as a culture if we don't all go out and learn Mandarin.  While it's very important to include second (or third) languages in the education of our children (and I desparately wish I'd taken something other than Latin in school), there's a lot that's disingenuous about the sentiments expressed here.  A few points:&lt;/p&gt;&lt;ul&gt;&lt;li&gt;If person A learns the language of person B, it is not required for person B to reciprocate in order to establish effective communication.&lt;/li&gt;&lt;li&gt;The impact of having every young Chinese citizen learn English will hopefully be that most of them can speak English.&lt;/li&gt;&lt;li&gt;If this program is successful, there will come a time when native English speakers fluent in Mandarin are no longer necessary in such industries as tourism, trade, and international relations in China.  Those functions will be more often assigned to Chinese citizens.  Thus, learning Mandarin at this time has limited utility.&lt;/li&gt;&lt;li&gt;When a large body of people devotes effort to learning a foreign language, that is a recognition of the importance of that language.  If anything, this should assist in the established dominance of English as an international medium for communication.&lt;/li&gt;&lt;/ul&gt;&lt;p&gt;Perhaps I'm wrong in my thinking, but I don't see this as a good reason to develop a national, or even individual, effort to learn Mandarin.  One could do so entirely reasonably for personal betterment and cultural exposure, but this infographic looks like media scaremongering to me.&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6378430709872859668-6018924171662743643?l=adventuresinoptimization.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://adventuresinoptimization.blogspot.com/feeds/6018924171662743643/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://adventuresinoptimization.blogspot.com/2011/05/thoughts-on-mandarin-in-media.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6378430709872859668/posts/default/6018924171662743643'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6378430709872859668/posts/default/6018924171662743643'/><link rel='alternate' type='text/html' href='http://adventuresinoptimization.blogspot.com/2011/05/thoughts-on-mandarin-in-media.html' title='thoughts on mandarin in the media'/><author><name>Ryan J. O'Neil</name><uri>http://www.blogger.com/profile/01002920068652190833</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://4.bp.blogspot.com/-7zSc00n-pqc/TbgcePz1iNI/AAAAAAAAAcc/RSMQsfpwnHo/s220/foo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6378430709872859668.post-9087877609342728252</id><published>2011-04-28T18:41:00.008-04:00</published><updated>2011-04-28T18:57:04.813-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='lisp'/><category scheme='http://www.blogger.com/atom/ns#' term='functional programming'/><title type='text'>zen of lisp / lisp koans</title><content type='html'>&lt;p&gt;There used to be a website of collected mythology about LISP masters and students, mostly at MIT during the 60s and 70s.  I believe it was called the Zen of LISP or something to that effect, but can no longer seem to find it.  After an extensive search, I've resigned myself to pulling as many of these as I can find out of my local fortune file.  Here are a few gems:&lt;/p&gt;&lt;p&gt;&lt;i&gt;A novice was trying to fix a broken lisp machine by turning the power off and on.  Knight, seeing what the student was doing spoke sternly, "You cannot fix a machine by just power-cycling it with no understanding of what is going wrong."  Knight turned the machine off and on.  The machine worked.&lt;/i&gt;&lt;/p&gt;&lt;p&gt;Of course, it's entirely possible to fix a machine by power-cycling if you do understand what is going on.&lt;/p&gt;&lt;p&gt;&lt;i&gt;A famous Lisp Hacker noticed an Undergraduate sitting in front of a Xerox 1108, trying to edit a complex Klone network via a browser.  Wanting to help, the Hacker clicked one of the nodes in the network with the mouse, and asked "what do you see?"  Very earnestly, the Undergraduate replied, "I see a cursor."  The Hacker then quickly pressed the boot toggle at the back of the keyboard, while simultaneously hitting the Undergraduate over the head with a thick Interlisp Manual.  The Undergraduate was then Enlightened.&lt;/i&gt;&lt;/p&gt;&lt;p&gt;Somehow I doubt it was ever that easy.  When I hit myself over the head with &lt;a href="http://www.aic.uniovi.es/libros/cltl2e/clm.html"&gt;CLTL2E&lt;/a&gt;, it just hurts.&lt;/p&gt;&lt;p&gt;&lt;i&gt;A student, in hopes of understanding the Lambda-nature, came to Greenblatt.  As they spoke a Multics system hacker walked by.  "Is it true", asked the student, "that PL-1 has many of the same data types as Lisp?"  Almost before the student had finished his question, Greenblatt shouted, "FOO!", and hit the student with a stick."&lt;/i&gt;&lt;/p&gt;&lt;p&gt;Not entirely sure the right way to approach this one.  Does it have to do with &lt;a href="http://en.wikipedia.org/wiki/Type_system#Dynamic_typing"&gt;dynamic typing&lt;/a&gt;, &lt;a href="http://en.wikipedia.org/wiki/Metasyntactic_variable"&gt;metasyntactic variables&lt;/a&gt;, both, something else entirely?&lt;/p&gt;&lt;p&gt;There's one in particular I have in mind that has to do with garbage collection and cycles in reference counting, but I haven't found it yet.  If you happen to know of it, please send it along.&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6378430709872859668-9087877609342728252?l=adventuresinoptimization.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://adventuresinoptimization.blogspot.com/feeds/9087877609342728252/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://adventuresinoptimization.blogspot.com/2011/04/zen-of-lisp-lisp-koans.html#comment-form' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6378430709872859668/posts/default/9087877609342728252'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6378430709872859668/posts/default/9087877609342728252'/><link rel='alternate' type='text/html' href='http://adventuresinoptimization.blogspot.com/2011/04/zen-of-lisp-lisp-koans.html' title='zen of lisp / lisp koans'/><author><name>Ryan J. O'Neil</name><uri>http://www.blogger.com/profile/01002920068652190833</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://4.bp.blogspot.com/-7zSc00n-pqc/TbgcePz1iNI/AAAAAAAAAcc/RSMQsfpwnHo/s220/foo.jpg'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6378430709872859668.post-3020468119809419280</id><published>2011-04-27T07:30:00.002-04:00</published><updated>2011-06-13T11:21:30.924-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='linear programming'/><category scheme='http://www.blogger.com/atom/ns#' term='r'/><title type='text'>affine scaling in r</title><content type='html'>&lt;p&gt;I recently stumbled across an implementation of the &lt;a href="http://demonstrations.wolfram.com/AffineScalingInteriorPointMethod/"&gt;affine scaling&lt;/a&gt; &lt;a href="http://en.wikipedia.org/wiki/Interior_point_method"&gt;interior point  method&lt;/a&gt; for solving linear programs that I'd coded up in R once upon a time.  I'm posting it here in case anyone else finds it useful.  There's not a whole lot of thought given to efficiency or numerical stability, just a demonstration of the basic algorithm.  Still, sometimes that's exactly what one wants.&lt;/p&gt;&lt;div style="overflow:auto;"&gt;&lt;div class="geshifilter"&gt;&lt;pre class="r geshifilter-R" style="font-family:monospace;"&gt;solve.affine &lt;span style=""&gt;&amp;lt;-&lt;/span&gt; &lt;a href="http://inside-r.org/r-doc/base/function"&gt;&lt;span style="color: #003399; font-weight: bold;"&gt;function&lt;/span&gt;&lt;/a&gt;&lt;span style="color: #009900;"&gt;&amp;#40;&lt;/span&gt;A&lt;span style="color: #339933;"&gt;,&lt;/span&gt; &lt;a href="http://inside-r.org/packages/cran/RC"&gt;&lt;span style=""&gt;rc&lt;/span&gt;&lt;/a&gt;&lt;span style="color: #339933;"&gt;,&lt;/span&gt; x&lt;span style="color: #339933;"&gt;,&lt;/span&gt; &lt;a href="http://inside-r.org/packages/cran/tolerance"&gt;&lt;span style=""&gt;tolerance&lt;/span&gt;&lt;/a&gt;=&lt;span style="color: #cc66cc;"&gt;10&lt;/span&gt;&lt;span style=""&gt;^-&lt;/span&gt;&lt;span style="color: #cc66cc;"&gt;7&lt;/span&gt;&lt;span style="color: #339933;"&gt;,&lt;/span&gt; R=&lt;span style="color: #cc66cc;"&gt;0.999&lt;/span&gt;&lt;span style="color: #009900;"&gt;&amp;#41;&lt;/span&gt; &lt;span style="color: #009900;"&gt;&amp;#123;&lt;/span&gt;&lt;br /&gt;  &lt;span style="color: #666666; font-style: italic;"&gt;# Affine scaling method&lt;/span&gt;&lt;br /&gt;  &lt;span style="color: #000000; font-weight: bold;"&gt;while&lt;/span&gt; &lt;span style="color: #009900;"&gt;&amp;#40;&lt;/span&gt;T&lt;span style="color: #009900;"&gt;&amp;#41;&lt;/span&gt; &lt;span style="color: #009900;"&gt;&amp;#123;&lt;/span&gt;&lt;br /&gt;    X_diag &lt;span style=""&gt;&amp;lt;-&lt;/span&gt; &lt;a href="http://inside-r.org/r-doc/base/diag"&gt;&lt;span style="color: #003399; font-weight: bold;"&gt;diag&lt;/span&gt;&lt;/a&gt;&lt;span style="color: #009900;"&gt;&amp;#40;&lt;/span&gt;x&lt;span style="color: #009900;"&gt;&amp;#41;&lt;/span&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;    &lt;span style="color: #666666; font-style: italic;"&gt;# Compute (A * X_diag^2 * A^t)-1 using Cholesky factorization.&lt;/span&gt;&lt;br /&gt;    &lt;span style="color: #666666; font-style: italic;"&gt;# This is responsible for scaling the original problem matrix.&lt;/span&gt;&lt;br /&gt;    &lt;a href="http://inside-r.org/r-doc/base/q"&gt;&lt;span style="color: #003399; font-weight: bold;"&gt;q&lt;/span&gt;&lt;/a&gt; &lt;span style=""&gt;&amp;lt;-&lt;/span&gt; A &lt;span style=""&gt;%*%&lt;/span&gt; X_diag&lt;span style=""&gt;**&lt;/span&gt;&lt;span style="color: #cc66cc;"&gt;2&lt;/span&gt; &lt;span style=""&gt;%*%&lt;/span&gt; &lt;a href="http://inside-r.org/r-doc/base/t"&gt;&lt;span style="color: #003399; font-weight: bold;"&gt;t&lt;/span&gt;&lt;/a&gt;&lt;span style="color: #009900;"&gt;&amp;#40;&lt;/span&gt;A&lt;span style="color: #009900;"&gt;&amp;#41;&lt;/span&gt;&lt;br /&gt;    q_inv &lt;span style=""&gt;&amp;lt;-&lt;/span&gt; &lt;a href="http://inside-r.org/r-doc/base/chol2inv"&gt;&lt;span style="color: #003399; font-weight: bold;"&gt;chol2inv&lt;/span&gt;&lt;/a&gt;&lt;span style="color: #009900;"&gt;&amp;#40;&lt;/span&gt;&lt;a href="http://inside-r.org/r-doc/base/chol"&gt;&lt;span style="color: #003399; font-weight: bold;"&gt;chol&lt;/span&gt;&lt;/a&gt;&lt;span style="color: #009900;"&gt;&amp;#40;&lt;/span&gt;&lt;a href="http://inside-r.org/r-doc/base/q"&gt;&lt;span style="color: #003399; font-weight: bold;"&gt;q&lt;/span&gt;&lt;/a&gt;&lt;span style="color: #009900;"&gt;&amp;#41;&lt;/span&gt;&lt;span style="color: #009900;"&gt;&amp;#41;&lt;/span&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;    &lt;span style="color: #666666; font-style: italic;"&gt;# lambda = q * A * X_diag^2 * c&lt;/span&gt;&lt;br /&gt;    lambda &lt;span style=""&gt;&amp;lt;-&lt;/span&gt; q_inv &lt;span style=""&gt;%*%&lt;/span&gt; A &lt;span style=""&gt;%*%&lt;/span&gt; X_diag&lt;span style=""&gt;^&lt;/span&gt;&lt;span style="color: #cc66cc;"&gt;2&lt;/span&gt; &lt;span style=""&gt;%*%&lt;/span&gt; &lt;a href="http://inside-r.org/packages/cran/RC"&gt;&lt;span style=""&gt;rc&lt;/span&gt;&lt;/a&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;    &lt;span style="color: #666666; font-style: italic;"&gt;# c - A^t * lambda is used repeatedly&lt;/span&gt;&lt;br /&gt;    foo &lt;span style=""&gt;&amp;lt;-&lt;/span&gt; &lt;a href="http://inside-r.org/packages/cran/RC"&gt;&lt;span style=""&gt;rc&lt;/span&gt;&lt;/a&gt; &lt;span style=""&gt;-&lt;/span&gt; &lt;a href="http://inside-r.org/r-doc/base/t"&gt;&lt;span style="color: #003399; font-weight: bold;"&gt;t&lt;/span&gt;&lt;/a&gt;&lt;span style="color: #009900;"&gt;&amp;#40;&lt;/span&gt;A&lt;span style="color: #009900;"&gt;&amp;#41;&lt;/span&gt; &lt;span style=""&gt;%*%&lt;/span&gt; lambda&lt;br /&gt;&amp;nbsp;&lt;br /&gt;    &lt;span style="color: #666666; font-style: italic;"&gt;# We converge as s goes to zero&lt;/span&gt;&lt;br /&gt;    &lt;a href="http://inside-r.org/r-doc/mgcv/s"&gt;&lt;span style="color: #003399; font-weight: bold;"&gt;s&lt;/span&gt;&lt;/a&gt; &lt;span style=""&gt;&amp;lt;-&lt;/span&gt; &lt;a href="http://inside-r.org/r-doc/base/sqrt"&gt;&lt;span style="color: #003399; font-weight: bold;"&gt;sqrt&lt;/span&gt;&lt;/a&gt;&lt;span style="color: #009900;"&gt;&amp;#40;&lt;/span&gt;&lt;a href="http://inside-r.org/r-doc/base/sum"&gt;&lt;span style="color: #003399; font-weight: bold;"&gt;sum&lt;/span&gt;&lt;/a&gt;&lt;span style="color: #009900;"&gt;&amp;#40;&lt;/span&gt;&lt;span style="color: #009900;"&gt;&amp;#40;&lt;/span&gt;X_diag &lt;span style=""&gt;%*%&lt;/span&gt; foo&lt;span style="color: #009900;"&gt;&amp;#41;&lt;/span&gt;&lt;span style=""&gt;^&lt;/span&gt;&lt;span style="color: #cc66cc;"&gt;2&lt;/span&gt;&lt;span style="color: #009900;"&gt;&amp;#41;&lt;/span&gt;&lt;span style="color: #009900;"&gt;&amp;#41;&lt;/span&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;    &lt;span style="color: #666666; font-style: italic;"&gt;# Compute new x&lt;/span&gt;&lt;br /&gt;    x &lt;span style=""&gt;&amp;lt;-&lt;/span&gt; &lt;span style="color: #009900;"&gt;&amp;#40;&lt;/span&gt;x &lt;span style=""&gt;+&lt;/span&gt; R &lt;span style=""&gt;*&lt;/span&gt; X_diag&lt;span style=""&gt;^&lt;/span&gt;&lt;span style="color: #cc66cc;"&gt;2&lt;/span&gt; &lt;span style=""&gt;%*%&lt;/span&gt; foo &lt;span style=""&gt;/&lt;/span&gt; &lt;a href="http://inside-r.org/r-doc/mgcv/s"&gt;&lt;span style="color: #003399; font-weight: bold;"&gt;s&lt;/span&gt;&lt;/a&gt;&lt;span style="color: #009900;"&gt;&amp;#41;&lt;/span&gt;&lt;span style="color: #009900;"&gt;&amp;#91;&lt;/span&gt;&lt;span style="color: #339933;"&gt;,&lt;/span&gt;&lt;span style="color: #009900;"&gt;&amp;#93;&lt;/span&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;    &lt;span style="color: #666666; font-style: italic;"&gt;# If s is within our tolerance, stop.&lt;/span&gt;&lt;br /&gt;    &lt;span style="color: #000000; font-weight: bold;"&gt;if&lt;/span&gt; &lt;span style="color: #009900;"&gt;&amp;#40;&lt;/span&gt;&lt;a href="http://inside-r.org/r-doc/base/abs"&gt;&lt;span style="color: #003399; font-weight: bold;"&gt;abs&lt;/span&gt;&lt;/a&gt;&lt;span style="color: #009900;"&gt;&amp;#40;&lt;/span&gt;&lt;a href="http://inside-r.org/r-doc/mgcv/s"&gt;&lt;span style="color: #003399; font-weight: bold;"&gt;s&lt;/span&gt;&lt;/a&gt;&lt;span style="color: #009900;"&gt;&amp;#41;&lt;/span&gt; &lt;span style=""&gt;&amp;lt;&lt;/span&gt; &lt;a href="http://inside-r.org/packages/cran/tolerance"&gt;&lt;span style=""&gt;tolerance&lt;/span&gt;&lt;/a&gt;&lt;span style="color: #009900;"&gt;&amp;#41;&lt;/span&gt; &lt;span style="color: #000000; font-weight: bold;"&gt;break&lt;/span&gt;&lt;br /&gt;  &lt;span style="color: #009900;"&gt;&amp;#125;&lt;/span&gt;&lt;br /&gt;  x&lt;br /&gt;&lt;span style="color: #009900;"&gt;&amp;#125;&lt;/span&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;&lt;p&gt;This function accepts a matrix A which contains all technological coefficients for an LP, a vector rc containing its reduced costs, and an initial point x interior to the LP's feasible region.  Optional arguments to the function include a tolerance, for detecting when the method is within an acceptable distance from the optimal point, and a value for R, which must be strictly between 0 and 1 and controls scaling.&lt;/p&gt;&lt;p&gt;The method works by rescaling the matrix A around the current solution x.  It then computes a new x such that it remains feasible and interior, which is why R cannot be 0 or 1.  It requires a feasible interior point to start and only projects to other feasible interior points, so the right hand side of the LP is not required &lt;i&gt;(it is implicit from the starting point)&lt;/i&gt;.  The shadow prices for each iteration are captured in the vector lambda, so the gap between primal and dual solutions is easy to compute.&lt;/p&gt;&lt;p&gt;We run this function against a 3x3 LP with a known solution:&lt;/p&gt;&lt;pre&gt;max z = 5x1 + 4x2 + 3x3&lt;br /&gt;st      2x1 + 3x2 +  x3 &lt;=  5 &lt;br /&gt;        4x1 +  x2 + 2x3 &lt;= 11 &lt;br /&gt;        3x1 + 4x2 + 2x3 &lt;=  8 &lt;br /&gt;        x1, x2, x3 &gt;= 0&lt;br /&gt;&lt;/pre&gt;&lt;p&gt;The optimal solution to this LP is:&lt;/p&gt;&lt;pre&gt;z  = 13 &lt;br /&gt;x1 =  2 &lt;br /&gt;x2 =  0 &lt;br /&gt;x3 =  1 &lt;br /&gt;&lt;/pre&gt;&lt;p&gt;This problem can be run against the affine scaling function by defining A with all necessary slack variables, and using an arbitrary feasible interior point:&lt;/p&gt;&lt;pre&gt;A  &lt;- matrix(c(&lt;br /&gt;  2,3,1,1,0,0, &lt;br /&gt;  4,1,2,0,1,0, &lt;br /&gt;  3,4,2,0,0,1&lt;br /&gt;), nrow=3, byrow=T) &lt;br /&gt;rc &lt;- c(5, 4, 3, 0, 0, 0) &lt;br /&gt;x  &lt;- c(0.5, 0.5, 0.5, 2, 7.5, 3.5) &lt;br /&gt;&lt;br /&gt;solution &lt;- solve.affine(A, rc, x) &lt;br /&gt;print(solution) &lt;br /&gt;print(sum(solution * rc)) &lt;br /&gt;&lt;/pre&gt;&lt;p&gt;This provides an output vector that is very close to the optimal primal solution shown above.  Since interior point methods converge asymptotically to optimal solutions, it is important to note that we can only ever get &lt;i&gt;(extremely)&lt;/i&gt; close to our final optimal objective and decision variable values.&lt;/p&gt;&lt;pre&gt;&gt; print(solution) &lt;br /&gt;[1] 1.999998e+00 4.268595e-07 1.000002e+00 1.280579e-06 1.000005e+00 &lt;br /&gt;[6] 1.280579e-06 &lt;br /&gt; &lt;br /&gt;&gt; print(sum(solution * rc))&lt;br /&gt;[1] 13.00000&lt;br /&gt;&lt;/pre&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6378430709872859668-3020468119809419280?l=adventuresinoptimization.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://adventuresinoptimization.blogspot.com/feeds/3020468119809419280/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://adventuresinoptimization.blogspot.com/2011/04/affine-scaling-in-r.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6378430709872859668/posts/default/3020468119809419280'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6378430709872859668/posts/default/3020468119809419280'/><link rel='alternate' type='text/html' href='http://adventuresinoptimization.blogspot.com/2011/04/affine-scaling-in-r.html' title='affine scaling in r'/><author><name>Ryan J. O'Neil</name><uri>http://www.blogger.com/profile/01002920068652190833</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://4.bp.blogspot.com/-7zSc00n-pqc/TbgcePz1iNI/AAAAAAAAAcc/RSMQsfpwnHo/s220/foo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6378430709872859668.post-6968512471005383178</id><published>2011-04-20T11:00:00.000-04:00</published><updated>2011-04-20T11:09:28.253-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='japh'/><category scheme='http://www.blogger.com/atom/ns#' term='python'/><category scheme='http://www.blogger.com/atom/ns#' term='perl'/><title type='text'>reformed japhs in python: scheme to python compilation</title><content type='html'>&lt;p&gt;I believe this is the final JAPH in this series.  I actually didn't have the heart to obfuscate it.  It starts with a Scheme program that prints 'just another scheme hacker', tokenizes it, parses the token stream, compiles that into Python 3.2, and executes the resulting string.  If anybody else wants to obfuscate it, be my guest.  Otherwise I'll just let it speak for itself.&lt;/p&gt;&lt;pre class="brush: python; toolbar: false;"&gt;import re&lt;br /&gt;&lt;br /&gt;def tokenize(input):&lt;br /&gt;    '''Tokenizes an input stream into a list of recognizable tokens'''&lt;br /&gt;    token_res = (&lt;br /&gt;        r'\(',      # open paren -&gt; starts expression&lt;br /&gt;        r'\)',      # close paren -&gt; ends expression&lt;br /&gt;        r'"[^"]*"', # quoted string (don't support \" yet)&lt;br /&gt;        r'[\w?]+'   # atom&lt;br /&gt;    )&lt;br /&gt;    return re.findall(r'(' + '|'.join(token_res) + ')', input)&lt;br /&gt;&lt;br /&gt;def parse(stream):&lt;br /&gt;    '''Parses a token stream into a syntax tree'''&lt;br /&gt;    if not stream:&lt;br /&gt;        return []&lt;br /&gt;&lt;br /&gt;    else:&lt;br /&gt;        # Build a list of arguments (possibly expressions) at this level&lt;br /&gt;        args = []&lt;br /&gt;        while True:&lt;br /&gt;            # Get the next token&lt;br /&gt;            try:&lt;br /&gt;                x = stream.pop(0)&lt;br /&gt;            except IndexError:&lt;br /&gt;                return args&lt;br /&gt;&lt;br /&gt;            # ( and ) control the level of the tree we're at&lt;br /&gt;            if x == '(':&lt;br /&gt;                args.append(parse(stream))&lt;br /&gt;            elif x == ')':&lt;br /&gt;                return args&lt;br /&gt;            else:&lt;br /&gt;                args.append(x)&lt;br /&gt;&lt;br /&gt;def compile(tree):&lt;br /&gt;    '''Compiles an Scheme Abstract Syntax Tree into near-Python'''&lt;br /&gt;    def compile_expr(indent, expr):&lt;br /&gt;        indent += 1&lt;br /&gt;&lt;br /&gt;        lines = [] # these will have [(indent, statement), ...] structure&lt;br /&gt;        while expr:&lt;br /&gt;            # Two options: expr is a string like "'" or it is a list&lt;br /&gt;            if isinstance(expr, str):&lt;br /&gt;                return [(&lt;br /&gt;                    indent,&lt;br /&gt;                    expr.replace('scheme', 'python').replace('\n', '\\n')&lt;br /&gt;                )]&lt;br /&gt;&lt;br /&gt;            else:&lt;br /&gt;                start = expr.pop(0)&lt;br /&gt;&lt;br /&gt;                if start == 'define':&lt;br /&gt;                    signature = expr.pop(0)&lt;br /&gt;                    lines.append((indent,&lt;br /&gt;                        'def %s(%s):' % (&lt;br /&gt;                            signature[0],&lt;br /&gt;                            ', '.join(signature[1:])&lt;br /&gt;                        )&lt;br /&gt;                    ))&lt;br /&gt;                    while expr:&lt;br /&gt;                        lines.extend(compile_expr(indent, expr.pop(0)))&lt;br /&gt;&lt;br /&gt;                elif start == 'if':&lt;br /&gt;                    # We don't support multi-clause conditionals yet&lt;br /&gt;                    clause = compile_expr(indent, expr.pop(0))[0][1]&lt;br /&gt;                    lines.append((indent, 'if %s:' % clause))&lt;br /&gt;&lt;br /&gt;                    if_true_lines = compile_expr(indent, expr.pop(0))&lt;br /&gt;                    if_false_lines = compile_expr(indent, expr.pop(0))&lt;br /&gt;&lt;br /&gt;                    lines.extend(if_true_lines)&lt;br /&gt;                    lines.append((indent, 'else:'))&lt;br /&gt;                    lines.extend(if_false_lines)&lt;br /&gt;&lt;br /&gt;                elif start == 'null?':&lt;br /&gt;                    # Only supports conditionals of the form (null? foo)&lt;br /&gt;                    if isinstance(expr[0], str):&lt;br /&gt;                        condition = expr.pop(0)&lt;br /&gt;                    else:&lt;br /&gt;                        condition = compile_expr(indent, expr.pop(0))[0][1]&lt;br /&gt;                    return [(indent, 'not %s' % condition)]&lt;br /&gt;&lt;br /&gt;                elif start == 'begin':&lt;br /&gt;                    # This is just a series of statements, so don't indent&lt;br /&gt;                    while expr:&lt;br /&gt;                        lines.extend(compile_expr(indent-1, expr.pop(0)))&lt;br /&gt;&lt;br /&gt;                elif start == 'display':&lt;br /&gt;                    arguments = []&lt;br /&gt;                    while expr:&lt;br /&gt;                        arguments.append(&lt;br /&gt;                            compile_expr(indent, expr.pop(0))[0][1]&lt;br /&gt;                        )&lt;br /&gt;                    lines.append((&lt;br /&gt;                        indent,&lt;br /&gt;                        "print(%s, end='')" % (', '.join(arguments))&lt;br /&gt;                    ))&lt;br /&gt;&lt;br /&gt;                elif start == 'car':&lt;br /&gt;                    lines.append((indent, '%s[0]' % expr.pop(0)))&lt;br /&gt;&lt;br /&gt;                elif start == 'cdr':&lt;br /&gt;                    lines.append((indent, '%s[1:]' % expr.pop(0)))&lt;br /&gt;&lt;br /&gt;                elif start == 'list':&lt;br /&gt;                    arguments = []&lt;br /&gt;                    while expr:&lt;br /&gt;                        arguments.append(&lt;br /&gt;                            compile_expr(indent, expr.pop(0))[0][1]&lt;br /&gt;                        )&lt;br /&gt;                    lines.append((indent, '[%s]' % ', '.join(arguments)))&lt;br /&gt;&lt;br /&gt;                else:&lt;br /&gt;                    # Assume this is a function call&lt;br /&gt;                    arguments = []&lt;br /&gt;                    while expr:&lt;br /&gt;                        arguments.append(&lt;br /&gt;                            compile_expr(indent, expr.pop(0))[0][1]&lt;br /&gt;                        )&lt;br /&gt;                    lines.append((&lt;br /&gt;                        indent,&lt;br /&gt;                        "%s(%s)" % (start, ', '.join(arguments))&lt;br /&gt;                    ))&lt;br /&gt;&lt;br /&gt;        return lines&lt;br /&gt;&lt;br /&gt;    return [compile_expr(-1, expr) for expr in tree]&lt;br /&gt;&lt;br /&gt;if __name__ == '__main__':&lt;br /&gt;    scheme = '''&lt;br /&gt;        (define (output x)&lt;br /&gt;            (if (null? x)&lt;br /&gt;                ""&lt;br /&gt;                (begin (display (car x))&lt;br /&gt;                       (if (null? (cdr x))&lt;br /&gt;                           (display "\n")&lt;br /&gt;                           (begin (display " ")&lt;br /&gt;                                  (output (cdr x)))))))&lt;br /&gt;        (output (list "just" "another" "scheme" "hacker"))&lt;br /&gt;    '''&lt;br /&gt;    python = ''&lt;br /&gt;    for expr in compile(parse(tokenize(scheme))):&lt;br /&gt;        python += '\n'.join([(' ' * 4 * x[0]) + x[1] for x in expr]) + '\n\n'&lt;br /&gt;    exec(python)&lt;/pre&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6378430709872859668-6968512471005383178?l=adventuresinoptimization.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://adventuresinoptimization.blogspot.com/feeds/6968512471005383178/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://adventuresinoptimization.blogspot.com/2011/04/reformed-japhs-in-python-scheme-to.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6378430709872859668/posts/default/6968512471005383178'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6378430709872859668/posts/default/6968512471005383178'/><link rel='alternate' type='text/html' href='http://adventuresinoptimization.blogspot.com/2011/04/reformed-japhs-in-python-scheme-to.html' title='reformed japhs in python: scheme to python compilation'/><author><name>Ryan J. O'Neil</name><uri>http://www.blogger.com/profile/01002920068652190833</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://4.bp.blogspot.com/-7zSc00n-pqc/TbgcePz1iNI/AAAAAAAAAcc/RSMQsfpwnHo/s220/foo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6378430709872859668.post-2216291960249205473</id><published>2011-04-18T08:00:00.000-04:00</published><updated>2011-04-18T08:50:45.649-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='japh'/><category scheme='http://www.blogger.com/atom/ns#' term='python'/><category scheme='http://www.blogger.com/atom/ns#' term='perl'/><title type='text'>reformed japhs in python: turing machine</title><content type='html'>&lt;p&gt;This JAPH constructs a &lt;a href="http://en.wikipedia.org/wiki/Turing_machine"&gt;Turing machine&lt;/a&gt; in order to achieve its goal.  The machine accepts any string that ends in '\n' and, to assist in our purposes, allows side effects.  This lets us print the value of the tape as it encounters each character.  While the idea of using lambda functions as side effects in a Turing machine is a little bizarre on many levels, we work with what we have.  And Python is multi-paradigmatic, so what the heck.&lt;/p&gt;&lt;pre class="brush: python; toolbar: false;"&gt;import re&lt;br /&gt;&lt;br /&gt;def turing(tape, transitions):&lt;br /&gt;    # The tape input comes in as a string.  We approximate an infinite&lt;br /&gt;    # length tape via a hash, so we need to convert this to {index: value}&lt;br /&gt;    tape_hash = {i: x for i, x in enumerate(tape)}&lt;br /&gt;&lt;br /&gt;    # Start at 0 using our transition matrix&lt;br /&gt;    index = 0&lt;br /&gt;    state = 0&lt;br /&gt;    while True:&lt;br /&gt;        value = tape_hash.get(index, '')&lt;br /&gt;&lt;br /&gt;        # This is a modified Turing machine: it uses regexen&lt;br /&gt;        # and has side effects.  Oh well, I needed IO.&lt;br /&gt;        for rule in transitions[state]:&lt;br /&gt;            regex, next, direction, new_value, side_effect = rule&lt;br /&gt;            if re.match(regex, value):&lt;br /&gt;                # Terminal states&lt;br /&gt;                if new_value in ('YES', 'NO'):&lt;br /&gt;                    return new_value&lt;br /&gt;&lt;br /&gt;                tape_hash[index] = new_value&lt;br /&gt;                side_effect(value)&lt;br /&gt;                index += direction&lt;br /&gt;                state = next&lt;br /&gt;                break&lt;br /&gt;&lt;br /&gt;assert 'YES' == turing('just another python hacker\n', [&lt;br /&gt;    # This Turing machine recognizes the language of strings that end in \n.&lt;br /&gt;&lt;br /&gt;    # Regex rule, next state, left/right = -1/+1, new value, side effect.&lt;br /&gt;    [ # State 0:&lt;br /&gt;        [r'^[a-z ]$', 0, +1, '', lambda x: print(x, end='')],&lt;br /&gt;        [r'^\n$', 1, +1, '', lambda x: print(x, end='')],&lt;br /&gt;        [r'^.*$', 0, +1, 'NO', None],&lt;br /&gt;    ],&lt;br /&gt;    [ # State 1:&lt;br /&gt;        [r'^$', 1, -1, 'YES', None]&lt;br /&gt;    ]&lt;br /&gt;])&lt;/pre&gt;&lt;p&gt;Obfuscation again consists of converting the above code into lambda functions using Y combinators.  This is a pretty fantastic programming exercise, so I've left it out of this post in case anyone wants to try.  And of course the Turing machine has to return 'YES' to indicate that it accepts the string, thus the assertion.  Our final obfuscated JAPH looks like this, amazingly in a single expression:&lt;/p&gt;&lt;pre class="brush: python; toolbar: false;"&gt;assert'''YES'''==(lambda g:(lambda f:g(lambda arg:f(f)(arg)))(lambda f:g(&lt;br /&gt;lambda arg: f(f)(arg))))(lambda f: lambda q:[(lambda g:(lambda f:g(lambda&lt;br /&gt;arg:f(f)(arg)))(lambda f: g(lambda arg:f(f)(arg))))(lambda f: lambda x:(x&lt;br /&gt;[0][0]if x[0] and __import__('re').match(x[0][0][0],x[1])else f([x[0][1:]&lt;br /&gt;,x[1]]))) ([q[3][q[1]],q[2].get(q[0],'')])[4](q[2].get(q[0],'')), (lambda&lt;br /&gt;g:(lambda f:g(lambda arg:f(f)(arg))) (lambda f:g(lambda arg:f(f)(arg))))(&lt;br /&gt;lambda f:lambda x:(x[0][0]if x[0] and __import__('re').match(x[0][0][0],x&lt;br /&gt;[1])else f([x[0][1:],x[1]])))([q[3][q[1]],q[2].get(q[0],'')])[3]if(lambda&lt;br /&gt;g:(lambda f:g(lambda arg:f(f)(arg))) (lambda f:g(lambda arg:f(f)(arg))))(&lt;br /&gt;lambda f:lambda x:(x[0][0]if x[0]and __import__('re').match(x[0][0][0],x[&lt;br /&gt;1]) else f([x[0][1:],x[1]])))([q[3][q[1]],q[2].get(q[0],'')])[3]in('YES',&lt;br /&gt;'NO')else f([q[0]+(lambda g:(lambda f:g(lambda arg:f(f)(arg)))(lambda f:g&lt;br /&gt;(lambda arg:f(f)(arg))))(lambda f:lambda x:(x[0][0]if x[0]and __import__(&lt;br /&gt;'re').match(x[0][0][0],x[1])else f([x[0][1:], x[1]])))([q[3][q[1]], q[2].&lt;br /&gt;get(q[0],'')])[2],(lambda g:(lambda f:g(lambda arg: f(f)(arg)))(lambda f:&lt;br /&gt;g(lambda arg:f(f)(arg))))(lambda f:lambda x:(x[0][0]if x[0]and __import__&lt;br /&gt;('re').match(x[0][0][0],x[1])else f([x[0][1:], x[1]])))([q[3][q[1]],q[2].&lt;br /&gt;get(q[0],'')])[1],q[2],q[3]])][1])([0,0,{i:x for i,x in enumerate('just '&lt;br /&gt;'another python hacker\n')}, [[[r'^[a-z ]$',0,+1,'',lambda x:print(x,end=&lt;br /&gt;'')], [r'^\n$',1,+1,'',lambda x:print(x, end='')],[r'^.*$',0,+1,'''NO''',&lt;br /&gt;lambda x:None]], [[r'''^$''',+1,-1,'''YES''', lambda x: None or None]]]])&lt;/pre&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6378430709872859668-2216291960249205473?l=adventuresinoptimization.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://adventuresinoptimization.blogspot.com/feeds/2216291960249205473/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://adventuresinoptimization.blogspot.com/2011/04/reformed-japhs-in-python-turing-machine.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6378430709872859668/posts/default/2216291960249205473'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6378430709872859668/posts/default/2216291960249205473'/><link rel='alternate' type='text/html' href='http://adventuresinoptimization.blogspot.com/2011/04/reformed-japhs-in-python-turing-machine.html' title='reformed japhs in python: turing machine'/><author><name>Ryan J. O'Neil</name><uri>http://www.blogger.com/profile/01002920068652190833</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://4.bp.blogspot.com/-7zSc00n-pqc/TbgcePz1iNI/AAAAAAAAAcc/RSMQsfpwnHo/s220/foo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6378430709872859668.post-1250130247563995637</id><published>2011-04-14T10:30:00.000-04:00</published><updated>2011-04-14T10:30:00.990-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='japh'/><category scheme='http://www.blogger.com/atom/ns#' term='python'/><category scheme='http://www.blogger.com/atom/ns#' term='perl'/><title type='text'>reformed japhs in python: huffman coding</title><content type='html'>&lt;p&gt;At this point, tricking python into printing strings via ever more pernicious mechanisms got a little boring.  So I switched to obfuscating fundamental computer science algorithms.  Here's a JAPH that takes in a &lt;a href="http://en.wikipedia.org/wiki/Huffman_coding"&gt;Huffman coded&lt;/a&gt; version of 'just another python hacker', decodes, and prints it.&lt;/p&gt;&lt;pre class="brush: python; toolbar: false;"&gt;# Build coding tree&lt;br /&gt;def build_tree(scheme):&lt;br /&gt;    if scheme.startswith('*'):&lt;br /&gt;        left, scheme = build_tree(scheme[1:])&lt;br /&gt;        right, scheme = build_tree(scheme)&lt;br /&gt;        return (left, right), scheme&lt;br /&gt;    else:&lt;br /&gt;        return scheme[0], scheme[1:]&lt;br /&gt;&lt;br /&gt;def decode(tree, encoded):&lt;br /&gt;    ret = ''&lt;br /&gt;    node = tree&lt;br /&gt;    for direction in encoded:&lt;br /&gt;        if direction == '0':&lt;br /&gt;            node = node[0]&lt;br /&gt;        else:&lt;br /&gt;            node = node[1]&lt;br /&gt;        if isinstance(node, str):&lt;br /&gt;            ret += node&lt;br /&gt;            node = tree&lt;br /&gt;    return ret&lt;br /&gt;&lt;br /&gt;tree = build_tree('*****ju*sp*er***yct* h**ka*no')[0]&lt;br /&gt;print(&lt;br /&gt;    decode(tree, bin(10627344201836243859174935587).lstrip('0b').zfill(103))&lt;br /&gt;)&lt;/pre&gt;&lt;p&gt;The decoding tree is built like a true LISP-style tree as a sequence of pairs.  '*' represents a branch in the tree while other characters are leaf nodes.  This looks like ((((('j', 'u'), ('s', 'p')), ('e', 'r')), ((('y', 'c'), 't'), (' ', 'h'))), (('k', 'a'), ('n', 'o'))) after it's constructed.&lt;/p&gt;&lt;p&gt;The actual Huffman coded version of our favorite string looks like 0000000001000100101011010111011101010111001000110110000110100001010111111110011001111010100110000100011, which in base-2 encoding represents around a 50% savings in space.  Well worth all the effort, right?&lt;/p&gt;&lt;p&gt;There's a catch here, which is that this is hard to obfuscate unless we turn it into a single expression.  This means that we have to convert build_tree and decode into lambda functions.  Unfortunately, they are recursive and lambda functions don't do that easily.  Fortunately, we can use &lt;a href="http://code.activestate.com/recipes/576366-y-combinator/"&gt;Y combinators&lt;/a&gt; and then the rest is simple.  These are worth some study since they will pop up again later.&lt;/p&gt;&lt;pre class="brush: python; toolbar: false;"&gt;Y = lambda g: (&lt;br /&gt;    lambda f: g(lambda arg: f(f)(arg))) (lambda f: g(lambda arg: f(f)(arg))&lt;br /&gt;)&lt;br /&gt;&lt;br /&gt;build_tree = Y(&lt;br /&gt;    lambda f: lambda scheme: (&lt;br /&gt;        (f(scheme[1:])[0], f(f(scheme[1:])[1])[0]),&lt;br /&gt;        f(f(scheme[1:])[1])[1 ]&lt;br /&gt;    ) if scheme.startswith('*') else (scheme[0], scheme[1:])&lt;br /&gt;)&lt;br /&gt;&lt;br /&gt;decode = Y(lambda f: lambda x: x[3]+x[1] if not x[2] else (&lt;br /&gt;    f([x[0], x[0], x[2], x[3]+x[1]]) if isinstance(x[1], str) else (&lt;br /&gt;        f([x[0], x[1][0], x[2][1:], x[3]]) if x[2][0] == '0' else (&lt;br /&gt;            f([x[0], x[1][1], x[2][1:], x[3]])&lt;br /&gt;        )&lt;br /&gt;    )&lt;br /&gt;))&lt;br /&gt;&lt;br /&gt;tree = build_tree('*****ju*sp*er***yct* h**ka*no')[0]&lt;br /&gt;print(&lt;br /&gt;    decode([&lt;br /&gt;        tree,&lt;br /&gt;        tree,&lt;br /&gt;        bin(10627344201836243859174935587).lstrip('0b').zfill(103), ''&lt;br /&gt;    ])&lt;br /&gt;)&lt;/pre&gt;&lt;p&gt;The final version is really just a condensed (and expanded, weirdly) version of the above (again, make sure to use Python 3.2):&lt;/p&gt;&lt;pre class="brush: python; toolbar: false;"&gt;print((lambda t,e,s:(lambda g:(lambda f:g(lambda arg:f(f)(arg)))(lambda f:&lt;br /&gt;g(lambda arg: f(f)(arg))))(lambda f:lambda x: x[3]+x[1]if not x[2]else f([&lt;br /&gt;x[0],x[0],x[2],x[3]+x[1]])if isinstance(x[1],str)else f([x[0],x[1][0],x[2]&lt;br /&gt;[1:],x[3]])if x[2][0]=='0'else f([x[0],x[1][1],x[2][1:],x[3]]))([t,t,e,s])&lt;br /&gt;)((lambda g:(lambda f:g(lambda arg:f(f)(arg)))(lambda f:g(lambda arg:f(f)(&lt;br /&gt;arg))))(lambda f:lambda p:((f(p[1:])[0],f(f(p[1:])[1])[0]),f(f(p[1:])[1])[&lt;br /&gt;1])if p.startswith('*')else(p[0],p[1:]))('*****ju*sp*er***yct* h**ka*no')[&lt;br /&gt;0],bin(10627344201836243859179756385-4820798).lstrip('0b').zfill(103),''))&lt;/pre&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6378430709872859668-1250130247563995637?l=adventuresinoptimization.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://adventuresinoptimization.blogspot.com/feeds/1250130247563995637/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://adventuresinoptimization.blogspot.com/2011/04/reformed-japhs-in-python-huffman-coding.html#comment-form' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6378430709872859668/posts/default/1250130247563995637'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6378430709872859668/posts/default/1250130247563995637'/><link rel='alternate' type='text/html' href='http://adventuresinoptimization.blogspot.com/2011/04/reformed-japhs-in-python-huffman-coding.html' title='reformed japhs in python: huffman coding'/><author><name>Ryan J. O'Neil</name><uri>http://www.blogger.com/profile/01002920068652190833</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://4.bp.blogspot.com/-7zSc00n-pqc/TbgcePz1iNI/AAAAAAAAAcc/RSMQsfpwnHo/s220/foo.jpg'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6378430709872859668.post-4821814306491558479</id><published>2011-04-11T09:20:00.000-04:00</published><updated>2011-04-11T09:21:54.573-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='japh'/><category scheme='http://www.blogger.com/atom/ns#' term='python'/><category scheme='http://www.blogger.com/atom/ns#' term='perl'/><title type='text'>reformed japhs in python: rolling effect</title><content type='html'>&lt;p&gt;Here's a JAPH composed solely for effect.  For each letter in 'just another python hacker' it loops over each the characters  ' abcdefghijklmnopqrstuvwxyz', printing each.  Between characters it pauses for 0.05 seconds, backing up and moving on to the next if it hasn't reached the desired one yet.  This achieves a sort of rolling effect by which the final string laboriously appears on our screen.&lt;/p&gt;&lt;pre class="brush: python; toolbar: false;"&gt;import string&lt;br /&gt;import sys&lt;br /&gt;import time&lt;br /&gt;&lt;br /&gt;letters = ' ' + string.ascii_lowercase&lt;br /&gt;for l in 'just another python hacker':&lt;br /&gt;    for x in letters:&lt;br /&gt;        print(x, end='')&lt;br /&gt;        sys.stdout.flush()&lt;br /&gt;        time.sleep(0.05)&lt;br /&gt;&lt;br /&gt;        if x == l:&lt;br /&gt;            break&lt;br /&gt;        else:&lt;br /&gt;            print('\b', end='')&lt;br /&gt;&lt;br /&gt;print()&lt;/pre&gt;&lt;p&gt;Locating and printing of each letter in the string is done via a list comprehension.  At the end we actually have an extra line of code (the eval statement) that gives us our newline.&lt;/p&gt;&lt;pre class="brush: python; toolbar: false;"&gt;[[(lambda x,l:str(print(x,end=''))+str(__import__(print.&lt;br /&gt;__doc__[print.__doc__.index('stdout') - 4:print.__doc__.&lt;br /&gt;index('stdout')-1]).stdout.flush()) + str(__import__(''.&lt;br /&gt;join(reversed('emit'))).sleep(0o5*1.01/0x64))+str(print(&lt;br /&gt;'\b',end='\x09'.strip())if x!=l else'*&amp;#'))(x1,l1)for x1&lt;br /&gt;in('\x20'+getattr(__import__(type('phear').__name__+'in'&lt;br /&gt;'g'),dir(__import__(type('snarf').__name__+'ing'))[13]))&lt;br /&gt;[:('\x20'+getattr(__import__(type('smear').__name__+'in'&lt;br /&gt;'g'),dir(__import__(type('slurp').__name__+'ing'))[13]))&lt;br /&gt;.index(l1)+1]]for l1 in'''just another python hacker''']&lt;br /&gt;eval('''\x20\x09eval("\x20\x09eval('\x20 print()')")''')&lt;/pre&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6378430709872859668-4821814306491558479?l=adventuresinoptimization.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://adventuresinoptimization.blogspot.com/feeds/4821814306491558479/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://adventuresinoptimization.blogspot.com/2011/04/reformed-japhs-in-python-rolling-effect.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6378430709872859668/posts/default/4821814306491558479'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6378430709872859668/posts/default/4821814306491558479'/><link rel='alternate' type='text/html' href='http://adventuresinoptimization.blogspot.com/2011/04/reformed-japhs-in-python-rolling-effect.html' title='reformed japhs in python: rolling effect'/><author><name>Ryan J. O'Neil</name><uri>http://www.blogger.com/profile/01002920068652190833</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://4.bp.blogspot.com/-7zSc00n-pqc/TbgcePz1iNI/AAAAAAAAAcc/RSMQsfpwnHo/s220/foo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6378430709872859668.post-5780976825402438948</id><published>2011-04-06T07:30:00.002-04:00</published><updated>2011-04-06T07:30:01.204-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='japh'/><category scheme='http://www.blogger.com/atom/ns#' term='python'/><category scheme='http://www.blogger.com/atom/ns#' term='perl'/><title type='text'>reformed japhs in python: rot13</title><content type='html'>&lt;p&gt;No series of JAPHs would be complete without &lt;a href="http://en.wikipedia.org/wiki/ROT13"&gt;ROT13&lt;/a&gt;.  This is the example through which aspiring Perl programmers learn to use tr and its synonym y.  In Perl the basic ROT13 JAPH starts as:&lt;/p&gt;&lt;pre class="brush: python; toolbar: false;"&gt;$foo = 'whfg nabgure crey unpxre';&lt;br /&gt;$foo =~ y/a-z/n-za-m/;&lt;br /&gt;print $foo;&lt;/pre&gt;&lt;p&gt;Python has nothing quite so elegant in its default namespace.  However, this does give us the opportunity to explore a little used aspect of strings: the translate method.  If we construct a dictionary of ordinals we can accomplish the same thing with a touch more effort.&lt;/p&gt;&lt;pre class="brush: python; toolbar: false;"&gt;import string&lt;br /&gt;&lt;br /&gt;table = {&lt;br /&gt;    ord(x): ord(y) for x, y in zip(&lt;br /&gt;        string.ascii_lowercase,&lt;br /&gt;        string.ascii_lowercase[13:] + string.ascii_lowercase&lt;br /&gt;    )&lt;br /&gt;}&lt;br /&gt;&lt;br /&gt;print('whfg nabgure clguba unpxre'.translate(table))&lt;/pre&gt;&lt;p&gt;We obfuscate the construction of this translation dictionary and, for added measure, use getattr to find the print function off of __builtins__.  This will likely only work in Python 3.2, since the order of attributes on __builtins__ matters.&lt;/p&gt;&lt;pre class="brush: python; toolbar: false;"&gt;getattr(vars()[list(filter(lambda _:'\x5f\x62'in _,dir&lt;br /&gt;()))[0]], dir(vars()[list(filter(lambda _:'\x5f\x62'in&lt;br /&gt;_, dir()))[0]])[list(filter(lambda _:_ [1].startswith(&lt;br /&gt;'\x70\x72'),enumerate(dir(vars()[list(filter(lambda _:&lt;br /&gt;'\x5f\x62'in _,dir()))[0]]))))[0][0]])(getattr('whfg '&lt;br /&gt;+'''nabgure clguba unpxre''', dir('0o52')[0o107])({ _:&lt;br /&gt;(_-0o124) %0o32 +0o141 for _ in range(0o141, 0o173)}))&lt;/pre&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6378430709872859668-5780976825402438948?l=adventuresinoptimization.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://adventuresinoptimization.blogspot.com/feeds/5780976825402438948/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://adventuresinoptimization.blogspot.com/2011/04/reformed-japhs-in-python-rot13.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6378430709872859668/posts/default/5780976825402438948'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6378430709872859668/posts/default/5780976825402438948'/><link rel='alternate' type='text/html' href='http://adventuresinoptimization.blogspot.com/2011/04/reformed-japhs-in-python-rot13.html' title='reformed japhs in python: rot13'/><author><name>Ryan J. O'Neil</name><uri>http://www.blogger.com/profile/01002920068652190833</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://4.bp.blogspot.com/-7zSc00n-pqc/TbgcePz1iNI/AAAAAAAAAcc/RSMQsfpwnHo/s220/foo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6378430709872859668.post-1214166675235651821</id><published>2011-04-03T10:30:00.000-04:00</published><updated>2011-04-03T10:30:00.910-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='japh'/><category scheme='http://www.blogger.com/atom/ns#' term='python'/><category scheme='http://www.blogger.com/atom/ns#' term='perl'/><title type='text'>reformed japhs in python: ridiculous anagram</title><content type='html'>&lt;p&gt;Here's the second in my reformed JAPH series.  It takes an anagram of 'just another python hacker' and converts it prior to printing.  It sorts the anagram by the indices of another string, in order of their associated characters.  This is sort of like a pre-digested &lt;a href="http://en.wikipedia.org/wiki/Schwartzian_transform"&gt;Schwartzian transform&lt;/a&gt;.&lt;/p&gt;&lt;pre class="brush: python; toolbar: false;"&gt;x = 'upjohn tehran hectors katy'&lt;br /&gt;y = '1D0HG6JFO9P5ICKAM87B24NL3E'&lt;br /&gt;&lt;br /&gt;print(''.join(x[i] for i in sorted(range(len(x)), key=lambda p: y[p])))&lt;/pre&gt;&lt;p&gt;Obfuscation consists mostly of using silly machinations to construct the string we use to sort the anagram.&lt;/p&gt;&lt;pre class="brush: python; toolbar: false;"&gt;print(''.join('''upjohn tehran hectors katy'''[_]for _ in sorted(range&lt;br /&gt;(26),key=lambda p:(hex(29)[2:].upper()+str(3*3*3*3-3**4)+'HG'+str(sum(&lt;br /&gt;range(4)))+'JFO'+str((1+2)**(1+1))+'P'+str(35/7)[:1]+'i.c.k.'.replace(&lt;br /&gt;'.','').upper()+'AM'+str(3**2*sum(range(5))-3)+hex(0o5444)[2:].replace&lt;br /&gt;(*'\x62|\x42'.split('|'))+'NL'+hex(0o076).split('x')[1].upper())[p])))&lt;/pre&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6378430709872859668-1214166675235651821?l=adventuresinoptimization.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://adventuresinoptimization.blogspot.com/feeds/1214166675235651821/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://adventuresinoptimization.blogspot.com/2011/04/reformed-japhs-in-python-ridiculous.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6378430709872859668/posts/default/1214166675235651821'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6378430709872859668/posts/default/1214166675235651821'/><link rel='alternate' type='text/html' href='http://adventuresinoptimization.blogspot.com/2011/04/reformed-japhs-in-python-ridiculous.html' title='reformed japhs in python: ridiculous anagram'/><author><name>Ryan J. O'Neil</name><uri>http://www.blogger.com/profile/01002920068652190833</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://4.bp.blogspot.com/-7zSc00n-pqc/TbgcePz1iNI/AAAAAAAAAcc/RSMQsfpwnHo/s220/foo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6378430709872859668.post-2408732761901175671</id><published>2011-04-01T21:40:00.008-04:00</published><updated>2011-04-01T22:17:13.414-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='japh'/><category scheme='http://www.blogger.com/atom/ns#' term='python'/><category scheme='http://www.blogger.com/atom/ns#' term='perl'/><title type='text'>reformed japhs in python: alphabetic indexing</title><content type='html'>&lt;p&gt;Full disclosure: I used to be a Perl programmer. (Hey, &lt;a href="http://drforr.livejournal.com/"&gt;Jeff.&lt;/a&gt;)&lt;/p&gt;&lt;p&gt;One day I became disillusioned at the progress of Perl 6 and decided to &lt;a href="http://www.python.org/dev/peps/pep-0020/"&gt;import this&lt;/a&gt;.  This appears to be a fairly common story for Perl to Python converts.  While I haven't looked back much, there are a number of things I really miss about perl &lt;i&gt;(lower case intentional)&lt;/i&gt;.  Among other things, I miss having value types in a dynamic language, magical and ill-advised use of &lt;a href="http://www.foo.be/docs/tpj/issues/vol3_1/tpj0301-0003.html"&gt;cryptocontext&lt;/a&gt;, and sometimes even &lt;a href="http://perldesignpatterns.com/?PseudoHash"&gt;pseudohashes&lt;/a&gt; because they were inexcusably weird.  A language that supports so many ideas out of the box enables an extended learning curve that lasts for &lt;a href="http://silver.sucs.org/~manic/humour/languages/perlhacker.htm"&gt;many years&lt;/a&gt;.  "Perl itself is the game."&lt;/p&gt;&lt;p&gt;Most of all I think I miss writing Perl &lt;a href="http://www.perlmonks.org/?node=Perl%20Poetry"&gt;poetry&lt;/a&gt; and &lt;a href="http://en.wikipedia.org/wiki/Just_another_Perl_hacker"&gt;JAPHs&lt;/a&gt;.  Stupidly, I didn't keep any of those I wrote, and I'm not competent enough with the language anymore to write interesting ones.  At the time I was intentionally distancing myself from a model that was largely implicit and based on understanding of archaic system internals and moving to one that was (supposedly) explicit and simple.  After switching to Python as my primary language, I used the following email signature in a nod to this change in worldview &lt;i&gt;(intended for Python 2.x)&lt;/i&gt;:&lt;/p&gt;&lt;pre class="brush: python; toolbar: false;"&gt;print 'just another python hacker'&lt;/pre&gt;&lt;p&gt;Recently I've been experimenting with writing JAPHs in Python.  I think of these as reformed JAPHs.  They accomplish the same purpose as programming exercises but exist with a far more restricted context.  In some ways they are more challenging.  Creativity is difficult when functioning within a more explicit landscape.  I now have a small series of reformed JAPHs dedicated to a close friend which increase monotonically in complexity.  Here is the first one, written in plain understandable Python 3.2:&lt;/p&gt;&lt;pre class="brush: python; toolbar: false;"&gt;import string&lt;br /&gt;&lt;br /&gt;letters = string.ascii_lowercase + ' '&lt;br /&gt;indices = [&lt;br /&gt;     9, 20, 18, 19, 26,  0, 13, 14, 19, 7,  4, 17, 26,&lt;br /&gt;    15, 24, 19,  7, 14, 13, 26,  7,  0, 2, 10,  4, 17&lt;br /&gt;]&lt;br /&gt;&lt;br /&gt;print(''.join(letters[i] for i in indices))&lt;/pre&gt;&lt;p&gt;This is fairly simple.  Instead of explicitly embedding the string 'just another python hacker' in the program, we assemble it using the index of its letters in the string 'abcdefghijklmnopqrstuvwxyz '.  Obfuscation is achieved through a series of minor measures:&lt;/p&gt;&lt;ul&gt;&lt;li&gt;Instead of calling the print function, we import sys and make a call to sys.stdout.write.&lt;/li&gt;&lt;li&gt;string.lowercase + ' ' is assembled by joining together the character versions of its respective ordinal values (97 to 123 and 32).&lt;/li&gt;&lt;li&gt;The integer indices are joined together using 'l' and split into a list.&lt;/li&gt;&lt;li&gt;Liberal use of ''' and the fact that multiple subsequent strings in Python are concatenated.&lt;/li&gt;&lt;/ul&gt;&lt;p&gt;Here's the obfuscated version:&lt;/p&gt;&lt;pre class="brush: python; toolbar: false;"&gt;eval("__import__('''\x73''''''\x79''''''\x73''').sTdOuT".lower()&lt;br /&gt;).write(''.join(map(lambda _:(list(map(chr,range(97,123)))+[chr(&lt;br /&gt;32)])[int(_)],('''9l20l18l19''''''l26l0l13l14l19l7l4l17l26l15'''&lt;br /&gt;'''l24l19l7l14l1''''''3l26l7l0l2l10l4l17''').split('l')))+'\n',)&lt;/pre&gt;&lt;p&gt;More could certainly be done, but that's about as far as this one captivated my interest.  Stay tuned for the next JAPH.&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6378430709872859668-2408732761901175671?l=adventuresinoptimization.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://adventuresinoptimization.blogspot.com/feeds/2408732761901175671/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://adventuresinoptimization.blogspot.com/2011/04/reformed-japhs-in-python-alphabetic.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6378430709872859668/posts/default/2408732761901175671'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6378430709872859668/posts/default/2408732761901175671'/><link rel='alternate' type='text/html' href='http://adventuresinoptimization.blogspot.com/2011/04/reformed-japhs-in-python-alphabetic.html' title='reformed japhs in python: alphabetic indexing'/><author><name>Ryan J. O'Neil</name><uri>http://www.blogger.com/profile/01002920068652190833</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://4.bp.blogspot.com/-7zSc00n-pqc/TbgcePz1iNI/AAAAAAAAAcc/RSMQsfpwnHo/s220/foo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6378430709872859668.post-4118227637302054901</id><published>2011-03-14T17:28:00.000-04:00</published><updated>2011-03-15T00:22:10.448-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='ampl'/><category scheme='http://www.blogger.com/atom/ns#' term='formulation'/><category scheme='http://www.blogger.com/atom/ns#' term='python-zibopt'/><category scheme='http://www.blogger.com/atom/ns#' term='zimpl'/><title type='text'>what i hate about writing decision support systems: a call to arms</title><content type='html'>&lt;p&gt;I realize it's been a while since I've updated &lt;a href="http://code.google.com/p/python-zibopt/"&gt;python-zibopt&lt;/a&gt;.  While it's true that I am wracked with guilt over this&lt;sup&gt;&lt;a href="#footnote-1"&gt;1&lt;/a&gt;&lt;/sup&gt;, I can't erase a feeling that there are larger issues that should be addressed in the optimization solver API department.&lt;/p&gt;&lt;p&gt;There are two standard methods for accessing a solver: through an algebraic modeling language like &lt;a href="http://ampl.com/"&gt;AMPL&lt;/a&gt; or using some kind of API.  python-zibopt belongs to the latter category, along with a host of other technologies.  Modeling languages tend to be very good at describing algebraic models succinctly, and at loading data into these models &lt;i&gt;(provided it's in a supported format)&lt;/i&gt;.  Where they lack is in basic programming features and libraries.  Try writing an interface between a modeling system and a web site, or performing anything beyond the simplest logic within a model.  These are often difficult or impossible tasks.  Never mind getting access to things like an object system or concurrency.&lt;/p&gt;&lt;p&gt;And these aren't things modeling language developers should have to worry about.  They already exist in most general purpose programming languages, which is why most serious decision support systems use direct APIs for optimization.  Unfortunately, this means giving up access to  most algebraic modeling features.  Consider the flow of operations for a typical decision support system:&lt;/p&gt;&lt;ul&gt;&lt;li&gt;Collect user input, probably represented in a database model.&lt;/li&gt;&lt;li&gt;Convert this input to an algebraic model and variables.&lt;/li&gt;&lt;li&gt;Hand this model off to a solver, often in the form of a big, ugly matrix.&lt;/li&gt;&lt;li&gt;Call the solver. Wait for an optimal solution, maximum bounds gap, or time out.&lt;/li&gt;&lt;li&gt;Assuming a feasible or optimal solution is even found, convert it back.&lt;/li&gt;&lt;li&gt;Display model solution output to the user.&lt;/li&gt;&lt;/ul&gt;&lt;p&gt;This is a little bit insane.  At best, a developer has to code two representations of the data and two different conversions from one model type to another.  The pain of this process has a lot to do with the solver APIs available, and these are anything but standard.  At its &lt;a href="http://abel.ee.ucla.edu/cvxopt/examples/tutorial/lp.html"&gt;most painful&lt;/a&gt; &lt;i&gt;(and most common)&lt;/i&gt;, this requires construction of a giant matrix for feeding into the solver.  &lt;a href="http://www.gurobi.com/doc/40/quickstart/node8.html"&gt;Some APIs&lt;/a&gt; wrap constraints and "variables"&lt;sup&gt;&lt;a href="#footnote-2"&gt;2&lt;/a&gt;&lt;/sup&gt; into object classes to try and attenuate the pain.  Others, like python-zibopt, add syntactic sugar from algebraic modeling languages, such as creating constraints by summation of decision variables over index sets, to make the modeler feel more at home.&lt;/p&gt;&lt;p&gt;Still, that's about as far as we can seem to go.  And the results still require different representations of the same information along with numerous conversions back and forth between formats.  Surely, as operations research practitioners we recognize that this is fundamentally inefficient.  As someone who straddles the worlds of CS and OR, here are a few things that I think should be easy for me to do when developing models:&lt;/p&gt;&lt;ul&gt;&lt;li&gt;Define single variables that represent data living in a database, web form, and solver, much the way web programmers use an &lt;a href="http://en.wikipedia.org/wiki/Object-relational_mapping"&gt;object-relational mapper&lt;/a&gt;.&lt;/li&gt;&lt;li&gt;Serialize an algebraic model to a simple format such as JSON.&lt;/li&gt;&lt;li&gt;Subject decision variables to the same kinds of conditionals I can use with normal variables, like "if i &lt;= 3".&lt;/li&gt;&lt;li&gt;All the things I haven't thought of yet but can do using a traditional programming language.&lt;/li&gt;&lt;/ul&gt;&lt;p&gt;It's my belief that this is difficult because general purpose languages and modeling languages treat variables in fundamentally different ways.  In most languages, the moment a variable is defined it has some sort of value, even if that value is null.  In purely functional languages, variables are prohibited from even being null.  &lt;i&gt;(They also don't vary, but that's another story.)&lt;/i&gt;  For an algebraic model, a variable represents a universe of possibilities.  Typically a variable can take on any real value immediately upon instantiation.  That universe may be limited by the addition of constraints.  Decision variables then take on a series of values as the solver searches the resulting feasible set.&lt;/p&gt;&lt;p&gt;It seems that resolving this dichotomy could alleviate the difficulties of our current modeling situation.  I'm not sure where to start.  Perhaps the introduction of a sub-language into a general purpose language would do the trick?  This would be similar to how &lt;a href="https://github.com/perlpilot/perl6-docs/blob/master/intro/p6-regex-intro.pod"&gt;regular expressions are treated in Perl 6&lt;/a&gt;.  I'd like to hear your thoughts.&lt;/p&gt;&lt;ol&gt;&lt;a name="footnote-1"&gt;&lt;/a&gt;&lt;li&gt;Seriously.  This keeps me up at night.&lt;/li&gt;&lt;a name="footnote-2"&gt;&lt;/a&gt;&lt;li&gt;The oddity of defining another variable type within a general purpose programming language is not lost on me.&lt;/li&gt;&lt;/ol&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6378430709872859668-4118227637302054901?l=adventuresinoptimization.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://adventuresinoptimization.blogspot.com/feeds/4118227637302054901/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://adventuresinoptimization.blogspot.com/2011/03/what-i-hate-about-writing-decision.html#comment-form' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6378430709872859668/posts/default/4118227637302054901'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6378430709872859668/posts/default/4118227637302054901'/><link rel='alternate' type='text/html' href='http://adventuresinoptimization.blogspot.com/2011/03/what-i-hate-about-writing-decision.html' title='what i hate about writing decision support systems: a call to arms'/><author><name>Ryan J. O'Neil</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://4.bp.blogspot.com/_Cv0KZsgj0Fo/TTX24yBxlSI/AAAAAAAAABQ/qLlZGYpEpT8/S220/2011-01-12-205103.jpg'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6378430709872859668.post-1866176395712388683</id><published>2011-03-12T23:09:00.000-05:00</published><updated>2011-03-12T23:09:35.409-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='pycon'/><category scheme='http://www.blogger.com/atom/ns#' term='yapc'/><category scheme='http://www.blogger.com/atom/ns#' term='perl'/><title type='text'>talks that shaped me</title><content type='html'>&lt;p&gt;As this weekend's PyCon progresses, which I am unfortunately unable to attend, I find myself reminiscing about a similar conference I went to a decade ago.  At the time the .com bust hadn't entirely taken hold of the Washington economy yet, so a few coworkers and I were sent to &lt;a href="http://www.yapc.org/America/previous-years/2001/"&gt;YAPC 19101&lt;/a&gt;&lt;sup&gt;&lt;a href="#footnote-1"&gt;1&lt;/a&gt;&lt;/sup&gt;.  I was a newly minted developer and hadn't received much formal computer science education yet.  Here are a few talks that opened my eyes to the world of possibilities.&lt;/p&gt;&lt;h4&gt;&lt;a href="http://schwern.net/"&gt;Michael Schwern&lt;/a&gt;: Disciplined Programming, or How to be Lazy without Really Trying&lt;/h4&gt;&lt;p&gt;This talk was about testing.  Schwern attempted to convince us that writing unit tests, although they weren't called that yet, was fundamental to the &lt;a href="http://c2.com/cgi/wiki?LazinessImpatienceHubris"&gt;three virtues of a programmer&lt;/a&gt;.  He reasoned that he maintained a lot of well used modules on CPAN and would never be able to handle them all if he hadn't developed habits that enable laziness.  Thus automated testing engenders laziness and is therefore to be encouragd.  This is when I realized that programming is the intersection of two totally different fields: a &lt;a href="http://en.wikipedia.org/wiki/Computer_science"&gt;science&lt;/a&gt; and a &lt;a href="http://en.wikipedia.org/wiki/Software_engineering"&gt;discipline&lt;/a&gt;.&lt;/p&gt;&lt;h4&gt;&lt;a href="http://perl.plover.com/"&gt;Mark-Jason Dominus&lt;/a&gt;: Stolen Secrets of the Wizards of the Ivory Tower&lt;/h4&gt;&lt;p&gt;MJD, one of the few programmers known by his initials, was deep into his studies of functional programming at the time and gave an in-depth preview of what would become &lt;a href="http://hop.perl.plover.com/"&gt;Higher Order Perl&lt;/a&gt;.  In addition to well known &lt;a href="http://www.catb.org/~esr/faqs/hacker-howto.html"&gt;advice from ESR&lt;/a&gt;, another TLA, this is why I learned functional programming.  It's also what piqued my interest in algorithmic complexity, which later led to a love of integer programming.&lt;/p&gt;&lt;h4&gt;&lt;a href="http://en.wikipedia.org/wiki/Damian_Conway"&gt;Damian Conway&lt;/a&gt;: Life, the Universe, and Everything&lt;/h4&gt;&lt;p&gt;This talk is mostly a blur, but that's because it was at the time too.  Conway possessed such a deep understanding of so many topics, and an ability to leap from one to another, that the experience was dizzying.  He implemented the &lt;a href="http://www.bitstorm.org/gameoflife/"&gt;Game of Life&lt;/a&gt; using &lt;a href="http://search.cpan.org/~dconway/Quantum-Superpositions-1.03/lib/Quantum/Superpositions.pm"&gt;Quantum::Superpositions&lt;/a&gt;, both &lt;a href="http://search.cpan.org/~mschwern/Lingua-tlhInganHol-yIghun-20090601/lib/Lingua/tlhInganHol/yIghun.pm"&gt;wrote Perl&lt;/a&gt; and spoke in Klingon, and, if I remember correctly, received a standing ovation at the end.  Conway set my standard for live demos.&lt;/p&gt;&lt;p&gt;There were others that made an impression, but even these three are hazy after a decade of bit rot.&lt;/p&gt;&lt;p&gt;When I got home I was excited and in awe.  I hacked at tail recursive functions in assembly language, studied LISP, and painstakingly implemented the exercises from &lt;a href="http://mitpress.mit.edu/sicp/full-text/book/book.html"&gt;SICP&lt;/a&gt; in half a dozen languages.  I learned Erlang from the &lt;a href="http://www.erlang.org/erlang_book_toc.html"&gt;only available book at the time&lt;/a&gt; because I thought concurrent programming would one day be important&lt;sup&gt;&lt;a href="#footnote-2"&gt;2&lt;/a&gt;&lt;/sup&gt;.  I picked up Python because "Python is not the enemy, Java is."  YAPC energized me to pursue the routine of computer science autodidact.  It took almost another &lt;a href="http://norvig.com/21-days.html"&gt;ten years&lt;/a&gt;.&lt;/p&gt;&lt;p&gt;Montreal will always hold a special place in my heart.  Before I was born, my parents lived there while my father was a graduate student at McGill studying biophysical chemistry.  They were dirt poor and suffered, but whenever they speak of it they light up and are suddenly animated.  For me, it's where I learned how big the universe of knowledge is, and how little I had been exposed to.  It's where I was given the push that shaped my studies, career, and life.  Nowadays conference talks are often recorded and posted to YouTube or blip.tv, but back then it was considered cutting edge for someone to post MP3s of lighting talks.  To those of you who speak at community technical conferences, thank you, and know that your efforts do make an impact.&lt;p&gt;&lt;ol&gt;&lt;a name="footnote-1"&gt;&lt;/a&gt;&lt;li&gt;Prior to Y2K, it was fairly common to find Perl code that assembled the current year using "'19'.(localtime())[5]".  Of course, index 5 of the return value of localtime is the number of years since 1900, so this produced 19100 instead of 2000 and 19101 instead of 2001.  This was &lt;i&gt;really&lt;/i&gt; funny at the time.&lt;/li&gt;&lt;a name="footnote-2"&gt;&lt;/a&gt;&lt;li&gt;Though it's uncredited, I actually did submit an entry for &lt;a href="http://foldoc.org/Open+Telecom+Platform"&gt;OTP to FOLDOC&lt;/a&gt; back in August of 2001.  My claim to fame.&lt;/li&gt;&lt;/ol&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6378430709872859668-1866176395712388683?l=adventuresinoptimization.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://adventuresinoptimization.blogspot.com/feeds/1866176395712388683/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://adventuresinoptimization.blogspot.com/2011/03/talks-that-shaped-me.html#comment-form' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6378430709872859668/posts/default/1866176395712388683'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6378430709872859668/posts/default/1866176395712388683'/><link rel='alternate' type='text/html' href='http://adventuresinoptimization.blogspot.com/2011/03/talks-that-shaped-me.html' title='talks that shaped me'/><author><name>Ryan J. O'Neil</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://4.bp.blogspot.com/_Cv0KZsgj0Fo/TTX24yBxlSI/AAAAAAAAABQ/qLlZGYpEpT8/S220/2011-01-12-205103.jpg'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6378430709872859668.post-2965750707396457257</id><published>2011-02-23T17:11:00.004-05:00</published><updated>2011-05-19T14:43:48.035-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='simulation'/><category scheme='http://www.blogger.com/atom/ns#' term='media'/><category scheme='http://www.blogger.com/atom/ns#' term='r'/><title type='text'>simulating gdp growth</title><content type='html'>&lt;p&gt;&lt;i&gt;[2011-02-24 This post has been corrected.  Near the end it stated that we could be 95% confident China's GDP would overtake the USA between 2057 and 2059.  It should have stated that we could be 95% confident the mean of that year would be between 2057 and 2059.  Thanks to my overbearing subconscious for pointing this out.]&lt;/i&gt;&lt;/p&gt;&lt;p&gt;I hope you saw &lt;a href="http://www.washingtonpost.com/wp-srv/special/business/china-growth/"&gt;China’s way to the top&lt;/a&gt; on the Post's website recently.  It's a very clear presentation of their statement and is certainly worth a look.&lt;/p&gt;&lt;p&gt;So say you're an economist and you actually do need to produce a realistic estimate of when China's GDP surpasses that of the USA.  Can you use such an approach?  Not really.  There are several simplifying assumptions the Post made that are perfectly reasonable.  However, if the goal is an analytical output from a highly random system such as GDP growth, one should not assume the inputs are fixed. &lt;i&gt;(I'm not saying I have any gripe with their interactive.  This post has a different purpose.)&lt;/i&gt;&lt;/p&gt;&lt;p&gt;Why is this?  The short answer is that randomness in any system can change its output drastically from one run to the next.  Even if the mean from a deterministic analysis is correct, it tells us nothing about the variance of our output.  We really need a confidence interval of years when China is likely to overtake the USA.&lt;/p&gt;&lt;p&gt;We'll move in the great tradition of all simulation studies.  First we pepare our input.  A CSV of GDP in current US dollars for both countries from 1960 to 2009 is available &lt;a href="http://chenoneil.com/blog/simulation/gdp.csv"&gt;here&lt;/a&gt;.  This was simply copied and transposed from the World Bank &lt;a href="http://data.worldbank.org/country/china"&gt;data&lt;/a&gt; &lt;a href="http://data.worldbank.org/country/usa"&gt;files&lt;/a&gt; &lt;i&gt;(thanks, &lt;a href="http://twitter.com/#!/kelsosCorner"&gt;Kelso&lt;/a&gt;)&lt;/i&gt;.  We read this into a data frame and calculate their growth rates year over year.  Note that the first value for growth has to be NA.&lt;/p&gt;&lt;pre&gt;&gt; gdp &lt;- read.csv('gdp.csv')&lt;br /&gt;&gt; gdp$USA.growth &lt;- rep(NA, length(gdp$USA))&lt;br /&gt;&gt; gdp$China.growth &lt;- rep(NA, length(gdp$China))&lt;br /&gt;&gt; for (i in 2:length(gdp$USA)) {&lt;br /&gt;+     gdp$USA.growth[i] &lt;- 100 * (gdp$USA[i] - gdp$USA[i-1]) / gdp$USA[i-1]&lt;br /&gt;+     gdp$China.growth[i] &lt;- 100 * (gdp$China[i] - gdp$China[i-1]) / gdp$China[i-1]&lt;br /&gt;+ }&lt;/pre&gt;&lt;p&gt;We now analyze our inputs and assign probability distributions to the annual growth rates.  In a full study this would involve comparing a number of different distributions and choosing the one that fits the input data best, but that's well beyond the scope of this post.  Instead, we'll use the poor man's way out: plot histograms and visually verify what we hope to be true, that the distributions are normal.&lt;/p&gt;&lt;center&gt;&lt;img border="0" height="319" src="http://3.bp.blogspot.com/-gv8uC9UW_R0/TWWDO6fUkjI/AAAAAAAAACM/VlwfolsmIDo/s320/us-gdp-percent-growth-histogram.png" width="320" /&gt;&lt;br /&gt;&lt;img border="0" height="320" width="320" src="http://4.bp.blogspot.com/-P4wpC3qXY0s/TWWDcXsjIgI/AAAAAAAAACU/xDUvMIrWOwg/s320/china-gdp-percent-growth-histogram.png" /&gt;&lt;/center&gt;&lt;p&gt;And they pretty much are.  That's good enough for our purposes.  Now all we need are the distribution parameters, which are mean and standard deviation for normal distributions.&lt;/p&gt;&lt;pre&gt;&gt; mean(gdp$USA.growth[!is.na(gdp$USA.growth)])&lt;br /&gt;[1] 7.00594&lt;br /&gt;&gt; sd(gdp$USA.growth[!is.na(gdp$USA.growth)])&lt;br /&gt;[1] 2.889808&lt;br /&gt;&gt; mean(gdp$China.growth[!is.na(gdp$China.growth)])&lt;br /&gt;[1] 9.90896&lt;br /&gt;&gt; sd(gdp$China.growth[!is.na(gdp$China.growth)])&lt;br /&gt;[1] 10.5712&lt;/pre&gt;&lt;p&gt;Now our input analysis is done.  These are the inputs:&lt;/p&gt;&lt;center&gt;USA Growth ~ N(7.00594, 2.889808&lt;sup&gt;2&lt;/sup&gt;)&lt;br /&gt;China Growth ~ N(9.90896, 10.5712&lt;sup&gt;2&lt;/sup&gt;)&lt;/center&gt;&lt;p&gt;This should make the advantage of such an approach much more obvious.  Compare the standard deviations for the two countries.  China is a lot more likely to have negative GDP growth in any given year.  They're also more likely to have astronomical growth.&lt;/p&gt;&lt;p&gt;We now build and run our simulation study.  The more times we run the simulation the tighter we can make our confidence interval &lt;i&gt;(to a point)&lt;/i&gt;, so we'll pick a pretty big number somewhat arbitrarily.  If we want to, we can be fairly scientific about determining how many iterations are necessary after we've done some runs, but we have to start somewhere.&lt;/p&gt;&lt;pre&gt;&gt; repetitions &lt;- 10000&lt;/pre&gt;&lt;p&gt;This is the code for our simulation.  For each iteration, it starts both countries at their 2009 GDPs.  It then iterates, changing GDP randomly until China's GDP is at least the same value as the USA's.  When that happens, it records the current year.&lt;/p&gt;&lt;pre&gt;&gt; results &lt;- rep(NA, repetitions)&lt;br /&gt;&gt; for (i in 1:repetitions) {&lt;br /&gt;+     usa &lt;- gdp$USA[length(gdp$USA)]&lt;br /&gt;+     china &lt;- gdp$China[length(gdp$China)]&lt;br /&gt;+     year &lt;- gdp$Year[length(gdp$Year)]&lt;br /&gt;+  &lt;br /&gt;+     while (TRUE) {&lt;br /&gt;+         year &lt;- year + 1&lt;br /&gt;+   &lt;br /&gt;+         usa.growth &lt;- rnorm(1, 7.00594, 2.889808)&lt;br /&gt;+         china.growth &lt;- rnorm(1, 9.90896, 10.5712)&lt;br /&gt;+   &lt;br /&gt;+         usa &lt;- usa * (1 + (usa.growth / 100))&lt;br /&gt;+         china &lt;- china * (1 + (china.growth / 100))&lt;br /&gt;+   &lt;br /&gt;+         if (china &gt;= usa) {&lt;br /&gt;+             results[i] &lt;- year&lt;br /&gt;+             break&lt;br /&gt;+         }&lt;br /&gt;+     }&lt;br /&gt;+ }&lt;/pre&gt;&lt;p&gt;From the results vector we see that, given the data and assumptions for this model, China should surpass the USA in 2058.  We also see that we can be 95% confident that the mean year this will happen is between 2057 and 2059.  This is not quite the same as saying we are confident this will actually happen between those years.  The result of our simulation is a probability distribution and we are discovering information about it.&lt;/p&gt;&lt;pre&gt;&gt; mean(results)[1] 2058.494&lt;br /&gt;&gt; mean(results) + (sd(results) / sqrt(length(results)) * qnorm(0.025))&lt;br /&gt;[1] 2057.873&lt;br /&gt;&gt; mean(results) + (sd(results) / sqrt(length(results)) * qnorm(0.975))&lt;br /&gt;[1] 2059.114&lt;/pre&gt;&lt;p&gt;So what's wrong with this model?  Well, we had to make a number of assumptions:&lt;ol&gt;  &lt;li&gt;We assume we actually used the right data set.  This was more of a how-to than a proper analysis, so that wasn't too much of a concern.&lt;/li&gt;  &lt;li&gt;We assume future growth for the next 40-50 years resembles past growth from 1960-2009.  This is a bit ridiculous, of course, but that's the problem with forecasting.&lt;/li&gt;  &lt;li&gt;We assume growth is normally distributed and that we don't encounter heavy-tailed behaviors in our distributions.&lt;/li&gt;  &lt;li&gt;We assume each year's growth is independent of the year before it.  See the last exercise.&lt;/li&gt;&lt;/ol&gt;&lt;/p&gt;&lt;p&gt;Here are some good simulation exercises if you're looking to do more:&lt;ol&gt;  &lt;li&gt;Note how the outputs are quite a bit different from the Post graphic.  I expect that's largely due to the inclusion of data back to 1960.  Try running the simulation for yourself using just the past 10, 20, and 30 years and see how that changes the result.&lt;/li&gt;  &lt;li&gt;Write a simulation to determine the probability China's GDP surpasses the USA's in the next 25 years.  Now plot the mean GDP and 95% confidence intervals for each country per year.&lt;/li&gt;  &lt;li&gt;Assume that there are actually two distributions for growth for each country: one when the previous year had positive growth and another when it was negative.  How does that change the output?&lt;/li&gt;&lt;/ol&gt;&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6378430709872859668-2965750707396457257?l=adventuresinoptimization.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://adventuresinoptimization.blogspot.com/feeds/2965750707396457257/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://adventuresinoptimization.blogspot.com/2011/02/simulating-gdp-growth.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6378430709872859668/posts/default/2965750707396457257'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6378430709872859668/posts/default/2965750707396457257'/><link rel='alternate' type='text/html' href='http://adventuresinoptimization.blogspot.com/2011/02/simulating-gdp-growth.html' title='simulating gdp growth'/><author><name>Ryan J. O'Neil</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://4.bp.blogspot.com/_Cv0KZsgj0Fo/TTX24yBxlSI/AAAAAAAAABQ/qLlZGYpEpT8/S220/2011-01-12-205103.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://3.bp.blogspot.com/-gv8uC9UW_R0/TWWDO6fUkjI/AAAAAAAAACM/VlwfolsmIDo/s72-c/us-gdp-percent-growth-histogram.png' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6378430709872859668.post-3355722417010029497</id><published>2011-02-15T16:13:00.000-05:00</published><updated>2011-02-15T16:13:18.254-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='data fitting'/><category scheme='http://www.blogger.com/atom/ns#' term='r'/><title type='text'>data fitting part 2a: very, very simple linear regression in r</title><content type='html'>&lt;p&gt;I thought it might be useful to follow up the &lt;a href="http://adventuresinoptimization.blogspot.com/2011/02/data-fitting-part-2-very-very-simple.html"&gt;last post&lt;/a&gt; with another one showing the same examples in R.  If you want to try the examples, first download a CSV of the input data &lt;a href="http://chenoneil.com/blog/data-fitting/example_data.csv"&gt;here&lt;/a&gt;.&lt;/p&gt;&lt;p&gt;R provides a function called lm, which is similar in spirit to &lt;a href="http://numpy.scipy.org/"&gt;numpy&lt;/a&gt;'s linalg.lstsq.  As you'll see, lm's interface is a bit more tuned to the concepts of modeling.&lt;/p&gt;&lt;p&gt;We begin by reading in the example CSV into a data frame:&lt;/p&gt;&lt;pre&gt;&gt; responses &lt;- read.csv('example_data.csv')&lt;br /&gt;&lt;br /&gt;&gt; responses&lt;br /&gt;  respondent vanilla.love strawberry.love chocolate.love&lt;br /&gt;1     Serdar            9               4              9&lt;br /&gt;2        Dan            8               6              4&lt;br /&gt;3  Nathaniel            9               4              8&lt;br /&gt;4     Lauren            3               7              9&lt;br /&gt;5        Jen            6               8              5&lt;br /&gt;6     Jackie            4               5              3&lt;br /&gt;&lt;br /&gt;  dog.love cat.love&lt;br /&gt;1        9        9&lt;br /&gt;2       10        4&lt;br /&gt;3        2        6&lt;br /&gt;4        4        6&lt;br /&gt;5        2        5&lt;br /&gt;6       10        3&lt;br /&gt;&lt;/pre&gt;&lt;p&gt;A data frame is sort of like a matrix, but with named columns.  That is, we can refer to entire columns using the dollar sign.  We are now ready to run least squares.  We'll create the model for predicting "dog love."  To create the "cat love" model, simply use that column name instead:&lt;/p&gt;&lt;pre&gt;&gt; fit1 &lt;- lm(responses$dog.love ~ responses$vanilla.love &lt;br /&gt;    + responses$strawberry.love + responses$chocolate.love)&lt;br /&gt;&lt;/pre&gt;&lt;p&gt;The syntax for lm is a little offputting at first.  This call tells it to create a model for "dog love" with respect to &lt;i&gt;(the ~)&lt;/i&gt; a function of the form &lt;i&gt;offset + x1*vanilla love + x2*strawberry love + x3*chocolate love&lt;/i&gt;.  Note that the offset is conveniently implied when using lm, so this is the same as the second model we created in Python.  Now that we've computed the coefficients for our "dog love" model, we can ask R about it:&lt;/p&gt;&lt;pre&gt;&gt; summary(fit1)&lt;br /&gt;&lt;br /&gt;Call:&lt;br /&gt;lm(formula = responses$dog.love ~ responses$vanilla.love &lt;br /&gt;  + responses$strawberry.love + responses$chocolate.love)&lt;br /&gt;&lt;br /&gt;Residuals:&lt;br /&gt;      1       2       3       4       5       6 &lt;br /&gt; 3.1827  2.9436 -4.5820  0.8069 -1.9856 -0.3657 &lt;br /&gt;&lt;br /&gt;Coefficients:&lt;br /&gt;                          Estimate Std. Error t value Pr(&gt;|t|)&lt;br /&gt;(Intercept)                20.9298    15.0654   1.389    0.299&lt;br /&gt;responses$vanilla.love     -0.2783     0.9934  -0.280    0.806&lt;br /&gt;responses$strawberry.love  -1.4314     1.5905  -0.900    0.463&lt;br /&gt;responses$chocolate.love   -0.7647     0.8214  -0.931    0.450&lt;br /&gt;&lt;br /&gt;Residual standard error: 4.718 on 2 degrees of freedom&lt;br /&gt;Multiple R-squared: 0.4206,     Adjusted R-squared: -0.4485 &lt;br /&gt;F-statistic: 0.484 on 3 and 2 DF,  p-value: 0.7272 &lt;br /&gt;&lt;/pre&gt;&lt;p&gt;This gives us quite a bit of information, including the coefficients for our "dog love" model and various error metrics.  You can find the offset and coefficients under the Estimate column above.  We quickly verify this using R's vectorized arithmetic:&lt;/p&gt;&lt;pre&gt;&gt; 20.9298 - 0.2783 * responses$vanilla.love &lt;br /&gt;    - 1.4314 * responses$strawberry.love &lt;br /&gt;    - 0.7647 * responses$chocolate.love&lt;br /&gt;&lt;br /&gt;[1]  5.8172  7.0562  6.5819  3.1928  3.9853 10.3655&lt;br /&gt;&lt;/pre&gt;&lt;p&gt;You'll notice the model is essentially the same as the one we got from numpy.  Our next step is to add in the squared inputs.  We do this by adding extra terms to the modeling formula.  The I() function allows us to easily add additional operators to columns.  That's how we accomplish the squaring.  We could alternatively add squared input values to the data frame, but using I() is more convenient and natural.&lt;/p&gt;&lt;pre&gt;&gt; fit2 &lt;- lm(responses$dog.love ~ responses$vanilla.love &lt;br /&gt;    + I(responses$vanilla.love^2) + responses$strawberry.love &lt;br /&gt;    + I(responses$strawberry.love^2) + responses$chocolate.love &lt;br /&gt;    + I(responses$chocolate.love^2))&lt;br /&gt;&lt;br /&gt;&gt; summary(fit2)&lt;br /&gt;&lt;br /&gt;Call:&lt;br /&gt;lm(formula = responses$dog.love ~ responses$vanilla.love &lt;br /&gt;  + I(responses$vanilla.love^2) + responses$strawberry.love &lt;br /&gt;  + I(responses$strawberry.love^2) + responses$chocolate.love &lt;br /&gt;  + I(responses$chocolate.love^2))&lt;br /&gt;&lt;br /&gt;Residuals:&lt;br /&gt;ALL 6 residuals are 0: no residual degrees of freedom!&lt;br /&gt;&lt;br /&gt;Coefficients: (1 not defined because of singularities)&lt;br /&gt;                               Estimate Std. Error t value Pr(&gt;|t|)&lt;br /&gt;(Intercept)                    -357.444         NA      NA       NA&lt;br /&gt;responses$vanilla.love           72.444         NA      NA       NA&lt;br /&gt;I(responses$vanilla.love^2)      -6.111         NA      NA       NA&lt;br /&gt;responses$strawberry.love        59.500         NA      NA       NA&lt;br /&gt;I(responses$strawberry.love^2)   -5.722         NA      NA       NA&lt;br /&gt;responses$chocolate.love          7.000         NA      NA       NA&lt;br /&gt;I(responses$chocolate.love^2)        NA         NA      NA       NA&lt;br /&gt;&lt;br /&gt;Residual standard error: NaN on 0 degrees of freedom&lt;br /&gt;Multiple R-squared:     1,      Adjusted R-squared:   NaN &lt;br /&gt;F-statistic:   NaN on 5 and 0 DF,  p-value: NA &lt;br /&gt;&lt;/pre&gt;&lt;p&gt;We can see that we get the same "dog love" model as produced by the third Python version of the last post.  Again, we quickly verify that the output is the same (minus some rounding errors):&lt;/p&gt;&lt;pre&gt;&gt; -357.444 + 72.444 * responses$vanilla.love &lt;br /&gt;    - 6.111 * responses$vanilla.love^2 &lt;br /&gt;    + 59.5 * responses$strawberry.love &lt;br /&gt;    - 5.722 * responses$strawberry.love^2 &lt;br /&gt;    + 7 * responses$chocolate.love&lt;br /&gt;&lt;br /&gt;[1]  9.009 10.012  2.009  4.011  2.016 10.006&lt;br /&gt;&lt;/pre&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6378430709872859668-3355722417010029497?l=adventuresinoptimization.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://adventuresinoptimization.blogspot.com/feeds/3355722417010029497/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://adventuresinoptimization.blogspot.com/2011/02/data-fitting-part-2a-very-very-simple.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6378430709872859668/posts/default/3355722417010029497'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6378430709872859668/posts/default/3355722417010029497'/><link rel='alternate' type='text/html' href='http://adventuresinoptimization.blogspot.com/2011/02/data-fitting-part-2a-very-very-simple.html' title='data fitting part 2a: very, very simple linear regression in r'/><author><name>Ryan J. O'Neil</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://4.bp.blogspot.com/_Cv0KZsgj0Fo/TTX24yBxlSI/AAAAAAAAABQ/qLlZGYpEpT8/S220/2011-01-12-205103.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6378430709872859668.post-366137730103186682</id><published>2011-02-15T00:39:00.012-05:00</published><updated>2011-02-15T14:21:01.751-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='data fitting'/><category scheme='http://www.blogger.com/atom/ns#' term='python'/><title type='text'>data fitting part 2: very, very simple linear regression in python</title><content type='html'>&lt;p&gt;This post is based on a memo I sent to some former colleagues at the Post.  I've edited it for use here since it fits well as the second in a series on simple data fitting techniques.  If you're among the many enlightened individuals already using regression analysis, then this post is probably not for you.  If you aren't, then hopefully this provides everything you need to develop rudimentary predictive models that yield surprising levels of accuracy.&lt;/p&gt;&lt;p&gt;For purposes of a simple working example, we have collected six records of input data over three dimensions with the goal of predicting two outputs.  The input data are:&lt;/p&gt;&lt;pre&gt;&lt;br /&gt;    x1: How much a respondent likes vanilla [0-10]&lt;br /&gt;    x2: How much a respondent likes strawberry [0-10]&lt;br /&gt;    x3: How much a respondent likes chocolate [0-10]&lt;br /&gt;&lt;/pre&gt;&lt;p&gt;Output data consist of:&lt;/p&gt;&lt;pre&gt;&lt;br /&gt;    b1: How much a respondent likes dogs [0-10]&lt;br /&gt;    b2: How much a respondent likes cats [0-10]&lt;br /&gt;&lt;/pre&gt;&lt;p&gt;Below are entirely anonymous data collected from people who bear no resemblance to certain Washington Post staffers.&lt;/p&gt;&lt;center&gt;&lt;table class="nice"&gt;&lt;tr&gt;&lt;th&gt;respondent&lt;/th&gt;&lt;th&gt;vanilla love&lt;/th&gt;&lt;th&gt;strawberry love&lt;/th&gt;&lt;th&gt;chocolate love&lt;/th&gt;&lt;th&gt;dog love&lt;/th&gt;&lt;th&gt;cat love&lt;/th&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Serdar&lt;/td&gt;&lt;td&gt;9&lt;/td&gt;&lt;td&gt;4&lt;/td&gt;&lt;td&gt;9&lt;/td&gt;&lt;td&gt;9&lt;/td&gt;&lt;td&gt;9&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Dan&lt;/td&gt;&lt;td&gt;8&lt;/td&gt;&lt;td&gt;6&lt;/td&gt;&lt;td&gt;4&lt;/td&gt;&lt;td&gt;10&lt;/td&gt;&lt;td&gt;4&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Nathaniel&lt;/td&gt;&lt;td&gt;9&lt;/td&gt;&lt;td&gt;4&lt;/td&gt;&lt;td&gt;8&lt;/td&gt;&lt;td&gt;2&lt;/td&gt;&lt;td&gt;6&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Lauren&lt;/td&gt;&lt;td&gt;3&lt;/td&gt;&lt;td&gt;7&lt;/td&gt;&lt;td&gt;9&lt;/td&gt;&lt;td&gt;4&lt;/td&gt;&lt;td&gt;6&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Jen&lt;/td&gt;&lt;td&gt;6&lt;/td&gt;&lt;td&gt;8&lt;/td&gt;&lt;td&gt;5&lt;/td&gt;&lt;td&gt;2&lt;/td&gt;&lt;td&gt;5&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Jackie&lt;/td&gt;&lt;td&gt;4&lt;/td&gt;&lt;td&gt;5&lt;/td&gt;&lt;td&gt;3&lt;/td&gt;&lt;td&gt;10&lt;/td&gt;&lt;td&gt;3&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;&lt;/center&gt;&lt;p&gt;Our input is in three dimensions.  Each output requires its own model, so we'll have one for dogs and one for cats.  We're looking for functions, dog(x) and cat(x), that can predict b1 and b2 based on given values of x1, x2, and x3.&lt;/p&gt;&lt;p&gt;For both models we want to find parameters that minimize their squared residuals (read: errors).  There's a number of names for this.  Optimization folks like to think of it as unconstrained quadratic optimization, but it's more common to call it least squares or linear regression.  It's not necessary to entirely understand why for our purposes, but the function that minimizes these errors is:&lt;/p&gt;&lt;center&gt;$\beta = ({A^t}A)^{-1}{A^t}b$&lt;/center&gt;&lt;p&gt;This is implemented for you in the numpy.linalg Python package, which we'll use for examples.  Much more information than you probably want can be found &lt;a href="http://en.wikipedia.org/wiki/Least_squares"&gt;here&lt;/a&gt;.&lt;/p&gt;&lt;p&gt;Below is a first stab at a Python version.  It runs least squares against our input and output data exactly as they are.  You can see the matrix A and outputs b1 and b2 (dog and cat love, respectively) are represented just as they are in the table.&lt;/p&gt;&lt;pre class="brush: python; toolbar: false;"&gt;# Version 1: No offset, no squared inputs&lt;br /&gt;&lt;br /&gt;import numpy&lt;br /&gt;&lt;br /&gt;A = numpy.vstack([&lt;br /&gt;    [9, 4, 9],&lt;br /&gt;    [8, 6, 4],&lt;br /&gt;    [9, 4, 8],&lt;br /&gt;    [3, 7, 9],&lt;br /&gt;    [6, 8, 5],&lt;br /&gt;    [4, 5, 3]&lt;br /&gt;])&lt;br /&gt;&lt;br /&gt;b1 = numpy.array([9, 10, 2, 4, 2, 10])&lt;br /&gt;b2 = numpy.array([9, 4, 6, 6, 5, 3])&lt;br /&gt;&lt;br /&gt;print 'dog &lt;3:', numpy.linalg.lstsq(A, b1)[0]&lt;br /&gt;print 'cat &lt;3:', numpy.linalg.lstsq(A, b2)[0]&lt;br /&gt;&lt;br /&gt;# Output:&lt;br /&gt;# dog &lt;3: [0.72548294      0.53045642     -0.29952361]&lt;br /&gt;# cat &lt;3: [2.36110929e-01  2.61934385e-05  6.26892476e-01]&lt;br /&gt;&lt;/pre&gt;&lt;p&gt;The resulting model is:&lt;/p&gt;&lt;p&gt;dog(x) = 0.72548294 * x1 + 0.53045642 * x2 - 0.29952361 * x3&lt;br/&gt;cat(x) = 2.36110929e-01 * x1 + 2.61934385e-05 * x2 + 6.26892476e-01 * x3&lt;/p&gt;&lt;p&gt;The coefficients before our variables correspond to beta in the formula above.  Errors between observed and predicted data, shown below, are calculated and summed. For these six records, dog(x) has a total error of 20.76 and cat(x) has 3.74.  Not great.&lt;/p&gt;&lt;center&gt;&lt;table class="nice"&gt;&lt;tr&gt;&lt;th&gt;Person&lt;/th&gt;&lt;th&gt;predicted b1&lt;/th&gt;&lt;th&gt;b1 error&lt;/th&gt;&lt;th&gt;predicted b2&lt;/th&gt;&lt;th&gt;b2 error&lt;/th&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Serdar&lt;/td&gt;&lt;td&gt;5.96&lt;/td&gt;&lt;td&gt;3.04&lt;/td&gt;&lt;td&gt;7.77&lt;/td&gt;&lt;td&gt;1.23&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Dan&lt;/td&gt;&lt;td&gt;7.79&lt;/td&gt;&lt;td&gt;2.21&lt;/td&gt;&lt;td&gt;4.40&lt;/td&gt;&lt;td&gt;0.40&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Nathaniel&lt;/td&gt;&lt;td&gt;6.25&lt;/td&gt;&lt;td&gt;4.25&lt;/td&gt;&lt;td&gt;7.14&lt;/td&gt;&lt;td&gt;1.14&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Lauren&lt;/td&gt;&lt;td&gt;3.19&lt;/td&gt;&lt;td&gt;0.81&lt;/td&gt;&lt;td&gt;6.35&lt;/td&gt;&lt;td&gt;0.35&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Jen&lt;/td&gt;&lt;td&gt;7.10&lt;/td&gt;&lt;td&gt;5.10&lt;/td&gt;&lt;td&gt;4.55&lt;/td&gt;&lt;td&gt;0.45&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Jackie&lt;/td&gt;&lt;td&gt;4.66&lt;/td&gt;&lt;td&gt;5.34&lt;/td&gt;&lt;td&gt;2.83&lt;/td&gt;&lt;td&gt;0.17&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Total error:&lt;/td&gt;&lt;td&gt;&lt;/td&gt;&lt;td&gt;20.76&lt;/td&gt;&lt;td&gt;&lt;/td&gt;&lt;td&gt;3.74&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;&lt;/center&gt;&lt;p&gt;One problem with this model is that dog(x) and cat(x) are forced to pass through the origin.  &lt;i&gt;(Why is that?)&lt;/i&gt;  We can improve it somewhat if we add an offset.  This amounts to prepending 1 to every row in A and adding a constant to the resulting functions.  You can see the very slight difference between the code for this model and that of the previous:&lt;/p&gt;&lt;pre class="brush: python; toolbar: false;"&gt;# Version 2: Offset, no squared inputs&lt;br /&gt;&lt;br /&gt;import numpy&lt;br /&gt;&lt;br /&gt;A = numpy.vstack([&lt;br /&gt;    [1, 9, 4, 9],&lt;br /&gt;    [1, 8, 6, 4],&lt;br /&gt;    [1, 9, 4, 8],&lt;br /&gt;    [1, 3, 7, 9],&lt;br /&gt;    [1, 6, 8, 5],&lt;br /&gt;    [1, 4, 5, 3]&lt;br /&gt;])&lt;br /&gt;&lt;br /&gt;b1 = numpy.array([9, 10, 2, 4, 2, 10])&lt;br /&gt;b2 = numpy.array([9, 4, 6, 6, 5, 3])&lt;br /&gt;&lt;br /&gt;print 'dog &lt;3:', numpy.linalg.lstsq(A, b1)[0]&lt;br /&gt;print 'cat &lt;3:', numpy.linalg.lstsq(A, b2)[0]&lt;br /&gt;&lt;br /&gt;# Output:&lt;br /&gt;# dog &lt;3: [20.92975427  -0.27831197  -1.43135684  -0.76469017]&lt;br /&gt;# cat &lt;3: [-0.31744124   0.25133547   0.02978098   0.63394765]&lt;br /&gt;&lt;/pre&gt;&lt;p&gt;This yields the seconds version of our models:&lt;/p&gt;&lt;p&gt;dog(x) = 20.92975427 - 0.27831197 * x1 - 1.43135684 * x2 - 0.76469017 * x3&lt;br/&gt;cat(x) = -0.31744124 + 0.25133547 * x1 + 0.02978098 * x2 + 0.63394765 * x3&lt;/p&gt;&lt;p&gt;These models provide errors of 13.87 and 3.79.  A little better on the dog side, but still not quite usable.&lt;/p&gt;&lt;center&gt;&lt;table class="nice"&gt;&lt;tr&gt;&lt;th&gt;Person&lt;/th&gt;&lt;th&gt;predicted b1&lt;/th&gt;&lt;th&gt;b1 error&lt;/th&gt;&lt;th&gt;predicted b2&lt;/th&gt;&lt;th&gt;b2 error&lt;/th&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Serdar&lt;/td&gt;&lt;td&gt;5.82&lt;/td&gt;&lt;td&gt;3.18&lt;/td&gt;&lt;td&gt;7.77&lt;/td&gt;&lt;td&gt;1.23&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Dan&lt;/td&gt;&lt;td&gt;7.06&lt;/td&gt;&lt;td&gt;2.94&lt;/td&gt;&lt;td&gt;4.41&lt;/td&gt;&lt;td&gt;0.41&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Nathaniel&lt;/td&gt;&lt;td&gt;6.58&lt;/td&gt;&lt;td&gt;4.58&lt;/td&gt;&lt;td&gt;7.14&lt;/td&gt;&lt;td&gt;1.14&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Lauren&lt;/td&gt;&lt;td&gt;3.19&lt;/td&gt;&lt;td&gt;0.81&lt;/td&gt;&lt;td&gt;6.35&lt;/td&gt;&lt;td&gt;0.35&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Jen&lt;/td&gt;&lt;td&gt;3.99&lt;/td&gt;&lt;td&gt;1.99&lt;/td&gt;&lt;td&gt;4.60&lt;/td&gt;&lt;td&gt;0.40&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Jackie&lt;/td&gt;&lt;td&gt;10.37&lt;/td&gt;&lt;td&gt;0.37&lt;/td&gt;&lt;td&gt;2.74&lt;/td&gt;&lt;td&gt;0.26&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Total error:&lt;/td&gt;&lt;td&gt;&lt;/td&gt;&lt;td&gt;13.87&lt;/td&gt;&lt;td&gt;&lt;/td&gt;&lt;td&gt;3.79&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;&lt;/center&gt;&lt;p&gt;The problem is that dog(x) and cat(x) are linear functions.  Most observed data don't conform to straight lines.  Take a moment and draw the line $f(x) = x$ and the curve $f(x) = x^2$.  The former makes a poor approximation of the latter.&lt;/p&gt;&lt;p&gt;Most of the time, people just use squares of the input data to add curvature to their models. We do this in our next version of the code by just adding squares of the input row values to our A matrix.  Everything else is the same.  (In reality, you can add any function of the input data you feel best models the data, if you understand it well enough.)&lt;/p&gt;&lt;pre class="brush: python; toolbar: false;"&gt;# Version 3: Offset with squared inputs&lt;br /&gt;&lt;br /&gt;import numpy&lt;br /&gt;&lt;br /&gt;A = numpy.vstack([&lt;br /&gt;    [1, 9, 9**2, 4, 4**2, 9, 9**2],&lt;br /&gt;    [1, 8, 8**2, 6, 6**2, 4, 4**2],&lt;br /&gt;    [1, 9, 9**2, 4, 4**2, 8, 8**2],&lt;br /&gt;    [1, 3, 3**2, 7, 7**2, 9, 9**2],&lt;br /&gt;    [1, 6, 6**2, 8, 8**2, 5, 5**2],&lt;br /&gt;    [1, 4, 4**2, 5, 5**2, 3, 3**2]&lt;br /&gt;])&lt;br /&gt;&lt;br /&gt;b1 = numpy.array([9, 10, 2, 4, 2, 10])&lt;br /&gt;b2 = numpy.array([9, 4, 6, 6, 5, 3])&lt;br /&gt;&lt;br /&gt;print 'dog &lt;3:', numpy.linalg.lstsq(A, b1)[0]&lt;br /&gt;print 'cat &lt;3:', numpy.linalg.lstsq(A, b2)[0]&lt;br /&gt;&lt;br /&gt;# dog &lt;3: [1.29368307  7.03633306  -0.44795498  9.98093332  &lt;br /&gt;#  -0.75689575  -19.00757486  1.52985734]&lt;br /&gt;# cat &lt;3: [0.47945896  5.30866067  -0.39644128 -1.28704188  &lt;br /&gt;#   0.12634295   -4.32392606  0.43081918]&lt;br /&gt;&lt;/pre&gt;&lt;p&gt;This gives us our final version of the model:&lt;/p&gt;&lt;p&gt;dog(x) = 1.29368307 + 7.03633306 * x1 - 0.44795498 * x1**2 + 9.98093332 * x2 - 0.75689575 * x2**2 - 19.00757486 * x3 + 1.52985734 * x3**2&lt;br/&gt;cat(x) = 0.47945896 + 5.30866067 * x1 - 0.39644128 * x1**2 - 1.28704188 * x2 + 0.12634295 * x2**2 - 4.32392606 * x3 + 0.43081918 * x3**2&lt;/p&gt;&lt;p&gt;Adding curvature to our model eliminates all perceived error, at least within 1e-16.  This may seem unbelievable, but when you consider that we only have six input records, it isn't really.&lt;/p&gt;&lt;center&gt;&lt;table class="nice"&gt;&lt;tr&gt;&lt;th&gt;Person&lt;/th&gt;&lt;th&gt;predicted b1&lt;/th&gt;&lt;th&gt;b1 error&lt;/th&gt;&lt;th&gt;predicted b2&lt;/th&gt;&lt;th&gt;b2 error&lt;/th&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Serdar&lt;/td&gt;&lt;td&gt;9&lt;/td&gt;&lt;td&gt;0&lt;/td&gt;&lt;td&gt;9&lt;/td&gt;&lt;td&gt;0&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Dan&lt;/td&gt;&lt;td&gt;10&lt;/td&gt;&lt;td&gt;0&lt;/td&gt;&lt;td&gt;4&lt;/td&gt;&lt;td&gt;0&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Nathaniel&lt;/td&gt;&lt;td&gt;2&lt;/td&gt;&lt;td&gt;0&lt;/td&gt;&lt;td&gt;6&lt;/td&gt;&lt;td&gt;0&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Lauren&lt;/td&gt;&lt;td&gt;4&lt;/td&gt;&lt;td&gt;0&lt;/td&gt;&lt;td&gt;6&lt;/td&gt;&lt;td&gt;0&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Jen&lt;/td&gt;&lt;td&gt;2&lt;/td&gt;&lt;td&gt;0&lt;/td&gt;&lt;td&gt;5&lt;/td&gt;&lt;td&gt;0&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Jackie&lt;/td&gt;&lt;td&gt;10&lt;/td&gt;&lt;td&gt;0&lt;/td&gt;&lt;td&gt;3&lt;/td&gt;&lt;td&gt;0&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Total error:&lt;/td&gt;&lt;td&gt;&lt;/td&gt;&lt;td&gt;0&lt;/td&gt;&lt;td&gt;&lt;/td&gt;&lt;td&gt;0&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;&lt;/center&gt;&lt;p&gt;It should be fairly obvious how one can take this and extrapolate to much larger models.  I hope this is useful and that least squares becomes an important part of your lives.&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6378430709872859668-366137730103186682?l=adventuresinoptimization.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://adventuresinoptimization.blogspot.com/feeds/366137730103186682/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://adventuresinoptimization.blogspot.com/2011/02/data-fitting-part-2-very-very-simple.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6378430709872859668/posts/default/366137730103186682'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6378430709872859668/posts/default/366137730103186682'/><link rel='alternate' type='text/html' href='http://adventuresinoptimization.blogspot.com/2011/02/data-fitting-part-2-very-very-simple.html' title='data fitting part 2: very, very simple linear regression in python'/><author><name>Ryan J. O'Neil</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://4.bp.blogspot.com/_Cv0KZsgj0Fo/TTX24yBxlSI/AAAAAAAAABQ/qLlZGYpEpT8/S220/2011-01-12-205103.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6378430709872859668.post-8984283133594155062</id><published>2010-11-30T23:36:00.000-05:00</published><updated>2010-11-30T23:36:02.715-05:00</updated><title type='text'>off-the-cuff online voter fraud detection</title><content type='html'>Consider this scenario:  You run a contest that accepts votes from the general Internet population.  In order to encourage user engagement, you record any and all votes into a database over several days, storing nothing more than the competitor voted for, when each vote is cast, and a cookie set on the voter's computer along with their apparent IP addresses.  If a voter already has a recorded cookie set they are denied subsequent votes.  This way you can avoid requiring site registration, a huge turnoff for your users.  Simple enough.&lt;br /&gt;&lt;br /&gt;Unfortunately, some of the competitors are wily and attached to the idea of winning.  They go so far as programming or hiring bots to cast thousands of votes for them.  Your manager wants to know which votes are real and which ones are fake Right Now.  Given very limited time, and ignoring actions that you &lt;i&gt;could&lt;/i&gt; have taken to avoid the problem, how can you tell apart sets of good votes from those that shouldn't be counted?&lt;br /&gt;&lt;br /&gt;One quick-and-dirty option involves comparing histograms of &lt;a href="http://www.ehow.com/how_5417319_calculate-interarrival-time.html"&gt;interarrival times&lt;/a&gt; for sets of votes.  Say you're concerned that all the votes during a particular period of time or from a given IP address might be fraudulent.  Put all the vote times you're concerned about into a list, sort them, and compute their differences:&lt;br /&gt;&lt;pre class="brush: python; toolbar: false;"&gt;# times is a list of datetime instances from vote records&lt;br /&gt;times.sort(reversed=True)&lt;br /&gt;interarrivals = [y-x for x, y in zip(times, times[1:]]&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;Now use matplotlib to &lt;a href="http://matplotlib.sourceforge.net/users/pyplot_tutorial.html#working-with-text"&gt;display a histogram&lt;/a&gt; of these.  Votes that occur naturally are likely to resemble an &lt;a href="http://en.wikipedia.org/wiki/Exponential_distribution"&gt;exponential distribution&lt;/a&gt; in their interarrival times.  For instance, here are interarrival times for all votes received in a contest:&lt;br /&gt;&lt;br /&gt;&lt;table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td style="text-align: center;"&gt;&lt;a href="http://2.bp.blogspot.com/_Cv0KZsgj0Fo/TPXFa2Ae-YI/AAAAAAAAABA/e1F04juOpQ4/s1600/all-votes.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"&gt;&lt;img border="0" height="241" src="http://2.bp.blogspot.com/_Cv0KZsgj0Fo/TPXFa2Ae-YI/AAAAAAAAABA/e1F04juOpQ4/s320/all-votes.png" width="320" /&gt;&lt;/a&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td class="tr-caption" style="text-align: center;"&gt;Interarrival times for all submissions &lt;/td&gt;&lt;td class="tr-caption" style="text-align: center;"&gt;&lt;br /&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;/table&gt;&lt;br /&gt;This subset of votes is clearly fraudulent, due to the near determinism of their interarrival times.  This is most likely caused by the voting bot not taking random sleep intervals during voting.  It casts a vote, receives a response, clears its cookies, and repeats:&lt;br /&gt;&lt;br /&gt;&lt;table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td style="text-align: center;"&gt;&lt;a href="http://3.bp.blogspot.com/_Cv0KZsgj0Fo/TPXFd_-9ruI/AAAAAAAAABI/spov-G7AgmY/s1600/fraud-plot.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"&gt;&lt;img border="0" height="241" src="http://3.bp.blogspot.com/_Cv0KZsgj0Fo/TPXFd_-9ruI/AAAAAAAAABI/spov-G7AgmY/s320/fraud-plot.png" width="320" /&gt;&lt;/a&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td class="tr-caption" style="text-align: center;"&gt;Interarrival times for clearly fraudulent votes&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;/table&gt;&lt;br /&gt;These votes, on the other hand, are most likely legitimate.  They exhibit a nice &lt;a href="http://en.wikipedia.org/wiki/Erlang_distribution"&gt;Erlang&lt;/a&gt; shape and appear to have natural interarrival times that one would expect:&lt;br /&gt;&lt;br /&gt;&lt;table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td style="text-align: center;"&gt;&lt;a href="http://2.bp.blogspot.com/_Cv0KZsgj0Fo/TPXFcgRwidI/AAAAAAAAABE/AFfyqkX8lik/s1600/not-fraud.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"&gt;&lt;img border="0" height="241" src="http://2.bp.blogspot.com/_Cv0KZsgj0Fo/TPXFcgRwidI/AAAAAAAAABE/AFfyqkX8lik/s320/not-fraud.png" width="320" /&gt;&lt;/a&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td class="tr-caption" style="text-align: center;"&gt;Proper-looking interarrival times&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;/table&gt;&lt;br /&gt;Of course this method is woefully inadequate for rigorous detection of voting fraud.  Ideally one would find a method to compute the probability that a set of votes is generated by a bot.  This is enough to inform quick, ad hoc decisions though.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6378430709872859668-8984283133594155062?l=adventuresinoptimization.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://adventuresinoptimization.blogspot.com/feeds/8984283133594155062/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://adventuresinoptimization.blogspot.com/2010/11/off-cuff-online-voter-fraud-detection.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6378430709872859668/posts/default/8984283133594155062'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6378430709872859668/posts/default/8984283133594155062'/><link rel='alternate' type='text/html' href='http://adventuresinoptimization.blogspot.com/2010/11/off-cuff-online-voter-fraud-detection.html' title='off-the-cuff online voter fraud detection'/><author><name>Ryan J. O'Neil</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://4.bp.blogspot.com/_Cv0KZsgj0Fo/TTX24yBxlSI/AAAAAAAAABQ/qLlZGYpEpT8/S220/2011-01-12-205103.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://2.bp.blogspot.com/_Cv0KZsgj0Fo/TPXFa2Ae-YI/AAAAAAAAABA/e1F04juOpQ4/s72-c/all-votes.png' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6378430709872859668.post-9048542292001947829</id><published>2010-11-23T18:00:00.002-05:00</published><updated>2010-11-23T23:27:51.989-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='linear programming'/><category scheme='http://www.blogger.com/atom/ns#' term='data fitting'/><category scheme='http://www.blogger.com/atom/ns#' term='python-zibopt'/><title type='text'>data fitting part 1: linear data fitting</title><content type='html'>Data fitting is one of those tasks that everyone should have at least some exposure to.  Certainly developers and analysts will benefit from a working knowledge of its fundamentals and their implementations.  However, in my own reading I've found it difficult to locate good examples that are simple enough to pick up quickly and come with accompanying source code.&lt;br /&gt;&lt;br /&gt;This article commences an ongoing series introducing basic data fitting techniques.  With any luck they won't be overly complex, while still being useful enough to get the point across with a real example and real data.  We'll start with a binary classification problem: presented with a series of records, each containing a set number of input values describing it, determine whether or not each record exhibits some property.&lt;br /&gt;&lt;br /&gt;We'll use the cancer1.dt data from the proben1 set of test cases, which you can download &lt;a href="ftp://ftp.cs.cuhk.hk/pub/proben1/cancer/cancer1.dt"&gt;here&lt;/a&gt;.  Each record starts with 9 data points containing physical characteristics of a tumor.  The second to last data point contains 1 if a tumor is benign and 0 if it is malignant.  We seek to find a linear function we can run on an arbitrary record that will return a value greater than zero if that record's tumor is predicted to be benign and less than zero if it is predicted to be malignant.  We will train our linear model on the first 350 records, and test it for accuracy on the remaining rows.&lt;br /&gt;&lt;br /&gt;This is similar to the data fitting problem found in &lt;a href="http://www.amazon.com/Linear-Programming-Books-Mathematical-Sciences/dp/0716715872"&gt;Chvatal&lt;/a&gt;.  Our inputs consist of a matrix of observed data, $A$, and a vector of classifications, $b$.  In order to classify a record, we require another vector $x$ such that the dot product of $x$ and that record will be either greater or less than zero depending on its predicted classification.&lt;br /&gt;&lt;br /&gt;A couple points to note before we start:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;Most observed data are noisy.  This means it may be impossible to locate a hyperplane that cleanly separates given records of one type from another.  In this case, we must resort to finding a function that minimizes our predictive error.  For the purposes of this example, we'll minimize the sum of the absolute values of the observed and predicted values.  That is, we seek $x$ such that we find $min \sum_i{|a_i^T x-b_i|}$.&lt;/li&gt;&lt;li&gt;The &lt;a href="http://www.purplemath.com/modules/strtlneq.htm"&gt;slope-intercept&lt;/a&gt; form of a line, $f(x)=m^T x+b$, contains an offset.  It should be obvious that this is necessary in our model so that our function isn't required to pass through the origin.  Thus, we'll be adding an extra variable with the coefficient of 1 to represent our offset value.&lt;/li&gt;&lt;/ul&gt;&lt;br /&gt;In order to model this, we use two linear constraints for each absolute value.  We minimize the sum of these.  Our Linear Programming model thus looks like:&lt;br /&gt;&lt;br /&gt;&lt;center&gt;&lt;table&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td valign="top"&gt;$\min$&lt;/td&gt;&lt;td colspan="5"&gt;$z = x_0 + \sum_i{v_i}$&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td valign="top"&gt;s.t.&lt;/td&gt;&lt;td align="right"&gt;$v_i$&lt;/td&gt;&lt;td&gt;$\geq$&lt;/td&gt;&lt;td&gt;$x_0 + a_i'x - 1$&lt;/td&gt;&lt;td valign="top"&gt;$\forall$ benign tumors&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td valign="top"&gt;&lt;/td&gt;&lt;td align="right"&gt;$v_i$&lt;/td&gt;&lt;td&gt;$\geq$&lt;/td&gt;&lt;td&gt;$1 - x_0 - a_i'x$&lt;/td&gt;&lt;td valign="top"&gt;$\forall$ benign tumors&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td valign="top"&gt;&lt;/td&gt;&lt;td align="right"&gt;$v_i$&lt;/td&gt;&lt;td&gt;$\geq$&lt;/td&gt;&lt;td&gt;$x_0 + a_i'x - (-1)$&lt;/td&gt;&lt;td valign="top"&gt;$\forall$ malignant tumors&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td valign="top"&gt;&lt;/td&gt;&lt;td align="right"&gt;$v_i$&lt;/td&gt;&lt;td&gt;$\geq$&lt;/td&gt;&lt;td&gt;$-1 - x_0 - a_i'x$&lt;/td&gt;&lt;td valign="top"&gt;$\forall$ malignant tumors&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;/table&gt;&lt;/center&gt;&lt;br /&gt;&lt;br /&gt;In order to do this in Python, we use SCIP and SoPlex from the zibopt library.  We start by setting constants for benign and malignant outputs and providing a function to read in the training and testing data sets.&lt;br /&gt;&lt;pre class="brush: python; first-line: 1; toolbar: false;"&gt;# Preferred output values for tumor categories&lt;br /&gt;BENIGN = 1&lt;br /&gt;MALIGNANT = -1&lt;br /&gt;&lt;br /&gt;def read_proben1_cancer_data(filename, train_size):&lt;br /&gt;    '''Loads a proben1 cancer file into train &amp;amp; test sets'''&lt;br /&gt;    # Number of input data points per record&lt;br /&gt;    DATA_POINTS = 9&lt;br /&gt;&lt;br /&gt;    train_data = []&lt;br /&gt;    test_data = []&lt;br /&gt;&lt;br /&gt;    with open(filename) as infile:&lt;br /&gt;        # Read in the first train_size lines to a training&lt;br /&gt;        # data list, and the others to testing data.  This&lt;br /&gt;        # allows us to test how general our model is on&lt;br /&gt;        # something other than the input data.&lt;br /&gt;        for line in infile.readlines()[7:]: # skip header&lt;br /&gt;            line = line.split()&lt;br /&gt;&lt;br /&gt;            # Records = offset (x0) + remaining data points&lt;br /&gt;            input = [float(x) for x in line[:DATA_POINTS]]&lt;br /&gt;            output = BENIGN if line[-2] == '1' else MALIGNANT&lt;br /&gt;            record = {'input': input, 'output': output}&lt;br /&gt;&lt;br /&gt;            # Determine what data set to put this in&lt;br /&gt;            if len(train_data) &amp;gt;= train_size:&lt;br /&gt;                test_data.append(record)&lt;br /&gt;            else:&lt;br /&gt;                train_data.append(record)&lt;br /&gt;&lt;br /&gt;    return train_data, test_data&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;The next function implements the LP model described above using SoPlex and SCIP.  It minimizes the sum of residuals for each training record.  This amounts to summing the absolute value of the difference between predicted and observed output data.  The following function takes in input and observed output data and returns a list of coefficients.  Our resulting model consists of taking the &lt;a href="http://en.wikipedia.org/wiki/Dot_product"&gt;dot product&lt;/a&gt; of an input record and these coefficients.  If the result is greater than or equal to zero, that record is predicted to be a benign tumor, otherwise it is predicted to be malignant.&lt;br /&gt;&lt;pre class="brush: python; first-line: 35; toolbar: false;"&gt;def train_linear_model(train_data):&lt;br /&gt;    '''&lt;br /&gt;    Accepts a set of input training data with known output&lt;br /&gt;    values.  Returns a list of coefficients to apply to&lt;br /&gt;    arbitrary records for purposes of binary categorization.&lt;br /&gt;    '''&lt;br /&gt;    from zibopt import scip&lt;br /&gt;    import sys&lt;br /&gt;&lt;br /&gt;    # Make sure we have at least one training records&lt;br /&gt;    assert len(train_data) &amp;gt; 0&lt;br /&gt;    num_variables = len(train_data[0]['input'])&lt;br /&gt;&lt;br /&gt;    # Variables are coefficients in front of the data points.&lt;br /&gt;    # It is important that these be unrestricted in sign so&lt;br /&gt;    # they can take negative values.&lt;br /&gt;    solver = scip.solver()&lt;br /&gt;    coefficients = [&lt;br /&gt;        solver.variable(lower=-sys.maxint)&lt;br /&gt;        for _ in xrange(num_variables)&lt;br /&gt;    ]&lt;br /&gt;&lt;br /&gt;    # Residual for each data row&lt;br /&gt;    residuals = [solver.variable() for _ in train_data]&lt;br /&gt;    for r, d in zip(residuals, train_data):&lt;br /&gt;        # r will be the absolute value of the difference&lt;br /&gt;        # between observed and predicted values.  We can&lt;br /&gt;        # model absolute values such as r &amp;gt;= |foo| as:&lt;br /&gt;        #&lt;br /&gt;        #   r &amp;gt;=  foo&lt;br /&gt;        #   r &amp;gt;= -foo&lt;br /&gt;        solver += sum(&lt;br /&gt;            x * y for x, y in zip(coefficients, d['input'])&lt;br /&gt;        ) + r &amp;gt;= d['output']&lt;br /&gt;&lt;br /&gt;        solver += sum(&lt;br /&gt;            x * y for x, y in zip(coefficients, d['input'])&lt;br /&gt;        ) - r &amp;lt;= d['output']&lt;br /&gt;&lt;br /&gt;    # Find and return coefficients that min sum of residuals&lt;br /&gt;    solution = solver.minimize(objective=sum(residuals))&lt;br /&gt;    return [solution[c] for c in coefficients]&lt;br /&gt;&lt;/pre&gt;We also provide a convenience function for counting the number of correct predictions by our resulting model against either the test or training data sets.&lt;br /&gt;&lt;pre class="brush: python; first-line: 79; toolbar: false;"&gt;def count_correct(data_set, coefficients):&lt;br /&gt;    '''Returns the number of correct predictions'''&lt;br /&gt;    correct = 0&lt;br /&gt;    for d in data_set:&lt;br /&gt;        result = sum(&lt;br /&gt;            x*y for x, y in zip(coefficients, d['input'])&lt;br /&gt;        )&lt;br /&gt;&lt;br /&gt;        # Do we predict the same as the output?&lt;br /&gt;        if (result &amp;gt;= 0) == (d['output'] &amp;gt;= 0):&lt;br /&gt;            correct += 1&lt;br /&gt;&lt;br /&gt;    return correct&lt;br /&gt;&lt;/pre&gt;Finally we write a main method to read in the data, build our linear model, and test its efficacy.&lt;br /&gt;&lt;pre class="brush: python; first-line: 93; toolbar: false;"&gt;if __name__ == '__main__':&lt;br /&gt;    from pprint import pprint&lt;br /&gt;&lt;br /&gt;    # Specs for this input file&lt;br /&gt;    INPUT_FILE_NAME = 'cancer1.dt'&lt;br /&gt;    TRAIN_SIZE = 350&lt;br /&gt;&lt;br /&gt;    train_data, test_data = read_proben1_cancer_data(&lt;br /&gt;        INPUT_FILE_NAME, TRAIN_SIZE&lt;br /&gt;    )&lt;br /&gt;&lt;br /&gt;    # Add the offset variable to each of our data records&lt;br /&gt;    for data_set in [train_data, test_data]:&lt;br /&gt;        for row in data_set:&lt;br /&gt;            row['input'] = [1] + row['input']&lt;br /&gt;&lt;br /&gt;    coefficients = train_linear_model(train_data)&lt;br /&gt;    print 'coefficients:'&lt;br /&gt;    pprint(coefficients)&lt;br /&gt;&lt;br /&gt;    # Print % of correct preditions for each data set&lt;br /&gt;    correct = count_correct(train_data, coefficients)&lt;br /&gt;    print '%s / %s = %.02f%% correct on training set' % (&lt;br /&gt;        correct, len(train_data),&lt;br /&gt;        100 * float(correct) / len(train_data)&lt;br /&gt;    )&lt;br /&gt;&lt;br /&gt;    correct = count_correct(test_data, coefficients)&lt;br /&gt;    print '%s / %s = %.02f%% correct on testing set' % (&lt;br /&gt;        correct, len(test_data),&lt;br /&gt;        100 * float(correct) / len(test_data)&lt;br /&gt;    )&lt;br /&gt;&lt;/pre&gt;The result of running this model against the cancer1.dt data set is:&lt;br /&gt;&lt;pre&gt;coefficients:&lt;br /&gt;[1.4072882449702786,&lt;br /&gt; -0.14014055927954652,&lt;br /&gt; -0.6239513714263405,&lt;br /&gt; -0.26727681774258882,&lt;br /&gt; 0.067107753841131157,&lt;br /&gt; -0.28300216102808429,&lt;br /&gt; -1.0355594670918404,&lt;br /&gt; -0.22774451038152174,&lt;br /&gt; -0.69871243677663608,&lt;br /&gt; -0.072575089848659444]&lt;br /&gt;328 / 350 = 93.71% correct on training set&lt;br /&gt;336 / 349 = 96.28% correct on testing set&lt;br /&gt;&lt;/pre&gt;The accuracy is pretty good here against the both the training and testing sets, so this particular model generalizes well.  This is about the simplest model we can implement for data fitting, and we'll get to more complicated ones later, but it's nice to see we can do so well so quickly.  The coefficients correspond to using a function of this form, rounding off to three decimal places:&lt;br /&gt;&lt;br /&gt;&lt;div style="font-size: 70%"&gt;$$f(x) = 1.407 - 0.140 x_1 - 0.624 x_2 - 0.267 x_3 + 0.067 x_4 - 0.283 x_5 - 1.037 x_6 - 0.228 x_7 - 0.699 x_8 - 0.073 x_9$$&lt;br /&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6378430709872859668-9048542292001947829?l=adventuresinoptimization.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://adventuresinoptimization.blogspot.com/feeds/9048542292001947829/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://adventuresinoptimization.blogspot.com/2010/11/data-fitting-part-1-linear-data-fitting.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6378430709872859668/posts/default/9048542292001947829'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6378430709872859668/posts/default/9048542292001947829'/><link rel='alternate' type='text/html' href='http://adventuresinoptimization.blogspot.com/2010/11/data-fitting-part-1-linear-data-fitting.html' title='data fitting part 1: linear data fitting'/><author><name>Ryan J. O'Neil</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://4.bp.blogspot.com/_Cv0KZsgj0Fo/TTX24yBxlSI/AAAAAAAAABQ/qLlZGYpEpT8/S220/2011-01-12-205103.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6378430709872859668.post-8476928709716927868</id><published>2010-11-20T01:47:00.001-05:00</published><updated>2010-11-23T15:05:38.024-05:00</updated><title type='text'>moving on</title><content type='html'>It's now official and public, so I may as well state it here: I'm leaving the Editorial Embedded Development team at the Washington Post for a position as a simulation &amp;amp; modeling engineer at &lt;a href="http://www.mitrecaasd.org/"&gt;MITRE&lt;/a&gt;.  This represents a fundamental shift of career for me from software development to operations research, something I've hoped for since I took my first course in deterministic models at &lt;a href="http://seor.gmu.edu/"&gt;GMU&lt;/a&gt;. &amp;nbsp;That said, leaving the Post remains a difficult step. &amp;nbsp;In my tenure there, &lt;a href="http://push.cx/"&gt;I've&lt;/a&gt; &lt;a href="http://www.linkedin.com/in/convergencejournalist"&gt;worked&lt;/a&gt; &lt;a href="http://leetrout.com/"&gt;with&lt;/a&gt; &lt;a href="http://www.kat-downs.com/"&gt;many&lt;/a&gt; &lt;a href="http://www.sarahsampsel.com/"&gt;truly&lt;/a&gt; &lt;a href="http://kelsocartography.com/blog/"&gt;remarkable&lt;/a&gt; &lt;a href="http://fds.duke.edu/db/Sanford/sarah.cohen"&gt;people&lt;/a&gt; on projects of obvious importance. &amp;nbsp;I've weathered a difficult newsroom merger and been trusted with leadership of a blossoming development team. &amp;nbsp;I've worked 32 hours straight to meet editorial deadlines and make news, won journalism awards by proxy, mentored aspiring developers, and been continually humbled by the intelligence, hard work, and dedication of my coworkers to accuracy, relevance, and objectivity. &amp;nbsp;It is, therefore, not with a light heart that I take this step.&lt;br /&gt;&lt;br /&gt;However, I know that it is the right path. &amp;nbsp;MITRE is a great place to be for someone like me, and with any luck I'll have both time and encouragement to continue my contributions to open source optimization. &amp;nbsp;As a first step, I'm picking this blog back up and finding time to post more interesting optimization tidbits.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6378430709872859668-8476928709716927868?l=adventuresinoptimization.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://adventuresinoptimization.blogspot.com/feeds/8476928709716927868/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://adventuresinoptimization.blogspot.com/2010/11/moving-on.html#comment-form' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6378430709872859668/posts/default/8476928709716927868'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6378430709872859668/posts/default/8476928709716927868'/><link rel='alternate' type='text/html' href='http://adventuresinoptimization.blogspot.com/2010/11/moving-on.html' title='moving on'/><author><name>Ryan J. O'Neil</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://4.bp.blogspot.com/_Cv0KZsgj0Fo/TTX24yBxlSI/AAAAAAAAABQ/qLlZGYpEpT8/S220/2011-01-12-205103.jpg'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6378430709872859668.post-2974117528533129977</id><published>2010-05-29T00:30:00.000-04:00</published><updated>2010-05-29T00:58:14.132-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='msor'/><category scheme='http://www.blogger.com/atom/ns#' term='gmu'/><title type='text'>recent work &amp; graduation</title><content type='html'>I spent the last term finishing my &lt;a href="http://seor.gmu.edu/msor/curriculum.html"&gt;MSOR&lt;/a&gt;, so I haven't had much time to post since &lt;a href="http://us.pycon.org/2010/about/"&gt;PyCon&lt;/a&gt;.  It's been a busy term, and while I'm glad my efforts of the last few years have reached their conclusion, I'm already a little nostalgic for the classroom.  Our department at George Mason has some really top notch faculty and the program is designed well.  The fact that it requires MS students to balance between deterministic and stochastic methods is part of this.  On my own I probably wouldn't have taken much on the stochastic side, but this way I ended up challenged and delighted by courses in simulation, stochastic processes, and queuing theory.  I've met students from other institutions that don't require such a degree of breadth, and it makes me doubly happy to have studied where I did.&lt;br /&gt;&lt;br /&gt;Here are a couple highlights of the past few months:&lt;br /&gt;&lt;br /&gt;In the &lt;a href="http://volgenau.gmu.edu/~klaskey/OR680/"&gt;capstone project&lt;/a&gt; course we assembled onto teams and spent the term working for a project sponsor.  Our team of five smart guys chose to work with &lt;a href="http://volgenau.gmu.edu/~kchang/"&gt;Dr. Kuo-Chu Chang&lt;/a&gt;, whose interest was in evaluating financial engineering strategies against S&amp;P 500 options.  It was like opening a fire hose of information and required a huge effort through the entire term, but our result was a pretty solid system and analysis.  I've posted a copy of the project web site &lt;a href="http://chenoneil.com/projects/investment-allocation/"&gt;here&lt;/a&gt; and the final report &lt;a href="http://chenoneil.com/projects/investment-allocation/files/investment-allocation-may-2010.pdf"&gt;here&lt;/a&gt;.  I was, of course, responsible for the software powering our deterministic simulation and GUI.&lt;br /&gt;&lt;br /&gt;For &lt;a href="http://mason.gmu.edu/~jshortle/or647.html"&gt;queuing theory&lt;/a&gt; I had the opportunity to learn from a &lt;a href="http://mason.gmu.edu/~jshortle"&gt;real expert&lt;/a&gt; in the field, which is always exciting.  I've uploaded a short paper analyzing a queuing simulating of the &lt;a href="http://www.djangoproject.com/"&gt;Django&lt;/a&gt; installation we use at the Post &lt;a href="http://chenoneil.com/projects/queuing-model-web-server.pdf"&gt;here&lt;/a&gt;.  It may be of interest to web programmers and administrators.&lt;br /&gt;&lt;br /&gt;So what's next?&lt;br /&gt;&lt;br /&gt;Well, first will come a new version of python-zibopt.  After that I may spend a little time polishing PyGEP and then it's off to building our combinatorial auction software. Check back for updates!&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6378430709872859668-2974117528533129977?l=adventuresinoptimization.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://adventuresinoptimization.blogspot.com/feeds/2974117528533129977/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://adventuresinoptimization.blogspot.com/2010/05/recent-work-graduation.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6378430709872859668/posts/default/2974117528533129977'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6378430709872859668/posts/default/2974117528533129977'/><link rel='alternate' type='text/html' href='http://adventuresinoptimization.blogspot.com/2010/05/recent-work-graduation.html' title='recent work &amp; graduation'/><author><name>Ryan J. O'Neil</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://4.bp.blogspot.com/_Cv0KZsgj0Fo/TTX24yBxlSI/AAAAAAAAABQ/qLlZGYpEpT8/S220/2011-01-12-205103.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6378430709872859668.post-8292711611623755346</id><published>2010-02-09T15:34:00.001-05:00</published><updated>2010-11-20T00:06:33.409-05:00</updated><title type='text'>my pycon picks for 2010</title><content type='html'>Friday, Feb 19:&lt;ul&gt;&lt;li&gt;&lt;a href="http://us.pycon.org/2010/conference/schedule/event/17/"&gt;Import this, that, and the other thing: custom importers&lt;/a&gt;&lt;/li&gt;&lt;li&gt;&lt;a href="http://us.pycon.org/2010/conference/schedule/event/29/"&gt;Python 3: The Next Generation&lt;/a&gt;&lt;/li&gt;&lt;li&gt;&lt;a href="http://us.pycon.org/2010/conference/schedule/event/32/"&gt;Maximize your program's laziness&lt;/a&gt;&lt;/li&gt;&lt;li&gt;&lt;a href="http://us.pycon.org/2010/conference/schedule/event/40/"&gt;How and why Python is being used to by the Military to model real-world battlefield scenarios&lt;/a&gt;&lt;/li&gt;&lt;/ul&gt;Saturday, Feb 20:&lt;ul&gt;&lt;li&gt;&lt;a href="http://us.pycon.org/2010/conference/schedule/event/68/"&gt;Demystifying Non-Blocking and Asynchronous I/O&lt;/a&gt;&lt;/li&gt;&lt;li&gt;&lt;a href="http://us.pycon.org/2010/conference/schedule/event/76/"&gt;Understanding the Python GIL&lt;/a&gt;&lt;/li&gt;&lt;li&gt;&lt;a href="http://us.pycon.org/2010/conference/schedule/event/90/"&gt;To relate or not to relate, that is the question&lt;/a&gt;&lt;/li&gt;&lt;li&gt;&lt;a href="http://us.pycon.org/2010/conference/schedule/event/100/"&gt;Threading is not a model&lt;/a&gt;&lt;/li&gt;&lt;/ul&gt;Sunday, Feb 21:&lt;ul&gt;&lt;li&gt;&lt;a href="http://us.pycon.org/2010/conference/schedule/event/126/"&gt;Dealing with unsightly data in the real world.&lt;/a&gt;&lt;/li&gt;&lt;li&gt;&lt;a href="http://us.pycon.org/2010/conference/schedule/event/140/"&gt;Optimal Resource Allocation using Python&lt;/a&gt; &lt;i&gt;(of course I have to go to this one...)&lt;/i&gt;&lt;/li&gt;&lt;/ul&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6378430709872859668-8292711611623755346?l=adventuresinoptimization.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://adventuresinoptimization.blogspot.com/feeds/8292711611623755346/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://adventuresinoptimization.blogspot.com/2010/02/my-pycon-picks-for-2010.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6378430709872859668/posts/default/8292711611623755346'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6378430709872859668/posts/default/8292711611623755346'/><link rel='alternate' type='text/html' href='http://adventuresinoptimization.blogspot.com/2010/02/my-pycon-picks-for-2010.html' title='my pycon picks for 2010'/><author><name>Ryan J. O'Neil</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://4.bp.blogspot.com/_Cv0KZsgj0Fo/TTX24yBxlSI/AAAAAAAAABQ/qLlZGYpEpT8/S220/2011-01-12-205103.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6378430709872859668.post-1894830970278875287</id><published>2010-02-09T14:53:00.000-05:00</published><updated>2010-02-09T14:56:19.997-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='python-zibopt'/><category scheme='http://www.blogger.com/atom/ns#' term='pycon'/><title type='text'>pycon 2010 slides</title><content type='html'>In case you'd like to see the working draft of my upcoming presentation on python-zibopt, you can find it &lt;a href="http://chenoneil.com/presentations/pycon2010/"&gt;here&lt;/a&gt;.  I'm looking forward to seeing you there!&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6378430709872859668-1894830970278875287?l=adventuresinoptimization.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://adventuresinoptimization.blogspot.com/feeds/1894830970278875287/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://adventuresinoptimization.blogspot.com/2010/02/pycon-2010-slides.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6378430709872859668/posts/default/1894830970278875287'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6378430709872859668/posts/default/1894830970278875287'/><link rel='alternate' type='text/html' href='http://adventuresinoptimization.blogspot.com/2010/02/pycon-2010-slides.html' title='pycon 2010 slides'/><author><name>Ryan J. O'Neil</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://4.bp.blogspot.com/_Cv0KZsgj0Fo/TTX24yBxlSI/AAAAAAAAABQ/qLlZGYpEpT8/S220/2011-01-12-205103.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6378430709872859668.post-3843305218182420059</id><published>2010-01-03T22:03:00.002-05:00</published><updated>2010-11-23T00:08:36.356-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='python-zibopt'/><category scheme='http://www.blogger.com/atom/ns#' term='pycon'/><category scheme='http://www.blogger.com/atom/ns#' term='zibopt'/><title type='text'>python-zibopt 0.4 released</title><content type='html'>This is the last release I'll be making with new features before PyCon.  That said, it's a pretty important one.  First off, the clunky syntax for adding constraints has been replaced with a much more Pythonic one that allows algebraic notation using operator overloading:&lt;br /&gt;&lt;pre class="brush: python; first-line: 1; toolbar: false;"&gt;from zibopt import scip&lt;br /&gt;solver = scip.solver()&lt;br /&gt;&lt;br /&gt;x1 = solver.variable(scip.INTEGER)&lt;br /&gt;x2 = solver.variable(scip.INTEGER)&lt;br /&gt;x3 = solver.variable(scip.INTEGER)&lt;br /&gt;&lt;br /&gt;solver += x1 &lt;= 2&lt;br /&gt;solver += x1 + x2 + 3*x3 &lt;= 3&lt;br /&gt;&lt;br /&gt;solution = solver.maximize(objective=x1 + x2 + 2*x3)&lt;br /&gt;&lt;/pre&gt;It even allows you to do fancy things with sum(...) and to set both upper and lower bounds at the same time:&lt;pre class="brush: python; first-line: 1; toolbar: false;"&gt;solver += 50 &lt;= sum(cost[i] * x[i] for i in things) &lt;= 100&lt;br /&gt;&lt;/pre&gt;The second new feature allows one to set branching priority on variables.  This is pretty straightforward:&lt;pre class="brush: python; first-line: 1; toolbar: false;"&gt;x1.priority = 1000&lt;br /&gt;&lt;/pre&gt;I'm still consulting with the ZIB folks about simplifying the installation, but I'm feeling pretty good about the state of the API.  Time to start practicing my presentation!You can download the new release &lt;a href="http://code.google.com/p/python-zibopt/downloads/list"&gt;here&lt;/a&gt;.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6378430709872859668-3843305218182420059?l=adventuresinoptimization.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://adventuresinoptimization.blogspot.com/feeds/3843305218182420059/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://adventuresinoptimization.blogspot.com/2010/01/python-zibopt-04-released.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6378430709872859668/posts/default/3843305218182420059'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6378430709872859668/posts/default/3843305218182420059'/><link rel='alternate' type='text/html' href='http://adventuresinoptimization.blogspot.com/2010/01/python-zibopt-04-released.html' title='python-zibopt 0.4 released'/><author><name>Ryan J. O'Neil</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://4.bp.blogspot.com/_Cv0KZsgj0Fo/TTX24yBxlSI/AAAAAAAAABQ/qLlZGYpEpT8/S220/2011-01-12-205103.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6378430709872859668.post-4492466189464223824</id><published>2009-11-10T23:27:00.000-05:00</published><updated>2009-11-11T00:14:05.230-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='python-zibopt'/><category scheme='http://www.blogger.com/atom/ns#' term='scip'/><category scheme='http://www.blogger.com/atom/ns#' term='python'/><category scheme='http://www.blogger.com/atom/ns#' term='pycon'/><category scheme='http://www.blogger.com/atom/ns#' term='zibopt'/><title type='text'>my pycon 2010 talk: optimal resource allocation using python</title><content type='html'>I'm very excited to report that my PyCon 2010 talk has been accepted and I will be publicly presenting python-zibopt in Atlanta this coming February.  There's still some work to be done on the project, including simplifying the addition of constraints and figuring out an easy way to distribute it, but for now I give you the outline to:&lt;br /&gt;&lt;br /&gt;Optimal Resource Allocation using Python&lt;br /&gt;&lt;br /&gt;A brief introduction to modeling and solving resource allocation and scheduling problems using Python and SCIP.&lt;br /&gt;&lt;br /&gt;At times a programmer is faced with difficult, possibly NP-Hard, optimization problems such as scheduling or assignment. Sophisticated techniques exist for modeling and solving these sorts of problems which are well implemented in optimization solvers. This talk introduces some of these techniques using the ZIB Optimization Suite and its new Python interface.&lt;br /&gt;&lt;br /&gt;Outline&lt;ul&gt;&lt;li&gt;What is Combinatorial Optimization?&lt;/li&gt;&lt;li&gt;A few methods for finding solutions&lt;/li&gt;&lt;li&gt;OK, but how do I really do it?&lt;/li&gt;&lt;li&gt;Introducing python-zibopt&lt;/li&gt;&lt;li&gt;Conclusion&lt;/li&gt;&lt;/ul&gt;&lt;br /&gt;What is Combinatorial Optimization?&lt;ul&gt;&lt;li&gt;A brief history of Mathematical Optimization&lt;/li&gt;&lt;li&gt;Example of Linear Optimization: Production planning&lt;ul&gt;&lt;li&gt;Problem statement &amp;amp; data&lt;/li&gt;&lt;li&gt;Mathematical model&lt;/li&gt;&lt;/ul&gt;&lt;/li&gt;&lt;li&gt;The Combinatorial Explosion: what happens when I can't produce 1/2 a widget?&lt;ul&gt;&lt;li&gt;Example: Uncapacitated Facility Location (UFL)&lt;/li&gt;&lt;li&gt;Example: The Traveling Salesman Problem&lt;/li&gt;&lt;li&gt;What makes these problems difficult?&lt;/li&gt;&lt;/ul&gt;&lt;/li&gt;&lt;/ul&gt;&lt;br /&gt;A few methods for finding solutions&lt;ul&gt;&lt;li&gt;It's all about bounding&lt;ul&gt;&lt;li&gt;Primal bounds: I have a feasible solution and here is its objective value&lt;/li&gt;&lt;ul&gt;&lt;li&gt;Example: Greedy algorithms&lt;/li&gt;&lt;/ul&gt;&lt;li&gt;Dual bounds: I've relaxed my problem and can do no better than this&lt;/li&gt;&lt;ul&gt;&lt;li&gt;Example: linear relaxations&lt;/li&gt;&lt;li&gt;Example: combinatorial relaxations&lt;/li&gt;&lt;/ul&gt;&lt;li&gt;Use Primal/Dual gap for search tree pruning&lt;/li&gt;&lt;/ul&gt;&lt;/li&gt;&lt;li&gt;Branch-and-Bound&lt;/li&gt;&lt;li&gt;Prosolving (or preprocessing) on binary integer programs&lt;/li&gt;&lt;li&gt;Cutting planes: cut off fractional solutions from LP dual&lt;/li&gt;&lt;/ul&gt;&lt;br /&gt;OK, but how do I really do it?&lt;ul&gt;&lt;li&gt;You can always code it up yourself, but...&lt;/li&gt;&lt;li&gt;All of these techniques (and more) are parts of commercial and open source solvers&lt;/li&gt;&lt;li&gt;Two common methods to interface with a solver:&lt;ul&gt;&lt;li&gt;Modeling language example: ZIMPL&lt;ul&gt;&lt;li&gt;The quickest way to get started&lt;/li&gt;&lt;li&gt;Closely resembles mathematical description of problem&lt;/li&gt;&lt;li&gt;Separates data from model&lt;/li&gt;&lt;li&gt;Not a full programming language... It is difficult to:&lt;ul&gt;&lt;li&gt;Automatically change the model depending on output (TSP subtour elimination)&lt;/li&gt;&lt;li&gt;Combine data from disparate sources (CSV, RDBMS, arbitrarily structured text...)&lt;/li&gt;&lt;li&gt;Base parts of the model on logic or user input&lt;/li&gt;&lt;/ul&gt;&lt;/li&gt;&lt;/ul&gt;&lt;/li&gt;&lt;li&gt;Solver APIs&lt;ul&gt;&lt;li&gt;Give all the existing power of a solver&lt;/li&gt;&lt;li&gt;Allow you to build your own problem-specific optimization components&lt;/li&gt;&lt;li&gt;Are usually in C...&lt;/li&gt;&lt;/ul&gt;&lt;/li&gt;&lt;/ul&gt;&lt;/li&gt;&lt;/ul&gt;&lt;br /&gt;Introducing python-zibopt&lt;ul&gt;&lt;li&gt;A brief intro to the ZIB Optimization Suite&lt;ul&gt;&lt;li&gt;Modeling Language: ZIMPL&lt;/li&gt;&lt;li&gt;Linear Programming Solver: SoPlex&lt;/li&gt;&lt;li&gt;Integer Programming Solver: SCIP&lt;/li&gt;&lt;/ul&gt;&lt;/li&gt;&lt;li&gt;Features of python-zibopt&lt;ul&gt;&lt;li&gt;Creating model variables&lt;/li&gt;&lt;li&gt;Describing problem constraints&lt;/li&gt;&lt;li&gt;Maximization / minimization&lt;/li&gt;&lt;li&gt;Maximum gaps and time limits on solving&lt;/li&gt;&lt;li&gt;Example: Maximizing network throughput&lt;ul&gt;&lt;li&gt;You have some kind of network (computer network, transportation, etc). Arcs have given capacities and you want to (a) calculate maximum throughput and (b) know where best to build extensions to the network.&lt;/li&gt;&lt;/ul&gt;&lt;/li&gt;&lt;li&gt;Example: Scheduling of on-call times for IT admins&lt;ul&gt;&lt;li&gt;You have servers and services running in an IT shop. Services go down in the middle of the night and have to be fixed and brought back up. You have limited IT staff, each with different capabilities (apache, oracle, python, etc). Given approved vacations, you want to create a feasible on-call schedule that equitably assigns on-call duties. You would also like guidance for future hiring and training decisions.&lt;/li&gt;&lt;/ul&gt;&lt;/li&gt;&lt;/ul&gt;&lt;/li&gt;&lt;/ul&gt;&lt;br /&gt;Conclusion&lt;ul&gt;&lt;li&gt;Just because a problem is intractable doesn't mean you can't solve it&lt;/li&gt;&lt;li&gt;ZIB has an excellent suite of open source tools for just this purpose&lt;/li&gt;&lt;li&gt;Optimization is for Python programmers too!&lt;/li&gt;&lt;/ul&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6378430709872859668-4492466189464223824?l=adventuresinoptimization.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://adventuresinoptimization.blogspot.com/feeds/4492466189464223824/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://adventuresinoptimization.blogspot.com/2009/11/my-pycon-2010-talk.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6378430709872859668/posts/default/4492466189464223824'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6378430709872859668/posts/default/4492466189464223824'/><link rel='alternate' type='text/html' href='http://adventuresinoptimization.blogspot.com/2009/11/my-pycon-2010-talk.html' title='my pycon 2010 talk: optimal resource allocation using python'/><author><name>Ryan J. O'Neil</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://4.bp.blogspot.com/_Cv0KZsgj0Fo/TTX24yBxlSI/AAAAAAAAABQ/qLlZGYpEpT8/S220/2011-01-12-205103.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6378430709872859668.post-5721521642083495760</id><published>2009-10-17T01:14:00.001-04:00</published><updated>2010-11-23T00:07:44.185-05:00</updated><title type='text'>python-zibopt 0.2 released</title><content type='html'>The new release has some exciting features, most importantly:&lt;br /&gt;&lt;br /&gt;You can now use gap, absgap, and time keywords when calling maximize/minimize to stop the solver before reaching optimality.  These function the same way they function for SCIP, pausing the search when any one of the conditions is met.  You can use any combination of them, and on the next call to maximize or minimize SCIP will pick up where it left off.&lt;br /&gt;&lt;pre class="brush: python; first-line: 1; toolbar: false;"&gt;solver.maximize(absgap=0.5, gap=100, time=500)&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;Settings for branching rules, separators, and conflict handlers can be read and modified from within Python:&lt;br /&gt;&lt;pre class="brush: python; first-line: 1; toolbar: false;"&gt;solver.branching['inference'].priority = 10000&lt;br /&gt;solver.branching['mostinf'].maxdepth += 2&lt;br /&gt;print(solver.branching['pscost'].maxbounddist)&lt;br /&gt;&lt;br /&gt;solver.separators['clique'].priority = 10000&lt;br /&gt;solver.conflict['logicor'].priority = 10000&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;You can download the new release &lt;a href="http://code.google.com/p/python-zibopt/downloads/list"&gt;here&lt;/a&gt;.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6378430709872859668-5721521642083495760?l=adventuresinoptimization.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://adventuresinoptimization.blogspot.com/feeds/5721521642083495760/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://adventuresinoptimization.blogspot.com/2009/10/python-zibopt-02-released.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6378430709872859668/posts/default/5721521642083495760'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6378430709872859668/posts/default/5721521642083495760'/><link rel='alternate' type='text/html' href='http://adventuresinoptimization.blogspot.com/2009/10/python-zibopt-02-released.html' title='python-zibopt 0.2 released'/><author><name>Ryan J. O'Neil</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://4.bp.blogspot.com/_Cv0KZsgj0Fo/TTX24yBxlSI/AAAAAAAAABQ/qLlZGYpEpT8/S220/2011-01-12-205103.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6378430709872859668.post-3135140346427208032</id><published>2009-10-08T12:11:00.008-04:00</published><updated>2010-11-23T00:04:57.194-05:00</updated><title type='text'>easy monte carlo simulation in python</title><content type='html'>One of the most useful tools one learns in an Operations Research curriculum is&lt;a href="http://en.wikipedia.org/wiki/Monte_Carlo_method"&gt;&lt;br /&gt;Monte Carlo Simulation&lt;/a&gt;.  Its utility lies in its simplicity: one can learn vital information about nearly any process, be it deterministic or stochastic, without wading through the grunt work of finding an analytical solution.  It can be used for off-the-cuff estimates or as a proper scientific tool.  All one needs to know is how to simulate a given process and its appropriate probability distributions and parameters if that process is stochastic.&lt;br /&gt;&lt;br /&gt;Here's how it works:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;Construct a simulation that, given input values, returns a value of  interest.  This could be a pure quantity, like time spent waiting for a bus,  or a boolean indicating whether or not a particular event occurs.&lt;/li&gt;&lt;li&gt;Run the simulation a, usually large, number of times, each time with  randomly generated input variables.  Record its output values.&lt;/li&gt;&lt;li&gt;Compute sample mean and variance of the output values.&lt;br /&gt;&lt;/li&gt;&lt;/ul&gt;In the case of time spent waiting for a bus, the sample mean and variance are estimators of mean and variance for one's wait time.  In the boolean case, these represent probability that the given event will occur.&lt;br /&gt;&lt;br /&gt;One can think of Monte Carlo Simulation as throwing darts.  Say you want to find the area under a curve without integrating.  All you must do is draw the curve on a wall and throw darts at it randomly.  After you've thrown enough darts, the area under the curve can be approximated using the percentage of  darts that end up under the curve times the total area.&lt;br /&gt;&lt;br /&gt;This technique is often performed using a spreadsheet, but that can be a bit clunky and may make more complex simulations difficult.  I'd like to spend a minute showing how it can be done in Python.  Consider the following scenario:&lt;br /&gt;&lt;br /&gt;Passengers for a train arrive according to a Poisson process with a mean of 100 per hour.  The next train arrives exponentially with a rate of 5 per hour.  How many passers will be aboard the train?&lt;br /&gt;&lt;br /&gt;We can simulate this using the fact that a Poisson process can be represented as a string of events occurring with exponential inter-arrival times.  We use the &lt;span style="font-family: courier new;"&gt;sim()&lt;/span&gt; function below to generate the number of passengers for random  instances of the problem.  We then compute sample mean and variance for these values.&lt;br /&gt;&lt;pre class="brush: python; first-line: 1; toolbar: false;"&gt;import random&lt;br /&gt;&lt;br /&gt;PASSENGERS = 100.0&lt;br /&gt;TRAINS     =   5.0&lt;br /&gt;ITERATIONS = 10000&lt;br /&gt;&lt;br /&gt;def sim():&lt;br /&gt;    passengers = 0.0&lt;br /&gt;&lt;br /&gt;    # Determine when the train arrives&lt;br /&gt;    train = random.expovariate(TRAINS)&lt;br /&gt;&lt;br /&gt;    # Count the number of passenger arrivals before the train&lt;br /&gt;    now = 0.0&lt;br /&gt;    while True:&lt;br /&gt;        now += random.expovariate(PASSENGERS)&lt;br /&gt;        if now &gt;= train:&lt;br /&gt;            break&lt;br /&gt;        passengers += 1.0&lt;br /&gt;&lt;br /&gt;    return passengers&lt;br /&gt;&lt;br /&gt;if __name__ == '__main__':        &lt;br /&gt;    output = [sim() for _ in xrange(ITERATIONS)]&lt;br /&gt;&lt;br /&gt;    total = sum(output)&lt;br /&gt;    mean = total / len(output)&lt;br /&gt;&lt;br /&gt;    sum_sqrs = sum(x*x for x in output)&lt;br /&gt;    variance = (sum_sqrs - total * mean) / (len(output) - 1)&lt;br /&gt;&lt;br /&gt;    print 'E[X] = %.02f' % mean&lt;br /&gt;    print 'Var(X) = %.02f' % variance&lt;br /&gt;&lt;/pre&gt;Running this code yields the intuitive result: we expect 20 passengers on the  train. While this is a very simple example, we could easily replace the &lt;span style="font-family: courier new;"&gt;sim()&lt;/span&gt; function with a simulator of any system of arbitrary complexity and get a pretty good estimation of its behavior.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6378430709872859668-3135140346427208032?l=adventuresinoptimization.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://adventuresinoptimization.blogspot.com/feeds/3135140346427208032/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://adventuresinoptimization.blogspot.com/2009/10/easy-monte-carlo-simulation-in-python.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6378430709872859668/posts/default/3135140346427208032'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6378430709872859668/posts/default/3135140346427208032'/><link rel='alternate' type='text/html' href='http://adventuresinoptimization.blogspot.com/2009/10/easy-monte-carlo-simulation-in-python.html' title='easy monte carlo simulation in python'/><author><name>Ryan J. O'Neil</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://4.bp.blogspot.com/_Cv0KZsgj0Fo/TTX24yBxlSI/AAAAAAAAABQ/qLlZGYpEpT8/S220/2011-01-12-205103.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6378430709872859668.post-4712757117556125991</id><published>2009-09-21T15:54:00.001-04:00</published><updated>2009-09-21T15:57:03.420-04:00</updated><title type='text'>python-zibopt and scip 1.2</title><content type='html'>In case you happen to be playing with python-zibopt at all, know that it has been tested and works perfectly well against ZIBOpt 1.2.  All you must do is install the new version of ZIBOpt, update your environment variables per the python-zibopt installation page, and recompile.&lt;br /&gt;&lt;br /&gt;Special thanks go out to the ZIB folks for maintaining an excellent API across releases.  Happy hacking!&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6378430709872859668-4712757117556125991?l=adventuresinoptimization.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://adventuresinoptimization.blogspot.com/feeds/4712757117556125991/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://adventuresinoptimization.blogspot.com/2009/09/python-zibopt-and-scip-12.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6378430709872859668/posts/default/4712757117556125991'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6378430709872859668/posts/default/4712757117556125991'/><link rel='alternate' type='text/html' href='http://adventuresinoptimization.blogspot.com/2009/09/python-zibopt-and-scip-12.html' title='python-zibopt and scip 1.2'/><author><name>Ryan J. O'Neil</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://4.bp.blogspot.com/_Cv0KZsgj0Fo/TTX24yBxlSI/AAAAAAAAABQ/qLlZGYpEpT8/S220/2011-01-12-205103.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6378430709872859668.post-6897286372918044982</id><published>2009-07-10T18:09:00.001-04:00</published><updated>2009-07-10T18:09:48.021-04:00</updated><title type='text'>python-zibopt initial public release</title><content type='html'>I realize it's been a few months since my last post, but the time has been well used. I spent the beginning of this year solidifying my understanding of basic combinatorial optimization techniques in a graduate class taught by &lt;a href="http://iris.gmu.edu/%7Ekhoffman/"&gt;Prof. Hoffman&lt;/a&gt; at GMU.  Being a Linux user and open source advocate, I quickly found that &lt;a href="http://scip.zib.de/"&gt;SCIP&lt;/a&gt; and &lt;a href="http://zimpl.zib.de/"&gt;ZIMPL&lt;/a&gt; were my best options for solving the sometimes rather complex modeling tasks we were assigned. However, we were occasionally given problems that required either going into the C interface for SCIP or doing a significant amount of manual work.&lt;br /&gt;&lt;br /&gt;Take, for example, solving a TSP instance using subtour elimination constraints.  (See the &lt;a href="http://adventuresinoptimization.blogspot.com/2009/02/on-beauty-of-power-sets.html"&gt;earlier post&lt;/a&gt; about this.) If a TSP is beyond nontrivial size, then the constraints required to describe it can't fit into memory. Thus, one ends up having to solve combinatorial relaxations of the problem repeatedly, adding formulation cuts after each solve until there aren't any disjoint subtours.&lt;br /&gt;&lt;br /&gt;With a background in dynamic languages, I wasn't about to dive into a C API just to do this. It began to dawn on me just how much better life would be with a Python interface to SCIP. Among other things, this would give:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;An easier time writing code to deal with special problem structures&lt;/li&gt;&lt;li&gt;Access to Python's rich set of libraries for things like &lt;a href="http://docs.python.org/library/csv.html"&gt;CSV&lt;/a&gt; and &lt;a href="http://docs.python.org/library/json.html"&gt;JSON&lt;/a&gt;&lt;/li&gt;&lt;li&gt;Proper interfaces to relational databases&lt;/li&gt;&lt;li&gt;Ties to advanced web programming tools for decision support&lt;/li&gt;&lt;li&gt;Multiprocessing (for the truly insane)&lt;br /&gt;&lt;/li&gt;&lt;/ul&gt;Thus, &lt;a href="http://code.google.com/p/python-zibopt/"&gt;python-zibopt&lt;/a&gt; was born.  I began coding it up in the days before attending the &lt;a href="https://sites.google.com/site/gimmemip/mip2009"&gt;MIP 2009 Workshop&lt;/a&gt; in June, where I had the pleasure of meeting some of the ZIB folks. Since then, python-zibopt has come along fairly quickly. There's a lot to be done, but I've made an initial public release available for anyone who'd like to start playing with it. Here's what it has so far:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;Interfaces to SCIP solvers, variables, constraints, and solutions&lt;/li&gt;&lt;li&gt;The ability to turn on and off solver verbosity&lt;/li&gt;&lt;li&gt;Solutions that are false in case of infeasibilty or unboundedness&lt;/li&gt;&lt;li&gt;Mapping from SCIP errors to Python exception types&lt;/li&gt;&lt;li&gt;Solvers can be restarted with additional constraints and variables&lt;/li&gt;&lt;li&gt;Primal solutions can be fed to the solver and are checked for feasibility&lt;br /&gt;&lt;/li&gt;&lt;/ul&gt;Take a look at the &lt;a href="http://code.google.com/p/python-zibopt/source/browse/#svn/tags/python-zibopt-0.1/examples"&gt;examples&lt;/a&gt; directory to get started, particularly at tsp.py. This is a TSP solver that can be run against any JSON input file containing a lower triangular distance matrix. It works using the method outlined above. I've included &lt;a href="http://code.google.com/p/python-zibopt/source/browse/tags/python-zibopt-0.1/examples/tsp/us-capitals.json"&gt;a JSON file&lt;/a&gt; with the data from the &lt;a href="http://www.cse.wustl.edu/%7Echen/7102/TSP.pdf"&gt;famous 1954 TSP paper&lt;/a&gt; by Dantzig, et al.  (If you haven't read this, you really should!)&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6378430709872859668-6897286372918044982?l=adventuresinoptimization.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://adventuresinoptimization.blogspot.com/feeds/6897286372918044982/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://adventuresinoptimization.blogspot.com/2009/07/python-zibopt-initial-public-release.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6378430709872859668/posts/default/6897286372918044982'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6378430709872859668/posts/default/6897286372918044982'/><link rel='alternate' type='text/html' href='http://adventuresinoptimization.blogspot.com/2009/07/python-zibopt-initial-public-release.html' title='python-zibopt initial public release'/><author><name>Ryan J. O'Neil</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://4.bp.blogspot.com/_Cv0KZsgj0Fo/TTX24yBxlSI/AAAAAAAAABQ/qLlZGYpEpT8/S220/2011-01-12-205103.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6378430709872859668.post-867814363811345617</id><published>2009-02-27T03:52:00.039-05:00</published><updated>2011-02-15T11:30:03.608-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='ampl'/><category scheme='http://www.blogger.com/atom/ns#' term='formulation'/><category scheme='http://www.blogger.com/atom/ns#' term='tsp'/><category scheme='http://www.blogger.com/atom/ns#' term='zimpl'/><category scheme='http://www.blogger.com/atom/ns#' term='power set'/><category scheme='http://www.blogger.com/atom/ns#' term='zibopt'/><title type='text'>on the beauty of power sets</title><content type='html'>One of the difficulties we encounter in solving the &lt;a href="http://www.tsp.gatech.edu/"&gt;Traveling Salesman Problem&lt;/a&gt; (TSP) is that, for even a small numer of cities, a complete description of the problem requires a factorial number of constraints.  This is apparent in the standard formulation used to teach the TSP to OR students.  Consider a set of $n$ cities with the distance from city $i$ to city $j$ denoted $d_{ij}$.  We attempt to minimize the total distance of a tour entering and leaving each city exactly once.  $x_{ij} = 1$ if the edge from city $i$ to city $j$ is included in the tour, $0$ otherwise:&lt;br /&gt;&lt;br /&gt;&lt;table&gt;&lt;tr&gt;&lt;td valign="top"&gt;$\min$&lt;/td&gt;&lt;td colspan="5"&gt;$z = \sum_i \sum_{j\ne i} d_{ij} x_{ij}$&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td valign="top"&gt;s.t.&lt;/td&gt;&lt;td align="right"&gt;$\sum_{j\ne i} x_{ij}$&lt;/td&gt;&lt;td valign="top"&gt;$=$&lt;/td&gt;&lt;td valign="top"&gt;$1$&lt;/td&gt;&lt;td valign="top"&gt;$\forall i$&lt;/td&gt;&lt;td align="right" valign="top"&gt;leave each city once&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;&lt;/td&gt;&lt;td align="right"&gt;$\sum_{i\ne j} x_{ij}$&lt;/td&gt;&lt;td valign="top"&gt;$=$&lt;/td&gt;&lt;td valign="top"&gt;$1$&lt;/td&gt;&lt;td valign="top"&gt;$\forall j$&lt;/td&gt;&lt;td align="right" valign="top"&gt;enter each city once&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;&lt;/td&gt;&lt;td align="right"&gt;$x_{ij}$&lt;/td&gt;&lt;td&gt;$\in$&lt;/td&gt;&lt;td&gt;${0,1}$&lt;/td&gt;&lt;td&gt;$\forall i, j$&lt;/td&gt;&lt;td&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;&lt;br /&gt;&lt;br /&gt;This appears like a reasonable formulation until we solve it and see that our solution contains disconnected subtours.  Suppose we have four cities, labeled $A$ through $D$.  Connecting $A$ to $B$, $B$ to $A$, $C$ to $D$ and $D$ to $C$ provides a feasible solution to our formulation, but does not constitute a cycle.  Here is a more concrete example of two disconnected subtours {(1,5),(5,1)} and {(2,3),(3,4),(4,2)} over five cities:&lt;br /&gt;&lt;pre&gt;ampl: display x;&lt;br /&gt;x [*,*]&lt;br /&gt;:   1   2   3   4   5    :=&lt;br /&gt;1   0   0   0   0   1&lt;br /&gt;2   0   0   1   0   0&lt;br /&gt;3   0   0   0   1   0&lt;br /&gt;4   0   1   0   0   0&lt;br /&gt;5   1   0   0   0   0&lt;br /&gt;;&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;Realizing we just solved the &lt;a href="http://en.wikipedia.org/wiki/Assignment_problem"&gt;Assignment Problem&lt;/a&gt;, we now add subtour elimination constraints.  These require that any proper, non-null subset of our $n$ cities is connected by at most $n-1$ active edges:&lt;br /&gt;&lt;br /&gt;$\sum_{i \in S} \sum_{j in S} x_{ij} \leq |S|-1 \forall S \subset {1, ..., n}, S \ne O$&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Indexing subtour elimination constraints over a &lt;a href="http://en.wikipedia.org/wiki/Power_set"&gt;power set&lt;/a&gt; of the cities completes the formulation. However, this requires an additional $\sum_{k=2}^{n-1} \left(\matrix{n \cr k}\right)$ rows tacked on the end of our matrix and is clearly infeasible for large $n$.  The most current computers can handle using this approach &lt;a href="http://zimpl.zib.de/download/zimpl.pdf"&gt;is around 19 cities&lt;/a&gt;.  It remains an instructive tool for understanding the &lt;a href="http://en.wikipedia.org/wiki/Combinatorial_explosion"&gt;combinatorial explosion&lt;/a&gt; that occurs in problems like TSP and is worth translating into a modeling language.  So how does one model it on a computer?&lt;br /&gt;&lt;br /&gt;Unfortunately, &lt;a href="http://ampl.com/"&gt;AMPL&lt;/a&gt;, the gold standard in mathematical modeling languages, is unable to index over sets.  Creating a power set in AMPL requires going through &lt;a href="http://tomopt.com/ampl/service/sets.php#HowcanIgetAMPLtoindexoverthepowersetconsistingofallsubsetsofaset"&gt;a few contortions&lt;/a&gt;.  The following code demonstrates power and index sets over four cities:&lt;br /&gt;&lt;pre&gt;set cities := 1 .. 4 ordered;&lt;br /&gt;&lt;br /&gt;param n := card(cities);&lt;br /&gt;set indices := 0 .. (2^n - 1);&lt;br /&gt;set power {i in indices} := {c in cities: (i div 2^(ord(c) - 1)) mod 2 = 1};&lt;br /&gt;&lt;br /&gt;display cities;&lt;br /&gt;display n;&lt;br /&gt;display indices;&lt;br /&gt;display power;&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;This yields the following output:&lt;br /&gt;&lt;pre&gt;set cities := 1 2 3 4;&lt;br /&gt;&lt;br /&gt;n = 4&lt;br /&gt;&lt;br /&gt;set indices := 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15;&lt;br /&gt;&lt;br /&gt;set power[0] := ; # empty&lt;br /&gt;set power[1] := 1;&lt;br /&gt;set power[2] := 2;&lt;br /&gt;set power[3] := 1 2;&lt;br /&gt;set power[4] := 3;&lt;br /&gt;set power[5] := 1 3;&lt;br /&gt;set power[6] := 2 3;&lt;br /&gt;set power[7] := 1 2 3;&lt;br /&gt;set power[8] := 4;&lt;br /&gt;set power[9] := 1 4;&lt;br /&gt;set power[10] := 2 4;&lt;br /&gt;set power[11] := 1 2 4;&lt;br /&gt;set power[12] := 3 4;&lt;br /&gt;set power[13] := 1 3 4;&lt;br /&gt;set power[14] := 2 3 4;&lt;br /&gt;set power[15] := 1 2 3 4;&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;Note how the index set contains an index for each row in our power set.  We can now generate the subtour elimination constraints:&lt;br /&gt;&lt;pre&gt;var x {cities cross cities} binary;&lt;br /&gt;s.t. subtours {i in indices: card(power[i]) &gt; 1 and card(power[i]) &lt; card(cities)}:&lt;br /&gt;sum {(c,k) in power[i] cross power[i]: k != c} x[c,k] &lt;= card(power[i]) - 1;&lt;br /&gt;&lt;br /&gt;expand subtours;&lt;br /&gt;&lt;br /&gt;subject to subtours[3]:  x[1,2] + x[2,1] &lt;= 1;&lt;br /&gt;subject to subtours[5]:  x[1,3] + x[3,1] &lt;= 1;&lt;br /&gt;subject to subtours[6]:  x[2,3] + x[3,2] &lt;= 1;&lt;br /&gt;subject to subtours[7]:  x[1,2] + x[1,3] + x[2,1] + x[2,3] + x[3,1] + x[3,2] &lt;= 2;&lt;br /&gt;subject to subtours[9]:  x[1,4] + x[4,1] &lt;= 1;&lt;br /&gt;subject to subtours[10]: x[2,4] + x[4,2] &lt;= 1;&lt;br /&gt;subject to subtours[11]: x[1,2] + x[1,4] + x[2,1] + x[2,4] + x[4,1] + x[4,2] &lt;= 2;&lt;br /&gt;subject to subtours[12]: x[3,4] + x[4,3] &lt;= 1;&lt;br /&gt;subject to subtours[13]: x[1,3] + x[1,4] + x[3,1] + x[3,4] + x[4,1] + x[4,3] &lt;= 2;&lt;br /&gt;subject to subtours[14]: x[2,3] + x[2,4] + x[3,2] + x[3,4] + x[4,2] + x[4,3] &lt;= 2;&lt;br /&gt;&lt;/pre&gt;While this does work, the code for generating the power set looks like &lt;a href="http://en.wikipedia.org/wiki/Voodoo_programming"voodoo&gt;voodoo&lt;/a&gt;.  Understanding it required piece-by-piece decomposition, an exercise I suggest you go through yourself if you have a copy of AMPL and 15 minutes to spare:&lt;pre&gt;set foo {c in cities} := {ord(c)};&lt;br /&gt;set bar {c in cities} := {2^(ord(c) - 1)};&lt;br /&gt;set baz {i in indices} := {c in cities: i div 2^(ord(c) - 1)};&lt;br /&gt;set qux {i in indices} := {c in cities: (i div 2^(ord(c) - 1)) mod 2 = 1};&lt;br /&gt;&lt;br /&gt;display foo;&lt;br /&gt;display bar;&lt;br /&gt;display baz;&lt;br /&gt;display qux;&lt;br /&gt;&lt;/pre&gt;This may be an instance where open source leads commercial software.  The good folks who produce the &lt;a href="http://zibopt.zib.de/"&gt;ZIB Optimization Suite&lt;/a&gt; provide an AMPL-like language called &lt;a href="http://zimpl.zib.de/"&gt;ZIMPL&lt;/a&gt; with a few additional useful features.  One of these is power sets.  Compared to the code above, doesn't this look refreshing?&lt;pre&gt;set cities := {1 to 4};&lt;br /&gt;&lt;br /&gt;set power[] := powerset(cities);&lt;br /&gt;set indices := indexset(power);&lt;br /&gt;&lt;/pre&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6378430709872859668-867814363811345617?l=adventuresinoptimization.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://adventuresinoptimization.blogspot.com/feeds/867814363811345617/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://adventuresinoptimization.blogspot.com/2009/02/on-beauty-of-power-sets.html#comment-form' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6378430709872859668/posts/default/867814363811345617'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6378430709872859668/posts/default/867814363811345617'/><link rel='alternate' type='text/html' href='http://adventuresinoptimization.blogspot.com/2009/02/on-beauty-of-power-sets.html' title='on the beauty of power sets'/><author><name>Ryan J. O'Neil</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://4.bp.blogspot.com/_Cv0KZsgj0Fo/TTX24yBxlSI/AAAAAAAAABQ/qLlZGYpEpT8/S220/2011-01-12-205103.jpg'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6378430709872859668.post-1440904683529247088</id><published>2009-02-20T03:30:00.089-05:00</published><updated>2010-11-23T23:05:31.940-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='formulation'/><category scheme='http://www.blogger.com/atom/ns#' term='bip'/><category scheme='http://www.blogger.com/atom/ns#' term='uls'/><title type='text'>uncapacitated lot-sizing formulation</title><content type='html'>Uncapacitated Lot-Sizing (ULS) is a classic &lt;a href="http://en.wikipedia.org/wiki/Operations_research"&gt;OR&lt;/a&gt; problem that seeks to minimize the cost of satisfying known demand for a product over time.  Demand is subject to varying costs for production, set-up, and storage of the product.  Technically, it is a mixed binary integer linear program -- the key point separating it from the world of &lt;a href="http://en.wikipedia.org/wiki/Linear_programming"&gt;linear optimization&lt;/a&gt; being that production cannot occur during any period without paying that period's fixed costs for set-up.  Thus it has linear nonnegative variables for production and storage amounts during each period, and a binary variable for each period that determines whether or not production can actually occur.&lt;br /&gt;&lt;br /&gt;For $n$ periods with per-period fixed set-up cost $f_t$, unit production cost $p_t$, unit storage cost $h_t$,and demand $d_t$, we define decision variables related to production and storage quantities:&lt;br /&gt;&lt;br /&gt;$x_t =$ units produced in period $t$&lt;br /&gt;$s_t =$ stock at the end of period $t$&lt;br /&gt;$y_t = 1$ if production occurs in period $t$, $0$ otherwise&lt;br /&gt;&lt;br /&gt;One can minimize overall cost for satisfying all demand on time using the following model per &lt;a href="http://www.amazon.com/Integer-Programming-Laurence-Wolsey/dp/0471283665/ref=pd_bbs_sr_1?ie=UTF8&amp;amp;s=books&amp;amp;qid=1235083688&amp;amp;sr=8-1"&gt;Wolsey (1998)&lt;/a&gt;, defined slightly differently here:&lt;br /&gt;&lt;table&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td valign="top"&gt;$\min$&lt;/td&gt;&lt;td colspan="4"&gt;$z = \sum_t{p_t x_t} + \sum_t{h_t s_t} + \sum_t{f_t y_t}$&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td al="" valign="top"&gt;s.t.&lt;/td&gt;&lt;td align="right"&gt;$s_1$&lt;/td&gt;&lt;td&gt;$=$&lt;/td&gt;&lt;td&gt;$d_1 + s_1$&lt;/td&gt;&lt;td&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;&lt;/td&gt;&lt;td align="right"&gt;$s_{t-1} + x_t$&lt;/td&gt;&lt;td&gt;$=$&lt;/td&gt;&lt;td&gt;$d_t + s_t$&lt;/td&gt;&lt;td&gt;$\forall t &gt; 1$&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;&lt;/td&gt;&lt;td align="right"&gt;$x_t$&lt;/td&gt;&lt;td&gt;$\leq$&lt;/td&gt;&lt;td&gt;$M y_t$&lt;/td&gt;&lt;td&gt;$\forall t$&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;&lt;/td&gt;&lt;td align="right"&gt;$s_t, x_t$&lt;/td&gt;&lt;td&gt;$\geq$&lt;/td&gt;&lt;td&gt;$0$&lt;/td&gt;&lt;td&gt;$\forall t$&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;&lt;/td&gt;&lt;td align="right"&gt;$y_t$&lt;/td&gt;&lt;td&gt;$\in$&lt;/td&gt;&lt;td&gt;$\{0,1\}$&lt;/td&gt;&lt;td&gt;$\forall t$&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;/table&gt;&lt;br /&gt;According to Wolsey, page 11, given that $s_t = \sum_{i=1}^t (x_i - d_i)$ and defining new constants $K = \sum_{t=1}^n h_t(\sum_{i=1}^t d_i)$ and $c_t = p_t + \sum_{i=t}^n h_i$, the objective function can be rewritten as $z = \sum_t c_t x_t + \sum _t f_t y_t - K$.  The book lacks a proof of this and it seems a bit non-obvious, so I attempt an explanation in somewhat painstaking detail here.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;table width="500"&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td colspan="3" valign="top"&gt;Proof:&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td colspan="1" align="right"&gt;$\sum_t p_t x_t + \sum_t h_t s_t + \sum_t f_t y_t$&lt;/td&gt;&lt;td valign="top"&gt;$=$&lt;/td&gt;&lt;td&gt;$\sum_t c_t x_t + \sum _t f_t y_t - K$&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td colspan="3" valign="top"&gt;1. Eliminate $\sum_t f_t y_t$:&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td align="right"&gt;$\sum_t p_t x_t + \sum_t h_t s_t$&lt;/td&gt;&lt;td valign="top"&gt;$=$&lt;/td&gt;&lt;td&gt;$\sum_t c_t x_t - K$&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td colspan="3"&gt;2. Expand $K$ and $c_t$:&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td align="right"&gt;$\sum_t p_t x_t + \sum_t h_t s_t$&lt;/td&gt;&lt;td&gt;$=$&lt;/td&gt;&lt;td&gt;$\sum_t (p_t + \sum_{i=t}^n h_i) x_t - \sum_t h_t (\sum_{i=1}^t d_i)$&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td colspan="3"&gt;3. Eliminate $\sum_t p_t x_t$:&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td align="right"&gt;$\sum_t h_t s_t$&lt;/td&gt;&lt;td&gt;$=$&lt;/td&gt;&lt;td&gt;$\sum_t x_t (\sum_{i=t}^n h_i) - \sum_t h_t (\sum_{i=1}^t d_i)$&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td colspan="3"&gt;4. Expand $s_t$:&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td align="right"&gt;$\sum_t h_t (\sum_{i=1}^t x_i) - \sum_t h_t (\sum_{i=1}^t d_i)$&lt;/td&gt;&lt;td&gt;$=$&lt;/td&gt;&lt;td&gt;$\sum_t x_t (\sum_{i=t}^n h_i) - \sum_t h_t (\sum_{i=1}^t d_i)$&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td colspan="3"&gt;5. Eliminate $\sum_t h_t (\sum_{i=1}^t d_i)$:&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td align="right"&gt;$\sum_t h_t (\sum_{i=1}^t x_i)$&lt;/td&gt;&lt;td&gt;$=$&lt;/td&gt;&lt;td&gt;$\sum_t x_t (\sum_{i=t}^n h_i)$&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;/table&gt;&lt;br /&gt;The result from step 5 becomes obvious upon expanding its left and right-hand terms:&lt;br /&gt;&lt;br /&gt;$h_1 x_1 + h_2 (x_1 + x_2) + \cdots + h_n (x_1 + \cdots + x_n) = $ $x_1 (h_1 + \cdots + h_n) + x2 (h_2 + \cdots + h_n) + \cdots + x_n h_n$.  &lt;br /&gt;&lt;br /&gt;In matrix notation with $h$ and $x$ as column vectors in $\bf R^n$ and $L$ and $U$ being $n x n$ lower and upper triangular identity matrices, respectively, this can be written as:&lt;br /&gt;&lt;br /&gt;$\left(h_1 \cdots h_n\right)\left(\matrix{1 \cdots 0 \cr \vdots\ddots \vdots \cr 1 \cdots 1}\right)\left(\matrix{x_1 \cr \vdots \cr x_n}\right) = \left(x_1 \cdots x_n \right)\left(\matrix{1 \cdots 1 \cr \vdots \ddots \vdots \cr 0 \cdots 1}\right)\left(\matrix{h_1 \cr \vdots \cr h_n}\right)$ &lt;br /&gt;&lt;br /&gt;or $h^T L x = x^T U h$.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6378430709872859668-1440904683529247088?l=adventuresinoptimization.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://adventuresinoptimization.blogspot.com/feeds/1440904683529247088/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://adventuresinoptimization.blogspot.com/2009/02/uncapacitated-lot-sizing-formulation.html#comment-form' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6378430709872859668/posts/default/1440904683529247088'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6378430709872859668/posts/default/1440904683529247088'/><link rel='alternate' type='text/html' href='http://adventuresinoptimization.blogspot.com/2009/02/uncapacitated-lot-sizing-formulation.html' title='uncapacitated lot-sizing formulation'/><author><name>Ryan J. O'Neil</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://4.bp.blogspot.com/_Cv0KZsgj0Fo/TTX24yBxlSI/AAAAAAAAABQ/qLlZGYpEpT8/S220/2011-01-12-205103.jpg'/></author><thr:total>2</thr:total></entry></feed>
