Biogeme Hints and Tips and My Biogeme Workflow

October 15th, 2013

I’ve been working a lot with Biogeme, the open-source discrete model estimation tool.  This is really a great tool, and it is distributed free of charge.  However, it has a few “quirks” that I’ve hit all the time.  That being said, there’s a few things I’ve learned:

  • Text fields in the data file are bad.  The only quotes should be on the top line of the file.  Quoted items throughout the data file will cause Biogeme to error with “No Data In The Sample” or another similar error message.
  • Similar to above, blanks are bad.  Zero fill empty cells.  For CSV files, this is easy – open the file in Excel and replace {blank} with 0.
  • For a large data file and/or when using Network GEV simulation, run it on Linux.  Michel Bierlaire (Biogeme’s author) has advocated this many times on the Biogeme Yahoo Group.  My Ubuntu 12.04 virtual machine (running under VirtualBox) runs circles around it’s Windows 7-64bit host.
  • Things are case sensitive.

My Workflow

Since everything I use is in Windows (sometimes by choice, sometimes by necessity), I’ve figured out a workflow that works.  The main dataset is in Microsoft Access where I have several queries and linked tables.

My Biogeme Workflow

In the graphic above, the one thing that ties the two sides together is Dropbox (which could easily be another service, or a network drive).  This allows me to easily view the output html files in the browser on either my Windows host or my Ubuntu VM.  I get the running speed of Ubuntu, and all my data work is on my Windows computer, which makes life easier on me, as I’m used to most of the Windows tools I use, not so much with the Linux equivalents.