Since I started playing EvE Six years ago, I have been a fan of spreadsheets. In fact, I never really gave Excel a second look until I started playing EvE, which, thinking back, is rather crazy in it’s own right. As I’ve gotten more proficient with Excel, and making it do the things that I want it to, I’m starting to bump into some of its limitations, both as an organizational tool and a number-crunching tool. This blog series is about the migration of some of my more complicated spreadsheets to databases, and, as a first look, I thought I would use my EvE Online spreadsheets – the main reason I have become so proficient in Excel in the first place.
The Spreadsheet
I do production in EvE online, so it should be no surprise that I have a fair number of spreadsheets to help me make good production decisions. The second generation of such a workbook is functionally complete, and I simply add to it when I get new blueprints or items that I am considering for production. It provides me with the production (and invention) costs, where I should buy the components, and even whether I can profitably produce the item at all. It also supports planetary and moon production chains based on import/export costs and fuel costs. The crux of this spreadsheet was a handful of database dumps, a script to parse eve-central api calls to prices sheets, and a master sheet at the front that provides minimum buy/produce costs and maximum sell costs for every “item of interest” and “market of interest” derived from other sheets (so, the items I produce/invent, plus all of their component items that I might need costs for). As of right now, a snapshot of this page looks like this:
So, when I want to figure out whether I should spend the time to produce/invent something, I simply look for a positive profit (and a reasonable margin) with a “Produce in New Eden” option. It even advises on my sell market, so that’s handy.
The Problem
While this spreadsheet is effectively feature complete, it is neither small nor speedy. It currently weighs in at roughly 15MB, and Excel uses almost 250MB of memory when I open it. For computation, it takes about 2 minutes to open the sheet, and about 5 minutes to recompute everything if I do anything to the sheet (add a new blueprint, update prices, or reimport the necessary database tables when the data dump gets updated). While not completely unacceptable, it has become rather cumbersome to work with, especially for prototyping new things. I also have the problem that if I want to look a new invention path (i.e. ask the question, “is it profitable for me to invent Remote Sensor Dampener IIs?”), I have to add the BP to one spreadsheet, add the T2 BP to two more, and add the final item to my master price sheet. Then, I have to make sure I have the prices for any outstanding dependencies in my eve-central export script, and reimport all the prices. At 5 minutes per step, this can take an inordinate amount of time. As a result, my research options have stagnated.
The Database
When I was looking through the main time sinks in the spreadsheet, I realized that most of my equations had vlookups (an excel function that allows you to look up a value in a table and pull a specific column out) in them. I recognized this as an obvious optimization that I would get out of a database, since, let’s face it, table operations are what databases do well. I also noted that I would be able to save a step in exporting specially-rolled queries from the eve database dumps, and, instead, just use the raw tables and roll queries around those. Then, if the raw tables get updated, I just have to reimport those tables and everything still works.
Unfortunately, there’s one thing that Excel has no problem with, that I’m having trouble getting Access to do: Not-quite circular dependencies. Specifically, the rows of the computation don’t have a circular dependency, but the tables do. For instance, in my spreadsheet, I have one master sheet that displays the minimum cost to acquire an item – whether it’s through production or buying the item on the market. It gets the production cost from my production spreadsheet, which uses the master sheet to determine the material costs from all of the components. Excel is fine with this, as it does each element update in sequence, and can detect circular dependencies (and, more importantly, that there aren’t any). When trying to do this with Access, however, a query that references itself is no good. This necessitated the use of a temp table, which, while not entirely a problem, means that I have to build that temp table for each “level” of dependence on my master table. So, for planetary material prices to percolate through to my T2 production chain correctly, I have to rerun the query and generate the temp table 5 times! Luckily, this isn’t the end of the world, but still a quirk with utilizing a database rather than Excel.
As of this writing, my database solution isn’t feature complete. I’ve got basic T1 production and planetary production complete, but I haven’t even looked at T2 production, invention, and moon reactions. So far though, things are quite a bit smoother, and, I am doing some design work so that I only have to enter any particular information in once.