An assignment I have at the moment is a simple cache simulator, as I think I’ve said before. Anyway, I thought the code was all done – all good. The results made sense, and it seemed to correspond with what dineroIV (a popular and powerful open source cache simulator, which we’re using as our reference, presuming it’s totally accurate) was saying. I emphasise seemed, because as it turns out I’d only tested the few cases where it did… for the vast majority, it didn’t. D’oh.
Long story short, I have two arrays – one of elements actually loaded into the given cache set, one of all elements that have at any time been in cache (i.e. a history). Now, when a miss occurs for an entry that’s been loaded before, you copy one of the elements from the ‘history’ into the ‘current’ array. I was doing this, of course. What I wasn’t doing was updating it’s access times for read/write/etc…. so when LRU or similar algorithms were applied they were forever seeing the stale values from when it was loaded into the cache the very first time. D’oh d’oh d’oh d’oh!!
So now it’s fixed… and now I have to run all 14100 simulations again, repopulate my PostgreSQL results database, and restart pulling my results selectively into Excel. Oh the humanity. Luckily it’s pretty fast, if I do say so myself, and those 14100 simulations only take half an hour or so. The front end for the app is extremely powerful, so I can just give it ranges for all the variables and it’ll automagically perform every valid permutation of those… saves typing in the parameters by hand 14100 times. :)
The morale of this story – get it right the first time!