Database tuning: Triggers & materialized views
Had fun today at work tuning a Postgres database that has gotten very slow over the years as it has accumulated many gigabytes of data. This app is frequently rendered inoperable by just one or two users visiting its home page, which is obviously a bad situation. (Luckily, it’s an in-house tool used almost exclusively by a single user, which is why it hasn’t received more love before now.)
Over the past week, I’ve been recording the slowest queries, and today I started attacking them. The easiest-to-fix were the ones caused by missing indexes. Another problem I found was unnecessary overhead from two compound indexes that were indexing the same two columns with opposite orderings; I turned one into a single-column index, which should produce similar read performance and superior write performance.
A third fix I proposed was adding a field for the calculated value of md5(email). Some queries have been doing full-table searches of md5(email). I don’t understand why that’s necessary, but having to calculate md5() for every row in the table and then scanning the whole table sounds pretty inefficient. So I created a named function for calculating md5(email) and a trigger that calls the function whenever a table record is added or modified. Doing this at the database layer makes sense because Rails doesn’t need to know anything about md5(email).
I also created my first Postgres materialized view today. Another query can occasionally take 40+ seconds on our server. The same query normally runs orders of magnitude faster, so I’m not sure what causes such long delays. But it’s doing a join that involves calculating a count on a large table. My first thought was to add a counter cache, but that didn’t make sense when I looked at the table layout. I instead made a materialized view, which worked well on my static copy of the production database. But when I went to the Postgres documentation, I discovered two flaws with Postgres 9.3’s materialized view implementation: 1) Updating the materialized view is a manual process; and, 2) Updating the materialized view takes a full lock on the view. So I’m not sure it’s worth pushing to production, but I’m glad to read that Postgres devs are already working to improve the implementation of materialized views.
Posted by James on Friday, March 21, 2014