Tuesday, May 10, 2016

amcheck: Verify the logical consistency of PostgreSQL B-Tree indexes

I've created a project page on Github for amcheck, a tool for verifying the logical consistency of PostgreSQL B-Tree indexes:

https://github.com/petergeoghegan/amcheck

The tool is primarily useful for detecting index corruption in production database systems. It can do this with low overhead; most verification requires only a non-disruptive lock on the index as it is verified. The strength of the lock taken on an index as it is verified matches that of simple SELECT statements (unless the highest level of verification is requested). The locking involved will generally not block concurrent reads or writes, and will not prevent VACUUM from running concurrently.



amcheck is proposed as a contrib extension for PostgreSQL 9.7. This externally maintained version of the extension exists to support earlier versions of PostgreSQL (PostgreSQL 9.4+), and to make the tool available to those that need it sooner. While the level of verification is not totally comprehensive (in particular, there is no verification of indexes against underlying tables), the tool is still likely to detect many subtle problems in practice.

amcheck verifies that certain invariants that must hold in the structure of B-Tree indexes actually do, in fact, hold. It's fairly exhaustive. One example of a problem that the tool can detect is inconsistency arising from the recent PostgreSQL 9.5 abbreviated keys glibc issue, where the new-to-9.5 abbreviated keys performance optimization could lead to structurally inconsistent indexes due to a bug in some glibc versions. This issue created a need to get amcheck into the hands of users sooner rather than later.

It's not ideal that the tool is maintained externally, since there are complex locking protocols involved; the implementation must make sure that there cannot be false positives to be of much practical use, and so the tool ought to be considered whenever there is a question about these locking protocols. Unfortunately, we ran out of time to get amcheck into PostgreSQL 9.6. Technically there is no disadvantage to an externally maintained tool, but in my opinion amcheck should really be maintained alongside the B-Tree index code itself.


No comments:

Post a Comment