Btrfs - totting up disk usage with subvolumes and snapshots
I was struggling last night for disk space.
df -h showed my disk nearly full, which seemed insane given the size of it. So I went to use "du" to start narrowing down where it is, my favourite variation is du --max-depth=1 -h, which gives me a nice total of each directory in the current one.
The numbers didn't add up. This was due to using the BTRFS file system which supports snapshots. When you snapshot a subvolume it appears to duplicate it, but actually just marks a point on the disk. If you don't make any changes to your duplicate it doesn't take up any more disk space. If you do start changing things, it'll only take up space for the changes (even just parts of files).
The "du" command will calculate everything at face-value. Which means, with all the snapshots I had, I was reporting several terabytes of data being used when the disk was 1TB.
My snapshots are done automatically with the rather excellent "snapper" tool which is bundled with openSUSE. It snapshots hourly and stores a configured number of snapshots. Luckily for me it does this in a predictable way and puts everything in a ".snapshots" directory. A small change to our command means we can see where our data usage really is:
du --max-depth=1 --exclude=.snapshots -h .
Armed with this command I soon found out that government opendata database I downloaded was still swamping my drive. Deleted!