To untangle the server code from the actual backup
implementation.
It would be ideal if the whole backup/ dir could become its
own crate with minimal dependencies, certainly without
depending on the actual api server. That would then also be
used more easily to create forensic tools for all the data
file types we have in the backup repositories.
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
We need to update the atime of chunk files if they already exist,
otherwise a concurrently running GC could sweep them away.
This is protected with ChunkStore.mutex, so the fstat/unlink does not
race with touching.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
The iterator of get_chunk_iterator is extended with a third parameter
indicating whether the current file is a chunk (false) or a .bad file
(true).
Count their sizes to the total of removed bytes, since it also frees
disk space.
.bad files are only deleted if the corresponding chunk exists, i.e. has
been rewritten. Otherwise we might delete data only marked bad because
of transient errors.
While at it, also clean up and use nix::unistd::unlinkat instead of
unsafe libc calls.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
Used chunks are marked in phase1 of the garbage collection process by
using the atime property. Each used chunk gets touched so that the atime
gets updated (if older than 24h, see relatime).
Should there ever be a situation in which the phase1 in the GC run needs
a very long time to finish, it could happen that the grace period
calculated in phase2 is not long enough and thus the marking of the
chunks (atime) becomes invalid. This would result in the removal of
needed chunks.
Even though the likelyhood of this happening is very low, using the
timestamp from right before phase1 is started, to calculate the grace
period in phase2 should avoid this situation.
Signed-off-by: Aaron Lauterer <a.lauterer@proxmox.com>
And make verify_crc private for now. We always call load_from_reader() to
verify the CRC.
Also add load_chunk() to datastore.rs (from chunk_store::read_chunk())
When creating a new datastore the basedir is only owned by the backup
user if it did not exist beforehand (create_path chowns only if it
creates the directory), and returns false if it did not create the
directory).
This improves the experience when adding a new datastore on a fresh
disk or existing directory (not owned by backup) - backups/pulls can
be run instead of terminating with EPERM.
Tested on my local testinstall with a new disk, and a existing directory:
Signed-off-by: Stoiko Ivanov <s.ivanov@proxmox.com>
We can now use iter::from_fn() which makes for a much nicer
logic. The only thing better is going to be when we can use
generators with `yield`.
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
The protocol handler will receive chunk data plus a hash
pre-calculated by the client. It will verify the hash before
sending it up to the datastore in order to respond to the
client with an error on a mismatch, so there's no need to
recalculate the hash another time.
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>