since converting from i64 epoch timestamp to DateTime is not always
possible. previously, passing invalid backup-time from client to server
(or vice-versa) panicked the corresponding tokio task. now we get proper
error messages including the invalid timestamp.
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
by either printing the original, out-of-range timestamp as-is, or
bailing with a proper error message instead of panicking.
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
even if it can't be handled by chrono. silently replacing it with epoch
0 is confusing..
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
We need to update the atime of chunk files if they already exist,
otherwise a concurrently running GC could sweep them away.
This is protected with ChunkStore.mutex, so the fstat/unlink does not
race with touching.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
The iterator of get_chunk_iterator is extended with a third parameter
indicating whether the current file is a chunk (false) or a .bad file
(true).
Count their sizes to the total of removed bytes, since it also frees
disk space.
.bad files are only deleted if the corresponding chunk exists, i.e. has
been rewritten. Otherwise we might delete data only marked bad because
of transient errors.
While at it, also clean up and use nix::unistd::unlinkat instead of
unsafe libc calls.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
This ensures that following backups will always upload the chunk,
thereby replacing it with a correct version again.
Format for renaming is <digest>.<counter>.bad where <counter> is used if
a chunk is found to be bad again before a GC cleans it up.
Care has been taken to deliberately only rename a chunk in conditions
where it is guaranteed to be an error in the chunk itself. Otherwise a
broken index file could lead to an unwanted mass-rename of chunks.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
Save the state ("ok" or "failed") and the UPID of the respective
verify task. With this we can easily allow to open the relevant task
log and show when the last verify happened.
As we already load the manifest when listing the snapshots, just add
it there directly.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
to avoid `map_struct` which is actually unsafe because it
does not verify alignment constraints at all
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
This reverts commit d53fbe2474.
The HashSet and "register" function are unnecessary, as we already know
which backup is the one we need to check: the last one, stored as
'last_backup'.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
An flock on the snapshot dir itself is used in addition to the group dir
lock. The lock is used to avoid races with forget and prune, while
having more granularity than the group lock (i.e. the group lock is
necessary to prevent more than one backup per group, but the snapshot
lock still allows backups unrelated to the currently running to be
forgotten/pruned).
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
Attempt to lock the backup directory to be deleted, if it works keep the
lock until the deletion is complete. This way we ensure that no other
locking operation (e.g. using a snapshot as base for another backup) can
happen concurrently.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
an encrypted Index should never reference a plain-text chunk, and an
unencrypted Index should never reference an encrypted chunk.
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
If the datastore holds broken backups for some reason, do not attempt to
base following snapshots on those. This would lead to an error on
/previous, leaving the client no choice but to upload all chunks, even
though there might be potential for incremental savings.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
instead of bailing and stopping the entire GC process, warn about the
missing chunks and continue.
this results in "TASK WARNINGS: X" as the status.
Signed-off-by: Oguz Bektas <o.bektas@proxmox.com>
Used chunks are marked in phase1 of the garbage collection process by
using the atime property. Each used chunk gets touched so that the atime
gets updated (if older than 24h, see relatime).
Should there ever be a situation in which the phase1 in the GC run needs
a very long time to finish, it could happen that the grace period
calculated in phase2 is not long enough and thus the marking of the
chunks (atime) becomes invalid. This would result in the removal of
needed chunks.
Even though the likelyhood of this happening is very low, using the
timestamp from right before phase1 is started, to calculate the grace
period in phase2 should avoid this situation.
Signed-off-by: Aaron Lauterer <a.lauterer@proxmox.com>
regular chunks are only decoded when their contents are accessed, in
which case we need to have the key anyway and want to verify the digest.
for blobs we need to verify beforehand, since their checksums are always
calculated based on their raw content, and stored in the manifest.
manifests are also stored as blobs, but don't have a digest in the
traditional sense (they might have a signature covering parts of their
contents, but that is verified already when loading the manifest).
this commit does not cover pull/sync code which copies blobs and chunks
as-is without decoding them.
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
Errors while applying metadata will not be considered fatal
by default using `pxar extract` unless `--strict` was passed
in which case it'll bail out immediately.
It'll still return an error exit status if something had
failed along the way.
Note that most other errors will still cause it to bail out
(eg. errors creating files, or I/O errors while writing
the contents).
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
so that we can print a list at the end of the worker which backups
are corrupt.
this is useful if there are many snapshots and some in between had an
error. Before this patch, the task log simply says to 'look in the logs'
but if the log is very long it makes it hard to see what exactly failed.
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
this makes it easier to see which chunks are corrupt
(and enables us in the future to build a 'complete' list of
corrupt chunks)
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
This should never trigger if everything else works correctly, but it is
still a very cheap check to avoid wrongly marking a backup as "OK" when
in fact some chunks might be missing.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
Multiple backups within one backup group don't really make sense, but
break all sorts of guarantees (e.g. a second backup started after a
first would use a "known-chunks" list from the previous unfinished one,
which would be empty - but using the list from the last finished one is
not a fix either, as that one could be deleted or pruned once the first
simultaneous backup is finished).
Fix it by only allowing one backup per backup group at one time. This is
done via a flock on the backup group directory, thus remaining intact
even after a reload.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
To prevent a race with a background GC operation, do not allow deletion
of backups who's index might currently be referenced as the "known chunk
list" for successive backups. Otherwise the GC could delete chunks it
thinks are no longer referenced, while at the same time telling the
client that it doesn't need to upload said chunks because they already
exist.
Additionally, prevent deletion of whole backup groups, if there are
snapshots contained that appear to be currently in-progress. This is
currently unlikely to trigger, as that function is only used for sync
jobs, but it's a useful safeguard either way.
Deleting a single snapshot has a 'force' parameter, which is necessary
to allow deleting incomplete snapshots on an aborted backup. Pruning
also sets force=true to avoid the check, since it calculates which
snapshots to keep on its own.
To avoid code duplication, the is_finished method is factored out.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
And make verify_crc private for now. We always call load_from_reader() to
verify the CRC.
Also add load_chunk() to datastore.rs (from chunk_store::read_chunk())
useful to get info like, was the previous snapshot encrypted in
libproxmox-backup-qemu
Requested-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Requires updating the AsyncRead implementation to cope with byte-wise
seeks to intra-chunk positions.
Uses chunk_from_offset to get locations within chunks, but tries to
avoid it for sequential read to not reduce performance from before.
AsyncSeek needs to use the temporary seek_to_pos to avoid changing the
position in case an invalid seek is given and it needs to error in
poll_complete.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
We support using an ext4 mountpoint directly as datastore and even do
so ourself when creating one through the disk manage code.
Such ext4 ountpoints have a lost+found directory which only root can
traverse into. As the GC list images is done as backup:backup user
walkdir gets an error.
We cannot ignore just all permission errors, as they could lead to
missing some backup indexes and thus possibly sweeping more chunks
than desired. While *normally* that should not happen through our
stack, we had already user report that they do rsyncs to move a
datastore from old to new server and got the permission wrong.
So for now be still very strict, only allow a "lost+found" directory
as immediate child of the datastore base directory, nothing else.
If deemed safe, this can always be made less strict. Possibly by
filtering the known backup-types on the highest level first.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
otherwise we leak those descriptors and run into EMFILE when a backup
group contains many snapshots.
fcntl::openat and Dir::openat are not the same ;)
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
JSON keys MUST be quoted. this is a one-time break in signature
validation for backups created with the broken canonicalization code.
QEMU backups are not affected, as libproxmox-backup-qemu never linked
the broken versions.
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
it is nice to have a command to exit from the shell instead of
only allowing ctrl+d or ctrl+c
the api method is just for documentation/help purposes and does nothing
by itself, the real logic is directly in the read loop
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
When creating a new datastore the basedir is only owned by the backup
user if it did not exist beforehand (create_path chowns only if it
creates the directory), and returns false if it did not create the
directory).
This improves the experience when adding a new datastore on a fresh
disk or existing directory (not owned by backup) - backups/pulls can
be run instead of terminating with EPERM.
Tested on my local testinstall with a new disk, and a existing directory:
Signed-off-by: Stoiko Ivanov <s.ivanov@proxmox.com>
* don't clone hash keys, just use references
* we don't need a String, stick to Vec<u8> and use
serde_json::to_writer to avoid a temporary strings
altogether
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
This is a more convenient way to pass along the key when
creating encrypted backups of unprivileged containers in PVE
where the unprivileged user namespace cannot access
`/etc/pve/priv`.
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
place() is used when creating a file, as it will create
intermediate directories, only use it when actually placing
a new file.
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
This also replaces the recently introduced --encryption
parameter on the client with a --crypt-mode parameter.
This can be "none", "encrypt" or "sign-only".
Note that this introduces various changes in the API types
which previously did not take the above distinction into
account properly:
Both `BackupContent` and the manifest's `FileInfo`:
lose `encryption: Option<bool>`
gain `crypt_mode: Option<CryptMode>`
Within the backup manifest itself, the "crypt-mode" property
will always be set.
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
And use it for fixed and dynamic index. Please note that this
changes checksums for fixed indexes, so restore older backups
will fails now (not backward compatible).
To support incremental backups (where not all chunks are sent to the
server), a new parameter "reuse-csum" is introduced on the
"create_fixed_index" API call. When set and equal to last backups'
checksum, the backup writer clones the data from the last index of this
archive file, and only updates chunks it actually receives.
In incremental mode some checks usually done on closing an index cannot
be made, since they would be inaccurate.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
we want to get a string representation of the DirEntryAttribute
like 'f' for file, etc. and since we have such a mapping already
in the CatalogEntryType, use that
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
mostly copied from BufferedDynamicReadAt from proxmox-backup-client
but the reader is wrapped in an Arc in addition to the Mutex
we will use this for local access to a pxar behind a didx file
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>