Commit Graph

165 Commits

Author SHA1 Message Date
Dietmar Maurer
ccc3896ff3 avoid type re-exports 2021-09-14 08:35:43 +02:00
Dietmar Maurer
e7d4be9d85 move datastore config to pbs_config workspace 2021-09-10 08:40:58 +02:00
Dietmar Maurer
2121174827 start new pbs-config workspace
moved src/config/domains.rs
2021-09-02 12:58:20 +02:00
Dietmar Maurer
7526d86419 use new atomic_open_or_create_file
Factor out open_backup_lockfile() method to acquire locks owned by
user backup with permission 0660.

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2021-07-20 18:54:23 +02:00
Dominik Csapak
8e0b852f24 server/prune_job: add proper permission checks to 'prune_datastore'
checks for PRIV_DATASTORE_MODIFY, or else if the auth_id is the backup
owner, and skips the group if not.

Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
2021-07-16 11:39:01 +02:00
Dominik Csapak
9751ef4b36 backup/datastore: refactor check_backup_owner there
and add a 'owns_backup' convenience function

Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
2021-07-16 11:36:02 +02:00
Wolfgang Bumiller
2f02e431b0 moving more code to pbs-datastore
prune and fixed/dynamic index

Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
2021-07-09 10:40:14 +02:00
Wolfgang Bumiller
c23192d34e move chunk_store to pbs-datastore
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
2021-07-07 14:37:47 +02:00
Wolfgang Bumiller
770a36e53a add pbs-tools subcrate
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
2021-07-06 15:10:37 +02:00
Dominik Csapak
4921a411ad backup/datastore: refactor chunk inode sorting to the datastore
so that we can reuse that information

the removal of the adding to the corrupted list is ok, since
'get_chunks_in_order' returns them at the end of the list
and we do the same if the loading fails later in 'verify_index_chunks'
so we still mark them corrupt
(assuming that the load will fail if the stat does)

Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
2021-06-28 12:14:52 +02:00
Dominik Csapak
062cf75cdf proxmox-backup-proxy: fix leftover references on datastore removal
when we remove a datastore via api/cli, the proxy
has sometimes leftover references to that datastore in its
DATASTORE_MAP which includes an open filehandle on the
'.lock' file

this prevents unmounting/exporting the datastore even after removal,
only a reload/restart of the proxy did help

add a command to our command socket, which removes all non
configured datastores from the map, dropping the open filehandle

Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
Signed-off-by: Dietmar Maurer <dietmar@proxmox.com>
2021-06-04 08:22:53 +02:00
Dietmar Maurer
28570d19a6 tape restore: avoid multiple stat calls for same chunk 2021-04-16 13:17:17 +02:00
Dietmar Maurer
1369bcdbba tape restore: verify if all chunks exist 2021-04-16 12:20:44 +02:00
Dominik Csapak
7f394c807b backup/verify: improve speed by sorting chunks by inode
before reading the chunks from disk in the order of the index file,
stat them first and sort them by inode number.

this can have a very positive impact on read speed on spinning disks,
even with the additional stat'ing of the chunks.

memory footprint should be tolerable, for 1_000_000 chunks
we need about ~16MiB of memory (Vec of 64bit position + 64bit inode)
(assuming 4MiB Chunks, such an index would reference 4TiB of data)

two small benchmarks (single spinner, ext4) here showed an improvement from
~430 seconds to ~330 seconds for a 32GiB fixed index
and from
~160 seconds to ~120 seconds for a 10GiB dynamic index

Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
2021-04-14 14:39:24 +02:00
Thomas Lamprecht
d1d74c4367 typo fixes all over the place
found and semi-manually replaced by using:
 codespell -L mut -L crate -i 3 -w

Mostly in comments, but also email notification and two occurrences
of misspelled  'reserved' struct member, which where not used and
cargo build did not complain about the change, soo ...

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2021-03-10 16:39:57 +01:00
Dominik Csapak
1399c592d1 garbage_collection: only ignore 'missing chunk' errors
with the fix for #2909 (improving handling missing chunks), we
changed from bailing to warning during a garbage collection when
updating the atime of a chunk.

but, updating the atime can not only fail when the chunk is missing,
but also on other occasions, e.g. no permissions or more importantly,
no space left on the device. in that case, the atime of a valid and used
chunk cannot be updated, and the second sweep of the gc will remove that chunk.
[0] is a real world example of that happening.

instead, only warn on really missin chunks, and bail on all other
errors.

0: https://forum.proxmox.com/threads/pbs-server-full-two-days-later-almost-empty.83274/

Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
2021-02-01 09:18:59 +01:00
Fabian Grünbichler
d08cff51a4 rework GC traversal error handling
the error message don't make sense with an empty default

Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
2021-01-25 11:41:48 +01:00
Fabian Grünbichler
81b2a87232 clippy: fix Mutex with unused value
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
2021-01-25 11:41:36 +01:00
Fabian Grünbichler
ea368a06cd clippy: misc. fixes
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
2021-01-20 16:23:54 +01:00
Fabian Grünbichler
a6bd669854 clippy: use matches!
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
2021-01-20 16:23:54 +01:00
Fabian Grünbichler
d8d8af9826 clippy: use chars / byte string literals
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
2021-01-20 16:23:54 +01:00
Fabian Grünbichler
4428818412 clippy: remove unnecessary clones
and from::<T>(T)

Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
2021-01-20 16:22:59 +01:00
Dominik Csapak
179145dc24 backup/datastore: move manifest locking to /run
this fixes the issue that on some filesystems, you cannot recursively
remove a directory when you hold a lock on a file inside (e.g. nfs/cifs)

it is not really backwards compatible (so during an upgrade, there
could be two daemons have the lock), but since the locking was
broken before (see previous patch) it should not really matter
(also it seems very unlikely that someone will trigger this)

Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
2020-12-03 09:56:42 +01:00
Dominik Csapak
6bd0a00c46 backup/datastore: really lock manifest on delete
'lock_manifest' returns a Result<File, Error> so we always got the result,
even when we did not get the lock, but we acted like we had.

bubble the locking error up

Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
2020-12-02 14:37:05 +01:00
Dietmar Maurer
2260f065d4 cleanup: use extra file for StoreProgress 2020-12-01 06:34:33 +01:00
Dietmar Maurer
6eff8dec4f cleanup: remove unnecessary StoreProgress clone() 2020-12-01 06:29:11 +01:00
Fabian Grünbichler
f867ef9c4a progress: add format variants
for iterating over a single group, or iterating just on the group level

Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
2020-12-01 06:22:12 +01:00
Fabian Grünbichler
fc8920e35d pull: factor out interpolated progress
and add group/snapshot count info.

Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
2020-12-01 06:13:11 +01:00
Fabian Grünbichler
844660036b gc: don't limit index listing to same filesystem
WalkDir does not follow symlinks by default anyway, and this behaviour
is not documented anywhere. e.g., if a sysadmin mounts 'extra storage'
for some backup group or type (not knowing that only metadata is stored
in those directories), GC will ignore all the indices contained within
and happily garbage collect their chunks..

Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
2020-12-01 06:07:09 +01:00
Fabian Grünbichler
efcac39d34 gc: remove duplicate variable
list_images already returns absolute paths, we don't need to prepend
anything.

Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
2020-12-01 06:06:51 +01:00
Fabian Grünbichler
cb4b721cb0 gc: log index files found outside of expected scheme
for safety reason, GC finds and marks all index files below the
datastore base path. as a result of regular operations, only index files
within the expected scheme of <TYPE>/<ID>/<TIMESTAMP> should exist.

add a small check + warning if the index list contains index files out
side of this expected scheme, so that an admin with shell access can
investigate.

Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
2020-12-01 06:06:17 +01:00
Fabian Grünbichler
7956877f14 gc: shorten progress messages
we have messages starting the phases anyway, and limit the number of
progress updates so that context remains available at all times.

Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
2020-12-01 06:04:13 +01:00
Stefan Reiter
fd19256470 gc: treat .bad files like regular chunks
Simplify the phase 2 code by treating .bad files just like regular
chunks, with the exception of stat logging.

To facilitate, we need to touch .bad files in phase 1. We only do this
under the condition that 1) the original chunk is missing (as before),
and 2) the original chunk is still referenced somewhere (since the code
lives in the error handler for a failed chunk touch, it only gets called
for chunks we expect to be there, i.e. ones that are referenced).

Untouched they will then be cleaned up after 24 hours (or after the last
longer-running task finishes).

Reason 2) is also a fix for .bad files not being cleaned up at all if
the original is no longer referenced anywhere (e.g. a user deleting all
snapshots after seeing some corrupt chunks appear).

cond_touch_path is introduced to touch arbitrary paths in the chunk
store with the same logic as touching chunks.

Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
2020-11-18 14:04:49 +01:00
Thomas Lamprecht
788d82d9b7 gc: mark_used_chunks: reduce implementation noise
try do reduce some unecessary lines, make match arms more precise so
one can faster see what's actually happening.

Also, avoid
> return Err(format_err!(...))
stuff, just use bail!()

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2020-11-02 21:08:38 +01:00
Dominik Csapak
2f0b92352d garbage collect: improve index error messages
so that in case of a broken index file, the user knows which it is

Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
2020-11-02 20:08:50 +01:00
Fabian Grünbichler
e6dc35acb8 replace Userid with Authid
in most generic places. this is accompanied by a change in
RpcEnvironment to purposefully break existing call sites.

Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
2020-10-29 15:11:39 +01:00
Thomas Lamprecht
b6563f48ad GC: improve task logs
Make it more clear that removed files are chunks (not indexes or
something like that, user cannot know that we do not touch them here)

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2020-10-29 14:47:39 +01:00
Thomas Lamprecht
932390bd46 GC: fix logging leftover bad chunks
fixes commit b4fb262335, which copied
over the "Removed bad files:" block, but only adapted the log text,
not the actual variable.

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2020-10-29 14:40:29 +01:00
Dietmar Maurer
d6373f3525 garbage_collection: log deduplication factor 2020-10-29 11:13:01 +01:00
Dietmar Maurer
b4fb262335 garbage_collection: log bad chunks (still_bad value) 2020-10-29 10:24:31 +01:00
Dominik Csapak
b683fd589c backup/datastore: save garbage collection status to disk
and load it again when opening it

this way we can persist the status of the last garbage collect across
daemon reloads and reboots

Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
2020-10-27 17:41:30 +01:00
Stefan Reiter
0698f78df5 fix #2988: allow verification after finishing a snapshot
To cater to the paranoid, a new datastore-wide setting "verify-new" is
introduced. When set, a verify job will be spawned right after a new
backup is added to the store (only verifying the added snapshot).

Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
2020-10-20 10:51:13 +02:00
Fabian Grünbichler
115d927c15 unbreak build
and silence warning.

Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
2020-10-20 09:07:32 +02:00
Stefan Reiter
df729017b4 datastore: cleanup open and load config only once
Force consumers to use the lookup_datastore method instead of
potentially opening a datastore twice, and pass the config we have
already loaded into open_with_path, removing the need for open(1).

Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
2020-10-20 07:51:05 +02:00
Stefan Reiter
1a374fcfd6 datastore: add manifest locking
Avoid races when updating manifest data by flocking a lock file.
update_manifest is used to ensure updates always happen with the lock
held.

Snapshot deletion also acquires the lock, so it cannot interfere with an
outstanding manifest write.

Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
2020-10-16 09:34:12 +02:00
Dietmar Maurer
e07620028d mark_used_chunks: simply ignore vanished files
In case a prune operation removed a file in the meantime.
2020-10-16 08:10:46 +02:00
Stefan Reiter
4c0ae82e23 datastore: remove individual snapshots before group
Removing a snapshot has some more safety checks which we don't want to
ignore when removing an entire group (i.e. locking the manifest and
notifying GC).

Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
2020-10-15 07:51:09 +02:00
Stefan Reiter
883aa6d5a4 datastore: remove load_manifest_json
There's no point in having that as a seperate method, just parse the
thing into a struct and write it back out correctly.

Also makes further changes to the method simpler.

Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
2020-10-15 07:19:32 +02:00
Stefan Reiter
238a872d1f reader: acquire shared flock on open snapshot
...to avoid it being forgotten or pruned while in use.

Update lock error message for deletions to be consistent.

Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
2020-10-15 07:09:34 +02:00
Wolfgang Bumiller
8db1468952 more clippy fixups
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
2020-10-14 13:58:35 +02:00