proxmox-backup

Author	SHA1	Message	Date
Dietmar Maurer	28570d19a6	tape restore: avoid multiple stat calls for same chunk	2021-04-16 13:17:17 +02:00
Dietmar Maurer	1369bcdbba	tape restore: verify if all chunks exist	2021-04-16 12:20:44 +02:00
Dominik Csapak	7f394c807b	backup/verify: improve speed by sorting chunks by inode before reading the chunks from disk in the order of the index file, stat them first and sort them by inode number. this can have a very positive impact on read speed on spinning disks, even with the additional stat'ing of the chunks. memory footprint should be tolerable, for 1_000_000 chunks we need about ~16MiB of memory (Vec of 64bit position + 64bit inode) (assuming 4MiB Chunks, such an index would reference 4TiB of data) two small benchmarks (single spinner, ext4) here showed an improvement from ~430 seconds to ~330 seconds for a 32GiB fixed index and from ~160 seconds to ~120 seconds for a 10GiB dynamic index Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>	2021-04-14 14:39:24 +02:00
Thomas Lamprecht	d1d74c4367	typo fixes all over the place found and semi-manually replaced by using: codespell -L mut -L crate -i 3 -w Mostly in comments, but also email notification and two occurrences of misspelled 'reserved' struct member, which where not used and cargo build did not complain about the change, soo ... Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>	2021-03-10 16:39:57 +01:00
Dominik Csapak	1399c592d1	garbage_collection: only ignore 'missing chunk' errors with the fix for #2909 (improving handling missing chunks), we changed from bailing to warning during a garbage collection when updating the atime of a chunk. but, updating the atime can not only fail when the chunk is missing, but also on other occasions, e.g. no permissions or more importantly, no space left on the device. in that case, the atime of a valid and used chunk cannot be updated, and the second sweep of the gc will remove that chunk. [0] is a real world example of that happening. instead, only warn on really missin chunks, and bail on all other errors. 0: https://forum.proxmox.com/threads/pbs-server-full-two-days-later-almost-empty.83274/ Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>	2021-02-01 09:18:59 +01:00
Fabian Grünbichler	d08cff51a4	rework GC traversal error handling the error message don't make sense with an empty default Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com> Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>	2021-01-25 11:41:48 +01:00
Fabian Grünbichler	81b2a87232	clippy: fix Mutex with unused value Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com> Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>	2021-01-25 11:41:36 +01:00
Fabian Grünbichler	ea368a06cd	clippy: misc. fixes Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>	2021-01-20 16:23:54 +01:00
Fabian Grünbichler	a6bd669854	clippy: use matches! Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>	2021-01-20 16:23:54 +01:00
Fabian Grünbichler	d8d8af9826	clippy: use chars / byte string literals Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>	2021-01-20 16:23:54 +01:00
Fabian Grünbichler	4428818412	clippy: remove unnecessary clones and from::<T>(T) Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>	2021-01-20 16:22:59 +01:00
Dominik Csapak	179145dc24	backup/datastore: move manifest locking to /run this fixes the issue that on some filesystems, you cannot recursively remove a directory when you hold a lock on a file inside (e.g. nfs/cifs) it is not really backwards compatible (so during an upgrade, there could be two daemons have the lock), but since the locking was broken before (see previous patch) it should not really matter (also it seems very unlikely that someone will trigger this) Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>	2020-12-03 09:56:42 +01:00
Dominik Csapak	6bd0a00c46	backup/datastore: really lock manifest on delete 'lock_manifest' returns a Result<File, Error> so we always got the result, even when we did not get the lock, but we acted like we had. bubble the locking error up Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>	2020-12-02 14:37:05 +01:00
Dietmar Maurer	2260f065d4	cleanup: use extra file for StoreProgress	2020-12-01 06:34:33 +01:00
Dietmar Maurer	6eff8dec4f	cleanup: remove unnecessary StoreProgress clone()	2020-12-01 06:29:11 +01:00
Fabian Grünbichler	f867ef9c4a	progress: add format variants for iterating over a single group, or iterating just on the group level Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>	2020-12-01 06:22:12 +01:00
Fabian Grünbichler	fc8920e35d	pull: factor out interpolated progress and add group/snapshot count info. Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>	2020-12-01 06:13:11 +01:00
Fabian Grünbichler	844660036b	gc: don't limit index listing to same filesystem WalkDir does not follow symlinks by default anyway, and this behaviour is not documented anywhere. e.g., if a sysadmin mounts 'extra storage' for some backup group or type (not knowing that only metadata is stored in those directories), GC will ignore all the indices contained within and happily garbage collect their chunks.. Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>	2020-12-01 06:07:09 +01:00
Fabian Grünbichler	efcac39d34	gc: remove duplicate variable list_images already returns absolute paths, we don't need to prepend anything. Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>	2020-12-01 06:06:51 +01:00
Fabian Grünbichler	cb4b721cb0	gc: log index files found outside of expected scheme for safety reason, GC finds and marks all index files below the datastore base path. as a result of regular operations, only index files within the expected scheme of <TYPE>/<ID>/<TIMESTAMP> should exist. add a small check + warning if the index list contains index files out side of this expected scheme, so that an admin with shell access can investigate. Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>	2020-12-01 06:06:17 +01:00
Fabian Grünbichler	7956877f14	gc: shorten progress messages we have messages starting the phases anyway, and limit the number of progress updates so that context remains available at all times. Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>	2020-12-01 06:04:13 +01:00
Stefan Reiter	fd19256470	gc: treat .bad files like regular chunks Simplify the phase 2 code by treating .bad files just like regular chunks, with the exception of stat logging. To facilitate, we need to touch .bad files in phase 1. We only do this under the condition that 1) the original chunk is missing (as before), and 2) the original chunk is still referenced somewhere (since the code lives in the error handler for a failed chunk touch, it only gets called for chunks we expect to be there, i.e. ones that are referenced). Untouched they will then be cleaned up after 24 hours (or after the last longer-running task finishes). Reason 2) is also a fix for .bad files not being cleaned up at all if the original is no longer referenced anywhere (e.g. a user deleting all snapshots after seeing some corrupt chunks appear). cond_touch_path is introduced to touch arbitrary paths in the chunk store with the same logic as touching chunks. Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>	2020-11-18 14:04:49 +01:00
Thomas Lamprecht	788d82d9b7	gc: mark_used_chunks: reduce implementation noise try do reduce some unecessary lines, make match arms more precise so one can faster see what's actually happening. Also, avoid > return Err(format_err!(...)) stuff, just use bail!() Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>	2020-11-02 21:08:38 +01:00
Dominik Csapak	2f0b92352d	garbage collect: improve index error messages so that in case of a broken index file, the user knows which it is Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>	2020-11-02 20:08:50 +01:00
Fabian Grünbichler	e6dc35acb8	replace Userid with Authid in most generic places. this is accompanied by a change in RpcEnvironment to purposefully break existing call sites. Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>	2020-10-29 15:11:39 +01:00
Thomas Lamprecht	b6563f48ad	GC: improve task logs Make it more clear that removed files are chunks (not indexes or something like that, user cannot know that we do not touch them here) Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>	2020-10-29 14:47:39 +01:00
Thomas Lamprecht	932390bd46	GC: fix logging leftover bad chunks fixes commit `b4fb262335`, which copied over the "Removed bad files:" block, but only adapted the log text, not the actual variable. Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>	2020-10-29 14:40:29 +01:00
Dietmar Maurer	d6373f3525	garbage_collection: log deduplication factor	2020-10-29 11:13:01 +01:00
Dietmar Maurer	b4fb262335	garbage_collection: log bad chunks (still_bad value)	2020-10-29 10:24:31 +01:00
Dominik Csapak	b683fd589c	backup/datastore: save garbage collection status to disk and load it again when opening it this way we can persist the status of the last garbage collect across daemon reloads and reboots Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>	2020-10-27 17:41:30 +01:00
Stefan Reiter	0698f78df5	fix #2988 : allow verification after finishing a snapshot To cater to the paranoid, a new datastore-wide setting "verify-new" is introduced. When set, a verify job will be spawned right after a new backup is added to the store (only verifying the added snapshot). Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>	2020-10-20 10:51:13 +02:00
Fabian Grünbichler	115d927c15	unbreak build and silence warning. Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>	2020-10-20 09:07:32 +02:00
Stefan Reiter	df729017b4	datastore: cleanup open and load config only once Force consumers to use the lookup_datastore method instead of potentially opening a datastore twice, and pass the config we have already loaded into open_with_path, removing the need for open(1). Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>	2020-10-20 07:51:05 +02:00
Stefan Reiter	1a374fcfd6	datastore: add manifest locking Avoid races when updating manifest data by flocking a lock file. update_manifest is used to ensure updates always happen with the lock held. Snapshot deletion also acquires the lock, so it cannot interfere with an outstanding manifest write. Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>	2020-10-16 09:34:12 +02:00
Dietmar Maurer	e07620028d	mark_used_chunks: simply ignore vanished files In case a prune operation removed a file in the meantime.	2020-10-16 08:10:46 +02:00
Stefan Reiter	4c0ae82e23	datastore: remove individual snapshots before group Removing a snapshot has some more safety checks which we don't want to ignore when removing an entire group (i.e. locking the manifest and notifying GC). Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>	2020-10-15 07:51:09 +02:00
Stefan Reiter	883aa6d5a4	datastore: remove load_manifest_json There's no point in having that as a seperate method, just parse the thing into a struct and write it back out correctly. Also makes further changes to the method simpler. Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>	2020-10-15 07:19:32 +02:00
Stefan Reiter	238a872d1f	reader: acquire shared flock on open snapshot ...to avoid it being forgotten or pruned while in use. Update lock error message for deletions to be consistent. Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>	2020-10-15 07:09:34 +02:00
Wolfgang Bumiller	8db1468952	more clippy fixups Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>	2020-10-14 13:58:35 +02:00
Wolfgang Bumiller	f6b1d1cc66	don't require WorkerTask in backup/ To untangle the server code from the actual backup implementation. It would be ideal if the whole backup/ dir could become its own crate with minimal dependencies, certainly without depending on the actual api server. That would then also be used more easily to create forensic tools for all the data file types we have in the backup repositories. Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>	2020-10-12 14:11:57 +02:00
Thomas Lamprecht	823867f5b7	datastore: gc: avoid unsafe call into libc, use epoch_i64 helper Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>	2020-10-01 12:38:38 +02:00
Thomas Lamprecht	c6772c92b8	datastore: gc: comment exclusive process lock Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>	2020-10-01 12:38:04 +02:00
Dietmar Maurer	ba37f3562d	src/backup/datastore.rs - open_with_path: use Path instead of str	2020-09-19 10:01:57 +02:00
Dietmar Maurer	fce4659388	src/backup/datastore.rs: new method open_with_path To make testing easier.	2020-09-19 09:55:21 +02:00
Dietmar Maurer	6a7be83efe	avoid chrono dependency, depend on proxmox 0.3.8 - remove chrono dependency - depend on proxmox 0.3.8 - remove epoch_now, epoch_now_u64 and epoch_now_f64 - remove tm_editor (moved to proxmox crate) - use new helpers from proxmox 0.3.8 * epoch_i64 and epoch_f64 * parse_rfc3339 * epoch_to_rfc3339_utc * strftime_local - BackupDir changes: * store epoch and rfc3339 string instead of DateTime * backup_time_to_string now return a Result * remove unnecessary TryFrom<(BackupGroup, i64)> for BackupDir - DynamicIndexHeader: change ctime to i64 - FixedIndexHeader: change ctime to i64	2020-09-15 07:12:57 +02:00
Stefan Reiter	a9767cf7de	gc: remove .bad files on garbage collect The iterator of get_chunk_iterator is extended with a third parameter indicating whether the current file is a chunk (false) or a .bad file (true). Count their sizes to the total of removed bytes, since it also frees disk space. .bad files are only deleted if the corresponding chunk exists, i.e. has been rewritten. Otherwise we might delete data only marked bad because of transient errors. While at it, also clean up and use nix::unistd::unlinkat instead of unsafe libc calls. Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>	2020-09-08 12:43:13 +02:00
Dietmar Maurer	8317873c06	gc: improve percentage done logs	2020-09-02 10:04:18 +02:00
Thomas Lamprecht	49a92084a9	gc: use human readable units for summary and avoid the "percentage done: X %" phrase Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>	2020-08-27 16:06:35 +02:00
Thomas Lamprecht	1ffe030123	various typo fixes Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>	2020-08-25 18:52:31 +02:00
Stefan Reiter	f23f75433f	backup: flock snapshot on backup start An flock on the snapshot dir itself is used in addition to the group dir lock. The lock is used to avoid races with forget and prune, while having more granularity than the group lock (i.e. the group lock is necessary to prevent more than one backup per group, but the snapshot lock still allows backups unrelated to the currently running to be forgotten/pruned). Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>	2020-08-11 11:02:21 +02:00

1 2 3 4

154 Commits