Commit Graph

562 Commits

Author SHA1 Message Date
Dietmar Maurer 4d431383d3 src/backup/data_blob.rs: expose verify_crc again 2020-09-16 10:43:42 +02:00
Stefan Reiter d10332a15d SnapshotVerifyState: use enum for state
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
2020-09-15 13:06:04 +02:00
Dietmar Maurer d09db6c2e9 rename BackupDir::new_with_group to BackupDir::with_group 2020-09-15 09:40:03 +02:00
Dietmar Maurer bc871bd19d src/backup/backup_info.rs: new BackupDir::with_rfc3339 2020-09-15 09:34:46 +02:00
Dietmar Maurer 6a7be83efe avoid chrono dependency, depend on proxmox 0.3.8
- remove chrono dependency

- depend on proxmox 0.3.8

- remove epoch_now, epoch_now_u64 and epoch_now_f64

- remove tm_editor (moved to proxmox crate)

- use new helpers from proxmox 0.3.8
  * epoch_i64 and epoch_f64
  * parse_rfc3339
  * epoch_to_rfc3339_utc
  * strftime_local

- BackupDir changes:
  * store epoch and rfc3339 string instead of DateTime
  * backup_time_to_string now return a Result
  * remove unnecessary TryFrom<(BackupGroup, i64)> for BackupDir

- DynamicIndexHeader: change ctime to i64

- FixedIndexHeader: change ctime to i64
2020-09-15 07:12:57 +02:00
Fabian Grünbichler e0e5b4426a BackupDir: make constructor fallible
since converting from i64 epoch timestamp to DateTime is not always
possible. previously, passing invalid backup-time from client to server
(or vice-versa) panicked the corresponding tokio task. now we get proper
error messages including the invalid timestamp.

Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
2020-09-11 15:49:35 +02:00
Fabian Grünbichler 833eca6d2f use non-panicky timestamp_opt where appropriate
by either printing the original, out-of-range timestamp as-is, or
bailing with a proper error message instead of panicking.

Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
2020-09-11 15:48:24 +02:00
Fabian Grünbichler 151acf5d96 don't truncate DateTime nanoseconds
where we don't care about them anyway..

Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
2020-09-11 15:48:10 +02:00
Fabian Grünbichler 4a363fb4a7 catalog dump: preserve original mtime
even if it can't be handled by chrono. silently replacing it with epoch
0 is confusing..

Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
2020-09-11 15:43:54 +02:00
Dietmar Maurer 5656888cc9 verify: fix done count
We need to filter out benchmark group earlier
2020-09-10 09:06:33 +02:00
Dietmar Maurer 5fdc5a6f3d verify: skip benchmark directory 2020-09-10 08:44:18 +02:00
Dietmar Maurer 14db8b52dc src/backup/chunk_store.rs: use ? insteadf of unwrap 2020-09-10 06:37:37 +02:00
Stefan Reiter 597427afaf clean up .bad file handling in sweep_unused_chunks
Code cleanup, no functional change intended.

Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
2020-09-10 06:31:22 +02:00
Stefan Reiter 068e526862 backup: touch all chunks, even if they exist
We need to update the atime of chunk files if they already exist,
otherwise a concurrently running GC could sweep them away.

This is protected with ChunkStore.mutex, so the fstat/unlink does not
race with touching.

Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
2020-09-08 12:51:03 +02:00
Stefan Reiter a9767cf7de gc: remove .bad files on garbage collect
The iterator of get_chunk_iterator is extended with a third parameter
indicating whether the current file is a chunk (false) or a .bad file
(true).

Count their sizes to the total of removed bytes, since it also frees
disk space.

.bad files are only deleted if the corresponding chunk exists, i.e. has
been rewritten. Otherwise we might delete data only marked bad because
of transient errors.

While at it, also clean up and use nix::unistd::unlinkat instead of
unsafe libc calls.

Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
2020-09-08 12:43:13 +02:00
Dietmar Maurer aadcc2815c cleanup rename_corrupted_chunk: avoid duplicate format macro 2020-09-08 12:29:53 +02:00
Stefan Reiter 0f3b7efa84 verify: rename corrupted chunks with .bad extension
This ensures that following backups will always upload the chunk,
thereby replacing it with a correct version again.

Format for renaming is <digest>.<counter>.bad where <counter> is used if
a chunk is found to be bad again before a GC cleans it up.

Care has been taken to deliberately only rename a chunk in conditions
where it is guaranteed to be an error in the chunk itself. Otherwise a
broken index file could lead to an unwanted mass-rename of chunks.

Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
2020-09-08 12:20:57 +02:00
Stefan Reiter 7c77e2f94a verify: fix log units
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
2020-09-08 12:10:19 +02:00
Dietmar Maurer 8317873c06 gc: improve percentage done logs 2020-09-02 10:04:18 +02:00
Dietmar Maurer deef63699e verify: also fail on server shutdown 2020-09-02 09:50:17 +02:00
Dietmar Maurer 63d9aca96f verify: log progress 2020-09-02 07:43:28 +02:00
Dietmar Maurer 4f09d31085 src/backup/verify.rs: use global hashes (instead of per group)
This makes verify more predictable.
2020-09-01 13:33:04 +02:00
Dietmar Maurer 58d73ddb1d src/backup/data_blob.rs: avoid useless &, data is already a reference 2020-09-01 12:56:25 +02:00
Dietmar Maurer 6b809ff59b src/backup/verify.rs: use separate thread to load data 2020-09-01 12:56:25 +02:00
Thomas Lamprecht 49a92084a9 gc: use human readable units for summary
and avoid the "percentage done: X %" phrase

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2020-08-27 16:06:35 +02:00
Thomas Lamprecht 3b2046d263 save last verify result in snapshot manifest
Save the state ("ok" or "failed") and the UPID of the respective
verify task. With this we can easily allow to open the relevant task
log and show when the last verify happened.

As we already load the manifest when listing the snapshots, just add
it there directly.

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2020-08-26 07:35:13 +02:00
Thomas Lamprecht 1ffe030123 various typo fixes
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2020-08-25 18:52:31 +02:00
Dietmar Maurer 7ae571e7cb verify: speedup - only verify chunks once
We need to do the check before we load the chunk.
2020-08-25 08:52:24 +02:00
Dietmar Maurer 4264c5023b verify: sort backup groups 2020-08-25 08:38:47 +02:00
Wolfgang Bumiller 3fa2b983c1 add methods to allocate a DynamicIndexHeader
to avoid `map_struct` which is actually unsafe because it
does not verify alignment constraints at all

Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
2020-08-17 11:50:32 +02:00
Stefan Reiter 8b5f72b176 Revert "backup: ensure base snapshots are still available after backup"
This reverts commit d53fbe2474.

The HashSet and "register" function are unnecessary, as we already know
which backup is the one we need to check: the last one, stored as
'last_backup'.

Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
2020-08-11 11:03:53 +02:00
Stefan Reiter f23f75433f backup: flock snapshot on backup start
An flock on the snapshot dir itself is used in addition to the group dir
lock. The lock is used to avoid races with forget and prune, while
having more granularity than the group lock (i.e. the group lock is
necessary to prevent more than one backup per group, but the snapshot
lock still allows backups unrelated to the currently running to be
forgotten/pruned).

Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
2020-08-11 11:02:21 +02:00
Stefan Reiter 6d6b4e72d3 datastore: prevent in-use deletion with locks instead of heuristic
Attempt to lock the backup directory to be deleted, if it works keep the
lock until the deletion is complete. This way we ensure that no other
locking operation (e.g. using a snapshot as base for another backup) can
happen concurrently.

Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
2020-08-11 11:00:29 +02:00
Dietmar Maurer e434258592 src/backup/backup_info.rs: remove BackupGroup lock()
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
2020-08-11 10:58:35 +02:00
Fabian Grünbichler 882c082369 mark signed manifests as such
for less-confusing display in the web interface

Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
2020-08-11 09:56:53 +02:00
Fabian Grünbichler 9a38fa29c2 verify: also check chunk CryptMode
and in-line verify_stored_chunk to avoid double-loading each chunk.

Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
2020-08-11 09:56:20 +02:00
Fabian Grünbichler 14f6c9cb8b chunk readers: ensure chunk/index CryptMode matches
an encrypted Index should never reference a plain-text chunk, and an
unencrypted Index should never reference an encrypted chunk.

Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
2020-08-11 09:54:22 +02:00
Wolfgang Bumiller e7cb4dc50d introduce Username, Realm and Userid api types
and begin splitting up types.rs as it has grown quite large
already

Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
2020-08-10 12:05:01 +02:00
Stefan Reiter 4dbe129284 backup: only allow finished backups as base snapshot
If the datastore holds broken backups for some reason, do not attempt to
base following snapshots on those. This would lead to an error on
/previous, leaving the client no choice but to upload all chunks, even
though there might be potential for incremental savings.

Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
2020-08-07 07:32:56 +02:00
Oguz Bektas 2f57a433b1 fix #2909: handle missing chunks gracefully in garbage collection
instead of bailing and stopping the entire GC process, warn about the
missing chunks and continue.

this results in "TASK WARNINGS: X" as the status.

Signed-off-by: Oguz Bektas <o.bektas@proxmox.com>
2020-08-06 06:36:48 +02:00
Wolfgang Bumiller 98c259b4c1 remove timer and lock functions, fix building with proxmox 0.3.2
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
2020-08-04 11:33:02 +02:00
Aaron Lauterer d3d566f7bd GC: use time pre phase1 to calculate min_atime in phase2
Used chunks are marked in phase1 of the garbage collection process by
using the atime property. Each used chunk gets touched so that the atime
gets updated (if older than 24h, see relatime).

Should there ever be a situation in which the phase1 in the GC run needs
a very long time to finish, it could happen that the grace period
calculated in phase2 is not long enough and thus the marking of the
chunks (atime) becomes invalid. This would result in the removal of
needed chunks.

Even though the likelyhood of this happening is very low, using the
timestamp from right before phase1 is started, to calculate the grace
period in phase2 should avoid this situation.

Signed-off-by: Aaron Lauterer <a.lauterer@proxmox.com>
2020-08-04 10:19:05 +02:00
Fabian Grünbichler 8819d1f2f5 blobs: attempt to verify on decode when possible
regular chunks are only decoded when their contents are accessed, in
which case we need to have the key anyway and want to verify the digest.

for blobs we need to verify beforehand, since their checksums are always
calculated based on their raw content, and stored in the manifest.

manifests are also stored as blobs, but don't have a digest in the
traditional sense (they might have a signature covering parts of their
contents, but that is verified already when loading the manifest).

this commit does not cover pull/sync code which copies blobs and chunks
as-is without decoding them.

Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
2020-08-04 07:27:56 +02:00
Wolfgang Bumiller d9b8e2c795 pxar: better error handling on extract
Errors while applying metadata will not be considered fatal
by default using `pxar extract` unless `--strict` was passed
in which case it'll bail out immediately.

It'll still return an error exit status if something had
failed along the way.

Note that most other errors will still cause it to bail out
(eg. errors creating files, or I/O errors while writing
the contents).

Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
2020-08-03 09:40:55 +02:00
Dietmar Maurer ff86ef00a7 cleanup: manifest is always CryptMode::None 2020-07-31 10:25:30 +02:00
Dietmar Maurer a4acb6ef84 lock_file: return std::io::Error 2020-07-31 08:53:00 +02:00
Dietmar Maurer e443902583 src/backup/datastore.rs: add helpers to load/store manifest
We want this to modify the manifest "unprotected" data, for example
to add upload statistics, notes, ...
2020-07-31 07:45:47 +02:00
Dietmar Maurer 1fc82c41f2 src/api2/backup.rs: aquire backup lock earlier in create_locked_backup_group() 2020-07-30 11:03:05 +02:00
Dominik Csapak adfdc36936 verify: keep track and log which dirs failed the verification
so that we can print a list at the end of the worker which backups
are corrupt.

this is useful if there are many snapshots and some in between had an
error. Before this patch, the task log simply says to 'look in the logs'
but if the log is very long it makes it hard to see what exactly failed.

Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
2020-07-30 09:39:37 +02:00
Dominik Csapak d8594d87f1 verify: keep also track of corrupt chunks
so that we do not have to verify a corrupt one multiple times

Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
2020-07-30 09:39:37 +02:00