Commit Graph

110 Commits

Author SHA1 Message Date
Fabian Grünbichler
7956877f14 gc: shorten progress messages
we have messages starting the phases anyway, and limit the number of
progress updates so that context remains available at all times.

Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
2020-12-01 06:04:13 +01:00
Stefan Reiter
fd19256470 gc: treat .bad files like regular chunks
Simplify the phase 2 code by treating .bad files just like regular
chunks, with the exception of stat logging.

To facilitate, we need to touch .bad files in phase 1. We only do this
under the condition that 1) the original chunk is missing (as before),
and 2) the original chunk is still referenced somewhere (since the code
lives in the error handler for a failed chunk touch, it only gets called
for chunks we expect to be there, i.e. ones that are referenced).

Untouched they will then be cleaned up after 24 hours (or after the last
longer-running task finishes).

Reason 2) is also a fix for .bad files not being cleaned up at all if
the original is no longer referenced anywhere (e.g. a user deleting all
snapshots after seeing some corrupt chunks appear).

cond_touch_path is introduced to touch arbitrary paths in the chunk
store with the same logic as touching chunks.

Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
2020-11-18 14:04:49 +01:00
Dominik Csapak
a2285525be backup/datastore: count still bad chunks for the status
we want to show the user that there are still bad chunks after a garbage
collection

Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
2020-10-27 17:41:30 +01:00
Wolfgang Bumiller
8db1468952 more clippy fixups
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
2020-10-14 13:58:35 +02:00
Wolfgang Bumiller
f6b1d1cc66 don't require WorkerTask in backup/
To untangle the server code from the actual backup
implementation.
It would be ideal if the whole backup/ dir could become its
own crate with minimal dependencies, certainly without
depending on the actual api server. That would then also be
used more easily to create forensic tools for all the data
file types we have in the backup repositories.

Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
2020-10-12 14:11:57 +02:00
Dietmar Maurer
12c65bacf1 src/backup/chunk_store.rs: disable debug output 2020-09-19 15:26:21 +02:00
Dietmar Maurer
14db8b52dc src/backup/chunk_store.rs: use ? insteadf of unwrap 2020-09-10 06:37:37 +02:00
Stefan Reiter
597427afaf clean up .bad file handling in sweep_unused_chunks
Code cleanup, no functional change intended.

Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
2020-09-10 06:31:22 +02:00
Stefan Reiter
068e526862 backup: touch all chunks, even if they exist
We need to update the atime of chunk files if they already exist,
otherwise a concurrently running GC could sweep them away.

This is protected with ChunkStore.mutex, so the fstat/unlink does not
race with touching.

Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
2020-09-08 12:51:03 +02:00
Stefan Reiter
a9767cf7de gc: remove .bad files on garbage collect
The iterator of get_chunk_iterator is extended with a third parameter
indicating whether the current file is a chunk (false) or a .bad file
(true).

Count their sizes to the total of removed bytes, since it also frees
disk space.

.bad files are only deleted if the corresponding chunk exists, i.e. has
been rewritten. Otherwise we might delete data only marked bad because
of transient errors.

While at it, also clean up and use nix::unistd::unlinkat instead of
unsafe libc calls.

Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
2020-09-08 12:43:13 +02:00
Dietmar Maurer
8317873c06 gc: improve percentage done logs 2020-09-02 10:04:18 +02:00
Thomas Lamprecht
49a92084a9 gc: use human readable units for summary
and avoid the "percentage done: X %" phrase

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2020-08-27 16:06:35 +02:00
Aaron Lauterer
d3d566f7bd GC: use time pre phase1 to calculate min_atime in phase2
Used chunks are marked in phase1 of the garbage collection process by
using the atime property. Each used chunk gets touched so that the atime
gets updated (if older than 24h, see relatime).

Should there ever be a situation in which the phase1 in the GC run needs
a very long time to finish, it could happen that the grace period
calculated in phase2 is not long enough and thus the marking of the
chunks (atime) becomes invalid. This would result in the removal of
needed chunks.

Even though the likelyhood of this happening is very low, using the
timestamp from right before phase1 is started, to calculate the grace
period in phase2 should avoid this situation.

Signed-off-by: Aaron Lauterer <a.lauterer@proxmox.com>
2020-08-04 10:19:05 +02:00
Dietmar Maurer
39f18b30b6 src/backup/data_blob.rs: new load_from_reader(), which verifies the CRC
And make verify_crc private for now. We always call load_from_reader() to
verify the CRC.

Also add load_chunk() to datastore.rs (from chunk_store::read_chunk())
2020-07-28 10:23:16 +02:00
Aaron Lauterer
b96b11cdb7 chunk_store: Fix typo in bail message
Signed-off-by: Aaron Lauterer <a.lauterer@proxmox.com>
2020-07-21 12:51:41 +02:00
Stoiko Ivanov
c687da9e8e datastore: chown base dir on creation
When creating a new datastore the basedir is only owned by the backup
user if it did not exist beforehand (create_path chowns only if it
creates the directory), and returns false if it did not create the
directory).

This improves the experience when adding a new datastore on a fresh
disk or existing directory (not owned by backup) - backups/pulls can
be run instead of terminating with EPERM.

Tested on my local testinstall with a new disk, and a existing directory:

Signed-off-by: Stoiko Ivanov <s.ivanov@proxmox.com>
2020-07-09 18:20:16 +02:00
Dietmar Maurer
92c3fd2e22 src/backup/chunk_store.rs: allow to read name()
This is helpful for logging ...
2020-06-24 06:54:21 +02:00
Dietmar Maurer
07ce44a633 avoid compiler warnings 2020-05-19 07:03:41 +02:00
Dietmar Maurer
99641a6bbb garbage_collect: call fail_on_abort to abort GV when requested. 2020-05-05 09:06:34 +02:00
Wolfgang Bumiller
f7d4e4b506 switch from failure to anyhow
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
2020-04-17 18:43:30 +02:00
Dietmar Maurer
cf459b1982 gc: log pending removals 2020-04-06 09:50:40 +02:00
Dietmar Maurer
a92830dc39 src/api2/types.rs: define and use api type GarbageCollectionStatus 2020-01-23 13:40:12 +01:00
Dietmar Maurer
2585a8a4e2 src/backup/chunk_store.rs: implement cond_touch_chunk()
This will be used by backup sync to test if a chunk already exists.
2020-01-02 13:26:28 +01:00
Dietmar Maurer
645995634a src/api2/config/datastore.rs - create: pass uid and gid instead of User 2019-12-20 09:23:58 +01:00
Dietmar Maurer
e67770d496 src/backup/chunk_store.rs - create: pass User instead of CreateOptions 2019-12-20 09:11:40 +01:00
Wolfgang Bumiller
afdcfb5bc9 let ChunkStore::create take CreateOptions
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
2019-12-19 13:14:49 +01:00
Dietmar Maurer
f74a03da1f remove tools::getpwnam_ugid, impl. crate::backup::backup_user()
And use new nix::unistd::User struct.
2019-12-19 10:20:13 +01:00
Dietmar Maurer
7e210bd0b4 src/backup/chunk_store.rs: create lock file with correct owner 2019-12-19 06:55:53 +01:00
Dietmar Maurer
0b97bc6158 src/backup/chunk_store.rs: use proxmox::tools::fs::create_path 2019-12-18 12:26:43 +01:00
Oguz Bektas
14f1e63067 chunk_store: create parent directories
'datastore create storename /path/to/dir/that/may/not/exist' should
work.

Signed-off-by: Oguz Bektas <o.bektas@proxmox.com>
2019-12-17 15:39:42 +01:00
Dietmar Maurer
62ee2eb405 avoid some clippy warnings 2019-10-26 11:42:05 +02:00
Dietmar Maurer
4ee8f53d07 remove DataChunk file format - use DataBlob instead 2019-10-06 10:31:06 +02:00
Dietmar Maurer
a24e3993e0 src/backup/chunk_store.rs: coding style fixes 2019-07-04 11:39:10 +02:00
Dietmar Maurer
e4c2fbf170 src/backup/chunk_store.rs: additionally log chunk count 2019-07-04 11:27:11 +02:00
Dietmar Maurer
9850bcdf19 src/backup/chunk_store.rs: improve error reporting 2019-07-04 11:21:54 +02:00
Wolfgang Bumiller
a3f3e91da2 backup/chunk_store: rework chunk iterator
We can now use iter::from_fn() which makes for a much nicer
logic. The only thing better is going to be when we can use
generators with `yield`.

Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
2019-07-04 10:55:17 +02:00
Dietmar Maurer
a57360983b src/backup/chunk_store.rs - get_chunk_iterator: return percentage inside iterator item 2019-07-04 09:26:44 +02:00
Dietmar Maurer
a660978c9a src/backup/datastore.rs: generic index_mark_used_chunks implementation, improve GC stats 2019-07-04 07:57:43 +02:00
Dietmar Maurer
81a6ce6fde src/backup/chunk_store.rs: new method chunk_path()
Returns the absolute path.
2019-06-28 15:48:09 +02:00
Dietmar Maurer
11515438cc Cargo.toml: use serde feature derive 2019-06-18 06:23:25 +02:00
Dietmar Maurer
bffd40d6b7 src/tools.rs: move hex_to_digest and digest_to_hex to proxmox::tools 2019-06-14 11:40:04 +02:00
Dietmar Maurer
f98ac774ee backup: Add support for client side encryption
first try ...
2019-06-13 11:47:23 +02:00
Dietmar Maurer
36898ffce6 src/backup/chunk_stream.rs: add optional chunk_size parameter 2019-05-30 13:28:24 +02:00
Dietmar Maurer
f2b99c34f7 src/api2/admin/datastore.rs: implement API to return last GC status 2019-04-11 12:04:25 +02:00
Dietmar Maurer
92da93b245 abort GC on server shutdown 2019-04-01 12:13:02 +02:00
Dietmar Maurer
11861a482d src/backup/chunk_store.rs: fix GC
Added option to get oldest_writer timestamp from ProcessLocker.
2019-03-31 17:21:36 +02:00
Dietmar Maurer
d85987aeeb fix last commit: the filename var was not ment to be removed, sorry 2019-03-31 16:16:14 +02:00
Dietmar Maurer
15a77c4c2e src/backup/chunk_store.rs: avoid create/unlink race 2019-03-31 10:03:01 +02:00
Dietmar Maurer
43b1303398 datastore: use new ProcessLocker
To make sure only one process runs garbage collection while having active writers.
2019-03-22 09:42:15 +01:00
Dietmar Maurer
141f062e08 src/backup/chunk_store.rs: use zstd compression insteadf of lz4
Provides better compressionm rate, and is still fast.
2019-03-07 11:42:59 +01:00