Commit Graph

3899 Commits

Author SHA1 Message Date
Stefan Reiter 8d72c2c32e file-restore: increase RAM for ZFS and disable ARC
Even through best efforts at keeping it small, including the ZFS tools
in the initramfs seems to have exhausted the small overhead we had left
- give it a bit more RAM to compensate.

Also disable the ZFS ARC, as it's no use in such a memory constrained
environment, and we cache on the QEMU/rust layer anyway.

Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
2021-06-28 13:58:41 +02:00
Stefan Reiter c48c38ab8c async_lru_cache: fix handling of errors in fetch
The future needs to be removed from the pending map in any case, even if
it returned an error, else all upcoming calls to access this key will
always return the same error.

Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
2021-06-28 13:48:26 +02:00
Dominik Csapak 3d3769830b tape/helpers/snapshot_reader: sort chunks by inode (per index)
sort the chunks we want to backup to tape by inode, to gain some
speed on spinning disks. this is done per index, not globally.

costs a bit memory, but not too much, about 16 bytes per chunk which
would mean ~4MiB for a 1TiB index with 4MiB chunks.

Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
2021-06-28 12:16:14 +02:00
Dominik Csapak 4921a411ad backup/datastore: refactor chunk inode sorting to the datastore
so that we can reuse that information

the removal of the adding to the corrupted list is ok, since
'get_chunks_in_order' returns them at the end of the list
and we do the same if the loading fails later in 'verify_index_chunks'
so we still mark them corrupt
(assuming that the load will fail if the stat does)

Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
2021-06-28 12:14:52 +02:00
Dominik Csapak 81c767efce proxmox-backup-manager: show task log on datastore create
since the output:
Result: "<UPID>"
is not really interesting, show instead the task log while
the datastore is creating, since it is now run in a worker

Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
2021-06-28 12:09:55 +02:00
Hannes Laimer 60abf03f05 close #3459: manager: add --ignore-verified and --outdated-after parameters
Signed-off-by: Hannes Laimer <h.laimer@proxmox.com>
Reviewed-By: Dominik Csapak <d.csapak@proxmox.com>
Tested-By: Dominik Csapak <d.csapak@proxmox.com>
2021-06-28 11:03:51 +02:00
Hannes Laimer dcbf29e71b api: add ignore-verified and outdated-after to datastore verify endpoint
preparatory change for fixing #3459

Signed-off-by: Hannes Laimer <h.laimer@proxmox.com>
Reviewed-By: Dominik Csapak <d.csapak@proxmox.com>
Tested-By: Dominik Csapak <d.csapak@proxmox.com>
2021-06-28 11:03:51 +02:00
Hannes Laimer 037e6c0ca8 verify-job: move snapshot filter into function
preparatory steps for fixing #3459

Signed-off-by: Hannes Laimer <h.laimer@proxmox.com>
Reviewed-By: Dominik Csapak <d.csapak@proxmox.com>
Tested-By: Dominik Csapak <d.csapak@proxmox.com>
2021-06-28 11:03:44 +02:00
Fabian Grünbichler 90ff75f85c update to zstd 0.6
compatible with libzstd from bullseye.

Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
2021-06-14 13:01:43 +02:00
Dietmar Maurer 2165f0d450 api: define and use REALM_ID_SCHEMA 2021-06-10 11:10:00 +02:00
Wolfgang Bumiller 1e7639bfc4 fixup minimum lru capacity
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
2021-06-08 10:13:46 +02:00
Stefan Reiter 4121628d99 tools/lru_cache: make minimum capacity 1
Setting this to 0 is not just useless, but breaks the logic horribly
enough to cause random segfaults - better forbid this, to avoid someone
else having to debug it again ;)

Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
2021-06-08 09:42:55 +02:00
Stefan Reiter da78b90f9c backup: remove AsyncIndexReader
superseded by CachedChunkReader, with less code and more speed

Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
2021-06-08 09:42:46 +02:00
Stefan Reiter 1ef6e8b6a7 replace AsyncIndexReader with SeekableCachedChunkReader
admin/datastore reads linearly only, so no need for cache (capacity of 1
basically means no cache except for the currently active chunk).
mount can do random access too, so cache last 8 chunks for possibly a
mild performance improvement.

Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
2021-06-08 09:42:44 +02:00
Stefan Reiter 10351f7075 backup: add AsyncRead/Seek to CachedChunkReader
Implemented as a seperate struct SeekableCachedChunkReader that contains
the original as an Arc, since the read_at future captures the
CachedChunkReader, which would otherwise not work with the lifetimes
required by AsyncRead. This is also the reason we cannot use a shared
read buffer and have to allocate a new one for every read. It also means
that the struct items required for AsyncRead/Seek do not need to be
included in a regular CachedChunkReader.

This is intended as a replacement for AsyncIndexReader, so we have less
code duplication and can utilize the LRU cache there too (even though
actual request concurrency is not supported in these traits).

Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
2021-06-08 09:42:40 +02:00
Stefan Reiter 70a152deb7 backup: add CachedChunkReader utilizing AsyncLruCache
Provides a fast arbitrary read implementation with full async and
concurrency support.

Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
2021-06-08 09:42:37 +02:00
Stefan Reiter 5446bfbba8 tools: add AsyncLruCache as a wrapper around sync LruCache
Supports concurrent 'access' calls to the same key via a
BroadcastFuture. These are stored in a seperate HashMap, the LruCache
underneath is only modified once a valid value has been retrieved.

Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
2021-06-08 09:42:34 +02:00
Stefan Reiter 400885e620 tools/BroadcastFuture: add testcase for better understanding
Explicitly test that data will stay available and can be retrieved
immediately via listen(), even if the future producing the data and
notifying the consumers was already run in the past.

Wasn't broken or anything, but helps with understanding IMO.

Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
2021-06-08 09:42:29 +02:00
Dominik Csapak f960fc3b6f fix #3433: use PVE's wearout logic in PBS
in PVE, the logic how wearout gets read from the smartctl output was
changed from a vendor -> id map to a sorted list of specific
attribute field names.

copy that list to pbs (in the same order), and use that to get the
wearout

in the future we might want to split the disk logic into its own crate
and reuse it in pve

Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
2021-06-08 08:31:37 +02:00
Dominik Csapak d2354a16cd client/pull: log snapshots that are skipped because of time
we skip snapshots that are older than the newest snapshot of the group in
the target datastore, log it so the user can know why it is not synced

Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
2021-06-07 10:51:25 +02:00
Dominik Csapak 2de4dc3a81 backup/chunk_store: optionally log progress on creation
and enable it for the worker variants

Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
2021-06-04 09:32:09 +02:00
Dietmar Maurer b90036dadd cleanup: factor out config::datastore::lock_config() 2021-06-04 09:04:14 +02:00
Dominik Csapak 4708f4fc21 api2/config/datastore: change create datastore api call to a worker
so that longer running creates (e.g. a slow storage), does not
run in a timeout and we can follow its creation

Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
Signed-off-by: Dietmar Maurer <dietmar@proxmox.com>
2021-06-04 09:02:05 +02:00
Dominik Csapak 062cf75cdf proxmox-backup-proxy: fix leftover references on datastore removal
when we remove a datastore via api/cli, the proxy
has sometimes leftover references to that datastore in its
DATASTORE_MAP which includes an open filehandle on the
'.lock' file

this prevents unmounting/exporting the datastore even after removal,
only a reload/restart of the proxy did help

add a command to our command socket, which removes all non
configured datastores from the map, dropping the open filehandle

Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
Signed-off-by: Dietmar Maurer <dietmar@proxmox.com>
2021-06-04 08:22:53 +02:00
Dominik Csapak e5950360ca tape/drive: improve tape device locking behaviour
by implementing a custom error type that is either 'TimeOut' or
'Other'.

In the api, check in the worker loop for exactly 'TimeOut' errors and continue only
then. All other errors lead to a aborted task.

Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
2021-06-02 17:08:00 +02:00
Dominik Csapak 5b358ff0b1 server/prune_job: fix locking during prune jobs
removing the backup dir must acquire the snapshot lock, else it can
happen that we remove a snapshot while it is being restored
or backed up to tape

the original commit that adds the force flag
(c9756b40d1)
mentions that the prune checks itself if the snapshot is in use,
but i could not find such code, so simply set force to false

to avoid failing and aborting the prune job, warn if it could not
and continue

Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
2021-06-02 17:04:49 +02:00
Fabian Grünbichler 3420029b5e Revert "file-restore-daemon: work around tokio DuplexStream bug"
This reverts commit 75f9f40922, which is
no longer needed now that we use tokio >= 1.6 which contains the proper
fix.

Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
2021-06-01 10:31:19 +02:00
Fabian Grünbichler 3e3b505cc8 reorder serde usage/derive
this is deprecated with rustc 1.52+, and will become a hard error at
some point:

https://github.com/rust-lang/rust/issues/79202

Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
2021-05-31 14:53:08 +02:00
Dietmar Maurer 0bca966ec5 fix typo: s/dies/does/ 2021-05-31 11:01:15 +02:00
Dominik Csapak 84737fb33f lto/sg_tape/encryption: remove non lto-4 supported byte
from the SspDataEncryptionCapabilityPage

it seems we do not need it, since the EXTDECC flag is only used for
determining if the drive is capable to be configured via
ADI (Automation/Drive Interface) which we do not use at all.

this makes the call work with LTO-4 again

Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
2021-05-31 10:58:38 +02:00
Dominik Csapak 03380db560 api2/tape: add api call to list media sets
we want a 'media-set' selector in the gui, this makes it
very easy to do and is not as costly as reusing the media list,
since we do not need to iterate over all media (e.g. unassigned)

Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
2021-05-26 18:10:57 +02:00
Dominik Csapak c24cb13382 api: node/journal: fix parameter extraction of /nodes/node/journal
by extracting them via the api macro into the function signature

this fixes an issue, where giving 'since' and 'until' where not
used since we tried to extract them as 'str' while they were numbers.

Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2021-05-25 13:26:51 +02:00
Stefan Reiter 3a804a8a20 file-restore-daemon: limit concurrent download calls
While the issue with vsock packets starving kernel memory is mostly
worked around by the '64k -> 4k buffer' patch in
'proxmox-backup-restore-image', let's be safe and also limit the number
of concurrent transfers. 8 downloads per VM seems like a fair value.

Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
2021-05-25 11:56:43 +02:00
Stefan Reiter 1fde4167ea file-restore-daemon: watchdog: add inhibit for long downloads
The extract API call may be active for more than the watchdog timeout,
so a simple ping is not enough.

This adds an "inhibit" API, which will stop the watchdog from completing
as long as at least one WatchdogInhibitor instance is alive. Keep one in
the download task, so it will be dropped once it completes (or errors).

Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
2021-05-25 11:56:43 +02:00
Stefan Reiter 75f9f40922 file-restore-daemon: work around tokio DuplexStream bug
See this PR for more info: https://github.com/tokio-rs/tokio/pull/3756

As a workaround use a pair of connected unix sockets - this obviously
incurs some overhead, albeit not measureable on my machine. Once tokio
includes the fix we can go back to a DuplexStream for performance and
simplicity.

Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
2021-05-25 11:56:43 +02:00
Thomas Lamprecht e9c2638f90 apt: fix removal of non-existant http-proxy config
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2021-05-25 11:54:46 +02:00
Oguz Bektas 338c545f85 tasks: fix typos in API description
Signed-off-by: Oguz Bektas <o.bektas@proxmox.com>
2021-05-25 07:54:57 +02:00
Stefan Reiter e379b4a31c file-restore-daemon: disk: add RawFs bucket type
Used to specify a filesystem placed directly on a disk, without a
partition table inbetween. Detected by simply attempting to mount the
disk itself.

A helper "make_dev_node" is extracted to avoid code duplication.

Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
2021-05-25 07:53:22 +02:00
Stefan Reiter 3d7ca2bdb9 file-restore-daemon: disk: allow arbitrary component count per bucket
A bucket might contain multiple (or 0) layers of components in its path
specification, so allow a mapping between bucket type strings and
expected component depth. For partitions, this is 1, as there is only
the partition number layer below the "part" node.

Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
2021-05-25 07:53:22 +02:00
Stefan Reiter d34019e246 file-restore-daemon: disk: ignore "invalid fs" error
Mainly just causes log spam, we print a more useful error in the end if
all mounts fail anyway.

Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
2021-05-25 07:53:22 +02:00
Thomas Lamprecht 64591e731e api: status: graceful-degrade when a datastore lookup fails
This can happen if the underlying storage failed, in which case we do
not want to fail the whole API call, as it should report the status
of all datastores. So rather add the error inline to the related
store entry and continue.

Allows to nicely visualize those stores in the gui.

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2021-05-21 13:20:55 +02:00
Thomas Lamprecht 64e0786aa9 api: datastore status: refactor reused rrd get-data code into closure
Nicer and shorter than just using a variable for the common parameters

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2021-05-21 13:20:55 +02:00
Thomas Lamprecht 90761f0f62 api: datastore status: code cleanup, reduce indentation level
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2021-05-21 13:20:55 +02:00
Wolfgang Bumiller 1d781c5b20 update proxmox-http dependency
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
2021-05-17 11:29:24 +02:00
Fabian Grünbichler 7d2be91bc9 move SimpleHttp to proxmox_http
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
2021-05-17 10:32:33 +02:00
Fabian Grünbichler 578895336a SimpleHttp: factor out product-specific bits
in preparation of moving the abstraction to proxmox_http

Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
2021-05-17 10:32:22 +02:00
Fabian Grünbichler 8c090937f5 move tools::http to proxmox_http
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
2021-05-17 10:31:54 +02:00
Fabian Grünbichler 4229633d98 move ProxyConfig to proxmox_http
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
2021-05-17 10:31:27 +02:00
Fabian Grünbichler 3ed7e87538 HttpsConnector: make keepalive configurable
it's the only PBS-specific part in there, so let's make it
product-agnostic before moving it off to proxmox-http.

Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
2021-05-17 10:31:15 +02:00
Fabian Grünbichler 5b43cc4487 move MaybeTlsStream wrapper to proxmox_http
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
2021-05-17 10:30:05 +02:00