sort the chunks we want to backup to tape by inode, to gain some
speed on spinning disks. this is done per index, not globally.
costs a bit memory, but not too much, about 16 bytes per chunk which
would mean ~4MiB for a 1TiB index with 4MiB chunks.
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
so that we can reuse that information
the removal of the adding to the corrupted list is ok, since
'get_chunks_in_order' returns them at the end of the list
and we do the same if the loading fails later in 'verify_index_chunks'
so we still mark them corrupt
(assuming that the load will fail if the stat does)
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
since the output:
Result: "<UPID>"
is not really interesting, show instead the task log while
the datastore is creating, since it is now run in a worker
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
Setting this to 0 is not just useless, but breaks the logic horribly
enough to cause random segfaults - better forbid this, to avoid someone
else having to debug it again ;)
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
admin/datastore reads linearly only, so no need for cache (capacity of 1
basically means no cache except for the currently active chunk).
mount can do random access too, so cache last 8 chunks for possibly a
mild performance improvement.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
Implemented as a seperate struct SeekableCachedChunkReader that contains
the original as an Arc, since the read_at future captures the
CachedChunkReader, which would otherwise not work with the lifetimes
required by AsyncRead. This is also the reason we cannot use a shared
read buffer and have to allocate a new one for every read. It also means
that the struct items required for AsyncRead/Seek do not need to be
included in a regular CachedChunkReader.
This is intended as a replacement for AsyncIndexReader, so we have less
code duplication and can utilize the LRU cache there too (even though
actual request concurrency is not supported in these traits).
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
Supports concurrent 'access' calls to the same key via a
BroadcastFuture. These are stored in a seperate HashMap, the LruCache
underneath is only modified once a valid value has been retrieved.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
Explicitly test that data will stay available and can be retrieved
immediately via listen(), even if the future producing the data and
notifying the consumers was already run in the past.
Wasn't broken or anything, but helps with understanding IMO.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
in PVE, the logic how wearout gets read from the smartctl output was
changed from a vendor -> id map to a sorted list of specific
attribute field names.
copy that list to pbs (in the same order), and use that to get the
wearout
in the future we might want to split the disk logic into its own crate
and reuse it in pve
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
we skip snapshots that are older than the newest snapshot of the group in
the target datastore, log it so the user can know why it is not synced
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
so that longer running creates (e.g. a slow storage), does not
run in a timeout and we can follow its creation
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
Signed-off-by: Dietmar Maurer <dietmar@proxmox.com>
when we remove a datastore via api/cli, the proxy
has sometimes leftover references to that datastore in its
DATASTORE_MAP which includes an open filehandle on the
'.lock' file
this prevents unmounting/exporting the datastore even after removal,
only a reload/restart of the proxy did help
add a command to our command socket, which removes all non
configured datastores from the map, dropping the open filehandle
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
Signed-off-by: Dietmar Maurer <dietmar@proxmox.com>
by implementing a custom error type that is either 'TimeOut' or
'Other'.
In the api, check in the worker loop for exactly 'TimeOut' errors and continue only
then. All other errors lead to a aborted task.
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
removing the backup dir must acquire the snapshot lock, else it can
happen that we remove a snapshot while it is being restored
or backed up to tape
the original commit that adds the force flag
(c9756b40d1)
mentions that the prune checks itself if the snapshot is in use,
but i could not find such code, so simply set force to false
to avoid failing and aborting the prune job, warn if it could not
and continue
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
This reverts commit 75f9f40922, which is
no longer needed now that we use tokio >= 1.6 which contains the proper
fix.
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
this is deprecated with rustc 1.52+, and will become a hard error at
some point:
https://github.com/rust-lang/rust/issues/79202
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
from the SspDataEncryptionCapabilityPage
it seems we do not need it, since the EXTDECC flag is only used for
determining if the drive is capable to be configured via
ADI (Automation/Drive Interface) which we do not use at all.
this makes the call work with LTO-4 again
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
we want a 'media-set' selector in the gui, this makes it
very easy to do and is not as costly as reusing the media list,
since we do not need to iterate over all media (e.g. unassigned)
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
by extracting them via the api macro into the function signature
this fixes an issue, where giving 'since' and 'until' where not
used since we tried to extract them as 'str' while they were numbers.
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
While the issue with vsock packets starving kernel memory is mostly
worked around by the '64k -> 4k buffer' patch in
'proxmox-backup-restore-image', let's be safe and also limit the number
of concurrent transfers. 8 downloads per VM seems like a fair value.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
The extract API call may be active for more than the watchdog timeout,
so a simple ping is not enough.
This adds an "inhibit" API, which will stop the watchdog from completing
as long as at least one WatchdogInhibitor instance is alive. Keep one in
the download task, so it will be dropped once it completes (or errors).
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
See this PR for more info: https://github.com/tokio-rs/tokio/pull/3756
As a workaround use a pair of connected unix sockets - this obviously
incurs some overhead, albeit not measureable on my machine. Once tokio
includes the fix we can go back to a DuplexStream for performance and
simplicity.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
Used to specify a filesystem placed directly on a disk, without a
partition table inbetween. Detected by simply attempting to mount the
disk itself.
A helper "make_dev_node" is extracted to avoid code duplication.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
A bucket might contain multiple (or 0) layers of components in its path
specification, so allow a mapping between bucket type strings and
expected component depth. For partitions, this is 1, as there is only
the partition number layer below the "part" node.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
This can happen if the underlying storage failed, in which case we do
not want to fail the whole API call, as it should report the status
of all datastores. So rather add the error inline to the related
store entry and continue.
Allows to nicely visualize those stores in the gui.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
it's the only PBS-specific part in there, so let's make it
product-agnostic before moving it off to proxmox-http.
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
so that a user can delete a whole group at once, until now, the fastest
way for this was to prune to one snapshot, and delete that
code is basically a copy/paste from the snapshot delete, sans
the 'backup-time' parameter
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
so that a user can force a new media set, e.g. if he uses the
allocation policy 'continue', but wants to manually start a new
media-set.
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
if the account does not exist, error with its name
if file loading fails, the error includes the full path
if the content fails to parse, show file & parse error
and in each case mention that it's about loading the acme account file
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
syncs behavior with both, the displayed state in the PBS
web-interface, and the behavior of PVE/PMG.
Without this a standard setup would result in a Error like:
> TASK ERROR: no acme client configured
which was pretty confusing, as the actual error was something else
(no account configured), and the web-interface showed "default" as
selected account, so a user had no idea what actually was wrong and
how to fix it.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
- refactor the combinators,
- make it take a `&T: Serialize` instead of a Value, and
allow sending the raw string via `send_raw_command`.
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
return a result with optional fingerprint instead of tuple, allowing
easy extraction of a meaningful error message.
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
if the expected fingerprint and the one returned by the server don't
match, print a warning and allow confirmation and proceeding if running
interactive.
previous:
$ proxmox-backup-client ...
Error: error trying to connect: error:1416F086:SSL routines:tls_process_server_certificate:certificate verify failed:../ssl/statem/statem_clnt.c:1915:
new:
$ proxmox-backup-client ...
WARNING: certificate fingerprint does not match expected fingerprint!
expected: ac:cb:6a:bc:d6:b7:b4:77:3e:17:05:d6:b6:29:dd:1f:05:9c:2b:3a:df:84:3b:4d:f9:06:2c:be:da:06:52:12
fingerprint: ab:cb:6a:bc:d6:b7:b4:77:3e:17:05:d6:b6:29:dd:1f:05:9c:2b:3a:df:84:3b:4d:f9:06:2c:be:da:06:52:12
Are you sure you want to continue connecting? (y/n): n
Error: error trying to connect: error:1416F086:SSL routines:tls_process_server_certificate:certificate verify failed:../ssl/statem/statem_clnt.c:1915:
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
this makes it possible to only restore some snapshots from a tape media-set
instead of the whole. If the user selects only a small part, this will
probably be faster (and definitely uses less space on the target
datastores).
the user has to provide a list of snapshots to restore in the form of
'store:type/group/id'
e.g. 'mystore:ct/100/2021-01-01T00:00:00Z'
we achieve this by first restoring the index to a temp dir, retrieving
a list of chunks, and using the catalog, we generate a list of
media/files that we need to (partially) restore.
finally, we copy the snapshots to the correct dir in the datastore,
and clean up the temp dir
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
and create the 'email' and 'restore_owner' variable at the beginning,
so that we can reuse them and do not have to pass the sources of those
through too many functions
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
if the reader thread is already gone here, we panic here, resulting in
a nondescript error message, so simply ignore/warn in that case and
return gracefully
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
It may make sense in the future, e.g., if the built-in standalone
type is not enough, e.g., as HTTP**s**, HTTP 2 or even QUIC (HTTP 3)
is wanted in some setups, but for now there's no scenario where one
would profit from adding a new HTTP plugin, especially as it requires
the `data` property to be set, which makes no sense..
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
we cannot add a plugin with an existing ID so this completion helper
is rather counterproductive...
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
It will be reused in a later patch in another module which should not
depend on the actual API implementation (ugly and cyclic)
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
especially for the last group, without this the progress would report:
"percentage done: 100.00% (1 of 2 groups, 1 of 1 group snapshots)"
instead of the more logical
"percentage done: 100.00% (2 of 2 groups)"
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
Set PBS_QEMU_DEBUG=1 on a command that starts a VM and then connect to
the debug root shell via:
minicom -D \unix#/run/proxmox-backup/file-restore-serial-10.sock
or similar.
Note that this requires 'proxmox-backup-restore-image-debug' to work,
the postinst script is updated to also generate the corresponding image.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
A PCI bus can only support up to 32 devices, so excluding built-in
devices that left us with a maximum of about 25 drives. By adding a new
PCI bridge every 32 devices (starting at bridge ID 2 to avoid conflicts
with automatic bridges), we can theoretically support up to 8096 drives.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
The guest kernel requires more memory depending on how many disks are
attached. 256 seems to be enough for basically any reasonable and
unreasonable amount of disks though.
For debug instance, make it 1G, as these are never started automatically
anyway, and need at least 512MB since the initramfs (especially when
including a debug build of the daemon) is substantially bigger.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
Helps to clean up a VM that has crashed, is not responding to vsock API
calls, but still has a running QEMU instance.
We always check the process commandline to ensure we don't kill a random
process that took over the PID.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
otherwise, the kernel driver exposes file names as iso 8859-1,
but we want to have them as utf8.
This mapping should always work, since UTF16 can be cleanly converted
to UTF8.
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
if we are given a 'naked' ipv6 without square brackets around it,
we need to add them ourselves, since the address is ambigious otherwise
when we add the port.
e.g. giving 'fe80::1' as address we arrive at the url (with the default port)
'https://fe80::1:8007/'
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
Signed-off-by: Dietmar Maurer <dietmar@proxmox.com>
by checking the 'checked_chunks' before trying to write to disk
and by doing the existance check in the parallel handler. This way,
we do not have to check the existance of a chunk multiple times
(if multiple source datastores gets restored to the same target
datastore) and also we do not have to wait on the stat before reading
the next chunk.
We have to change the &WorkerTask to an Arc though, otherwise we
cannot log to the worker from the parallel handler
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
Split out a separate function scan_chunk_archive() for catalog restores.
Note: Required, because we need to optimize restore_chunk_archive() to
write datastore in separate threads (else thape drive will stop during restore)
API like in PVE:
GET .../info => current cert information
POST .../custom => upload custom certificate
DELETE .../custom => delete custom certificate
POST .../acme/certificate => order acme certificate
PUT .../acme/certificate => renew expiring acme cert
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
This is the highlevel part using proxmox-acme-rs to create
requests and our hyper code to issue them to the acme
server.
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
else we sometimes forget to remove it from the 'params' variable
and use that further, running into 'invalid parameter' errors
found by giving 'output-format' paramter to proxmox-tape status
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
by checking for definedness of the label (tapes without barcode
have the empty string as label-text) and falling back to the
source slot for the load action
Note: Changed the load-slot API from PUT to POST
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
Signed-off-by: Dietmar Maurer <dietmar@proxmox.com>
This allows mounting XFS partitons with 'dirty' states, like from a
running VM. Otherwise XFS tries to write recovery information, which
fails on a read-only mount.
Tested-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Tested-by: Dominik Csapak <d.csapak@proxmox.com>
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>