This lock is held during VM startup, so that multiple calls will not
start VMs twice. But this means that the timeout needs to incorporate
the time it might take a VM to boot, so increase it quite a bit.
This could previously lead to "interrupted system call" errors when
accessing backups with many disks.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
if an error occurs, the snapshot dirs will already be created, and we
do not clean them up (some might already be finished).
Warn the user that they are not cleaned up.
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
According to crypt(3):
"crypt places its result in a static storage area, which will be
overwritten by subsequent calls to crypt. It is not safe to call crypt
from multiple threads simultaneously."
This means that multiple login calls as a PBS-realm user can collide and
produce intermittent authentication failures. A visible case is for
file-restore, where VMs with many disks lead to just as many auth-calls
at the same time, as the GUI tries to expand each tree element on load.
Instead, use the thread-safe variant 'crypt_r', which places the result
into a pre-allocated buffer of type 'crypt_data'. The C struct is laid
out according to 'lib/crypt.h.in' and the man page mentioned above.
Use the opportunity and make both arguments to the rust 'crypt' function
take a &[u8].
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
Some changers do not like the DVCID bit when querying non-drives,
this includes when querying 'all' elements.
To circumvent this, we query each type by itself (like mtx does it),
and only add the DVCID bit for drives (Data Transfer Elements).
Reported by a user in the forum:
https://forum.proxmox.com/threads/ibm-3584-ts3500-support.92291/
and limit to 1000 elements per request.
(Because some changers limit that request with the options we set)
instead of checking if the data len was equal to the allocation_len
for getting more data, we count the returned elements and compare
that with the number we requested
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
Signed-off-by: Dietmar Maurer <dietmar@proxmox.com>
New kernel has stricter checks on tmpfs with stick-bit on directories, so some
commands (i.e. proxmox-tape changer status) fails when executed as root, because
permission checks fails when locking the drive.
This patch move the drive locks to /run/proxmox-backup/drive-lock.
Note: This is incompatible to old locking mechmanism, so users may not
run tape backups during update (or running backup can fail).
Stored in atomically-updated 'notes' file in backup group directory.
Available via dedicated GET/PUT API calls, as well as the first line
being included in list_groups (similar to list_snapshots).
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
a match expresses the fallback slightly nicer and needs no mut,
which is always nice to avoid.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
modeled like our other section config api calls
two drawbacks of doing it this way:
* we have to copy some api properties again for the update call,
since not all of them are updateable (username-claim)
* we only handle openid for now, which we would have to change
when we add ldap/ad
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
these will be used as parameters/return types for the read/create/etc.
calls for realms
for now we copy the necessary attributes (only from openid) since
our api macros/tools are not good enought to generate the necessary
api definitions for section configs
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
it's not used by the client and not part of the client, it
just makes use *of* the client, but is used on the
datastore/server...
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
So callers get more stable results. Most noticeable, the disk list in
the web UI doesn't jump around upon reloading, and while sorting could
be done directly there, like this other callers get the benefit too.
Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
in preparation to also get the file system type from lsblk.
Co-developed-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
While the PVE one "bails" too, it has an eval around those and moves
the error to the message property, so lets do so too to ensure a user
can force an update on a too old subscription
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
the systemd config/unit parsing stays in pbs for now since
that's not usually required and uses our section config
parser
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
move key_derivation to pbs-datastore
pbs-api-types should only contain "basic" types which
* are usually required by clients
* don't depend on pbs-related code directly
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
These are mostly tokio specific "hacks" or "workarounds" we
only really need/want in our binaries without pulling it in
via our library crates.
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
the dns plugin config allow for a specified amount of time to wait for
the TXT record to be set and propagated through DNS.
This patch adds a sleep for this amount of time.
The log message was taken from the perl implementation in proxmox-acme
for consistency.
Tested with the powerdns plugin in my test setup.
Signed-off-by: Stoiko Ivanov <s.ivanov@proxmox.com>
During startup most of the stuff is happening in milliseconds (or
less), so the timestamp granularity of seconds made it hard to tell
if the previous command required 990ms or 1ms, which is quite the
difference in the restore daemon context.
Using micros seems not to bring too much additional information, a
millisecond is already an ok lower time resolution for logging, so
switch only to millis for now.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
fixes file restore again.
The new Memcom tracking file lives in `/run/proxmox-backup` and is
always created on REST interaction, as CachedUserInfo uses it to
efficiently track config changes, and such a cache is used in each
REST handle_request.
Further, the Memcom infra expects the base run PBS dir to exists
already, which is an OK assumption to have, but in the file-restore
daemon we have a significantly more minimal environment, and the run
dir was simply not required there, even /run isn't a tmpfs yet.
Fixes fda19dcc6f ("fix CachedUserInfo by using a shared memory version counter")
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
We send it already to the user via the response body, but the
log_response does not has, nor wants to have FWIW, access to the
async body stream, so pass it through the ErrorMessageExtension
mechanism like we do else where.
Note that this is not only useful for PBS API proxy/daemon but also
the REST server of the file-restore daemon running inside the restore
VM, and it really is *very* helpful to debug things there..
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Parses JSON output from 'pvs' and 'lvs' LVM utils and does two passes:
one to scan for thinpools and create a device node for their
metadata_lv, and a second to load all LVs, thin-provisioned or not.
Should support every LV-type that LVM supports, as we only parse LVM
tools and use 'vgscan --mknodes' to create device nodes for us.
Produces a two-layer BucketComponent hierarchy with VGs followed by LVs,
PVs are mapped to their respective disk node.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
Reviewed-By: Fabian Grünbichler <f.gruenbichler@proxmox.com>
Prefix zpool mount paths to avoid clashing with other mount namespaces
(like LVM).
Also ignore "already-mounted" error and return it as success instead -
as we always assume that a mount path is unique, this is a safe
assumption, as nothing else could have been mounted here.
This fixes an issue where a mountpoint=legacy subvol might be available
on different disks, and thus have different Bucket instances that don't
share the mountpoint cache, which could lead to an error if the user
tried opening it multiple times on different disks.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
Reviewed-By: Fabian Grünbichler <f.gruenbichler@proxmox.com>
otherwise the path ends in an array ["foo", "bar"] instead of "foo/bar"
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
Reviewed-By: Fabian Grünbichler <f.gruenbichler@proxmox.com>
To support nested BucketComponents, it is necessary to dedup them, as
otherwise two components like:
/foo/bar
/foo/baz
will result in /foo being shown twice at the first hierarchy.
Also make the size property based on index and optional, as for example
/foo in the example above might not have a size, and bar/baz might have
differing sizes.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
Reviewed-By: Fabian Grünbichler <f.gruenbichler@proxmox.com>
since it pulls in lots of additional linked libraries for all binaries
compiled as part of proxmox-backup. it can easily be re-enabled with
`--cfg openid` added to the RUSTFLAGS env variable.
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
it's not really needed in the config module, and this makes it easier to
disable the proxmox-openid dependency linkage as a stop-gap measure.
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
we try to load the correct media in a loop until we find the correct tape.
when encountering an error or wrong tape, we want to log that (and send
an email if one is set) that requests the correct tape.
while trying to avoid printing the same errors more than once in a row,
we had at least one case (starting with an empty tape in the drive)
which would not print/send any tape request.
reworking that code to use a custom 'TapeRequest' enum, which contains
the state + error message, and a helper that prints and sends an email
when the state changes
this reduces the change check/log to a single variable, instead of 4
(tried, last_media_uuid, last_error, failure_reason)
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
Signed-off-by: Dietmar Maurer <dietmar@proxmox.com>
Add test code to the first locate_file command, compute locate_offset.
Subsequent locate_file commands use that offset.
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
Signed-off-by: Dietmar Maurer <dietmar@proxmox.com>
we have a static list of filesystems and their capabilities regarding
file attributes and fs features (e.g. sockets/fifos/etc) which also
includes xattrs,acls and fcaps
if we did not know a filesystem by its magic number (for example cephfs),
we did not even attempt to read xattrs, etc.
this patch adds those flags by default to unknown filesystems, and
removes them when we encounter EOPNOTSUPP (to remove the number
of syscalls)
with this, we should be able to catch xattrs/acls/fcaps on all
(unknown) fs types that support them
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
These require mounting using the regular 'mount' syscall.
Auto-generates an appropriate mount path.
Note that subvols with mountpoint=none cannot be mounted this way, and
would require setting the mountpoint property, which is not possible as
the zpools have to be imported with readonly=on.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
Uses the ZFS utils to detect, import and mount zpools. These are
available as a new Bucket type 'zpool'.
Requires some minor changes to the existing disk and partiton detection
code, so the ZFS-specific part can use the information gathered in the
previous pass to associate drive names with their 'drive-xxxN.img.fidx'
node.
For detecting size, the zpool has to be imported. This is only done with
pools containing 5 or less disks, as anything else might take too long
(and should be seldomly found within VMs).
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
Even through best efforts at keeping it small, including the ZFS tools
in the initramfs seems to have exhausted the small overhead we had left
- give it a bit more RAM to compensate.
Also disable the ZFS ARC, as it's no use in such a memory constrained
environment, and we cache on the QEMU/rust layer anyway.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
The future needs to be removed from the pending map in any case, even if
it returned an error, else all upcoming calls to access this key will
always return the same error.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
sort the chunks we want to backup to tape by inode, to gain some
speed on spinning disks. this is done per index, not globally.
costs a bit memory, but not too much, about 16 bytes per chunk which
would mean ~4MiB for a 1TiB index with 4MiB chunks.
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
so that we can reuse that information
the removal of the adding to the corrupted list is ok, since
'get_chunks_in_order' returns them at the end of the list
and we do the same if the loading fails later in 'verify_index_chunks'
so we still mark them corrupt
(assuming that the load will fail if the stat does)
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
since the output:
Result: "<UPID>"
is not really interesting, show instead the task log while
the datastore is creating, since it is now run in a worker
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
Setting this to 0 is not just useless, but breaks the logic horribly
enough to cause random segfaults - better forbid this, to avoid someone
else having to debug it again ;)
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
admin/datastore reads linearly only, so no need for cache (capacity of 1
basically means no cache except for the currently active chunk).
mount can do random access too, so cache last 8 chunks for possibly a
mild performance improvement.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
Implemented as a seperate struct SeekableCachedChunkReader that contains
the original as an Arc, since the read_at future captures the
CachedChunkReader, which would otherwise not work with the lifetimes
required by AsyncRead. This is also the reason we cannot use a shared
read buffer and have to allocate a new one for every read. It also means
that the struct items required for AsyncRead/Seek do not need to be
included in a regular CachedChunkReader.
This is intended as a replacement for AsyncIndexReader, so we have less
code duplication and can utilize the LRU cache there too (even though
actual request concurrency is not supported in these traits).
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
Supports concurrent 'access' calls to the same key via a
BroadcastFuture. These are stored in a seperate HashMap, the LruCache
underneath is only modified once a valid value has been retrieved.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
Explicitly test that data will stay available and can be retrieved
immediately via listen(), even if the future producing the data and
notifying the consumers was already run in the past.
Wasn't broken or anything, but helps with understanding IMO.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
in PVE, the logic how wearout gets read from the smartctl output was
changed from a vendor -> id map to a sorted list of specific
attribute field names.
copy that list to pbs (in the same order), and use that to get the
wearout
in the future we might want to split the disk logic into its own crate
and reuse it in pve
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
we skip snapshots that are older than the newest snapshot of the group in
the target datastore, log it so the user can know why it is not synced
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
so that longer running creates (e.g. a slow storage), does not
run in a timeout and we can follow its creation
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
Signed-off-by: Dietmar Maurer <dietmar@proxmox.com>
when we remove a datastore via api/cli, the proxy
has sometimes leftover references to that datastore in its
DATASTORE_MAP which includes an open filehandle on the
'.lock' file
this prevents unmounting/exporting the datastore even after removal,
only a reload/restart of the proxy did help
add a command to our command socket, which removes all non
configured datastores from the map, dropping the open filehandle
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
Signed-off-by: Dietmar Maurer <dietmar@proxmox.com>
by implementing a custom error type that is either 'TimeOut' or
'Other'.
In the api, check in the worker loop for exactly 'TimeOut' errors and continue only
then. All other errors lead to a aborted task.
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
removing the backup dir must acquire the snapshot lock, else it can
happen that we remove a snapshot while it is being restored
or backed up to tape
the original commit that adds the force flag
(c9756b40d1)
mentions that the prune checks itself if the snapshot is in use,
but i could not find such code, so simply set force to false
to avoid failing and aborting the prune job, warn if it could not
and continue
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
This reverts commit 75f9f40922, which is
no longer needed now that we use tokio >= 1.6 which contains the proper
fix.
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
this is deprecated with rustc 1.52+, and will become a hard error at
some point:
https://github.com/rust-lang/rust/issues/79202
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
from the SspDataEncryptionCapabilityPage
it seems we do not need it, since the EXTDECC flag is only used for
determining if the drive is capable to be configured via
ADI (Automation/Drive Interface) which we do not use at all.
this makes the call work with LTO-4 again
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
we want a 'media-set' selector in the gui, this makes it
very easy to do and is not as costly as reusing the media list,
since we do not need to iterate over all media (e.g. unassigned)
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
by extracting them via the api macro into the function signature
this fixes an issue, where giving 'since' and 'until' where not
used since we tried to extract them as 'str' while they were numbers.
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
While the issue with vsock packets starving kernel memory is mostly
worked around by the '64k -> 4k buffer' patch in
'proxmox-backup-restore-image', let's be safe and also limit the number
of concurrent transfers. 8 downloads per VM seems like a fair value.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
The extract API call may be active for more than the watchdog timeout,
so a simple ping is not enough.
This adds an "inhibit" API, which will stop the watchdog from completing
as long as at least one WatchdogInhibitor instance is alive. Keep one in
the download task, so it will be dropped once it completes (or errors).
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
See this PR for more info: https://github.com/tokio-rs/tokio/pull/3756
As a workaround use a pair of connected unix sockets - this obviously
incurs some overhead, albeit not measureable on my machine. Once tokio
includes the fix we can go back to a DuplexStream for performance and
simplicity.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
Used to specify a filesystem placed directly on a disk, without a
partition table inbetween. Detected by simply attempting to mount the
disk itself.
A helper "make_dev_node" is extracted to avoid code duplication.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
A bucket might contain multiple (or 0) layers of components in its path
specification, so allow a mapping between bucket type strings and
expected component depth. For partitions, this is 1, as there is only
the partition number layer below the "part" node.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
This can happen if the underlying storage failed, in which case we do
not want to fail the whole API call, as it should report the status
of all datastores. So rather add the error inline to the related
store entry and continue.
Allows to nicely visualize those stores in the gui.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
it's the only PBS-specific part in there, so let's make it
product-agnostic before moving it off to proxmox-http.
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
so that a user can delete a whole group at once, until now, the fastest
way for this was to prune to one snapshot, and delete that
code is basically a copy/paste from the snapshot delete, sans
the 'backup-time' parameter
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
so that a user can force a new media set, e.g. if he uses the
allocation policy 'continue', but wants to manually start a new
media-set.
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
if the account does not exist, error with its name
if file loading fails, the error includes the full path
if the content fails to parse, show file & parse error
and in each case mention that it's about loading the acme account file
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
syncs behavior with both, the displayed state in the PBS
web-interface, and the behavior of PVE/PMG.
Without this a standard setup would result in a Error like:
> TASK ERROR: no acme client configured
which was pretty confusing, as the actual error was something else
(no account configured), and the web-interface showed "default" as
selected account, so a user had no idea what actually was wrong and
how to fix it.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
- refactor the combinators,
- make it take a `&T: Serialize` instead of a Value, and
allow sending the raw string via `send_raw_command`.
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
return a result with optional fingerprint instead of tuple, allowing
easy extraction of a meaningful error message.
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
if the expected fingerprint and the one returned by the server don't
match, print a warning and allow confirmation and proceeding if running
interactive.
previous:
$ proxmox-backup-client ...
Error: error trying to connect: error:1416F086:SSL routines:tls_process_server_certificate:certificate verify failed:../ssl/statem/statem_clnt.c:1915:
new:
$ proxmox-backup-client ...
WARNING: certificate fingerprint does not match expected fingerprint!
expected: ac:cb:6a:bc:d6:b7:b4:77:3e:17:05:d6:b6:29:dd:1f:05:9c:2b:3a:df:84:3b:4d:f9:06:2c:be:da:06:52:12
fingerprint: ab:cb:6a:bc:d6:b7:b4:77:3e:17:05:d6:b6:29:dd:1f:05:9c:2b:3a:df:84:3b:4d:f9:06:2c:be:da:06:52:12
Are you sure you want to continue connecting? (y/n): n
Error: error trying to connect: error:1416F086:SSL routines:tls_process_server_certificate:certificate verify failed:../ssl/statem/statem_clnt.c:1915:
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
this makes it possible to only restore some snapshots from a tape media-set
instead of the whole. If the user selects only a small part, this will
probably be faster (and definitely uses less space on the target
datastores).
the user has to provide a list of snapshots to restore in the form of
'store:type/group/id'
e.g. 'mystore:ct/100/2021-01-01T00:00:00Z'
we achieve this by first restoring the index to a temp dir, retrieving
a list of chunks, and using the catalog, we generate a list of
media/files that we need to (partially) restore.
finally, we copy the snapshots to the correct dir in the datastore,
and clean up the temp dir
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
and create the 'email' and 'restore_owner' variable at the beginning,
so that we can reuse them and do not have to pass the sources of those
through too many functions
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
if the reader thread is already gone here, we panic here, resulting in
a nondescript error message, so simply ignore/warn in that case and
return gracefully
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
It may make sense in the future, e.g., if the built-in standalone
type is not enough, e.g., as HTTP**s**, HTTP 2 or even QUIC (HTTP 3)
is wanted in some setups, but for now there's no scenario where one
would profit from adding a new HTTP plugin, especially as it requires
the `data` property to be set, which makes no sense..
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
we cannot add a plugin with an existing ID so this completion helper
is rather counterproductive...
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
It will be reused in a later patch in another module which should not
depend on the actual API implementation (ugly and cyclic)
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
especially for the last group, without this the progress would report:
"percentage done: 100.00% (1 of 2 groups, 1 of 1 group snapshots)"
instead of the more logical
"percentage done: 100.00% (2 of 2 groups)"
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
Set PBS_QEMU_DEBUG=1 on a command that starts a VM and then connect to
the debug root shell via:
minicom -D \unix#/run/proxmox-backup/file-restore-serial-10.sock
or similar.
Note that this requires 'proxmox-backup-restore-image-debug' to work,
the postinst script is updated to also generate the corresponding image.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
A PCI bus can only support up to 32 devices, so excluding built-in
devices that left us with a maximum of about 25 drives. By adding a new
PCI bridge every 32 devices (starting at bridge ID 2 to avoid conflicts
with automatic bridges), we can theoretically support up to 8096 drives.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
The guest kernel requires more memory depending on how many disks are
attached. 256 seems to be enough for basically any reasonable and
unreasonable amount of disks though.
For debug instance, make it 1G, as these are never started automatically
anyway, and need at least 512MB since the initramfs (especially when
including a debug build of the daemon) is substantially bigger.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
Helps to clean up a VM that has crashed, is not responding to vsock API
calls, but still has a running QEMU instance.
We always check the process commandline to ensure we don't kill a random
process that took over the PID.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
otherwise, the kernel driver exposes file names as iso 8859-1,
but we want to have them as utf8.
This mapping should always work, since UTF16 can be cleanly converted
to UTF8.
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
if we are given a 'naked' ipv6 without square brackets around it,
we need to add them ourselves, since the address is ambigious otherwise
when we add the port.
e.g. giving 'fe80::1' as address we arrive at the url (with the default port)
'https://fe80::1:8007/'
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
Signed-off-by: Dietmar Maurer <dietmar@proxmox.com>
by checking the 'checked_chunks' before trying to write to disk
and by doing the existance check in the parallel handler. This way,
we do not have to check the existance of a chunk multiple times
(if multiple source datastores gets restored to the same target
datastore) and also we do not have to wait on the stat before reading
the next chunk.
We have to change the &WorkerTask to an Arc though, otherwise we
cannot log to the worker from the parallel handler
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
Split out a separate function scan_chunk_archive() for catalog restores.
Note: Required, because we need to optimize restore_chunk_archive() to
write datastore in separate threads (else thape drive will stop during restore)
API like in PVE:
GET .../info => current cert information
POST .../custom => upload custom certificate
DELETE .../custom => delete custom certificate
POST .../acme/certificate => order acme certificate
PUT .../acme/certificate => renew expiring acme cert
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
This is the highlevel part using proxmox-acme-rs to create
requests and our hyper code to issue them to the acme
server.
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
else we sometimes forget to remove it from the 'params' variable
and use that further, running into 'invalid parameter' errors
found by giving 'output-format' paramter to proxmox-tape status
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
by checking for definedness of the label (tapes without barcode
have the empty string as label-text) and falling back to the
source slot for the load action
Note: Changed the load-slot API from PUT to POST
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
Signed-off-by: Dietmar Maurer <dietmar@proxmox.com>
This allows mounting XFS partitons with 'dirty' states, like from a
running VM. Otherwise XFS tries to write recovery information, which
fails on a read-only mount.
Tested-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Tested-by: Dominik Csapak <d.csapak@proxmox.com>
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
Drive serials have a character limit of 20, longer names like
"drive-virtio0.img.fidx" or "drive-efidisk0.img.fidx" would get cut off.
Fix this by removing the suffix, it is not necessary to uniquely
identify an image.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
Some drives will always return the number of bytes given in the
allocation_length field, but correctly report the data len in the mode
sense header. Simply ignore the excess data.
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
include the expected and unexpected sizes in the error message,
so that it's easier to debug in case of an error
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
With the vsock-pkt-buffer fix in proxmox-backup-restore-image, we can
use way less memory for the VM without risking any crashes. 128 MiB
seems to be the lowest it will go and still be fully reliable.
While at it, add the "panic=1" argument to the kernel command line, so
in case the kernel *does* run out of memory, it will at least restart
automatically.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
Read image sizes (.pxar.fidx/.img.didx) from manifest and partition
sizes from /sys/...
Requires a change to ArchiveEntry, as DirEntryAttribute::Directory
does not have a size associated with it (and that's probably good).
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
this way, the api call does not error out when the file is locked
currently (which means that job is running and we do not need
to update the time)
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
when a user updates a job schedule, we want to save that point in time
to calculate future runs, otherwise when a user updates a schedule to
a time that would have been between the last run and 'now' the
schedule is triggered instantly
for example:
schedule 08:00
last run today 08:00
now it is 12:00
before this patch:
update schedule to 11:00
-> triggered instantly since we calculate from 08:00
after this patch:
update schedule to 11:00
-> triggered tomorrow 11:00 since we calculate from today 12:00
the change in the enum type is ok, since by default serde does not
error on unknown fields and the new field is optional
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
if a backup task failed (e.g. it was aborted), show the snapshots
which were successfully backed up in the notification
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
to make the following cryptic error:
proxmox-file-restore failed: Error: Invalid byte 46, offset 5.
more understandable:
proxmox-file-restore failed: Error: Failed base64-decoding path '/root.pxar.didx' - Invalid byte 46, offset 5.
when a user passes in a non-base64 path but sets `--base64`.
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
basically the same as commit eeff085d9d
Will be required once we get to use a newer rustc, at least the
client build for archlinux was broken due to this.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
same functionality as crypto_parameters, except it keeps the file
descriptor passed as "keyfd" open (and seeks to the beginning after
reading), if one is given.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
For the actual partitions and blockdevices in a backup, which the
user sees like folders in the file-restore ui
Encoded as "None", to avoid cluttering DirEntryAttribute, where it
wouldn't make any sense to have.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
These can't be entered or restored anyway, and cause issues with catalog
files for example.
Also a clippy fix.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
in some configurations, samba stores NTFS-ACLs in this xattr[0], so
we should backup (if we can)
altough the 'security' namespace is special (e.g. in use by
selinux, etc.) this value is normally only used by samba and we
should be able to back it up.
to restore it, the user needs at least 'CAP_SYS_ADMIN' rights, otherwise
it cannot be set
0: https://www.samba.org/samba/docs/current/man-html/vfs_acl_xattr.8.html
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
Some changer seem to append more data than we expect, but correctly
annotates that size in the subheader.
For each descriptor entry, read as much as the size given in the
subheader (or until the end of the reader), else our position in
the reader is wrong for the next entry, and we will parse
incorrect data.
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
only check every 1024'th, which is cheaper to do than a modulo, as we
can just mask the 10 least-significant-bits and check if the result
is zero.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Fixes a non-negligible performance regression from commit
7f394c807b
While we skip known-verified chunks in the stat-and-inode-sort loop,
those are only the ones from previous indexes. If there's a repeated
chunk in one index they would get re-verified more often as required.
So, add the check again explicitly to the read+verify loop.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
before reading the chunks from disk in the order of the index file,
stat them first and sort them by inode number.
this can have a very positive impact on read speed on spinning disks,
even with the additional stat'ing of the chunks.
memory footprint should be tolerable, for 1_000_000 chunks
we need about ~16MiB of memory (Vec of 64bit position + 64bit inode)
(assuming 4MiB Chunks, such an index would reference 4TiB of data)
two small benchmarks (single spinner, ext4) here showed an improvement from
~430 seconds to ~330 seconds for a 32GiB fixed index
and from
~160 seconds to ~120 seconds for a 10GiB dynamic index
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
when we get an error from the tape, we possibly want to ignore it,
i.e. when the file was incomplete, but we still want to error
out if the error came from e.g, the datastore, so we have to move
the error checking code to the 'next_chunk' call
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
Treat filepaths like "/root.pxar.didx" without a trailing slash as
wanting to download the entire archive content instead of erroring. The
zip-creation code already works fine for this scenario.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
in proxmox-backup-proxy, we log and discard any errors on 'accept',
so that we can continue to server requests
in proxmox-backup-api, we just have the StaticIncoming that accepts,
which will forward any errors from the underlying TcpListener
this patch also logs and discards the errors, like in the proxy.
Otherwise it could happen that if the api-daemon has more files open
than the proxy, it will shut itself down because of a
'too many open files' error if there are many open connections
(the service should also restart on exit i think, but this is
a separate issue)
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
if a datastore or root is not used directly on the pool dir
(e.g. the installer creates 2 sub datasets ROOT/pbs-1), info in
/proc/self/mountinfo returns not the pool, but the path to the
dataset, which has no iostats itself in /proc/spl/kstat/zfs/
but only the pool itself
so instead of not gathering data at all, gather the info from the
underlying pool instead. if one has multiple datastores on the same
pool those rrd stats will be the same for all those datastores now
(instead of empty) similar to 'normal' directories
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
The data on the restore daemon is either encoded into a pxar archive, to
provide the most accurate data for local restore, or encoded directly
into a zip file (or written out unprocessed for files), depending on the
'pxar' argument to the 'extract' API call.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
Encodes an entire local directory into an AsyncWrite recursively.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
extract_sub_dir_seq, together with seq_files_extractor, allow extracting
files from a pxar Decoder, along with the existing option for an
Accessor. To facilitate code re-use, some helper functions are extracted
in the process.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
Allows listing files and directories on a block device snapshot.
Hierarchy displayed is:
/archive.img.fidx/bucket/component/<path>
e.g.
/drive-scsi0.img.fidx/part/2/etc/passwd
(corresponding to /etc/passwd on the second partition of drive-scsi0)
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Includes methods to start, stop and list QEMU file-restore VMs, as well
as CLI commands do the latter two (start is implicit).
The implementation is abstracted behind the concept of a
"BlockRestoreDriver", so other methods can be implemented later (e.g.
mapping directly to loop devices on the host, using other hypervisors
then QEMU, etc...).
Starting VMs is currently unused but will be needed for further changes.
The design for the QEMU driver uses a locked 'map' file
(/run/proxmox-backup/$UID/restore-vm-map.json) containing a JSON
encoding of currently running VMs. VMs are addressed by a 'name', which
is a systemd-unit encoded combination of repository and snapshot string,
thus uniquely identifying it.
Note that currently you need to run proxmox-file-restore as root to use
this method of restoring.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
Includes functionality for scanning and referring to partitions on
attached disks (i.e. snapshot images).
Fairly modular structure, so adding ZFS/LVM/etc... support in the future
should be easy.
The path is encoded as "/disk/bucket/component/path/to/file", e.g.
"/drive-scsi0/part/0/etc/passwd". See the comments for further
explanations on the design.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
Add a watchdog that will automatically shut down the VM after 10
minutes, if no API call is received.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
Implements the base of a small daemon to run within a file-restore VM.
The binary spawns an API server on a virtio-vsock socket, listening for
connections from the host. This happens mostly manually via the standard
Unix socket API, since tokio/hyper do not have support for vsock built
in. Once we have the accept'ed file descriptor, we can create a
UnixStream and use our tower service implementation for that.
The binary is deliberately not installed in the usual $PATH location,
since it shouldn't be executed on the host by a user anyway.
For now, only the API calls 'status' and 'stop' are implemented, to
demonstrate and test proxmox::api functionality.
Authorization is provided via a custom ApiAuth only checking a header
value against a static /ticket file.
Since the REST server implementation uses the log!() macro, we can
redirect its output to stdout by registering env_logger as the logging
target. env_logger is already in our dependency tree via zstd/bindgen.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
This allows switching the base user identification/authentication method
in the rest server. Will initially be used for single file restore VMs,
where authentication is based on a ticket file, not the PBS user
backend (PAM/local).
To avoid putting generic types into the RestServer type for this, we
merge the two calls "extract_auth_data" and "check_auth" into a single
one, which can use whatever type it wants internally.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
For now it only supports 'list' and 'extract' commands for 'pxar.didx'
files. This should be the foundation for a general file-restore
interface that is shared with block-level snapshots.
This is packaged as a seperate .deb file, since for block level restore
it will need to depend on pve-qemu-kvm, which we want to seperate from
proxmox-backup-client.
[original code for proxmox-file-restore.rs]
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
[code cleanups/clippy, use helpers::list_dir_content/ArchiveEntry, no
/block subdir for .fidx files, seperate binary and package]
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
I made some comparision with bombardier[0], the one listed here are
30s looped requests with two concurrent clients:
[ static download of ext-all.js ]:
lvl avg / stdev / max
none 1.98 MiB 100 % 5.17ms / 1.30ms / 32.38ms
fastest 813.14 KiB 42 % 20.53ms / 2.85ms / 58.71ms
default 626.35 KiB 30 % 39.70ms / 3.98ms / 85.47ms
[ deterministic (pre-defined data), but real API call ]:
lvl avg / stdev / max
none 129.09 KiB 100 % 2.70ms / 471.58us / 26.93ms
fastest 42.12 KiB 33 % 3.47ms / 606.46us / 32.42ms
default 34.82 KiB 27 % 4.28ms / 737.99us / 33.75ms
The reduction is quite better with default, but it's also slower, but
only when testing over unconstrained network. For real world
scenarios where compression actually matters, e.g., when using a
spotty train connection, we will be faster again with better
compression.
A GPRS limited connection (Firefox developer console) requires the
following load (until the DOMContentLoaded event triggered) times:
lvl t x faster
none 9m 18.6s x 1.0
fastest 3m 20.0s x 2.8
default 2m 30.0s x 3.7
So for worst case using sligthly more CPU time on the server has a
tremendous effect on the client load time.
Using a more realistical example and limiting for "Good 2G" gives:
none 1m 1.8s x 1.0
fastest 22.6s x 2.7
default 16.6s x 3.7
16s is somewhat OK, >1m just isn't...
So, use default level to ensure we get bearable load times on
clients, and if we want to improve transmission size AND speed then
we could always use a in-memory cache, only a few MiB would be
required for the compressable static files we server.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
by using our DeflateEncoder
for this to work, we have to create wrapper reader that generates the crc32
checksum while reading.
also we need to put the target writer in an Option, so that we can take
it out of self and move it into the DeflateEncoder while writing
compressed
we can drop the internal buffer then, since that is managed by the
deflate encoder now
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
implements a deflate encoder that can compress anything that implements
AsyncRead + Unpin into a file with the helper 'compress'
if the inner type is a Stream, it implements Stream itself, this way
some streaming data can be streamed compressed
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
Previously we did not store GROUP_OBJ ACL entries for
directories, this means that these were lost which may
potentially elevate group permissions if they were masked
before via ACLs, so we also show a warning.
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
Don't override `group_obj` with `None` when handling
`ACL_TYPE_DEFAULT` entries for directories.
Reproducer: /var/log/journal ends up without a `MASK` type
entry making it invalid as it has `USER` and `GROUP`
entries.
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
Add a new module containing key-related functions and schemata from all
over, code moved is not changed as much as possible.
Requires adapting some 'use' statements across proxmox-backup-client and
putting the XDG helpers quite cozily into proxmox_client_tools/mod.rs
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
Pass in an optional auth tag, which will be passed as an Authorization
header on every subsequent call.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
This way we get a better rendering in the api-viewer.
before:
[<string>, ... ]
after:
[(<source>=)?<target>, ... ]
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
by changing the 'store' parameter of the restore api call to a
list of mappings (or a single default datastore)
for example giving:
a=b,c=d,e
would restore
datastore 'a' from tape to local datastore 'b'
datastore 'c' from tape to local datastore 'e'
all other datastores to 'e'
this way, only a single datastore can also be restored, by only
giving a single mapping, e.g. 'a=b'
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
The text 'had to upload [KMG]iB' implies that this is the size we
actually had to send to the server, while in reality it is the
raw data size before compression.
Count the size of the compressed chunks and print it separately.
Split the average speed into its own line so they do not get too long.
Rename 'uploaded' into 'size_dirty' and 'vsize_h' into 'size'
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
in commit `asyncify pxar create_archive`, we changed from a
separate thread for creating a pxar to using async code, but the
StdChannelWriter used for both pxar and catalog can block, which
may block the tokio runtime for single (and probably dual) core
environments
this patch adds a wrapper struct for any writer that implements
'std::io::Write' and wraps the write calls with 'block_in_place'
so that if called in a tokio runtime, it knows that this code
potentially blocks
Fixes: 6afb60abf5 ("asyncify pxar create_archive")
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
This is just an optimization, avoiding to read the catalog into memory.
We also expose create_temporary_database_file() now (will be
used for catalog restore).
- new helper: lock_media_set()
- MediaPool: lock media set
- Expose Inventory::new() to avoid double loading
- do not lock pool on restore (only lock media-set)
- change pool lock name to ".pool-{name}"
so that a user can schedule multiple backup jobs onto a single
media pool without having to consider timing them apart
this makes sense since we can backup multiple datastores onto
the same media-set but can only specify one datastore per backup job
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
the default escape handler is handlebars::html_escape, but this are
plain text emails and we manually escape them for the html part, so
set the default escape handler to 'no_escape'
this avoids double html escape for the characters: '&"<>' in emails
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
instead print an error and continue, the rendering functions will error
out if one of the templates could not be registered
if we `.unwrap()` here, it can lead to problems if the templates are
not correct, i.e. we could panic while holding a lock, if something holds
a mutex while this is called for the first time
add a test to catch registration issues during package build
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>