The first rotation is normally the one still opened by one or more
processes for writing, so it must NOT be replaced, removed, ..., as
this then makes the remaining logging, until those processes are
noticed that they should reopen the logfile due to rotation, goes
into nirvana, which is far from ideal for a log.
Only rotating (renaming) is OK for this active file, as this does not
invalidates the file and keeps open FDs intact.
So start compressing with the second rotation, which should be clear
to use, as all writers must have been told to reopen the log during
the last rotation, reopen is a fast operation and normally triggered
at least day ago (at least if one did not dropped the state file
manually), so we are fine to archive that one for real.
If we plan to allow faster rotation the whole rotation+reopen should
be locked, so that we can guarantee that all writers switched over,
but this is unlikely to be needed.
Again, this is was logrotate sanely does by default since forever.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
this is not the job of logrotate, and the real 20+ years battle
tested log rotate binary does not do so either as it's actually
pretty dangerous.
If we "replace" the file we break any logger which already opened a
new one here, e.g., a dameon starting up, and thus that writer would
log to nirvana.
It's the job of a logger to create a file if not existing, it makes
no sense to do it here.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
To cater to the paranoid, a new datastore-wide setting "verify-new" is
introduced. When set, a verify job will be spawned right after a new
backup is added to the store (only verifying the added snapshot).
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
Force consumers to use the lookup_datastore method instead of
potentially opening a datastore twice, and pass the config we have
already loaded into open_with_path, removing the need for open(1).
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
Commit 9070d11f4c introduced this change for other call sites,
assuming it is correct, this one was missed.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
and use that in ApiConfig to avoid that it is owned by root if the
proxmox-backup-api process creates it first.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
for now log auth errors also to the syslog, on a protected (LAN
and/or firewalled) setup this should normally happen due to
missconfiguration, not tries to break in.
This reduces syslog noise *a lot*. A current full journal output from
the current boot here has 72066 lines, of which 71444 (>99% !!) are
"successful auth for user ..." messages
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
reuse the FileLogger module in append mode.
As it implements write, which is not thread safe (mutable self) and
we use it in a async context we need to serialize access using a
mutex.
Try to use the same format we do in pveproxy, namely the one which is
also used in apache or nginx by default.
Use the response extensions to pass up the userid, if we extract it
from a ticket.
The privileged and unprivileged dameons log both to the same file, to
have a unified view, and avoiding the need to handle more log files.
We avoid extra intra-process locking by reusing the fact that a write
smaller than PIPE_BUF (4k on linux) is atomic for files opened with
the 'O_APPEND' flag. For now the logged request path is not yet
guaranteed to be smaller than that, this will be improved in a future
patch.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Add a generous limit now and return the correct error (414 URI Too
Long). Otherwise we could to pretty larger GET requests, 64 KiB and
possible bigger (at 64 KiB my simple curl test failed due to
shell/curl limitations).
For now allow a 3072 characters as combined length of URI path and
query.
This is conform with the HTTP/1.1 RFCs (e.g., RFC 7231, 6.5.12 and
RFC 2616, 3.2.1) which do not specify any limits, upper or lower, but
require that all server accessible resources mus be reachable without
getting 414, which is normally fulfilled as we have various length
limits for stuff which could be in an URI, in place, e.g.:
* user id: max. 64 chars
* datastore: max. 32 chars
The only known problematic API endpoint is the catalog one, used in
the GUI's pxar file browser:
GET /api2/json/admin/datastore/<id>/catalog?..&filepath=<path>
The <path> is the encoded archive path, and can be arbitrary long.
But, this is a flawed design, as even without this new limit one can
easily generate archives which cannot be browsed anymore, as hyper
only accepts requests with max. 64 KiB in the URI.
So rather, we should move that to a GET-as-POST call, which has no
such limitations (and would not need to base32 encode the path).
Note: This change was inspired by adding a request access log, which
profits from such limits as we can then rely on certain atomicity
guarantees when writing requests to the log.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
needs new proxmox dependency to get the RpcEnvironment changes,
adding client_ip getter and setter.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Rewrite most of the documentation to be more readable and correct
(according to the current implementations).
Add a table visualizing all different locks used to synchronize
concurrent operations.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
Contains a link to the 'backup' module's doc, as that explains a lot
about the inner workings of PBS and probably marks a good entry point
for new readers.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
Avoid races when updating manifest data by flocking a lock file.
update_manifest is used to ensure updates always happen with the lock
held.
Snapshot deletion also acquires the lock, so it cannot interfere with an
outstanding manifest write.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
The 'Ok::<_, Self::Error>(res)' type annotation was from a time where
we could not use async, and had a combinator here which needed
explicity type information. We switched over to async in commit
91e4587343 and, as the type annotation
is already included in the Future type, we can safely drop it.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Given the .pxarexclude file
foo
/bar
The following happens:
exclude: /foo
exclude: /bar
exclude: /subdir/foo
include: /subdir/bar
since the `/bar` line is an absolute path
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
Removing a snapshot has some more safety checks which we don't want to
ignore when removing an entire group (i.e. locking the manifest and
notifying GC).
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
There's no point in having that as a seperate method, just parse the
thing into a struct and write it back out correctly.
Also makes further changes to the method simpler.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
If we can't acquire a lock (either because the snapshot disappeared, it
is about to be forgotten/pruned, or it is currently still running) skip
the snapshot. Hold the lock during verification, so that it cannot be
deleted while we are still verifying.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
...to avoid it being forgotten or pruned while in use.
Update lock error message for deletions to be consistent.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
To allow other reading operations on the base snapshot as well. No
semantic changes with this patch alone, as all other locks on snapshots
are exclusive.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
A removal can fail if the snapshot is already gone (this is fine, our
job is done either way) or we couldn't get a lock (also fine, it can't
be removed then, just warn the user so he knows what happened and why it
wasn't removed) - keep going either way.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
A snapshot that's currently being read can still appear in the prune
list, but should not be removed.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
This adds a change-owner command to proxmox-backup-client,
that allows a caller with datastore modify privileges
to change the owner of a backup-group.
Signed-off-by: Dylan Whyte <d.whyte@proxmox.com>
To untangle the server code from the actual backup
implementation.
It would be ideal if the whole backup/ dir could become its
own crate with minimal dependencies, certainly without
depending on the actual api server. That would then also be
used more easily to create forensic tools for all the data
file types we have in the backup repositories.
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
Used to not require access to the WorkerTask struct outside
the `server` and `api2` module, so it'll be easier to
separate those backup/server/client parts into separate
crates.
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
This is only acquired in those two methods, both as shared. So it has
no use.
It seems, that it was planned in the past that the index deletion
should take the exclusive, while read and write takes the shared
flock on the index, as one can guess from the lock comments in commit
0465218953
But then later, in commit c8ec450e37)
the documented semantics where changed to use a temp file and do an
atomic rename instead for atomicity.
The reader shared flock on the index file was done inbetween,
probably as preparatory step, but was not removed again when strategy
was changed to using the file rename instead.
Do so now, to avoid confusion of readers and a useless flock.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
via HTTP2/backup reader protocol. they already could do so via the plain
HTTP download-file/.. API calls that the GUI uses, but the reader
environment required READ permission on the whole datastore instead of
just BACKUP on the backup group itself.
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
a reader connection should not be allowed to read arbitrary chunks in
the datastore, but only those that were previously registered by opening
the corresponding index files.
this mechanism is needed to allow unprivileged users (that don't have
full READ permissions on the whole datastore) access to their own
backups via a reader environment.
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
not triggered by any current code, but this would lead to a stack
exhaustion since borrow would call deref which would call borrow again..
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
Previously only Datastore.Modify was required for creating a new
datastore.
But, that endpoint allows one to pass an arbitrary path, of which all
parent directories will be created, this can allow any user with the
"Datastore Admin" role on "/datastores" to do some damage to the
system. Further, it is effectively a side channel for revealing the
systems directory structure through educated guessing and error
handling.
Add a new privilege "Datastore.Allocate" which, for now, is used
specifically for the create datastore API endpoint.
Add it only to the "Admin" role.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
avoiding the need for reshuffling all bits when a new privilege is
added at the start or in the middle of this definition.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
If a fuse_loop instance dies suddenly (e.g. SIGKILL), the FUSE mount and
loop device assignment are left behind. We can determine this scenario
on specific unmap, when the PID file is either missing or contains a PID
of a non-running process, but the backing file and potentially loop
device are still there.
If that's the case, do an "emergency cleanup", by unassigning the
loopdev, calling 'fusermount -u' and then cleaning any leftover files
manually.
With this in place, pretty much any situation is now recoverable via
only the 'proxmox-backup-client' binary, by either calling 'unmap' with
or without parameters.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
On unmap, only report success if the instance we are killing actually
terminates. This is especially important so that cleanup routines can be
assured that /run files are actually cleaned up after calling
cleanup_unused_run_files.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
A 'map' call will only clean up what it needs, that is only leftover
files or dangling instances of it's own name.
For a full cleanup the user can call 'unmap' without any arguments.
The 'cleanup on error' behaviour of map_loop is removed. It is no longer
needed (since the next call will clean up anyway), and in fact fixes a
bug where trying to map an image twice would result in an error, but
also cleanup the .pid file of the running instance, causing 'unmap' to
fail afterwards.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
So user doesn't need to remember which loop devices he has mapped to
what.
systemd unit encoding is used to transform a unique identifier for the
mapped image into a suitable name. The files created in /run/pbs-loopdev
will be named accordingly.
The encoding all happens outside fuse_loop.rs, so the fuse_loop module
does not need to care about encodings - it can always assume a name is a
valid filename.
'unmap' without parameter displays all current mappings. It's
autocompletion handler will list the names of all currently mapped
images for easy selection. Unmap by /dev/loopX or loopdev number is
maintained, as those can be distinguished from mapping names.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
the same as the regular TaskState, but without its fields, so that
we can use the api macro and use it as api call parameter
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
Allows mapping fixed-index .img files (usually from VM backups) to be
mapped to a local loopback device.
The architecture uses a FUSE-backed temp file mapped to a loopdev:
/dev/loopX -> FUSE /run/pbs-loopdev/xxx -> backup client -> PBS
Since unmapping requires some cleanup (unmap the loopdev, stop FUSE,
remove the temp files) a special 'unmap' command is added, which uses a
PID file to send SIGINT to the backup-client instance started with
'map', which will handle the cleanup itself.
The polling with select! in mount.rs needs to be split in two, since we
have a chicken and egg problem between running FUSE and setting up the
loop device - so we need to do them concurrently, until the loopdev is
assigned, at which point we can report success and daemonize, and then
continue polling the FUSE loop future.
A loopdev module is added to tools containing all required functions for
mapping a loop device to the FUSE file, with the ioctls moved into an
inline module to avoid exposing them directly.
The client code is placed in the 'mount' module, which, while
admittedly a loose fit, allows reuse of the daemonizing code.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
if the archive file does not exist yet, we cannot rotate it, but it's not
actually an error, so just return Ok(false) to indicate no rotation took
place
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
while we probably do not add much more to them, it still looks ugly.
If this was made so that adding a World readable API call is "hard"
and not done by accident, it rather should be done as a test on build
time. But, IMO, the API permission schema definitions are easy to
review, and not often changed/added - so any wrong World readable API
call will normally still caught.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
In theory, one can do std::mem::forget, and ignore the drop handler. With
the lifetime hack, this could result in a crash.
So we simply require 'static lifetime now (futures also needs that).
Causes a panic if last_update is smaller than RRD_DATA_ENTRIES*reso,
which (I believe) can happen when inserting the first value for a DB.
Clamp the value to 0 in that case.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
Fix a potential bug where errors that happen after the SendHandle has
been dropped while doing the thread join might have been ignored.
Requires internal check_abort to be moved out of 'impl SendHandle' since
we only have the Mutex left, not the SendHandle.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
This can slow things down by a lot on setups with (relatively) high
seek time, in the order of doubling the backup times if cache isn't
populated with the last backups chunk inode info.
Effectively there's nothing known this protects us from in the
codebase. The only thing which was theorized about was the case
where a really long running backup job (over 24 hours) is still
running and writing new chunks, not indexed yet anywhere, then an
update (or manual action) triggers a reload of the proxy. There was
some theory that then a GC in the new daemon would not know about the
oldest writer in the old one, and thus use a less strict atime limit
for chunk sweeping - opening up a window for deleting chunks from the
long running backup.
But, this simply cannot happen as we have a per datastore process
wide flock, which is acquired shared by backup jobs and exclusive by
GC. In the same process GC and backup can both get it, as it has a
process locking granularity. If there's an old daemon with a writer,
that also has the lock open shared, and so no GC in the new process
can get exclusive access to it.
So, with that confirmed we have no need for a "half-assed"
verification in the backup finish step. Rather, we plan to add an
opt-in "full verify each backup on finish" option (see #2988)
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
We forgot to put braces around the DNS_NAME regex, and in
DNS_NAME_OR_IP_REGEX
this is wrong because the regex
^foo|bar$
matches 'foo' at the beginning and 'bar' at the end, so either
foobaz
bazbar
would match. only
^(foo|bar)$
matches only 'foo' and 'bar'
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
* add square brackets to ipv6 adresses in BackupRepository if they not
already have some (we save them without in the remote config)
* in get_pull_parameters, we now create a BackupRepository first and use
those values (which does the [] mapping), this also has the advantage
that we have one place less were we hardcode 8007 as port
* in the ui, add square brackets for ipv6 adresses for remotes
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
when upgrading from a version where we stored all tasks in the 'active' file,
we did not completly account for finished tasks still there
we should update the file when encountering any finished task in
'active' as well as filter them out on the api call (if they get through)
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>