stratis-storage / stratis-docs Goto Github PK
View Code? Open in Web Editor NEWStratis Documentation
Stratis Documentation
Create a document to describe the Stratis-cli command-line arguments and options.
I see that storaged is mentioned in stratis-interactions.svg diagram, where it uses Stratis D-Bus API, but there is no other explanation of how storaged relate to Stratis in the document.
Do I understand it right that you expect Storaged/udisks to implement support for Stratis managed storage for Cockpit, Gnome and other components which uses storaged? Moreover since Storaged is a daemon which provides D-Bus API, would it make sense to compare it's design with Stratis?
We have agreed that since devicemapper is heavily used and also authored by the Stratis team, it is a good thing to split it out from the other external use declarations. The style doc should document this decision. cc @mulkieran
API will not be stable for 1.0.
is None
, == []
, etc. This way, when an object of an unexpected type arrives, you get a failure at the comparison, not a subtle bug which may hang around for literally years before being discovered.Even if we're not stabilizing the API yet, let's document them for our own benefit for the time being even so.
XFS doesn't support shrinking at the moment. And in the design document, you state that:
Stratis requires the filesystem used have online resize (or at least online
grow) capabilities
Would it make sense to explain in more detail why lack of shrinking is not a problem for Stratis? As I went through the design document, it seems that Stratis would grow the filesystem automatically while removing block devices would be supported.
Seems like "Implementation Details" is better.
One of the limitations of the current LVM2 + XFS or MD + XFS lash-ups is that the block layer cannot talk up the stack to the filesystem, and the filesystem treats the block layer as a dumb pile of storage.
I'm not familiar with btrfs in this respect, but with ZFS, one of the huge advantages you get from a unified filesystem and volume manager is that when all redundant copies of a given block are unreadable or corrupt, it doesn't just mark the pool as degraded and demand a disk swap. Instead, it tells you exactly which file is affected. If you remove or replace that file, the problem instantly goes away, because those damaged blocks are no longer involved with that file.
This is partly thanks to CoW, but independent of it, since even in a filesystem that overwrites blocks in place, that overwrite operation may be able to repair the damage. If it's just soft corruption, simply overwriting the blocks fixes it. If it's on-rust corruption, the drive's sector remapper is likely to swap in a fresh sector when the write occurs, and if not, the filesystem can potentially react to a failed write and mark that sector out in the same way badblocks
does.
Will Stratis have this tight interoperation between the filesystem and the block layer, or will the fact that it is built atop several "foreign" code bases mean it will remain unable to diagnose and repair itself in the way ZFS can?
Updated Note: problem is easy to reproduce on dell-r730-011.dsal.lab.eng.rdu2.redhat.com - need to create at least 25 filesystems.
The attached script causes the following on the 3rd filesystem:
stratis filesystem create pool1 24
Execution failed: 5: Command failed: cmd: "/usr/sbin/mkfs.xfs" "-f" "-q" "/dev/dm-16" "-m" "uuid=1cf50752-f0e3-4648-b107-31a3c0761ccb", stdout: stderr: mkfs.xfs: libxfs_device_zero write failed: No space left on device
#!/bin/bash
set -x
set -e
DEVS1="<list of block devices here>"
FS_COUNT=25
POOL_NAME=pool1
stratis pool create $POOL_NAME $DEVS1
for((fs=1;fs<=$FS_COUNT;fs++))
do
echo "create $POOL_NAME-fs-$fs"
stratis filesystem create $POOL_NAME "$fs"
done
Problem happens every time on dell-r730-011. Have not reproduced on other systems.
Current plan/goal is to maintain symlinks to filesystems under /dev/stratis/poolname/filesystemname.
I infer by your use of XFS that Stratis does not support copy-on-write today, one of the key advantages of ZFS and btrfs relative to the current RHEL storage stack, being the near-manual LVM2 + MD + XFS lash-up.
I understand that CoW-on-XFS is coming in some unspecified future version of XFS.
I would like the document to clarify this situation:
State that CoW-on-XFS is expected to arrive sometime during the Stratis development plan, and say which versions are expected to coincide. That is, will we have it for Stratis 1.0, 2.0, 3.0...?
If you cannot make such a claim, this should be called out in the document as a known limitation of Stratis relative to ZFS and btrfs.
stratis-storage/stratisd#978 has revealed that the treatment of these sectors is not properly specified.
Properties
State
listed as a s
, it is actually a q
or uint16
InitializationTime
documented as a s
string, it's actually a t
or uint64
Tier
not documented, currently it is a q
Method
SetUserId
is actually SetUserInfo
We need version 37 for dm event poll support.
10.2.5 Progressively throttling writes, is this implemented?
10.2.7 Will Stratis support changing the pass phrase for a Snapshot of an
encrypted FS?
10.2.8 What if a FS has no idle periods, will stratis eventually force a
fstrim or will the fstrim starve? Will dm stats be used for the historical
system I/O activity levels?
10.3.4 Should the BDA format include a version field?
Note: It would be good to have a little more detail on the write sequence when
multiple devices exist. More detail on the algorithm would be helpful.
10.3.5 Should the MDA JSON format include a version field?
10.3.7 Would it be useful to have a background or periodic task that updates
the physical device(s) meta data to make them current? When a box comes up with
the current design you are required to read every physical disk to ensure you have
found the latest metadata, correct?
We should specify the scheme we are using for dm names for all devices we are creating in the design doc.
The "References" section would have to be newly created.
Document the DBus API, since it is not in the design doc any more.
There are madly out of date and will be very hard to maintain.
It might make sense to dump in the current, most up-to-date introspection information, and then maybe have a little further information about princiiples, if any.
Demo of this approach now available in docs.
Part I contains a requirements section, but Part II and III should describe how Stratis works, rather than saying what requirements a solution must meet.
This really just means searching for requirements words like "must" and "should", and replacing them with present-tense words that describe Stratis's implementation. It already does this partially so it's just a matter of fixing a few spots.
The scheduling section at the end of the document worries me. 1.0 looks like it is scheduled for the first release of RHEL 8, simply based on how long it's been since RHEL 7 came out. That section says we won't have any form of RAID until Stratis 2.0. Is this reading correct?
Without at least mirroring RAID, I cannot see Stratis as any kind of replacement for btrfs.
Is there any possibility of moving the RAID bits alone into Stratis 1.0, or 1.1? My main concern is that Stratis-with-RAID be available in some version of RHEL 8, preferably the first version.
Potentially long running operations, eg. creating many filesystems in
one call. How sure are we that the method will complete
before the timeout between the client and service expires? Or is the client
to set the dbus client timeout to infinity?
When you supply a list of things to be created, what happens if some are
successful and some fail? eg. CreateFileSystems, DestroyFileSystems? Or is the
contract all or none?
Clarification, if you repeatedly set the name of a resource to the same name,
is the result = False, 0, "" ?
1.1.2 Typically in dbus calls to provide for an optional argument you use a
hashmap, so that the existence of the key indicates it's presence and of course
it's associated value. What is the convention if a method takes an optional
argument for the value of the second argument to be even though it's to be
ignored?
3.0 Would it be better to be consistent with the return value being always
in element 0, message in element 1 and then the result in element 2 if
present. Perhaps I'm missing a benefit to utilizing the documented approach.
After we determine the usage model for snapshots that we'll be implementing, document this in the design doc, along with implementation details.
We are calling many functions that can fail. We should document guidelines for when a failure should result in:
as well as when debug, warn, and critical debug messages should be emitted.
When should one leave debug!()
and other logging levels in the code? This issue is not just that it would be good to put this in the style guide so we all are consistent, but also coming up with the policy itself.
Some documents seem to say that it is, and other's disagree. It may be a question of the particular version of the spec.
We should remove it from the appendix and use it to actually validate our metadata, somehow, in a test of stratisd.
We've chosen to use LyX to generate our design doc and to-be-written user's guide, but we want to make these easily readable via HTML. One possible way might be to generate HTML output off of a commit hook, and then publish them to our project's GitHub Pages (see https://pages.github.com/), and then point people there in the README.md or from wherever.
In moving away from Google Docs, we also have lost its nifty comment-adding feature. We should document what readers should do if they have comments or questions about our docs. (I guess open an issue?)
We have an initial size requirement for meta_dev, data_dev (for thinpool) and also for mdv, but I can not find anything about these in the design doc.
We should have a page on the website that answers common questions about using Stratis, such as why it shows 1 TiB for filesystems in df
.
Right now, we know this:
What this leaves open is understanding the probability of getting wrong data when reading metadata during setup.
The documentation talks about the stratisd
monitoring daemon but does not talk about SMART at all. Will stratisd
functionally replace smartd
?
For example, is stratisd
expected to make a decision about a rapidly-climbing reallocated sector count on a drive and mark the drive as troubled, maybe even automatically swap in a spare when that functionality lands?
docs/faq/FAQ.md.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.