Re: Slowly, slowly
> Linux is getting the tech from Solaris: ZFS, DTrace, SMF (systemd), Containers (dockers), etc etc etc
ZFS on Linux bases on OpenZFS which is years behind of what Solaris ZFS now provides.
btrfs in latest kernel 4.9rc2 in first few seconds after mounting root fs OOPes on my laptop with detecting recursive locking:
[ 19.847017] =============================================
[ 19.848461] [ INFO: possible recursive locking detected ]
[ 19.849904] 4.9.0-0.rc2.git2.1.fc26.x86_64 #1 Not tainted
[ 19.851360] ---------------------------------------------
[ 19.852823] systemd-journal/491 is trying to acquire lock:
[ 19.854283] (
[ 19.854298] &ei->log_mutex
[ 19.855740] ){+.+...}
[ 19.855751] , at:
[ 19.857234] [<ffffffffc0594372>] btrfs_log_inode+0x162/0x1190 [btrfs]
[ 19.858627]
but task is already holding lock:
[ 19.861547] (
[ 19.861562] &ei->log_mutex
[ 19.863042] ){+.+...}
[ 19.863053] , at:
[ 19.864566] [<ffffffffc0594372>] btrfs_log_inode+0x162/0x1190 [btrfs]
[ 19.866079]
other info that might help us debug this:
[ 19.869067] Possible unsafe locking scenario:
[ 19.871622] CPU0
[ 19.873098] ----
[ 19.874558] lock(
[ 19.874575] &ei->log_mutex
[ 19.876023] );
[ 19.877432] lock(
[ 19.877449] &ei->log_mutex
[ 19.878900] );
[ 19.880319]
*** DEADLOCK ***
[ 19.884367] May be due to missing lock nesting notation
[ 19.887001] 3 locks held by systemd-journal/491:
[ 19.888316] #0:
[ 19.888332] (
[ 19.889627] &sb->s_type->i_mutex_key
[ 19.889644] #12
[ 19.890938] ){+.+.+.}
[ 19.890948] , at:
[ 19.892278] [<ffffffffc0562da3>] btrfs_sync_file+0x163/0x4c0 [btrfs]
[ 19.893624] #1:
[ 19.893640] (
[ 19.894959] sb_internal
[ 19.894970] ){.+.+.+}
[ 19.896274] , at:
[ 19.896307] [<ffffffffc05499f6>] start_transaction+0x2f6/0x530 [btrfs]
[ 19.897660] #2:
[ 19.897676] (
[ 19.899034] &ei->log_mutex
[ 19.899046] ){+.+...}
[ 19.900396] , at:
[ 19.900431] [<ffffffffc0594372>] btrfs_log_inode+0x162/0x1190 [btrfs]
[ 19.901819]
stack backtrace:
[ 19.904602] CPU: 2 PID: 491 Comm: systemd-journal Not tainted 4.9.0-0.rc2.git2.1.fc26.x86_64 #1
[ 19.906037] Hardware name: Sony Corporation VPCSB2M9E/VAIO, BIOS R2087H4 06/15/2012
[ 19.907493] ffffaf42c165b820 ffffffffb746cb43 ffffffffb8be9350 ffff9ff58ada0000
[ 19.908930] ffffaf42c165b8e8 ffffffffb7111cbe 0000000000000002 ffffffff00000003
[ 19.910363] 00000000ca125543 ffffffffb84bf600 a2c5096b94a6fc27 ffff9ff58ada0ca8
[ 19.911768] Call Trace:
[ 19.913071] [<ffffffffb746cb43>] dump_stack+0x86/0xc3
[ 19.914393] [<ffffffffb7111cbe>] __lock_acquire+0x78e/0x1290
[ 19.915713] [<ffffffffb70ece67>] ? sched_clock_cpu+0xa7/0xc0
[ 19.917019] [<ffffffffb7907a5e>] ? mutex_unlock+0xe/0x10
[ 19.918318] [<ffffffffb7112c26>] lock_acquire+0xf6/0x1f0
[ 19.919662] [<ffffffffc0594372>] ? btrfs_log_inode+0x162/0x1190 [btrfs]
[ 19.920966] [<ffffffffb7906de6>] mutex_lock_nested+0x86/0x3f0
[ 19.922289] [<ffffffffc0594372>] ? btrfs_log_inode+0x162/0x1190 [btrfs]
[ 19.923613] [<ffffffffc05aaa85>] ? __btrfs_release_delayed_node+0x75/0x1c0 [btrfs]
[ 19.924936] [<ffffffffc0594372>] ? btrfs_log_inode+0x162/0x1190 [btrfs]
[ 19.926268] [<ffffffffc05ac919>] ? btrfs_commit_inode_delayed_inode+0xe9/0x130 [btrfs]
[ 19.927620] [<ffffffffc0594372>] btrfs_log_inode+0x162/0x1190 [btrfs]
[ 19.928951] [<ffffffffb70e0f8a>] ? __might_sleep+0x4a/0x80
[ 19.930290] [<ffffffffc0594f28>] btrfs_log_inode+0xd18/0x1190 [btrfs]
[ 19.931607] [<ffffffffb7037de9>] ? sched_clock+0x9/0x10
[ 19.932940] [<ffffffffc0595670>] btrfs_log_inode_parent+0x240/0x940 [btrfs]
[ 19.934264] [<ffffffffb72c6279>] ? dget_parent+0x99/0x2a0
[ 19.935628] [<ffffffffc0596d52>] btrfs_log_dentry_safe+0x62/0x80 [btrfs]
[ 19.937010] [<ffffffffc0562f52>] btrfs_sync_file+0x312/0x4c0 [btrfs]
[ 19.938385] [<ffffffffb72e70eb>] vfs_fsync_range+0x4b/0xb0
[ 19.939768] [<ffffffffb72e71ad>] do_fsync+0x3d/0x70
[ 19.941092] [<ffffffffb72e7470>] SyS_fsync+0x10/0x20
[ 19.942251] [<ffffffffb7003eec>] do_syscall_64+0x6c/0x1f0
[ 19.943665] [<ffffffffb790b949>] entry_SYSCALL64_slow_path+0x25/0x25
Problem started about two months ago https://bugzilla.redhat.com/show_bug.cgi?id=1366869 and seems no one in Linux community is interested sorting it out.
Nevertheless btrfs is not even close to ZFS because it does not uses free lists.
DTrace ond OBP?
Comparing those two is kind of a joke.
- Injected OBP code does not have basic sanity checks so it is quite easy hang whole system
- There is no clearly defined providers so forget about building library of OBP scripts which will be possible to use for next few years
- namespace of trace points is messy. Every subsystem has own convention. Sometimes tracepoint names ends on begin/end, sometimes on enter/exit and sometimes begin/end is in the middle. Looks like no one controls this and refuses some some random new naming conventions. Some people started forming hierarchy in providers names with slashes :-0
# perf list | awk -F: '{print $1}' | sort | uniq | wc -l
160
So this is number of kind of perf providers.
And everything on top of:
# perf list | wc -l
1743
only a bit less than 2k tracepoint when on Typical Solaris now are available 80-90k.
- SMV vs systemd?
On Solaris adding SMF not caused any changes in init. On Linux systemd does theoretically everything from init things to crond, PM, logging, login user session and few other things.
None of the even Fedora rawhide systemd services are ready to create so easy instance of the services like using SMF on Solaris.
Linux problem is constantly thes same and is called NIH (Not Invented Here) which is causing that initial impulse to implement something new which is already working well on other OS is causing that instead implementing some base functionalities most of the people are attracted on implementing whistles and bells.
More than 10 years after Solaris 10 release Linux still is deep in the wood.
Containers and docker?
Solaris has rock solid non-global zones and has integrated kernel-zones. Linux has Xen and kvm but both of them are not even close to K-Z from point of view for example network layer overhead.
Solaris started provide well isolation of the processes on top of SMF. Linux SElinux here still is useless because it provides only global AVLs without possibility to cage process on top of service definition.
Docker is now integrated in Solaris as well.
In other words .. Linux still is a bit better than only toy which can blow up in your face almost at any time by New-Briliant-Idea-Of-Doing-Old--Things-New-Way.