hurr raidzone

Today: Reiserfs craps out on fileserver trying to free previously
freed objects, and kupdated tries to deref a null pointer and oopses.
Userland stops, but instead of halting, kernel continues to run other
threads, including nfsd, which was apparently completely unaware
that anything was amiss.

Writing to a filesystem which has already given up for the day should
probably be considered harmful, I think.

Someone somewhere probably thinks that continuing to run after an oops
in kupdated is a feature.

7 responses to “hurr raidzone”

  1. Not sure if I follow — nfsd is running in kernel space? It sounds like the central design failure is in what’s in kernel and what’s not. Mostly what’s in. But as I say, I’m not sure if I follow.

  2. Yeah, standard nfs in Linux is in-kernel, and has been since the late ’90s. I’m not sure that that’s a problem in itself; while Linux’s NFS implementation is a bit immature across the board, putting nfsd in-kernel seems to me a reasonable solution to the problem of locking and a nice way to avoid unnecessary context switches, especially on machines that are dedicated NFS servers.

    In any case, it’s not like nfsd is the only kernel thread that I’d want to stop when the userland stops. I’d hate to have my firewall keep passing packets but stop logging, for instance, or my loadbalancer decide to continue working with that moment’s load snapshot. If part of the kernel thinks that operations from then on will have undefined results, the rest of the kernel ought to believe that and halt the damn machine. :-)

    (Having investigated further since posting, I see that the userland hadn’t stopped anyhow, although most of it had incidentally ground to a halt.)

  3. NFS really scares me. I haven’t had much experience with it on the admin side, but as a user of it, it has done some very very odd things to me indeed.

  4. No.

    Just plain no.

    That’s the kind of made up failure that doesn’t get anything accomplished.

    A real problem is that when the kernel takes a critical fault, there’s no mechanism for it to tell the subsystem which faulted that it should try to avoid doing any further damage. So the filesystem takes a fault, and the kernel yells “hey, woah, fault here!” to the user, but chances are ReiserFS doesn’t actually even know something has gone wrong1. When nfsd tries to issue another command to the fs, the fs goes ahead and tries to process it, because it doesn’t know it shouldn’t.

    This is almost certainly a problem for userland processes as well in some scenarios, although not in this specific case.

    1 Arguably, it should know anyway. But if it knew that, it shouldn’t have faulted…

  5. That is absolutely the problem. I have experienced nothing but grief and ulcers with the Linux in-kernel nfsd ever since it was introduced. Compared to the old nfs-server userland package, it’s just more fragile and less compatible — and every bug means your whole system oopses. Solaris 2.2 client trying to connect? Crash! Two clients accessing the same file at just the same time? Crash! One of the bytes in your network’s address is “0”? Crash! (I swear I’m not making that one up.) Last time I checked, the whole point of a separate kernel was to make the machine more stable… if you’re just going to put everything in the kernel, you might as well just be running DOS. And when it comes to a tradeoff between slower performance or occasionally blowing up the machine and pissing all over the disk, I think I know which I’d choose. knfsd was a bad idea.

    Phew. Off-topic rant mode off. Granted, a lot of the fatal bugs in knfsd have been fixed, and it’s been getting a lot better over the years, as one might expect. Still a bad concept, though.

  6. Maybe that’s your problem, but it’s not this problem. If the kernel doesn’t halt on panic, nfsd is supposed to keep running, and if writes are returning successfully, then nfsd is supposed to keep writing. The kernel didn’t halt, so userland was still running if it could, just like the other kernel threads were running if they could. A userland nfsd would not matter at all here.

    When reiserfs gives up but doesn’t panic and halt, it’s not nfsd pissing all over the disk, it’s reiserfs.

    I don’t want every kernel module and user process to have its own heuristics to determine whether or not they should keep running because the kernel isn’t smart enough to halt on a filesystem oops.

    The problem here is that the kernel reached a state where disk writes were resulting in undefined behavior (and anything that wrote in the few hours between oops and reboot helpfully provided evidence, sigh) and let them continue anyhow. That’s probably a reiserfs bug, but it might also be a kupdated bug. It’s certainly not an nfsd bug, because the problem wasn’t nfs-related except that no-one told nfs to stop trying.

  7. No, it wouldn’t have made a difference in your case… hence why I described it as “off-topic”. I was responding to HJ’s “central design failure” comment. Nothing gets me going nonlinear faster than mentioning knfsd. :-)