Linux Headquarters
[ Register ]
[ About us ] [ Home Page ]

Advertisement
[ Kernel ] [ Documentation ] [ Links ] [ Books ]

Advertisement
Linux 2.0 Unofficial Patches (http://www.dandelion.com/Linux/deadlock.patch.4)

Linux 2.0 SMP Lockup (deadlock) Patch

Leonard N. Zubkoff (lnz@dandelion.com)
Sun Jun 15 09:06:00 1997

[Home] [Linux 2.0] [Linux 2.1] [Information] [Distributions] [Links]


I spent last weekend tracking down some of the remaining SMP deadlocks in Linux 2.0.30. The latest version of my deadlock patch now corrects all the problems I've found to date. Since it's grown a bit larger, the patch is now available from my Linux web page at URL http://www.dandelion.com/Linux/. Specifically, the following improvements have been implemented:

  1. The earlier versions of the patch were only effective on systems where the boot CPU was CPU #0. This version correctly handles a non-zero boot CPU. The Tyan Titan Pro and Tomcat IIID seems to always have CPU #0 as the boot CPU, whereas many of the AIR, SuperMicro, and ASUS boards have CPU #1 as the boot CPU.

  2. An additional form of deadlock is where kernel code running on a non-boot CPU waits for the jiffies variable to be incremented. This deadlock is now avoided by having the spin loops in ENTER_KERNEL increment jiffies approximately every 10 milliseconds. This approach avoids having to track down every place in the kernel where such waiting loops occur.

  3. Finally, if approximately 60 seconds elapse while waiting for the kernel lock, a message will be printed if possible to indicate that a deadlock has been detected. This will help differentiate between SMP lockups and hardware lockups.

I suspect (1) is the reason that earlier versions of this patch seemed to be effective for some people and yet completely ineffective for others.

If people still encounter lockups with this patch, I also fixed the big in Ingo's deadlock detection patch, so we should be able to use that for further debugging.

12-Apr-97

Enclosed below is my patch to linux 2.0.30 to avoid the interrupt/paging deadlocks that have been reported in Linux 2.0.x/SMP. Look in the patched "linux/kernel/sched.c" for a large comment with a full explanation of the deadlock conditions I've addressed and how they are resolved. Please report on whether this resolves any lockups you've seen or if you have any problems with it installed. With luck, this will be reliable enough for 2.0.31 and will remove the black mark from Linux/SMP.

Thanks to Bill Reynolds for his "kill_kernel" package which made testing this easier, and to Linus for the 2.1.28/29 SMP fix which provided the locations for the necessary allow_interrupts calls.

Download




[Home] [Up] [Search] [FeedBack]


For information regarding copying and distribution of this material see the COPYING document.
Comments: webmaster (at) linuxhq.com.
Advertising: banners (at) linuxhq.com.
Compilation ©1998-2008 Linux Headquarters, Inc.