Linux: Memory Error Handling

Submitted by Jeremy
on June 16, 2005 - 5:17am

Russ Anderson posted a Request For Comment document on the lkml discussing the development of a common infrastructure for the Linux kernel for dealing with memory errors. He explained that the effort began with work on the ia64 architecture, "there has been considerable work on recovering from Machine Check Aborts (MCAs) in arch/ia64. One result is that many memory errors encountered by user applications no longer cause a kernel panic. The application is terminated, but linux and other applications keep running. Additional improvements are becoming dependent on mainline linux support."

The docment begins by listing the many times of memory failures, including transient errors which return bad data on one read and good data on another, soft errors in which memory has been corrupted but can be rewritten, and hard errors in which the memory storage cell itself has gone bad and can no longer be written to. It then goes on to briefly discuss various methods for handling memory errors broken into two sections, corrected error handling and uncorrected error handling. Finally, the RFC suggests several forms of common infrastruture that could be implemented in the kernel, including page flags used to mark areas in memory that have gone bad and should no longer be used, a /proc interface for tunning the functionality and for communicating with the BIOS/SAL, and "pseudo task switching" to handle non maskable interrupts sent on some architectures to signal memory errors.


From: Russ Anderson [email blocked]
To:  linux-kernel
Subject: [RCF] Linux memory error handling
Date:	Wed, 15 Jun 2005 09:30:13 -0500 (CDT)

		[RCF] Linux memory error handling.

Summary: One of the most common hardware failures in a computer 
	is a memory failure.   There has been efforts in various
	architectures to support recover from memory errors.  This
	is an attempt to define a common support infrastructure
	in Linux to support memory error handling.

Background:  There has been considerable work on recovering from
	Machine Check Aborts (MCAs) in arch/ia64.  One result is
	that many memory errors encountered by user applications
	not longer cause a kernel panic.  The application is 
	terminated, but linux and other applications keep running.
	Additional improvements are becoming dependent on mainline
	linux support.  That requires involvement of lkml, not
	just linux-ia64.

Types of memory failures:

	Memory hardware failures are very hardware implementation 
	specific, but there are some general characteristics.

	    Corrected errors: Error Correction Codes (ECC) in memory 
		hardware can correct Single Bit Errors (SBEs).  

	    Uncorrected errors: Parity errors and Multiple Bit Errors (MBEs)
		are errors that hardware cannot correct.  In this case the
		data in memory is no longer valid and cannot be used.

	There are different types of memory errors:

	    Transient errors: The bit showed up bad, but re-reading the
		data returns the correct data.

	    Soft errors: A bit in memory has changed state, but the 
		the underlying memory cell still works.  For example
		a particle strike can sometimes cause a bit to switch.
		In this case, re-writing the data corrects the error.

	    Hard errors:  The memory storage cell cannot hold the bit.  
		The underlying memory cell could be stuck at 0 or 1.

	A common question is whether single bit (corrected) errors will 
	turn into double bit (uncorrected) errors.  The answer is it
	depends on the underlying cause of the memory error.  There are
	some errors that show up as single bits, especially transient 
	and soft errors, that do not degrade over time.  There are other
	failures that do degrade over time.  The details of the memory
	technology are implementation specific and too detailed for
	this discussion.

Handling memory errors:

	Some memory error handling functionality is common to
	most architectures.

	Corrected error handling:

	    Logging:  When ECC hardware corrects a Single Bit Error (SBE),
		an interrupt is generated to inform linux that there is 
		a corrected error record available for logging.

	    Polling Threshold:  A solid single bit error can cause a burst
		of correctable errors that can cause a significant logging
		overhead.  SBE thresholding counts the number of SBEs for
		a given page and if too many SBEs are detected in a given
		period of time, the interrupt is disabled and instead 
		linux periodically polls for corrected errors.

	    Data Migration:  If a page of memory has too many single bit
		errors, it may be prudent to move the data off that
		physical page before the correctable SBE turns into an
		uncorrectable MBE. 

	    Memory handling parameters:

		Since memory failure modes are due to specific DIMM
		failure characteristics, there is will be no way to 
		reach agreement on one set of thresholds that will
		be appropriate for all configurations.  Therefore there
		needs to be a way to modify the thresholds.  One alternative
		is a /proc/sys/kernel/ interface to control settings, such
		as polling thresholds.  That provides an easy standard
		way of modifying thresholds to match the characteristics
		of the specific DIMM type.

	Uncorrected error handling:

	    Kill the application:  One recovery technique to avoid a kernel
		panic when an application process hits an uncorrectable 
		memory error is to SIGKILL the application.  The page is 
		marked PG_reserved to avoid re-use.  A (new) PG_hard_error
		flag would be useful to indicate that the physical page has
		a hard memory error.

	    Disable memory for next reboot:  When a hard error is detected,
		notify SAL/BIOS of the bad physical memory.  SAL/BIOS can
		save the bad addresses and, when building the EFI map after
		reset/reboot, mark the bad pages as EFI_UNUSABLE_MEMORY,
		and type = 0, so Linux will ignore granules contains these 
		pages.

	    Dumping:  Dump programs should not try to dump pages with bad
		memory.  A PG_hard_error flag would indicate to dump
		programs which pages have bad memory.

	Memory DIMM information & settings:

	    Use a /proc/dimm_info interface to pass DIMM information to Linux.
	    Hardware vendors could add their hardware specific settings.

Linux infrastructure:

	Some infrastructure that could be added to linux that would be
	useful to various architectures.

	Page Flags:  When a page is discarded, PG_reserved is set so that the
		page is no longer used.  A PG_hard_error flag could be added
		to indicate the physical page has bad memory.

	/proc interfaces:  Use /proc interfaces to change thresholds and
		pass information to/from BIOS/SAL.  

	Pseudo task switching:  Some architectures signal memory errors via
		non maskable interrupts, with unusual calling sequences into
		the OS.  It is often easier to process these non-maskable
		errors on a stack that is separate from the normal kernel
		stacks.  This requires non-blocking scheduler interfaces
		to obtain the current running task, to modify the pointer
		to the current running task and to reset that pointer when
		the memory error has been processed.

-- 
Russ Anderson, OS RAS/Partitioning Project Lead  
SGI - Silicon Graphics Inc [email blocked]


From: Andi Kleen [email blocked] Subject: Re: [RCF] Linux memory error handling Date: Wed, 15 Jun 2005 17:08:05 +0200 Russ Anderson [email blocked] writes: > [RCF] Linux memory error handling. RCF? RFC? > > Summary: One of the most common hardware failures in a computer > is a memory failure. There has been efforts in various > architectures to support recover from memory errors. This > is an attempt to define a common support infrastructure > in Linux to support memory error handling. Yes that is badly needed. With rmap we can do much better than we used to do. That code should be common though, not specific to an architecture. > Corrected error handling: > > Logging: When ECC hardware corrects a Single Bit Error (SBE), > an interrupt is generated to inform linux that there is > a corrected error record available for logging. I don't think it makes sense to commonize this - many platforms want to log these errors to platform specific firmware logs (like IA64 or PPC). Others who don't have such powerful firmware need to do their own thing (like x86-64's mcelog). But I don't see much commodiality. > > Polling Threshold: A solid single bit error can cause a burst > of correctable errors that can cause a significant logging > overhead. SBE thresholding counts the number of SBEs for > a given page and if too many SBEs are detected in a given > period of time, the interrupt is disabled and instead > linux periodically polls for corrected errors. I don't see how this could be sanely done in common code. It is deeply architecture specific. > > Data Migration: If a page of memory has too many single bit > errors, it may be prudent to move the data off that > physical page before the correctable SBE turns into an > uncorrectable MBE. This should be common code indeed. Similar for handling uncorrectable errors; e.g. swap the page in again from disk if possible or kill the application. That should be imho all common code I did a prototype of this some time ago, but ran out of time and it wasn't that useful on my platform anyways so I gave it up. > > Memory handling parameters: > > Since memory failure modes are due to specific DIMM > failure characteristics, there is will be no way to > reach agreement on one set of thresholds that will > be appropriate for all configurations. Therefore there > needs to be a way to modify the thresholds. One alternative > is a /proc/sys/kernel/ interface to control settings, such > as polling thresholds. That provides an easy standard > way of modifying thresholds to match the characteristics > of the specific DIMM type. This is deeply architecture and even platform specific. > > Uncorrected error handling: > > Kill the application: One recovery technique to avoid a kernel > panic when an application process hits an uncorrectable > memory error is to SIGKILL the application. The page is > marked PG_reserved to avoid re-use. A (new) PG_hard_error > flag would be useful to indicate that the physical page has > a hard memory error. No need for a new flag, just allocate it. This should be indeed common code using the rmap infrastructure. > Disable memory for next reboot: When a hard error is detected, > notify SAL/BIOS of the bad physical memory. SAL/BIOS can > save the bad addresses and, when building the EFI map after > reset/reboot, mark the bad pages as EFI_UNUSABLE_MEMORY, > and type = 0, so Linux will ignore granules contains these > pages. Deeply hardware specific. > Dumping: Dump programs should not try to dump pages with bad > memory. A PG_hard_error flag would indicate to dump > programs which pages have bad memory. There is no dump program in mainline. I have no problem with the flag, but for some reason the struct page bits seem to be very contended and 32bit will run out of them in the forseeable future. > > Memory DIMM information & settings: > > Use a /proc/dimm_info interface to pass DIMM information to Linux. > Hardware vendors could add their hardware specific settings. I don't think it makes sense to put any of this in common code. > Page Flags: When a page is discarded, PG_reserved is set so that the > page is no longer used. A PG_hard_error flag could be added That is not quite how PG_reserved works... > to indicate the physical page has bad memory. > > Pseudo task switching: Some architectures signal memory errors via > non maskable interrupts, with unusual calling sequences into > the OS. It is often easier to process these non-maskable > errors on a stack that is separate from the normal kernel > stacks. This requires non-blocking scheduler interfaces > to obtain the current running task, to modify the pointer > to the current running task and to reset that pointer when > the memory error has been processed. A "non blocking interface to obtain the current task"? aka "current"? I sense some confusion here ;-) Doing all the rmap process lookup etc. needed for the advanced handling needs to take sleep locks. No way around that. What I did in my x86-64 prototype to handle this was to raise a "self interrupt" (kind of a IPI to the current CPU that would raise next time interrupts were enabled or immediately in user space etc.) and then in the self interrupt where you have a defined context queue work for a CPU workqueue. The workqueue would then take the mm locks and look up the processes mapping the page and kill them etc. Basically the trick is to keep the tricky fully lockless part of the MCE handler as small as possible and "bootstrap" yourself in multiple steps to a defined process context where you can use the rest of the kernel sanely. This implies the actual machine check is processed a bit later. That is fine because near all CPUs seem to cause machine checks asynchronously to the normal instruction stream anyways (so you are already "too late") and adding a bit more delay is not too different. Trying to complicate everything and processing the MCE immediately thus does not help too much. For the common case of the MCE happening in user space it will be always immediately after the exception anyways. -Andi
From: Russ Anderson [email blocked] Subject: Re: [RCF] Linux memory error handling Date: Wed, 15 Jun 2005 11:36:28 -0500 (CDT) Andi Kleen wrote: > Russ Anderson [email blocked] writes: > > > [RCF] Linux memory error handling. > > RCF? RFC? (sigh) RFC. I ran the document through the spellchecker and still missed the first three letters. > > Summary: One of the most common hardware failures in a computer > > is a memory failure. There has been efforts in various > > architectures to support recover from memory errors. This > > is an attempt to define a common support infrastructure > > in Linux to support memory error handling. > > Yes that is badly needed. With rmap we can do much better than > we used to do. That code should be common though, not specific > to an architecture. > > > Corrected error handling: > > > > Logging: When ECC hardware corrects a Single Bit Error (SBE), > > an interrupt is generated to inform linux that there is > > a corrected error record available for logging. > > I don't think it makes sense to commonize this - many platforms > want to log these errors to platform specific firmware logs (like > IA64 or PPC). Others who don't have such powerful firmware need > to do their own thing (like x86-64's mcelog). But I don't see much > commodiality. Sure. It should only be common code when it makes sense. > > Polling Threshold: A solid single bit error can cause a burst > > of correctable errors that can cause a significant logging > > overhead. SBE thresholding counts the number of SBEs for > > a given page and if too many SBEs are detected in a given > > period of time, the interrupt is disabled and instead > > linux periodically polls for corrected errors. > > I don't see how this could be sanely done in common code. It is deeply > architecture specific. This is what could be used to trigger the data migration (common) code. It's the interface from arch specific to common code that has pushed me from linux-ia64 to lkml. > > Data Migration: If a page of memory has too many single bit > > errors, it may be prudent to move the data off that > > physical page before the correctable SBE turns into an > > uncorrectable MBE. > > This should be common code indeed. > > Similar for handling uncorrectable errors; e.g. swap the page > in again from disk if possible or kill the application. That should > be imho all common code Yup. > I did a prototype of this some time ago, but ran out of time > and it wasn't that useful on my platform anyways so I gave it up. > > > > > Memory handling parameters: > > > > Since memory failure modes are due to specific DIMM > > failure characteristics, there is will be no way to > > reach agreement on one set of thresholds that will > > be appropriate for all configurations. Therefore there > > needs to be a way to modify the thresholds. One alternative > > is a /proc/sys/kernel/ interface to control settings, such > > as polling thresholds. That provides an easy standard > > way of modifying thresholds to match the characteristics > > of the specific DIMM type. > > This is deeply architecture and even platform specific. The implementation is arch specific, but the external interface could be common. If common doesn't make sense, I'll just add it in linux-ia64 and be done with it. :-) > > Uncorrected error handling: > > > > Kill the application: One recovery technique to avoid a kernel > > panic when an application process hits an uncorrectable > > memory error is to SIGKILL the application. The page is > > marked PG_reserved to avoid re-use. A (new) PG_hard_error > > flag would be useful to indicate that the physical page has > > a hard memory error. > > No need for a new flag, just allocate it. That is what the current code does. Looking ahead, it would be nice to keep track of the bad memory, so that other processes, such as a dump program, does not try to access it. The PG_hard_error flag is one idea, but others may have a better idea. Conversely, a diag program may want to access it to do additional analysys. The hot-plug people, working on page migration, were wondering how to deal with pages marked reserved. Bad data on bad memory pages does not need to be migrated. They need to know what data not to migrate. > This should be indeed common > code using the rmap infrastructure. > > > Disable memory for next reboot: When a hard error is detected, > > notify SAL/BIOS of the bad physical memory. SAL/BIOS can > > save the bad addresses and, when building the EFI map after > > reset/reboot, mark the bad pages as EFI_UNUSABLE_MEMORY, > > and type = 0, so Linux will ignore granules contains these > > pages. > > Deeply hardware specific. My intent was a common interface to tell EFI of a bad address. In ia64, I could add a SAL call to tell our SAL(SGI PROM) of the bad address, to get this functionality. Very platform specific. Perhaps a more generic interface would add more value for more platforms. That was my intent. > > Dumping: Dump programs should not try to dump pages with bad > > memory. A PG_hard_error flag would indicate to dump > > programs which pages have bad memory. > > There is no dump program in mainline. I have no problem with the flag, > but for some reason the struct page bits seem to be very contended > and 32bit will run out of them in the forseeable future. Add more bits. :-) I realize that flag bits are more limited with 32bit, but adding a page flag is a lkml issue, not a linux-ia64 issue. So I need to discuss this issue here. Perhaps there is an alternative way to achieve the needed functionality. > > Memory DIMM information & settings: > > > > Use a /proc/dimm_info interface to pass DIMM information to Linux. > > Hardware vendors could add their hardware specific settings. > > I don't think it makes sense to put any of this in common code. > > > Page Flags: When a page is discarded, PG_reserved is set so that the > > page is no longer used. A PG_hard_error flag could be added > > That is not quite how PG_reserved works... That's why it needs improvement. > > to indicate the physical page has bad memory. > > > > Pseudo task switching: Some architectures signal memory errors via > > non maskable interrupts, with unusual calling sequences into > > the OS. It is often easier to process these non-maskable > > errors on a stack that is separate from the normal kernel > > stacks. This requires non-blocking scheduler interfaces > > to obtain the current running task, to modify the pointer > > to the current running task and to reset that pointer when > > the memory error has been processed. > > > A "non blocking interface to obtain the current task"? aka "current"? > > I sense some confusion here ;-) See "[RFD] Separating struct task and the kernel stacks" http://www.gelato.unsw.edu.au/linux-ia64/0506/14426.html Thanks, -- Russ Anderson, OS RAS/Partitioning Project Lead SGI - Silicon Graphics Inc [email blocked]
From: Maciej W. Rozycki [email blocked] Subject: Re: [RCF] Linux memory error handling Date: Wed, 15 Jun 2005 16:26:13 +0100 (BST) On Wed, 15 Jun 2005, Russ Anderson wrote: > Handling memory errors: > > Some memory error handling functionality is common to > most architectures. > > Corrected error handling: > > Logging: When ECC hardware corrects a Single Bit Error (SBE), > an interrupt is generated to inform linux that there is > a corrected error record available for logging. > > Polling Threshold: A solid single bit error can cause a burst > of correctable errors that can cause a significant logging > overhead. SBE thresholding counts the number of SBEs for > a given page and if too many SBEs are detected in a given > period of time, the interrupt is disabled and instead > linux periodically polls for corrected errors. This is highly undesirable if the same interrupt is used for MBEs. A page that causes an excessive number of SBEs should rather be removed from the available pool instead. Logging should probably take recent events into account anyway and take care of not overloading the system, e.g. by keeping only statistical data instead of detailed information about each event under load. > Data Migration: If a page of memory has too many single bit > errors, it may be prudent to move the data off that > physical page before the correctable SBE turns into an > uncorrectable MBE. > > Memory handling parameters: > > Since memory failure modes are due to specific DIMM > failure characteristics, there is will be no way to > reach agreement on one set of thresholds that will > be appropriate for all configurations. Therefore there > needs to be a way to modify the thresholds. One alternative > is a /proc/sys/kernel/ interface to control settings, such > as polling thresholds. That provides an easy standard > way of modifying thresholds to match the characteristics > of the specific DIMM type. Note that scrubbing may also be required depending on hardware capabilities as data could have been corrected on the fly for the purpose of providing a correct value for the bus transaction, but memory may still hold corrupted data. And of course not all memory is DIMM! > Uncorrected error handling: > > Kill the application: One recovery technique to avoid a kernel > panic when an application process hits an uncorrectable > memory error is to SIGKILL the application. The page is > marked PG_reserved to avoid re-use. A (new) PG_hard_error > flag would be useful to indicate that the physical page has > a hard memory error. Note we have some infrastructure for that in the MIPS port -- we kill the triggering process, but we don't mark the problematic memory page as unusable (which is an area for improvement). This is of course the case for faults occurring synchronously in the user mode -- when in the kernel mode or when happening asynchronously (e.g. because of being triggered by a DMA transaction rather than one involving a CPU) you often cannot determine whether killing a process is good enough for system safety even if you are able to narrow the fault down to a potential victim. > Disable memory for next reboot: When a hard error is detected, > notify SAL/BIOS of the bad physical memory. SAL/BIOS can > save the bad addresses and, when building the EFI map after > reset/reboot, mark the bad pages as EFI_UNUSABLE_MEMORY, > and type = 0, so Linux will ignore granules contains these > pages. > > Dumping: Dump programs should not try to dump pages with bad > memory. A PG_hard_error flag would indicate to dump > programs which pages have bad memory. > > Memory DIMM information & settings: > > Use a /proc/dimm_info interface to pass DIMM information to Linux. > Hardware vendors could add their hardware specific settings. I'd recommend a more generic name rather than "dimm_info" if that is to be reused universally. Maciej
From: Russell King [email blocked] Subject: Re: [RCF] Linux memory error handling Date: Wed, 15 Jun 2005 20:46:59 +0100 On Wed, Jun 15, 2005 at 04:26:13PM +0100, Maciej W. Rozycki wrote: > On Wed, 15 Jun 2005, Russ Anderson wrote: > > Memory DIMM information & settings: > > > > Use a /proc/dimm_info interface to pass DIMM information to Linux. > > Hardware vendors could add their hardware specific settings. > > I'd recommend a more generic name rather than "dimm_info" if that is to > be reused universally. Agree. I'd also suggest that there be some method to tell the kernel from architecture code about this "dimm_info" stuff - many embedded platforms already know their memory organisation. BTW, Russ, could we have a better description of what information is intended to be supplied? -- Russell King Linux kernel 2.6 ARM Linux - http://www.arm.linux.org.uk/ maintainer of: 2.6 Serial core
From: Russ Anderson [email blocked] Subject: Re: [RFC] Linux memory error handling Date: Wed, 15 Jun 2005 15:28:56 -0500 (CDT) Russell King wrote: > On Wed, Jun 15, 2005 at 04:26:13PM +0100, Maciej W. Rozycki wrote: > > On Wed, 15 Jun 2005, Russ Anderson wrote: > > > Memory DIMM information & settings: > > > > > > Use a /proc/dimm_info interface to pass DIMM information to Linux. > > > Hardware vendors could add their hardware specific settings. > > > > I'd recommend a more generic name rather than "dimm_info" if that is to > > be reused universally. > > Agree. I really don't care what it's called, as long as it's descriptive. /proc/meminfo is taken. :-) One idea would follow the concept of /proc/bus/ and have /proc/memory/ with different memory types. /proc/memory/dimm0 /proc/memory/dimm1 /proc/memory/flash0 . > I'd also suggest that there be some method to tell the kernel from > architecture code about this "dimm_info" stuff - many embedded > platforms already know their memory organisation. > > BTW, Russ, could we have a better description of what information is > intended to be supplied? Part tracking info and configuration info. For example, we were doing some experiments to determine the relationship between refresh rates and memory errors. Could increasing the refresh rate reduce the number of memory errors, therefor making memory more reliable for customers? Could decreasing the refresh rate in manufacturing be used to identify questionable DIMMs? Having a convient interface to read the current refresh rate setting and write a new setting would be useful. This type info, not necessarily in this format: ------------------------------------------------------------------------------ EEPROM JEDEC-SPD Info Part Number Rev Speed SGI BC ---------- ------------------------ ------------------ ---- ------ -------- -- DIMM0 N0 L CE0000000000000006071D84 M3 12L6423DT0-CB3 0D 6.0 09/02/03 00 DIMM1 N0 L CE0000000000000006051CB2 M3 12L6423DT0-CB3 0D 6.0 09/02/03 00 DIMM2 N0 L no hardware detected DIMM3 N0 L no hardware detected -- Russ Anderson, OS RAS/Partitioning Project Lead SGI - Silicon Graphics Inc [email blocked]
From: Wang, Zhenyu [email blocked] Subject: Re: [RCF] Linux memory error handling Date: Thu, 16 Jun 2005 10:54:14 +0800 On 2005.06.15 09:30:13 +0000, Russ Anderson wrote: > [RCF] Linux memory error handling. > > Summary: One of the most common hardware failures in a computer > is a memory failure. There has been efforts in various > architectures to support recover from memory errors. This > is an attempt to define a common support infrastructure > in Linux to support memory error handling. > > Background: There has been considerable work on recovering from > Machine Check Aborts (MCAs) in arch/ia64. One result is > that many memory errors encountered by user applications > not longer cause a kernel panic. The application is > terminated, but linux and other applications keep running. > Additional improvements are becoming dependent on mainline > linux support. That requires involvement of lkml, not > just linux-ia64. Good RFC! Actually on x86 arch, 'bluesmoke' - http://bluesmoke.sf.net - is out there for some simple mem ECC error handling already. It's inspired by the old linux-ecc project. Current capability is limited to detect, report, configuable for polling and UE panic. Bluesmoke contains a driver core which is used to host infos for each mem controller, like dimm info, and currently only polling method is taken for registered controller. Others are all the specific chipset drivers, which is mostly platform depend, e.g e7520, 82875P, etc. Those platforms have also been tested, bluesmoke's webpage contains some test method if you really want to try. nmi handling is still under work, Dave and Corey's patch is on sourceforge page, and http://lkml.org/lkml/2004/8/19/140 http://lkml.org/lkml/2005/3/21/11 Those nmi callbacks have not been added to chipset driver yet, but some initial testing failed, still don't know why... thanks -zhen

Related Links:

Error handling ?

nico (not verified)
on
June 16, 2005 - 6:16am

Does that means the beginning of a error recovery framework ?

A system where you could "save" the memory of a process and restart the process "else where". That's the step after killing a process on bad memory (the error should be MEMORY ERROR and not segmentation violation, because it look like a problem with software instead of hardware).

Don't think so. Dumping a pro

Johannes (not verified)
on
June 17, 2005 - 12:51am

Don't think so. Dumping a process to restart it later on requires not only dumping all of it's memory but also all corresponding data-structures inside the kernel.

Also

Chris Jones (not verified)
on
June 17, 2005 - 1:50am

If you are dumping the program because of a memory error, you may well not want to migrate it somewhere else because its data may now be corrupted.

A typical program uses much m

om (not verified)
on
June 17, 2005 - 2:13am

A typical program uses much more resources than just memory. Think file descriptors, sockets, file locks, and more.

So to answer your question: no, this is not the beginning of an error recovery framework.

questionable memory use as disk cache

sq5bpf (not verified)
on
June 17, 2005 - 2:19am

mark pages as "good", "questionable", "has SBEs" and "bad"

when you see uncorrectable errors mark a page as "bad".

when you see too much SBEs, mark the page "questionable", and temporarily swap it's contents with some "good" page used for cache. if the problem goes away after the page has been used for some time mark it "good" again. if the problem persists mark it with "has SBEs" (the kernel will use it only for cache). if there are uncorrectable errors mark it "bad", and drop it (whatever was there will get reread into a different memory page when it's used).

this way your memory won't be starved by temporary recoverable memory failures that come in bursts (like from a failing nuclear reactor nearby :)

jacek

I'd rather have Linux finally

anonym/0001 (not verified)
on
June 17, 2005 - 7:46am

I'd rather have Linux finally being able to return NULL from malloc(), when there is no memory available...
That simple thing seems to be very very low priority to Linux developers... Nice results in some syntetic benchmark are more important obviously. Or as here - some obscure interface needed by 1% of installed systems. But obviously its development is paid by server companies.
Who pays - got all the features.

Considered disabling overcomm

Antti S. Lankila (not verified)
on
June 17, 2005 - 1:38pm

Considered disabling overcommitting?

Isn't that what "echo 2 > /pr

cbcbcb (not verified)
on
June 17, 2005 - 4:21pm

Isn't that what "echo 2 > /proc/sys/vm/overcommit_memory" gives you?

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.