Howard Goldstein
2007-Aug-13 18:36 UTC
3ware RAID ctrlr: twe0: twe_map_request: malloc failed. Informational or am I screwed?
Would someone confirm (or disabuse!) me of the belief that these 3ware Escalade 8106-LP2 (2 port sata raid controllers) related messages are informational and the UFS device is retrying after it sees ENOMEM causing these? Aug 13 10:29:48 cally kernel: twe0: twe_map_request: malloc failed Aug 13 10:33:04 cally kernel: twe0: twe_map_request: malloc failed Aug 13 11:05:37 cally kernel: twe0: twe_map_request: malloc failed Aug 13 11:14:55 cally kernel: twe0: twe_map_request: malloc failed Aug 13 12:07:44 cally kernel: twe0: twe_map_request: malloc failed Aug 13 12:28:05 cally kernel: twe0: twe_map_request: malloc failed Aug 13 14:29:59 cally kernel: twe0: twe_map_request: malloc failed (Either way but particularly in the event the data are being lost is there a sysctl or kernel build option to supply a bit more memory in advance?)
Scott Long
2007-Aug-13 18:51 UTC
3ware RAID ctrlr: twe0: twe_map_request: malloc failed. Informational or am I screwed?
Howard Goldstein wrote:> Would someone confirm (or disabuse!) me of the belief that these 3ware > Escalade 8106-LP2 (2 port sata raid controllers) related messages are > informational and the UFS device is retrying after it sees ENOMEM > causing these? > > Aug 13 10:29:48 cally kernel: twe0: twe_map_request: malloc failed > Aug 13 10:33:04 cally kernel: twe0: twe_map_request: malloc failed > Aug 13 11:05:37 cally kernel: twe0: twe_map_request: malloc failed > Aug 13 11:14:55 cally kernel: twe0: twe_map_request: malloc failed > Aug 13 12:07:44 cally kernel: twe0: twe_map_request: malloc failed > Aug 13 12:28:05 cally kernel: twe0: twe_map_request: malloc failed > Aug 13 14:29:59 cally kernel: twe0: twe_map_request: malloc failed > > (Either way but particularly in the event the data are being lost is > there a sysctl or kernel build option to supply a bit more memory in > advance?)The system is doing unaligned I/O to the array (either via an app that has a file opened O_DIRECT, or via I/O directly to the device node), and there is corresponding memory pressure that is making the driver fail at the re-alignment process. The problem is likely temporary and the operation eventually succeeds after a few retries, but success is not guaranteed, and failure will almost certainly cause system instability. The fix is to rewrite the alignment code in the driver to use the automatic, failure-free, alignment service that busdma provides. I'm happy to provide direction to anyone who wants to tackle this. Scott