Werner Kuballa
2009-Aug-27 17:21 UTC
[Xen-users] iSCSI initiator problems accessing Infortrend storage
I am trying to use two Infortrend iSCSI systems (A16E-G2130-4, each with 16x750GB in Raid 6) for our Oracle VM environment (three big Dell R900 servers). When I copy files on the Infortrend I receive SCSI errors, resulting in loss of connection. I read some postings to this user group indicating that others are successfully using Infortrend storage systems. However, I have tried pretty much everything I can think of, but can not get the Infortrend working reliably with Dom0. This is what I have tried: - simple iSCSI network connection, just one server connected with only one 1Gb Ethernet NIC to the Infortrend (exclusively no other connections on the Infortrend) - when installing Oracle Enterprise Linux 5.3 natively on the R900, the Infortrend works fine - in all different network connections (single connection, bonded NICs, etc.) - accessing an EMC AX4 iSCSI system works without any problems No matter what I tried, when utilizing the Infortrend in Dom0 I always get errors as shown below. And it gets worse when I am using "ocfs2" (what I eventually have to use) - ocfs2 fences off the system when these errors occur and reboots the server. Infortrend Support has no solution for this problem, they maintain that XEN is not a supported platform. Any suggestions would be appreciated. Regards, Werner Aug 27 05:16:24 ovm3 multipathd: IFT110: load table [0 8787689472 multipath 1 queue_if_no_path 0 1 1 round-robin 0 1 1 8:48 100] Aug 27 05:16:24 ovm3 multipathd: IFT110: event checker started Aug 27 05:16:24 ovm3 multipathd: dm-0: add map (uevent) Aug 27 05:16:24 ovm3 multipathd: dm-0: devmap already registered Aug 27 05:16:24 ovm3 multipathd: dm-1: add map (uevent) Aug 27 05:16:24 ovm3 multipathd: dm-1: devmap already registered Aug 27 05:16:25 ovm3 iscsid: received iferror -38 Aug 27 05:16:25 ovm3 last message repeated 2 times Aug 27 05:16:25 ovm3 iscsid: connection1:0 is operational now Aug 27 05:18:08 ovm3 kernel: kjournald starting. Commit interval 5 seconds Aug 27 05:18:08 ovm3 kernel: EXT3 FS on dm-1, internal journal Aug 27 05:18:08 ovm3 kernel: EXT3-fs: mounted filesystem with ordered data mode. Aug 27 05:19:16 ovm3 kernel: ping timeout of 5 secs expired, last rx 13940, last ping 15190, now 16440 Aug 27 05:19:16 ovm3 kernel: connection1:0: iscsi: detected conn error (1011) Aug 27 05:19:17 ovm3 kernel: sd 4:0:0:1: SCSI error: return code = 0x00020000 Aug 27 05:19:17 ovm3 kernel: end_request: I/O error, dev sdd, sector 49529536 Aug 27 05:19:17 ovm3 kernel: device-mapper: multipath: Failing path 8:48. Aug 27 05:19:17 ovm3 kernel: sd 4:0:0:1: SCSI error: return code = 0x00020000 Aug 27 05:19:17 ovm3 kernel: end_request: I/O error, dev sdd, sector 1569783144 Aug 27 05:19:17 ovm3 kernel: sd 4:0:0:1: SCSI error: return code = 0x00020000 Aug 27 05:19:17 ovm3 kernel: end_request: I/O error, dev sdd, sector 1569784168 Aug 27 05:19:17 ovm3 kernel: sd 4:0:0:1: SCSI error: return code = 0x00020000 Aug 27 05:19:17 ovm3 kernel: end_request: I/O error, dev sdd, sector 1569785192 Aug 27 05:19:17 ovm3 kernel: sd 4:0:0:1: SCSI error: return code = 0x00020000 Aug 27 05:19:17 ovm3 kernel: end_request: I/O error, dev sdd, sector 1569786216 Aug 27 05:19:17 ovm3 kernel: sd 4:0:0:1: SCSI error: return code = 0x00020000 Aug 27 05:19:17 ovm3 kernel: end_request: I/O error, dev sdd, sector 1569786224 Aug 27 05:19:17 ovm3 kernel: sd 4:0:0:1: SCSI error: return code = 0x00020000 Aug 27 05:19:17 ovm3 kernel: end_request: I/O error, dev sdd, sector 1569787248 Aug 27 05:19:17 ovm3 iscsid: Kernel reported iSCSI connection 1:0 error (1011) state (3) Aug 27 05:19:17 ovm3 kernel: sd 4:0:0:1: SCSI error: return code = 0x00020000 Aug 27 05:19:17 ovm3 kernel: end_request: I/O error, dev sdd, sector 1569788272 Save Paper. Think before you print. _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Pasi Kärkkäinen
2009-Aug-27 20:35 UTC
Re: [Xen-users] iSCSI initiator problems accessing Infortrend storage
On Thu, Aug 27, 2009 at 10:21:46AM -0700, Werner Kuballa wrote:> > I am trying to use two Infortrend iSCSI systems (A16E-G2130-4, each with 16x750GB in Raid 6) for our Oracle VM environment (three big Dell R900 servers). When I copy files on the Infortrend I receive SCSI errors, resulting in loss of connection. I read some postings to this user group indicating that others are successfully using Infortrend storage systems. However, I have tried pretty much everything I can think of, but can not get the Infortrend working reliably with Dom0. This is what I have tried: > > - simple iSCSI network connection, just one server connected with only one 1Gb Ethernet NIC to the Infortrend (exclusively no other connections on the Infortrend) > > - when installing Oracle Enterprise Linux 5.3 natively on the R900, the Infortrend works fine - in all > different network connections (single connection, bonded NICs, etc.) >What is your dom0 OS? Oracle Enterprise Linux 5.3 and the Xen included in it? Or something else?> - accessing an EMC AX4 iSCSI system works without any problems >That''s weird.> No matter what I tried, when utilizing the Infortrend in Dom0 I always get errors as shown below. And it gets worse when I am using "ocfs2" (what I eventually have to use) - ocfs2 fences off the system when these errors occur and reboots the server. > > Infortrend Support has no solution for this problem, they maintain that XEN is not a supported platform. > > Any suggestions would be appreciated. >Does this happen when only dom0 is running, ie. no domUs? If it happens when you have load on the server, try giving dom0 more weight (and dedicating a cpu core for it) so that it can always get the needed cpu time for iSCSI processing. -- Pasi> Regards, > Werner > > > > > Aug 27 05:16:24 ovm3 multipathd: IFT110: load table [0 8787689472 multipath 1 queue_if_no_path 0 1 1 round-robin 0 1 1 8:48 100] > > Aug 27 05:16:24 ovm3 multipathd: IFT110: event checker started > Aug 27 05:16:24 ovm3 multipathd: dm-0: add map (uevent) > Aug 27 05:16:24 ovm3 multipathd: dm-0: devmap already registered > Aug 27 05:16:24 ovm3 multipathd: dm-1: add map (uevent) > Aug 27 05:16:24 ovm3 multipathd: dm-1: devmap already registered > Aug 27 05:16:25 ovm3 iscsid: received iferror -38 > Aug 27 05:16:25 ovm3 last message repeated 2 times > Aug 27 05:16:25 ovm3 iscsid: connection1:0 is operational now > Aug 27 05:18:08 ovm3 kernel: kjournald starting. Commit interval 5 seconds > Aug 27 05:18:08 ovm3 kernel: EXT3 FS on dm-1, internal journal > Aug 27 05:18:08 ovm3 kernel: EXT3-fs: mounted filesystem with ordered data mode. > Aug 27 05:19:16 ovm3 kernel: ping timeout of 5 secs expired, last rx 13940, last ping 15190, now 16440 > Aug 27 05:19:16 ovm3 kernel: connection1:0: iscsi: detected conn error (1011) > Aug 27 05:19:17 ovm3 kernel: sd 4:0:0:1: SCSI error: return code = 0x00020000 > Aug 27 05:19:17 ovm3 kernel: end_request: I/O error, dev sdd, sector 49529536 > Aug 27 05:19:17 ovm3 kernel: device-mapper: multipath: Failing path 8:48. > Aug 27 05:19:17 ovm3 kernel: sd 4:0:0:1: SCSI error: return code = 0x00020000 > Aug 27 05:19:17 ovm3 kernel: end_request: I/O error, dev sdd, sector 1569783144 > Aug 27 05:19:17 ovm3 kernel: sd 4:0:0:1: SCSI error: return code = 0x00020000 > Aug 27 05:19:17 ovm3 kernel: end_request: I/O error, dev sdd, sector 1569784168 > Aug 27 05:19:17 ovm3 kernel: sd 4:0:0:1: SCSI error: return code = 0x00020000 > Aug 27 05:19:17 ovm3 kernel: end_request: I/O error, dev sdd, sector 1569785192 > Aug 27 05:19:17 ovm3 kernel: sd 4:0:0:1: SCSI error: return code = 0x00020000 > Aug 27 05:19:17 ovm3 kernel: end_request: I/O error, dev sdd, sector 1569786216 > Aug 27 05:19:17 ovm3 kernel: sd 4:0:0:1: SCSI error: return code = 0x00020000 > Aug 27 05:19:17 ovm3 kernel: end_request: I/O error, dev sdd, sector 1569786224 > Aug 27 05:19:17 ovm3 kernel: sd 4:0:0:1: SCSI error: return code = 0x00020000 > Aug 27 05:19:17 ovm3 kernel: end_request: I/O error, dev sdd, sector 1569787248 > Aug 27 05:19:17 ovm3 iscsid: Kernel reported iSCSI connection 1:0 error (1011) state (3) > Aug 27 05:19:17 ovm3 kernel: sd 4:0:0:1: SCSI error: return code = 0x00020000 > Aug 27 05:19:17 ovm3 kernel: end_request: I/O error, dev sdd, sector 1569788272 > > > Save Paper. > Think before you print.> _______________________________________________ > Xen-users mailing list > Xen-users@lists.xensource.com > http://lists.xensource.com/xen-users_______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Werner Kuballa
2009-Aug-27 21:18 UTC
Re: [Xen-users] iSCSI initiator problems accessing Infortrendstorage
Yes, I understand that Oracle VM is based on RedHat. This is what Dom0 shows: root@hsarcovmsp3:~# ls -lhd /etc/*releas* -rw-r--r-- 1 root root 31 Mar 12 11:20 /etc/enterprise-release -rw-r--r-- 1 root root 31 Mar 12 11:20 /etc/ovs-release -rw-r--r-- 1 root root 31 Mar 12 11:20 /etc/redhat-release root@hsarcovmsp3:~# cat /etc/*releas* Oracle VM server release 2.1.5 Oracle VM server release 2.1.5 Oracle VM server release 2.1.5 root@hsarcovmsp3:~# uname -a Linux hsarcovmsp3 2.6.18-8.1.15.5.1.el5xen #1 SMP Thu Aug 6 15:25:40 EDT 2009 i686 i686 i386 GNU/Linux root@hsarcovmsp3:~# At this point in time I have only Dom0 running, no DomUs. There is obviously no shortage of CPUs (the system has 16 cores :-). I did increase the memory for Dom0 to 3.5GB, but that didn''t change anything. I do have a tcpdump capture of the traffic when the error occurred, but I am not an expert of the iSCSI protocol and it very definitely appears to be not something that one can pick up quickly... I guess I am looking for some configuration tricks or whatever... something I have not thought of. Maybe somebody experienced similar problems and found some workaround??? Regards, Werner>>> "Pasi Kärkkäinen" <pasik@iki.fi> 8/27/2009 1:35 PM >>>On Thu, Aug 27, 2009 at 10:21:46AM -0700, Werner Kuballa wrote:> > I am trying to use two Infortrend iSCSI systems (A16E-G2130-4, eachwith 16x750GB in Raid 6) for our Oracle VM environment (three big Dell R900 servers). When I copy files on the Infortrend I receive SCSI errors, resulting in loss of connection. I read some postings to this user group indicating that others are successfully using Infortrend storage systems. However, I have tried pretty much everything I can think of, but can not get the Infortrend working reliably with Dom0. This is what I have tried:> > - simple iSCSI network connection, just one server connected withonly one 1Gb Ethernet NIC to the Infortrend (exclusively no other connections on the Infortrend)> > - when installing Oracle Enterprise Linux 5.3 natively on the R900,the Infortrend works fine - in all> different network connections (single connection, bonded NICs, etc.) >What is your dom0 OS? Oracle Enterprise Linux 5.3 and the Xen included in it? Or something else?> - accessing an EMC AX4 iSCSI system works without any problems >That''s weird.> No matter what I tried, when utilizing the Infortrend in Dom0 Ialways get errors as shown below. And it gets worse when I am using "ocfs2" (what I eventually have to use) - ocfs2 fences off the system when these errors occur and reboots the server.> > Infortrend Support has no solution for this problem, they maintainthat XEN is not a supported platform.> > Any suggestions would be appreciated. >Does this happen when only dom0 is running, ie. no domUs? If it happens when you have load on the server, try giving dom0 more weight (and dedicating a cpu core for it) so that it can always get the needed cpu time for iSCSI processing. -- Pasi> Regards, > Werner > > > > > Aug 27 05:16:24 ovm3 multipathd: IFT110: load table [0 8787689472multipath 1 queue_if_no_path 0 1 1 round-robin 0 1 1 8:48 100]> > Aug 27 05:16:24 ovm3 multipathd: IFT110: event checker started > Aug 27 05:16:24 ovm3 multipathd: dm-0: add map (uevent) > Aug 27 05:16:24 ovm3 multipathd: dm-0: devmap already registered > Aug 27 05:16:24 ovm3 multipathd: dm-1: add map (uevent) > Aug 27 05:16:24 ovm3 multipathd: dm-1: devmap already registered > Aug 27 05:16:25 ovm3 iscsid: received iferror -38 > Aug 27 05:16:25 ovm3 last message repeated 2 times > Aug 27 05:16:25 ovm3 iscsid: connection1:0 is operational now > Aug 27 05:18:08 ovm3 kernel: kjournald starting. Commit interval 5seconds> Aug 27 05:18:08 ovm3 kernel: EXT3 FS on dm-1, internal journal > Aug 27 05:18:08 ovm3 kernel: EXT3-fs: mounted filesystem with ordereddata mode.> Aug 27 05:19:16 ovm3 kernel: ping timeout of5 secs expired, last rx 13940, last ping 15190, now 16440> Aug 27 05:19:16 ovm3 kernel: connection1:0: iscsi: detected connerror (1011)> Aug 27 05:19:17 ovm3 kernel: sd 4:0:0:1: SCSI error: return code 0x00020000 > Aug 27 05:19:17 ovm3 kernel: end_request: I/O error, dev sdd, sector49529536> Aug 27 05:19:17 ovm3 kernel: device-mapper: multipath: Failing path8:48.> Aug 27 05:19:17 ovm3 kernel: sd 4:0:0:1: SCSI error: return code 0x00020000 > Aug 27 05:19:17 ovm3 kernel: end_request: I/O error, dev sdd, sector1569783144> Aug 27 05:19:17 ovm3 kernel: sd 4:0:0:1: SCSI error: return code 0x00020000 > Aug 27 05:19:17 ovm3 kernel: end_request: I/O error, dev sdd, sector1569784168> Aug 27 05:19:17 ovm3 kernel: sd 4:0:0:1: SCSI error: return code 0x00020000 > Aug 27 05:19:17 ovm3 kernel: end_request: I/O error, dev sdd, sector1569785192> Aug 27 05:19:17 ovm3 kernel: sd 4:0:0:1: SCSI error: return code 0x00020000 > Aug 27 05:19:17 ovm3 kernel: end_request: I/O error, dev sdd, sector1569786216> Aug 27 05:19:17 ovm3 kernel: sd 4:0:0:1: SCSI error: return code 0x00020000 > Aug 27 05:19:17 ovm3 kernel: end_request: I/O error, dev sdd, sector1569786224> Aug 27 05:19:17 ovm3 kernel: sd 4:0:0:1: SCSI error: return code 0x00020000 > Aug 27 05:19:17 ovm3 kernel: end_request: I/O error, dev sdd, sector1569787248> Aug 27 05:19:17 ovm3 iscsid: Kernel reported iSCSI connection 1:0error (1011) state (3)> Aug 27 05:19:17 ovm3 kernel: sd 4:0:0:1: SCSI error: return code 0x00020000 > Aug 27 05:19:17 ovm3 kernel: end_request: I/O error, dev sdd, sector1569788272> > > Save Paper. > Think before you print.> _______________________________________________ > Xen-users mailing list > Xen-users@lists.xensource.com > http://lists.xensource.com/xen-usersSave Paper. Think before you print. _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Pasi Kärkkäinen
2009-Aug-28 07:00 UTC
Re: [Xen-users] iSCSI initiator problems accessing Infortrendstorage
On Thu, Aug 27, 2009 at 02:18:29PM -0700, Werner Kuballa wrote:> > Yes, I understand that Oracle VM is based on RedHat. > > This is what Dom0 shows: > > root@hsarcovmsp3:~# ls -lhd /etc/*releas* > -rw-r--r-- 1 root root 31 Mar 12 11:20 /etc/enterprise-release > -rw-r--r-- 1 root root 31 Mar 12 11:20 /etc/ovs-release > -rw-r--r-- 1 root root 31 Mar 12 11:20 /etc/redhat-release > root@hsarcovmsp3:~# cat /etc/*releas* > Oracle VM server release 2.1.5 > Oracle VM server release 2.1.5 > Oracle VM server release 2.1.5 > root@hsarcovmsp3:~# uname -a > Linux hsarcovmsp3 2.6.18-8.1.15.5.1.el5xen #1 SMP Thu Aug 6 15:25:40 > EDT 2009 i686 i686 i386 GNU/Linux > root@hsarcovmsp3:~# >That''s really old release, Redhat has made a lot of bugfixes related to open-iscsi initiator after that..> At this point in time I have only Dom0 running, no DomUs. There is > obviously no shortage of CPUs (the system has 16 cores :-). I did > increase the memory for Dom0 to 3.5GB, but that didn''t change anything. > > > I do have a tcpdump capture of the traffic when the error occurred, but > I am not an expert of the iSCSI protocol and it very definitely appears > to be not something that one can pick up quickly... > > I guess I am looking for some configuration tricks or whatever... > something I have not thought of. Maybe somebody experienced similar > problems and found some workaround??? >If there''s a way to upgrade to the latest kernel-xen, aka 2.6.18-128 then do that first. It might fix your problems. -- Pasi> > Regards, > Werner > > > >>> "Pasi Kärkkäinen" <pasik@iki.fi> 8/27/2009 1:35 PM >>> > On Thu, Aug 27, 2009 at 10:21:46AM -0700, Werner Kuballa wrote: > > > > I am trying to use two Infortrend iSCSI systems (A16E-G2130-4, each > with 16x750GB in Raid 6) for our Oracle VM environment (three big Dell > R900 servers). When I copy files on the Infortrend I receive SCSI > errors, resulting in loss of connection. I read some postings to this > user group indicating that others are successfully using Infortrend > storage systems. However, I have tried pretty much everything I can > think of, but can not get the Infortrend working reliably with Dom0. > This is what I have tried: > > > > - simple iSCSI network connection, just one server connected with > only one 1Gb Ethernet NIC to the Infortrend (exclusively no other > connections on the Infortrend) > > > > - when installing Oracle Enterprise Linux 5.3 natively on the R900, > the Infortrend works fine - in all > > different network connections (single connection, bonded NICs, etc.) > > > > What is your dom0 OS? Oracle Enterprise Linux 5.3 and the Xen included > in it? > > Or something else? > > > - accessing an EMC AX4 iSCSI system works without any problems > > > > That''s weird. > > > No matter what I tried, when utilizing the Infortrend in Dom0 I > always get errors as shown below. And it gets worse when I am using > "ocfs2" (what I eventually have to use) - ocfs2 fences off the system > when these errors occur and reboots the server. > > > > Infortrend Support has no solution for this problem, they maintain > that XEN is not a supported platform. > > > > Any suggestions would be appreciated. > > > > Does this happen when only dom0 is running, ie. no domUs? > > If it happens when you have load on the server, try giving dom0 more > weight (and dedicating a cpu core for it) so that it can always get > the > needed cpu time for iSCSI processing. > > -- Pasi > > > Regards, > > Werner > > > > > > > > > > Aug 27 05:16:24 ovm3 multipathd: IFT110: load table [0 8787689472 > multipath 1 queue_if_no_path 0 1 1 round-robin 0 1 1 8:48 100] > > > > Aug 27 05:16:24 ovm3 multipathd: IFT110: event checker started > > Aug 27 05:16:24 ovm3 multipathd: dm-0: add map (uevent) > > Aug 27 05:16:24 ovm3 multipathd: dm-0: devmap already registered > > Aug 27 05:16:24 ovm3 multipathd: dm-1: add map (uevent) > > Aug 27 05:16:24 ovm3 multipathd: dm-1: devmap already registered > > Aug 27 05:16:25 ovm3 iscsid: received iferror -38 > > Aug 27 05:16:25 ovm3 last message repeated 2 times > > Aug 27 05:16:25 ovm3 iscsid: connection1:0 is operational now > > Aug 27 05:18:08 ovm3 kernel: kjournald starting. Commit interval 5 > seconds > > Aug 27 05:18:08 ovm3 kernel: EXT3 FS on dm-1, internal journal > > Aug 27 05:18:08 ovm3 kernel: EXT3-fs: mounted filesystem with ordered > data mode. > > Aug 27 05:19:16 ovm3 kernel: ping timeout of > 5 secs expired, last rx > 13940, last ping 15190, now 16440 > > Aug 27 05:19:16 ovm3 kernel: connection1:0: iscsi: detected conn > error (1011) > > Aug 27 05:19:17 ovm3 kernel: sd 4:0:0:1: SCSI error: return code > 0x00020000 > > Aug 27 05:19:17 ovm3 kernel: end_request: I/O error, dev sdd, sector > 49529536 > > Aug 27 05:19:17 ovm3 kernel: device-mapper: multipath: Failing path > 8:48. > > Aug 27 05:19:17 ovm3 kernel: sd 4:0:0:1: SCSI error: return code > 0x00020000 > > Aug 27 05:19:17 ovm3 kernel: end_request: I/O error, dev sdd, sector > 1569783144 > > Aug 27 05:19:17 ovm3 kernel: sd 4:0:0:1: SCSI error: return code > 0x00020000 > > Aug 27 05:19:17 ovm3 kernel: end_request: I/O error, dev sdd, sector > 1569784168 > > Aug 27 05:19:17 ovm3 kernel: sd 4:0:0:1: SCSI error: return code > 0x00020000 > > Aug 27 05:19:17 ovm3 kernel: end_request: I/O error, dev sdd, sector > 1569785192 > > Aug 27 05:19:17 ovm3 kernel: sd 4:0:0:1: SCSI error: return code > 0x00020000 > > Aug 27 05:19:17 ovm3 kernel: end_request: I/O error, dev sdd, sector > 1569786216 > > Aug 27 05:19:17 ovm3 kernel: sd 4:0:0:1: SCSI error: return code > 0x00020000 > > Aug 27 05:19:17 ovm3 kernel: end_request: I/O error, dev sdd, sector > 1569786224 > > Aug 27 05:19:17 ovm3 kernel: sd 4:0:0:1: SCSI error: return code > 0x00020000 > > Aug 27 05:19:17 ovm3 kernel: end_request: I/O error, dev sdd, sector > 1569787248 > > Aug 27 05:19:17 ovm3 iscsid: Kernel reported iSCSI connection 1:0 > error (1011) state (3) > > Aug 27 05:19:17 ovm3 kernel: sd 4:0:0:1: SCSI error: return code > 0x00020000 > > Aug 27 05:19:17 ovm3 kernel: end_request: I/O error, dev sdd, sector > 1569788272 > > > > > > Save Paper. > > Think before you print. > > > _______________________________________________ > > Xen-users mailing list > > Xen-users@lists.xensource.com > > http://lists.xensource.com/xen-users > > > > Save Paper. > Think before you print._______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users