Hi, just got a quote from our campus reseller, that readzilla and logzilla are not available for the X4540 - hmm strange.... Anyway, wondering whether it is possible/supported/would make sense to use a Sun Flash Accelerator F20 PCIe Card in a X4540 instead of 2.5" SSDs? If so, is it possible to "partition" the F20, e.g. into 36 GB "logzilla", 60GB "readzilla" (also interesting for other Xnnnn servers)? Wrt. super capacitators: I would guess, at least wrt. X4540 it doesn''t give one more protection, since if power is lost, the HDDs do not respond anymore and thus it doesn''t matter, whether the log cache is protected for a short time or not. Is this correct? Regards, jel. -- Otto-von-Guericke University http://www.cs.uni-magdeburg.de/ Department of Computer Science Geb. 29 R 027, Universitaetsplatz 2 39106 Magdeburg, Germany Tel: +49 391 67 12768
Jens Elkner wrote:> Hi, > > just got a quote from our campus reseller, that readzilla and logzilla > are not available for the X4540 - hmm strange.... Anyway, wondering > whether it is possible/supported/would make sense to use a Sun Flash > Accelerator F20 PCIe Card in a X4540 instead of 2.5" SSDs? > > If so, is it possible to "partition" the F20, e.g. into 36 GB "logzilla", > 60GB "readzilla" (also interesting for other Xnnnn servers)? > >IIRC the card presents 4x LUNs so you could use each of them for different purpose. You could also use different slices.> me or not. Is this correct? > >It still does. The capacitor is not for flushing data to disks drives! The card has a small amount of DRAM memory on it which is being flushed to FLASH. Capacitor is to make sure it actually happens if the power is lost.
As to whether it makes sense (as opposed to two distinct physical devices), you would have read cache hits competing with log writes for bandwidth. I doubt both will be pleased :-) On 12/12/09, Robert Milkowski <milek at task.gda.pl> wrote:> Jens Elkner wrote: >> Hi, >> >> just got a quote from our campus reseller, that readzilla and logzilla >> are not available for the X4540 - hmm strange.... Anyway, wondering >> whether it is possible/supported/would make sense to use a Sun Flash >> Accelerator F20 PCIe Card in a X4540 instead of 2.5" SSDs? >> >> If so, is it possible to "partition" the F20, e.g. into 36 GB "logzilla", >> 60GB "readzilla" (also interesting for other Xnnnn servers)? >> >> > IIRC the card presents 4x LUNs so you could use each of them for > different purpose. > You could also use different slices. >> me or not. Is this correct? >> >> > > It still does. The capacitor is not for flushing data to disks drives! > The card has a small amount of DRAM memory on it which is being flushed > to FLASH. Capacitor is to make sure it actually happens if the power is > lost. > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss >-- Regards, Andrey
Andrey Kuzmin wrote:> As to whether it makes sense (as opposed to two distinct physical > devices), you would have read cache hits competing with log writes for > bandwidth. I doubt both will be pleased :-)As usual it depends on your workload. In many real-life scenarios the bandwidth probably won''t be an issue. Then also keep in mind that you can put up-to 4 ssd modules on it and each module iirc is presented as a separate device anyway. So in order to get all the performance you need to make sure to issue I/O to all modules. -- Robert Milkowski http://milek.blogspot.com
On Sat, Dec 12, 2009 at 03:28:29PM +0000, Robert Milkowski wrote:> Jens Elkner wrote:Hi Robert,> > > >just got a quote from our campus reseller, that readzilla and logzilla > >are not available for the X4540 - hmm strange.... Anyway, wondering > >whether it is possible/supported/would make sense to use a Sun Flash > >Accelerator F20 PCIe Card in a X4540 instead of 2.5" SSDs? > > > >If so, is it possible to "partition" the F20, e.g. into 36 GB "logzilla", > >60GB "readzilla" (also interesting for other Xnnnn servers)? > > > > > IIRC the card presents 4x LUNs so you could use each of them for > different purpose. > You could also use different slices.oh. coool - IMHO this would be sufficient for our purposes (see next posting).> >me or not. Is this correct? > > It still does. The capacitor is not for flushing data to disks drives! > The card has a small amount of DRAM memory on it which is being flushed > to FLASH. Capacitor is to make sure it actually happens if the power is > lost.Yepp - found the specs. (BTW: Was probably to late to think about the term "Flash Accelerator" having DRAM prestoserv in mind ;-)). Thanx, jel. -- Otto-von-Guericke University http://www.cs.uni-magdeburg.de/ Department of Computer Science Geb. 29 R 027, Universitaetsplatz 2 39106 Magdeburg, Germany Tel: +49 391 67 12768
On Sat, Dec 12, 2009 at 04:23:21PM +0000, Andrey Kuzmin wrote:> As to whether it makes sense (as opposed to two distinct physical > devices), you would have read cache hits competing with log writes for > bandwidth. I doubt both will be pleased :-)Hmm - good point. What I''m trying to accomplish: Actually our current prototype thumper setup is: root pool (1x 2-way mirror SATA) hotspare (2x SATA shared) pool1 (12x 2-way mirror SATA) ~25% used user homes pool2 (10x 2-way mirror SATA) ~25% used mm files, archives, ISOs So pool2 is not really a problem - delivers about 600MB/s uncached, about 1.8 GB/s cached (i.e. read a 2nd time, tested with a 3.8GB iso) and is not contineously stressed. However sync write is ~ 200 MB/s or 20 MB/s and mirror, only. Problem is pool1 - user homes! So GNOME/firefox/eclipse/subversion/soffice usually via NFS and a litle bit via samba -> a lot of more or less small files, probably widely spread over the platters. E.g. checkin'' out a project from a svn|* repository into a home takes "hours". Also having its workspace on NFS isn''t fun (compared to linux xfs driven local soft 2-way mirror). So data are coming in/going out currently via 1Gbps aggregated NICs, for X4540 we plan to use one (may be experiment with two some time later) 10 Gbps NIC. So max. 2 GB/s read and write. This leaves still 2GB/s in and out for the last PCIe 8x Slot - the F20. Since IO55 is bound with 4GB/s bidirectional HT to the Mezzanine Connector1, in theory those 2 GB/s to and from the F20 should be possible. So IMHO wrt. bandwith basically it makes not really a difference, whether one puts 4 SSDs into HDD slots or using the 4 Flash-Modules on the F20 (even when distributing the SSDs over the IO55(2) and MCP55). However, having it on a separate HT than the HDDs might be an advantage. Also one would be much more flexible/able to "scale immediately", i.e. don''t need to re-organize the pools because of the now "unavailable" slots/ is still able to use all HDD slots with normal HDDs. (we are certainly going to upgrade x4500 to x4540 next year ...) (And if Sun makes a F40 - dropping the SAS ports and putting 4 other Flash-Modules on it or is able to get flashMods with double speed , one could probably really get ~ 1.2 GB write and ~ 2GB/s read). So, seems to be a really interesting thing and I expect at least wrt. user homes a real improvement, no matter, how the final configuration will look like. Maybe the experts at the source are able to do some 4x SSD vs. 1xF20 benchmarks? I guess at least if they turn out to be good enough, it wouldn''t hurt ;-)> > Jens Elkner wrote:...> >> whether it is possible/supported/would make sense to use a Sun Flash > >> Accelerator F20 PCIe Card in a X4540 instead of 2.5" SSDs?Regards, jel. -- Otto-von-Guericke University http://www.cs.uni-magdeburg.de/ Department of Computer Science Geb. 29 R 027, Universitaetsplatz 2 39106 Magdeburg, Germany Tel: +49 391 67 12768
On Dec 13, 2009, at 5:04 PM, Jens Elkner wrote:> On Sat, Dec 12, 2009 at 04:23:21PM +0000, Andrey Kuzmin wrote: >> As to whether it makes sense (as opposed to two distinct physical >> devices), you would have read cache hits competing with log writes >> for >> bandwidth. I doubt both will be pleased :-) > > Hmm - good point. What I''m trying to accomplish: > > Actually our current prototype thumper setup is: > root pool (1x 2-way mirror SATA) > hotspare (2x SATA shared) > pool1 (12x 2-way mirror SATA) ~25% used user homes > pool2 (10x 2-way mirror SATA) ~25% used mm files, archives, ISOs > > So pool2 is not really a problem - delivers about 600MB/s uncached, > about 1.8 GB/s cached (i.e. read a 2nd time, tested with a 3.8GB iso) > and is not contineously stressed. However sync write is ~ 200 MB/s > or 20 MB/s and mirror, only. > > Problem is pool1 - user homes! So GNOME/firefox/eclipse/subversion/ > soffice > usually via NFS and a litle bit via samba -> a lot of more or less > small > files, probably widely spread over the platters. E.g. checkin'' out a > project from a svn|* repository into a home takes "hours". Also having > its workspace on NFS isn''t fun (compared to linux xfs driven local > soft > 2-way mirror).This is probably a latency problem, not a bandwidth problem. Use zilstat to see how much ZIL traffic you have and, if the number is significant, consider using the F20 for a separate log device. -- richard> > So data are coming in/going out currently via 1Gbps aggregated NICs, > for > X4540 we plan to use one (may be experiment with two some time later) > 10 Gbps NIC. So max. 2 GB/s read and write. This leaves still 2GB/s in > and out for the last PCIe 8x Slot - the F20. Since IO55 is bound > with 4GB/s bidirectional HT to the Mezzanine Connector1, in theory > those > 2 GB/s to and from the F20 should be possible. > > So IMHO wrt. bandwith basically it makes not really a difference, > whether > one puts 4 SSDs into HDD slots or using the 4 Flash-Modules on the F20 > (even when distributing the SSDs over the IO55(2) and MCP55). > > However, having it on a separate HT than the HDDs might be an > advantage. > Also one would be much more flexible/able to "scale immediately", i.e. > don''t need to re-organize the pools because of the now "unavailable" > slots/ is still able to use all HDD slots with normal HDDs. > (we are certainly going to upgrade x4500 to x4540 next year ...) > (And if Sun makes a F40 - dropping the SAS ports and putting 4 other > Flash-Modules on it or is able to get flashMods with double speed , > one > could probably really get ~ 1.2 GB write and ~ 2GB/s read). > > So, seems to be a really interesting thing and I expect at least wrt. > user homes a real improvement, no matter, how the final configuration > will look like. > > Maybe the experts at the source are able to do some 4x SSD vs. 1xF20 > benchmarks? I guess at least if they turn out to be good enough, it > wouldn''t hurt ;-) > >>> Jens Elkner wrote: > ... >>>> whether it is possible/supported/would make sense to use a Sun >>>> Flash >>>> Accelerator F20 PCIe Card in a X4540 instead of 2.5" SSDs? > > Regards, > jel. > -- > Otto-von-Guericke University http://www.cs.uni-magdeburg.de/ > Department of Computer Science Geb. 29 R 027, Universitaetsplatz 2 > 39106 Magdeburg, Germany Tel: +49 391 67 12768 > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
On Mon, Dec 14, 2009 at 4:04 AM, Jens Elkner <jel+zfs at cs.uni-magdeburg.de> wrote:> On Sat, Dec 12, 2009 at 04:23:21PM +0000, Andrey Kuzmin wrote: >> As to whether it makes sense (as opposed to two distinct physical >> devices), you would have read cache hits competing with log writes for >> bandwidth. I doubt both will be pleased :-) > > Hmm - good point. What I''m trying to accomplish: > > Actually our current prototype thumper setup is: > ? ? ? ?root pool (1x 2-way mirror SATA) > ? ? ? ?hotspare ?(2x SATA shared) > ? ? ? ?pool1 (12x 2-way mirror SATA) ? ~25% used ? ? ? user homes > ? ? ? ?pool2 (10x 2-way mirror SATA) ? ~25% used ? ? ? mm files, archives, ISOs > > So pool2 is not really a problem - delivers about 600MB/s uncached, > about 1.8 GB/s cached (i.e. read a 2nd time, tested with a 3.8GB iso) > and is not contineously stressed. However sync write is ~ 200 MB/s > or 20 MB/s and mirror, only. > > Problem is pool1 - user homes! So GNOME/firefox/eclipse/subversion/soffice > usually via NFS and a litle bit via samba -> a lot of more or less small > files, probably widely spread over the platters. E.g. checkin'' out a > project from a svn|* repository into a home takes "hours". Also having > its workspace on NFS isn''t fun (compared to linux xfs driven local soft > 2-way mirror).Flash-based read cache should help here by minimizing (metadata) read latency, and flash-based log would bring down write latency. The only drawback of using single F20 is that you''re trying to minimize both with the same device.> > So, seems to be a really interesting thing and I expect at least wrt. > user homes a real improvement, no matter, how the final configuration > will look like. > > Maybe the experts at the source are able to do some 4x SSD vs. 1xF20 > benchmarks? I guess at least if they turn out to be good enough, it > wouldn''t hurt ;-)Would be interesting indeed. Regards, Andrey> >> > Jens Elkner wrote: > ... >> >> whether it is possible/supported/would make sense to use a Sun Flash >> >> Accelerator F20 PCIe Card in a X4540 instead of 2.5" SSDs? > > Regards, > jel. > -- > Otto-von-Guericke University ? ? http://www.cs.uni-magdeburg.de/ > Department of Computer Science ? Geb. 29 R 027, Universitaetsplatz 2 > 39106 Magdeburg, Germany ? ? ? ? Tel: +49 391 67 12768 > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss >
On Mon, Dec 14, 2009 at 01:29:50PM +0300, Andrey Kuzmin wrote:> On Mon, Dec 14, 2009 at 4:04 AM, Jens Elkner > <jel+zfs at cs.uni-magdeburg.de> wrote:...> > Problem is pool1 - user homes! So GNOME/firefox/eclipse/subversion/soffice...> Flash-based read cache should help here by minimizing (metadata) read > latency, and flash-based log would bring down write latency. The onlyHmmm not yet sure - I think writing via NFS is the biggest problem. Anyway, almost finished the work for a ''generic collector'' and data visualizer which allows us to better correlate them to each other on the fly (i.e. no rrd pain) and understand the numbers hopefully a little bit better ;-).> drawback of using single F20 is that you''re trying to minimize both > with the same device.Yepp. But would that scenario change much, when one puts 4 SSDs at HDD slots instead? I guess, not really or would be even worse because it "disturbs" the data path from/to HDD controlers. Anyway, I''ll try that out next year, when those neat toys are officially supported (and the budget for this got its final approval of course). Regards, jel. -- Otto-von-Guericke University http://www.cs.uni-magdeburg.de/ Department of Computer Science Geb. 29 R 027, Universitaetsplatz 2 39106 Magdeburg, Germany Tel: +49 391 67 12768