Can Lustre be used to store data like streaming audio / video? I?ve been scolded about considering it for DB storage but I?m looking at the relative merits of Lustre vs HDFS. I?m moving to a clustered DB setup and wondering about Cassandra / Lustre vs Hadoop (IE HBase / HDFS). One offers flexibility in terms of mixing hardware components while the other is a ?one stop shop?. Not trying to elicit a religious war ? and yes, I?ve been reading as much as I can find about this. Just hoping for the opinion(s) of this side of the table. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.lustre.org/pipermail/lustre-discuss/attachments/20121207/e042e2e7/attachment.html
Hi, On 12/07/2012 10:26 AM, Jon Yeargers wrote:> > Can Lustre be used to store data like streaming audio / video? >Yes> I?ve been scolded about considering it for DB storage but I?m looking > at the relative merits of Lustre vs HDFS. >db reads/writes tends to lead to small I/O which lustre does not handle as well as large I/O> I?m moving to a clustered DB setup and wondering about Cassandra / > Lustre vs Hadoop (IE HBase / HDFS). One offers flexibility in terms of > mixing hardware components while the other is a ?one stop shop?. >Honestly not sure, If you do perform some benchmarking between the two, I, and I''m sure others would be greatly interested in seeing how the various FS technologies stack up!> Not trying to elicit a religious war ? and yes, I?ve been reading as > much as I can find about this. Just hoping for the opinion(s) of this > side of the table. >I don''t think you''ll find that here =) -cf> > > _______________________________________________ > Lustre-discuss mailing list > Lustre-discuss at lists.lustre.org > http://lists.lustre.org/mailman/listinfo/lustre-discuss
Dilger, Andreas
2012-Dec-07 17:34 UTC
[Lustre-discuss] Applications of Lustre - streaming?
On 2012-12-07, at 10:26, Jon Yeargers <yeargers at ohsu.edu<mailto:yeargers at ohsu.edu>> wrote: Can Lustre be used to store data like streaming audio / video? I?ve been scolded about considering it for DB storage but I?m looking at the relative merits of Lustre vs HDFS. I''ve been using Lustre for years with my home MythTV (Linux PVR) setup. The only major change I made was to reduce the readahead window size so that there wasn''t lag when videos first start playing due to the large readahead window being filled. Of course, the suitability for a given workload depends on the hardware being used. Lustre will definitely give you better performance for the same hardware than HDFS, but if you need highly available data, the storage needs to be able to failover between servers. Cheers, Andreas I?m moving to a clustered DB setup and wondering about Cassandra / Lustre vs Hadoop (IE HBase / HDFS). One offers flexibility in terms of mixing hardware components while the other is a ?one stop shop?. Not trying to elicit a religious war ? and yes, I?ve been reading as much as I can find about this. Just hoping for the opinion(s) of this side of the table. _______________________________________________ Lustre-discuss mailing list Lustre-discuss at lists.lustre.org<mailto:Lustre-discuss at lists.lustre.org> http://lists.lustre.org/mailman/listinfo/lustre-discuss
Hello, The question of hdfs storage via lustre has been in the foreground of my thinking. the hadoop hdfs processes are not aware of block devices: they only know of a filesystem mount point to begin storing data in hdfs. THUS? If we provide a filesystem interface (say a lustre mount point) whose latencies and throughput approach that of local disk storage (say, via infiniband), could we not have the various hadoop nodes store their data in the lustre filesystem? would hadoop even care? I realize that this may not be a good place to bring it up. But there you go? One of these days, (with all of my ample spare time), I will benchmark it. and report of course? --jason From: Jon Yeargers <yeargers at ohsu.edu<mailto:yeargers at ohsu.edu>> Date: Friday, December 7, 2012 9:26 AM To: "lustre-discuss at lists.lustre.org<mailto:lustre-discuss at lists.lustre.org>" <lustre-discuss at lists.lustre.org<mailto:lustre-discuss at lists.lustre.org>> Subject: [Lustre-discuss] Applications of Lustre - streaming? Can Lustre be used to store data like streaming audio / video? I?ve been scolded about considering it for DB storage but I?m looking at the relative merits of Lustre vs HDFS. I?m moving to a clustered DB setup and wondering about Cassandra / Lustre vs Hadoop (IE HBase / HDFS). One offers flexibility in terms of mixing hardware components while the other is a ?one stop shop?. Not trying to elicit a religious war ? and yes, I?ve been reading as much as I can find about this. Just hoping for the opinion(s) of this side of the table. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.lustre.org/pipermail/lustre-discuss/attachments/20121207/31acfb8b/attachment-0001.html
The redundancy of HDFS is very appealing. I''ve been weighing the merits of this vs a RAID-6 / server on Lustre. HDFS recommends avoiding RAID for the very reason that the data is (typically) saved in several locations. -----Original Message----- From: Dilger, Andreas [mailto:andreas.dilger at intel.com] Sent: Friday, December 07, 2012 9:35 AM To: Jon Yeargers Cc: lustre-discuss at lists.lustre.org Subject: Re: [Lustre-discuss] Applications of Lustre - streaming? On 2012-12-07, at 10:26, Jon Yeargers <yeargers at ohsu.edu<mailto:yeargers at ohsu.edu>> wrote: Can Lustre be used to store data like streaming audio / video? I?ve been scolded about considering it for DB storage but I?m looking at the relative merits of Lustre vs HDFS. I''ve been using Lustre for years with my home MythTV (Linux PVR) setup. The only major change I made was to reduce the readahead window size so that there wasn''t lag when videos first start playing due to the large readahead window being filled. Of course, the suitability for a given workload depends on the hardware being used. Lustre will definitely give you better performance for the same hardware than HDFS, but if you need highly available data, the storage needs to be able to failover between servers. Cheers, Andreas I?m moving to a clustered DB setup and wondering about Cassandra / Lustre vs Hadoop (IE HBase / HDFS). One offers flexibility in terms of mixing hardware components while the other is a ?one stop shop?. Not trying to elicit a religious war ? and yes, I?ve been reading as much as I can find about this. Just hoping for the opinion(s) of this side of the table. _______________________________________________ Lustre-discuss mailing list Lustre-discuss at lists.lustre.org<mailto:Lustre-discuss at lists.lustre.org> http://lists.lustre.org/mailman/listinfo/lustre-discuss
If it weren?t for the positive aspects of HDFS I wouldn?t really be considering HBase (over Cassandra). Any notion of the merits of Lustre?s kernel-based mounts vs a FUSE-based mount (HDFS)? Whichever filesystem I go with I will need to store ?flat files? in. From: Jason Brooks Sent: Friday, December 07, 2012 9:38 AM To: Jon Yeargers; lustre-discuss at lists.lustre.org Subject: Re: [Lustre-discuss] Applications of Lustre - streaming? Hello, The question of hdfs storage via lustre has been in the foreground of my thinking. the hadoop hdfs processes are not aware of block devices: they only know of a filesystem mount point to begin storing data in hdfs. THUS? If we provide a filesystem interface (say a lustre mount point) whose latencies and throughput approach that of local disk storage (say, via infiniband), could we not have the various hadoop nodes store their data in the lustre filesystem? would hadoop even care? I realize that this may not be a good place to bring it up. But there you go? One of these days, (with all of my ample spare time), I will benchmark it. and report of course? --jason From: Jon Yeargers <yeargers at ohsu.edu<mailto:yeargers at ohsu.edu>> Date: Friday, December 7, 2012 9:26 AM To: "lustre-discuss at lists.lustre.org<mailto:lustre-discuss at lists.lustre.org>" <lustre-discuss at lists.lustre.org<mailto:lustre-discuss at lists.lustre.org>> Subject: [Lustre-discuss] Applications of Lustre - streaming? Can Lustre be used to store data like streaming audio / video? I?ve been scolded about considering it for DB storage but I?m looking at the relative merits of Lustre vs HDFS. I?m moving to a clustered DB setup and wondering about Cassandra / Lustre vs Hadoop (IE HBase / HDFS). One offers flexibility in terms of mixing hardware components while the other is a ?one stop shop?. Not trying to elicit a religious war ? and yes, I?ve been reading as much as I can find about this. Just hoping for the opinion(s) of this side of the table. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.lustre.org/pipermail/lustre-discuss/attachments/20121207/a43e1e02/attachment.html
I have used fuse for other filesystems: its great if all you need is access to the data, but the performance is HORRIBLE. --jason From: Jon Yeargers <yeargers at ohsu.edu<mailto:yeargers at ohsu.edu>> Date: Friday, December 7, 2012 9:42 AM To: Jason Brooks <brookjas at ohsu.edu<mailto:brookjas at ohsu.edu>>, "lustre-discuss at lists.lustre.org<mailto:lustre-discuss at lists.lustre.org>" <lustre-discuss at lists.lustre.org<mailto:lustre-discuss at lists.lustre.org>> Subject: RE: [Lustre-discuss] Applications of Lustre - streaming? If it weren?t for the positive aspects of HDFS I wouldn?t really be considering HBase (over Cassandra). Any notion of the merits of Lustre?s kernel-based mounts vs a FUSE-based mount (HDFS)? Whichever filesystem I go with I will need to store ?flat files? in. From: Jason Brooks Sent: Friday, December 07, 2012 9:38 AM To: Jon Yeargers; lustre-discuss at lists.lustre.org<mailto:lustre-discuss at lists.lustre.org> Subject: Re: [Lustre-discuss] Applications of Lustre - streaming? Hello, The question of hdfs storage via lustre has been in the foreground of my thinking. the hadoop hdfs processes are not aware of block devices: they only know of a filesystem mount point to begin storing data in hdfs. THUS? If we provide a filesystem interface (say a lustre mount point) whose latencies and throughput approach that of local disk storage (say, via infiniband), could we not have the various hadoop nodes store their data in the lustre filesystem? would hadoop even care? I realize that this may not be a good place to bring it up. But there you go? One of these days, (with all of my ample spare time), I will benchmark it. and report of course? --jason From: Jon Yeargers <yeargers at ohsu.edu<mailto:yeargers at ohsu.edu>> Date: Friday, December 7, 2012 9:26 AM To: "lustre-discuss at lists.lustre.org<mailto:lustre-discuss at lists.lustre.org>" <lustre-discuss at lists.lustre.org<mailto:lustre-discuss at lists.lustre.org>> Subject: [Lustre-discuss] Applications of Lustre - streaming? Can Lustre be used to store data like streaming audio / video? I?ve been scolded about considering it for DB storage but I?m looking at the relative merits of Lustre vs HDFS. I?m moving to a clustered DB setup and wondering about Cassandra / Lustre vs Hadoop (IE HBase / HDFS). One offers flexibility in terms of mixing hardware components while the other is a ?one stop shop?. Not trying to elicit a religious war ? and yes, I?ve been reading as much as I can find about this. Just hoping for the opinion(s) of this side of the table. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.lustre.org/pipermail/lustre-discuss/attachments/20121207/e8ce57ad/attachment.html
It is the question of how to handle redundancy that stops me from immediately testing this idea of mine. Well, that and time, werewithal, etc... Hadoop is great because it uses the speed and latency of local disks to work with data, and does not require systems be homogeneous. With the data replicated, it not only has the storage redundancy, but also x-1 hosts that can work with the data if a host goes down. The down side appears to be that hadoop can''t really handle many nodes going down ungracefully. My personal goal with my idea is to make a host a "dumb compute node" that I can shutdown with impunity. On 12/7/12 9:37 AM, "Jon Yeargers" <yeargers at ohsu.edu> wrote:>The redundancy of HDFS is very appealing. I''ve been weighing the merits >of this vs a RAID-6 / server on Lustre. HDFS recommends avoiding RAID for >the very reason that the data is (typically) saved in several locations. > >-----Original Message----- >From: Dilger, Andreas [mailto:andreas.dilger at intel.com] >Sent: Friday, December 07, 2012 9:35 AM >To: Jon Yeargers >Cc: lustre-discuss at lists.lustre.org >Subject: Re: [Lustre-discuss] Applications of Lustre - streaming? > >On 2012-12-07, at 10:26, Jon Yeargers ><yeargers at ohsu.edu<mailto:yeargers at ohsu.edu>> wrote: >Can Lustre be used to store data like streaming audio / video? I?ve been >scolded about considering it for DB storage but I?m looking at the >relative merits of Lustre vs HDFS. > >I''ve been using Lustre for years with my home MythTV (Linux PVR) setup. >The only major change I made was to reduce the readahead window size so >that there wasn''t lag when videos first start playing due to the large >readahead window being filled. > >Of course, the suitability for a given workload depends on the hardware >being used. Lustre will definitely give you better performance for the >same hardware than HDFS, but if you need highly available data, the >storage needs to be able to failover between servers. > >Cheers, Andreas > >I?m moving to a clustered DB setup and wondering about Cassandra / Lustre >vs Hadoop (IE HBase / HDFS). One offers flexibility in terms of mixing >hardware components while the other is a ?one stop shop?. > >Not trying to elicit a religious war ? and yes, I?ve been reading as much >as I can find about this. Just hoping for the opinion(s) of this side of >the table. >_______________________________________________ >Lustre-discuss mailing list >Lustre-discuss at lists.lustre.org<mailto:Lustre-discuss at lists.lustre.org> >http://lists.lustre.org/mailman/listinfo/lustre-discuss >_______________________________________________ >Lustre-discuss mailing list >Lustre-discuss at lists.lustre.org >http://lists.lustre.org/mailman/listinfo/lustre-discuss
On 12/7/12 9:34 AM, Dilger, Andreas wrote:> I''ve been using Lustre for years with my home MythTV (Linux PVR) setup.Nerd. :) -- ------------------------------ Jeff Johnson Co-Founder Aeon Computing jeff.johnson at aeoncomputing.com www.aeoncomputing.com t: 858-412-3810 x101 f: 858-412-3845 m: 619-204-9061 /* New Address */ 4170 Morena Boulevard, Suite D - San Diego, CA 92117
Lustre works great for a Zoneminder installation.? Zoneminder has no problems recording indefinitely on four different cameras to the Lustre mount point.? Plus I can view it remotely though a virtual session at 30fps on all four cameras for as long as necessary. +1 for Mythtv and Lustre.? Can''t beat a recording indefinitely on eight different tuners to 8TB of space.? Seems to take a long time to fill up that much space with the audio and video.? Works great for archiving all my movies and audio tracks.? Backup and redundancy in my case comes from another box with 4 2TB drives raided together.? I copy big files around 105Mb/s and smaller ones like .frm files from database directories in their own good time mainly because there seems to be hundreds of thousands of them.? I even take Lustre on the road with Mythtv and Zoneminder running in the Coach.? Works great for providing entertainment along the way and providing over the road security from all four sides as I''m travelling.? Plus it captures some nice videos and stills of the memories. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.lustre.org/pipermail/lustre-discuss/attachments/20121208/5b732532/attachment.html