Hello, I have been reading http://wiki.lustre.org/images/1/1b/Hadoop_wp_v0.4.2.pdf for setting up Hadoop over lustre. Generally in hadoop setup, we have 1 Namenode and various number of datanodes. If I want to setup the same keeping Lustre as backend, in the document it is mentioned that: ".............Our experiments run on cluster with 8 nodes in total, one is mds/namenode, the rest are OSS/DataNode". I wonder where does the Lustre Client fit here? For Hadoop to work , we mention filesystem parameter esp /lustre here. We dont have /lustre on OSS. How is it possible?
Hi, In general the clients are isolated nodes which only act as single client nodes accessing the greater file system (MGS, MDS, and OSS (OSTS)) You can run a client mount on any of these nodes however it''s not recommended as it can lead to deadlock and memory contention problems. Ideally your hadoop data source from lustre would be a lustre client. This can consist of a network boot machine which then mounts Lustre FS via it''s interconnect. So think of it like this: (LUSTRE: [MGS <-> MDS] <-> [OSS { OST OST OST }]) <-> Client A (mount /mnt/lustre) |______________________________________^ Then Client A acts as data store exporting /mnt/lustre to your hadoop cluster. I hope this makes sense =) -cf On 03/11/2013 11:33 AM, linux freaker wrote:> Hello, > > I have been reading > http://wiki.lustre.org/images/1/1b/Hadoop_wp_v0.4.2.pdf for setting up > Hadoop over lustre. > Generally in hadoop setup, we have 1 Namenode and various number of datanodes. > If I want to setup the same keeping Lustre as backend, in the document > it is mentioned that: > > ".............Our experiments run on cluster with 8 nodes in total, > one is mds/namenode, the rest are > OSS/DataNode". > > I wonder where does the Lustre Client fit here? > > For Hadoop to work , we mention filesystem parameter esp /lustre here. > We dont have /lustre on OSS. > How is it possible? > _______________________________________________ > Lustre-discuss mailing list > Lustre-discuss at lists.lustre.org > http://lists.lustre.org/mailman/listinfo/lustre-discuss
Hi Colin, Thanks for the response. So you mean to say I need to install Hadoop on just one lustre client. Is it enough to setup Hadoop? But how come I will start datanode.Where will it run? Shall I need to keep OSS and MDS untouched for Hadoop? I have wordcount application testing to be done. Can you suggest me the steps to configure Hadoop under this setup? On Tue, Mar 12, 2013 at 12:23 AM, Colin Faber <colin_faber at xyratex.com> wrote:> Hi, > > In general the clients are isolated nodes which only act as single client > nodes accessing the greater file system (MGS, MDS, and OSS (OSTS)) > > You can run a client mount on any of these nodes however it''s not > recommended as it can lead to deadlock and memory contention problems. > > Ideally your hadoop data source from lustre would be a lustre client. This > can consist of a network boot machine which then mounts Lustre FS via it''s > interconnect. > > > So think of it like this: > > (LUSTRE: [MGS <-> MDS] <-> [OSS { OST OST OST }]) <-> Client A (mount > /mnt/lustre) > |______________________________________^ > > Then Client A acts as data store exporting /mnt/lustre to your hadoop > cluster. > > I hope this makes sense =) > > -cf > > > > > On 03/11/2013 11:33 AM, linux freaker wrote: >> >> Hello, >> >> I have been reading >> http://wiki.lustre.org/images/1/1b/Hadoop_wp_v0.4.2.pdf for setting up >> Hadoop over lustre. >> Generally in hadoop setup, we have 1 Namenode and various number of >> datanodes. >> If I want to setup the same keeping Lustre as backend, in the document >> it is mentioned that: >> >> ".............Our experiments run on cluster with 8 nodes in total, >> one is mds/namenode, the rest are >> OSS/DataNode". >> >> I wonder where does the Lustre Client fit here? >> >> For Hadoop to work , we mention filesystem parameter esp /lustre here. >> We dont have /lustre on OSS. >> How is it possible? >> _______________________________________________ >> Lustre-discuss mailing list >> Lustre-discuss at lists.lustre.org >> http://lists.lustre.org/mailman/listinfo/lustre-discuss > >
Can you please suggest? On Mon, Mar 18, 2013 at 1:26 PM, linux freaker <linuxfreaker at gmail.com> wrote:> Hi Colin, > > Thanks for the response. So you mean to say I need to install Hadoop > on just one lustre client. > Is it enough to setup Hadoop? But how come I will start > datanode.Where will it run? > Shall I need to keep OSS and MDS untouched for Hadoop? > > I have wordcount application testing to be done. Can you suggest me > the steps to configure Hadoop under this setup? > > > On Tue, Mar 12, 2013 at 12:23 AM, Colin Faber <colin_faber at xyratex.com> wrote: >> Hi, >> >> In general the clients are isolated nodes which only act as single client >> nodes accessing the greater file system (MGS, MDS, and OSS (OSTS)) >> >> You can run a client mount on any of these nodes however it''s not >> recommended as it can lead to deadlock and memory contention problems. >> >> Ideally your hadoop data source from lustre would be a lustre client. This >> can consist of a network boot machine which then mounts Lustre FS via it''s >> interconnect. >> >> >> So think of it like this: >> >> (LUSTRE: [MGS <-> MDS] <-> [OSS { OST OST OST }]) <-> Client A (mount >> /mnt/lustre) >> |______________________________________^ >> >> Then Client A acts as data store exporting /mnt/lustre to your hadoop >> cluster. >> >> I hope this makes sense =) >> >> -cf >> >> >> >> >> On 03/11/2013 11:33 AM, linux freaker wrote: >>> >>> Hello, >>> >>> I have been reading >>> http://wiki.lustre.org/images/1/1b/Hadoop_wp_v0.4.2.pdf for setting up >>> Hadoop over lustre. >>> Generally in hadoop setup, we have 1 Namenode and various number of >>> datanodes. >>> If I want to setup the same keeping Lustre as backend, in the document >>> it is mentioned that: >>> >>> ".............Our experiments run on cluster with 8 nodes in total, >>> one is mds/namenode, the rest are >>> OSS/DataNode". >>> >>> I wonder where does the Lustre Client fit here? >>> >>> For Hadoop to work , we mention filesystem parameter esp /lustre here. >>> We dont have /lustre on OSS. >>> How is it possible? >>> _______________________________________________ >>> Lustre-discuss mailing list >>> Lustre-discuss at lists.lustre.org >>> http://lists.lustre.org/mailman/listinfo/lustre-discuss >> >>
---------- Forwarded message ---------- From: linux freaker Date: Monday, March 18, 2013 Subject: Understanding lustre setup .. To: Colin Faber <colin_faber at xyratex.com> On Monday, March 18, 2013, Colin Faber <colin_faber at xyratex.com> wrote:> Hi, > > On 03/15/2013 10:56 AM, linux freaker wrote: >> >> Let me explain in brief the clear picture I would need to understand: >> >> >> Aim: Comparing Hadoop over HDFS Vs Hadoop over Lustre >> >> Things I tried: >> >> I took 5 machines: 1 MDS, 2 OSS/OST and 2 Lustre Client.(each of 4GB RAMand 700 GB hard disk).>> I created around 6 OST on each OSS for size 6GB each through LVM. > > Using LVM based OST''s can slow things down a lot. Really for bestperformance you should be using unpartitioned raw disk for ldiskfs formating.> >> >> >> >> On Wed, Mar 13, 2013 at 8:01 AM, linux freaker <linuxfreaker at gmail.com<mailto:linuxfreaker at gmail.com>> wrote:>> > >> > >> > On Wednesday, March 13, 2013, Colin Faber <colin_faber at xyratex.com<mailto:colin_faber at xyratex.com>> wrote:>> >> Hi, >> >> >> >> I''m sorry, It was a busy day for me. I will try and respondappropriately>> >> to your questions tomorrow. >> >> >> >> -cf >> >> >> >> On 03/12/2013 07:45 PM, linux freaker wrote: >> >>> >> >>> >> >>> On Tuesday, March 12, 2013, linux freaker <linuxfreaker at gmail.com<mailto:linuxfreaker at gmail.com>>> >>> <mailto:linuxfreaker at gmail.com <mailto:linuxfreaker at gmail.com>>>wrote:>> >>> > >> >>> > >> >>> > On Tuesday, March 12, 2013, linux freaker <linuxfreaker at gmail.com<mailto:linuxfreaker at gmail.com>>> >>> > <mailto:linuxfreaker at gmail.com <mailto:linuxfreaker at gmail.com>>>wrote:>> >>> >> >> >>> >> >> >>> >> On Tuesday, March 12, 2013, Colin Faber <colin_faber at xyratex.com<mailto:colin_faber at xyratex.com>>> >>> >> <mailto:colin_faber at xyratex.com <mailto:colin_faber at xyratex.com>>>wrote:>> >>> >>> Hi, >> >>> >>> >> >>> >>> On 03/11/2013 08:13 PM, linux freaker wrote: >> >>> >>>> >> >>> >>>> >> >>> >>>> On Tuesday, March 12, 2013, linux freaker <linuxfreaker at gmail.com <mailto:linuxfreaker at gmail.com>>> >>> >>>> <mailto:linuxfreaker at gmail.com <mailto:linuxfreaker at gmail.com>><mailto:Thanks for response. I got what you suggested. so i will have 1 mds, 2 oss and 2 lustre client.I will install hadoop on both the lustre client. May I know what are the steps to configure hadoop over lustre? I tried to configure core-site.xml but no idea what other files chinch I need. Can you share the steps? Also, i am trying to run wordcount example over lustre.Please suggest. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.lustre.org/pipermail/lustre-discuss/attachments/20130318/229a7fe2/attachment.html