Hi everyone, David R. here; I''m new to this mailing list but I''ve been incredibly wrapped up in Ruby on Rails for the last two or three months. I''ve built a fully functional web application and we''re about to roll it out to our customers. However, my supervisor would like to add just one more feature to it: listing the number of online users for a particular ISP. Here''s the problem: the table that I need to pull data from has over 7,000,000 entries in it, and my rails app hangs when trying to process all this data, and the server it''s on needs to be rebooted. This application will service a very small number of users, so we have placed my app on a weaker box; I know that upgrading the box to 2GB of RAM and a stronger processor would probably fix this problem, but the process would still take a few minutes to run successfully even on a machine that could handle it. My question is this: Is there a graceful Rails way to alleviate this problem? Thanks, David -------------- next part -------------- An HTML attachment was scrubbed... URL: http://wrath.rubyonrails.org/pipermail/rails/attachments/20060615/b0c8070c/attachment.html
On Jun 15, 2006, at 9:24 AM, David R. wrote:> Hi everyone, David R. here; I''m new to this mailing list but I''ve > been incredibly wrapped up in Ruby on Rails for the last two or > three months. > > I''ve built a fully functional web application and we''re about to > roll it out to our customers. However, my supervisor would like to > add just one more feature to it: listing the number of online users > for a particular ISP. > > Here''s the problem: the table that I need to pull data from has > over 7,000,000 entries in it, and my rails app hangs when trying to > process all this data, and the server it''s on needs to be > rebooted. This application will service a very small number of > users, so we have placed my app on a weaker box; I know that > upgrading the box to 2GB of RAM and a stronger processor would > probably fix this problem, but the process would still take a few > minutes to run successfully even on a machine that could handle it. > > My question is this: Is there a graceful Rails way to alleviate > this problem? > > Thanks, > David > _______________________________________________ > Rails mailing list > Rails@lists.rubyonrails.org > http://lists.rubyonrails.org/mailman/listinfo/railsHey David- Welcome to the list! I think one solution to your problem would be to create a script that uses ActiveRecord outside of your rails app. Then you could have this script run every 5 or 10 minutes and write the results to disk. THen when someone hits your rails app for the information you can just serve up the cached results. Cheers- -Ezra -------------- next part -------------- An HTML attachment was scrubbed... URL: http://wrath.rubyonrails.org/pipermail/rails/attachments/20060615/4fd6d392/attachment-0001.html
On 15-Jun-06, at 9:24 AM, David R. wrote:> Hi everyone, David R. here; I''m new to this mailing list but I''ve > been incredibly wrapped up in Ruby on Rails for the last two or > three months. > > I''ve built a fully functional web application and we''re about to > roll it out to our customers. However, my supervisor would like to > add just one more feature to it: listing the number of online users > for a particular ISP. > > Here''s the problem: the table that I need to pull data from has > over 7,000,000 entries in it, and my rails app hangs when trying to > process all this data, and the server it''s on needs to be > rebooted. This application will service a very small number of > users, so we have placed my app on a weaker box; I know that > upgrading the box to 2GB of RAM and a stronger processor would > probably fix this problem, but the process would still take a few > minutes to run successfully even on a machine that could handle it. > > My question is this: Is there a graceful Rails way to alleviate > this problem? >Hey, this is kind of a vague question. You might want to show a) exactly what you''re trying to achieve and (more importantly) b) how you''re currently doing it That said, you''re not actually pulling in all 7 million records are you? You say "list the number of online users" and then say "hangs trying to process all this data". It sounds like you''re doing: SevenMillion.find(:all).length when this would be a tad faster: SevenMillion.count Provide some more (targeted/complete) information and you''ll probably get a speedy and accurate answer from members of the list. Regards, Trevor -- Trevor Squires http://somethinglearned.com> Thanks, > David > _______________________________________________ > Rails mailing list > Rails@lists.rubyonrails.org > http://lists.rubyonrails.org/mailman/listinfo/rails-------------- next part -------------- An HTML attachment was scrubbed... URL: http://wrath.rubyonrails.org/pipermail/rails/attachments/20060615/5d094e9c/attachment.html
Much thanks to everyone who responded. Here''s what I''m trying to do exactly: In our database, there is a Calls table that holds all of the calls made for roughly 12 ISPs that our company provides ports to. The Calls table holds the following data that''s important to me: The modem box the customer dialed into, the customer''s username, the customer''s status type (I''ll elaborate on this in a moment), and the date/time the call was made. Each customer gets assigned a "status type" of 1 when they sign on successfully, a status type of 3 when they''ve been logged on for a period of time (not sure what this interval is) and are still active and thus they are not kicked offline for being idle, and a status type of 2 when they sign off. Unfortunately, our system doesn''t have a database that only has currently online users, so I can''t pull from any such luxurious place. In order to find out who is currently online, I have to: 1. Grab all of the calls from the Calls table where the status type is equal to 2 (because these are users that have signed out) and store them in an array. 2. Becuase each user will have many entries in the Calls table where they have signed out (status type 2), I find those entries in the array (this is a gigantic array now) that are listed an even number of times and delete them from the array. (Reason: if the user has an odd number of signouts, then that user is currently online.) 3. I''m left with a smaller array of nothing but usernames of users who are currently online. Now I use array.uniq to shrink the array one last time to just one entry per user. I can then just list these in a table in the view. The problem is that it has to do this seven million (or more) times. On 6/15/06, Trevor Squires <trevor@protocool.com> wrote:> > On 15-Jun-06, at 9:24 AM, David R. wrote: > > Hi everyone, David R. here; I''m new to this mailing list but I''ve been > incredibly wrapped up in Ruby on Rails for the last two or three months. > > I''ve built a fully functional web application and we''re about to roll it > out to our customers. However, my supervisor would like to add just one > more feature to it: listing the number of online users for a particular > ISP. > > Here''s the problem: the table that I need to pull data from has over > 7,000,000 entries in it, and my rails app hangs when trying to process all > this data, and the server it''s on needs to be rebooted. This application > will service a very small number of users, so we have placed my app on a > weaker box; I know that upgrading the box to 2GB of RAM and a stronger > processor would probably fix this problem, but the process would still take > a few minutes to run successfully even on a machine that could handle it. > > My question is this: Is there a graceful Rails way to alleviate this > problem? > > > Hey, > > this is kind of a vague question. You might want to show a) exactly what > you''re trying to achieve and (more importantly) b) how you''re currently > doing it > > That said, you''re not actually pulling in all 7 million records are you? > > You say "list the number of online users" and then say "hangs trying to > process all this data". It sounds like you''re doing: > > SevenMillion.find(:all).length > > when this would be a tad faster: > > SevenMillion.count > > Provide some more (targeted/complete) information and you''ll probably get > a speedy and accurate answer from members of the list. > > Regards, > Trevor > -- > Trevor Squires > http://somethinglearned.com > > > > Thanks, > David > _______________________________________________ > Rails mailing list > Rails@lists.rubyonrails.org > http://lists.rubyonrails.org/mailman/listinfo/rails > > > > _______________________________________________ > Rails mailing list > Rails@lists.rubyonrails.org > http://lists.rubyonrails.org/mailman/listinfo/rails > > >-------------- next part -------------- An HTML attachment was scrubbed... URL: http://wrath.rubyonrails.org/pipermail/rails/attachments/20060615/f9b524a9/attachment-0001.html
On Jun 15, 2006, at 10:28 AM, David R. wrote:> Much thanks to everyone who responded. Here''s what I''m trying to > do exactly: > > 1. Grab all of the calls from the Calls table where the status > type is equal to 2 (because these are users that have signed out) > and store them in an array. > > 2. Becuase each user will have many entries in the Calls table > where they have signed out (status type 2), I find those entries in > the array (this is a gigantic array now) that are listed an even > number of times and delete them from the array. (Reason: if the > user has an odd number of signouts, then that user is currently > online.) > > 3. I''m left with a smaller array of nothing but usernames of users > who are currently online. Now I use array.uniq to shrink the array > one last time to just one entry per user. I can then just list > these in a table in the view. > > The problem is that it has to do this seven million (or more) times.Yeah that is going to be a problem. Creating that big of an array and then doing operations on it in ruby is the bottleneck. Is there any way you can get this information with pure sql? That would probably be the better way to do it. ANd also for something liek this you probably don''t want to use activerecord objects because they are expensive to instantiate. You may want to use just the mysql-ruby bindings(or whatever db you are using) to do the queries directly. Regardless trying to work with arrays that large all in memory at once is going to choke ruby. SQL is much better at manipulating large data sets like this. Good Luck- -Ezra
David R. wrote:> Much thanks to everyone who responded. Here''s what I''m trying to do > exactly: > > In our database, there is a Calls table that holds all of the calls > made for roughly 12 ISPs that our company provides ports to. The > Calls table holds the following data that''s important to me: > > The modem box the customer dialed into, the customer''s username, the > customer''s status type (I''ll elaborate on this in a moment), and the > date/time the call was made. > > Each customer gets assigned a "status type" of 1 when they sign on > successfully, a status type of 3 when they''ve been logged on for a > period of time (not sure what this interval is) and are still active > and thus they are not kicked offline for being idle, and a status type > of 2 when they sign off. > > Unfortunately, our system doesn''t have a database that only has > currently online users, so I can''t pull from any such luxurious > place. In order to find out who is currently online, I have to: > > 1. Grab all of the calls from the Calls table where the status type > is equal to 2 (because these are users that have signed out) and store > them in an array. > > 2. Becuase each user will have many entries in the Calls table where > they have signed out (status type 2), I find those entries in the > array (this is a gigantic array now) that are listed an even number of > times and delete them from the array. (Reason: if the user has an odd > number of signouts, then that user is currently online.) > > 3. I''m left with a smaller array of nothing but usernames of users > who are currently online. Now I use array.uniq to shrink the array > one last time to just one entry per user. I can then just list these > in a table in the view. > > The problem is that it has to do this seven million (or more) times. >I think you really want to do this with find_by_sql. All of your conditions can be done database side which should be several orders of magnitude faster. -- Jack Christensen jackc@hylesanderson.edu
On Thu, 15 Jun 2006, David R. wrote:> 1. Grab all of the calls from the Calls table where the status type is > equal to 2 (because these are users that have signed out) and store them in > an array. > > 2. Becuase each user will have many entries in the Calls table where they > have signed out (status type 2), I find those entries in the array (this is > a gigantic array now) that are listed an even number of times and delete > them from the array. (Reason: if the user has an odd number of signouts, > then that user is currently online.) >I don''t follow your logic here -- you''ve stated above if they''ve logged out then they have a status of 2 then they''ve logged out. Why then the count of the odd number of statuses? Also, why not do a query such as select user, count(status) group by user having status = 2 Of course, it depends on your background database, but .... This way you would still need to go through and figure who was even or odd, but there''d be a whole lot less data to process!! Here''s an example of something along those lines: http://www.w3schools.com/sql/sql_groupby.asp Why not let the database do the work for you? Matt
Find_by_sql and the link Matt provided look to be exactly what I''m looking for. I have two days to clean up my app''s code and get it rolled out to customers, and if in that time I have a window of opportunity to implement this idea in version 1.0 I''ll definitely be using these suggestions. Thanks guys, if anything arises I''ll keep you posted. (This mailing list is a bag of pure awesome.) On 6/15/06, Matthew Williams <matt@harpstar.com> wrote:> > > > On Thu, 15 Jun 2006, David R. wrote: > > > 1. Grab all of the calls from the Calls table where the status type is > > equal to 2 (because these are users that have signed out) and store them > in > > an array. > > > > 2. Becuase each user will have many entries in the Calls table where > they > > have signed out (status type 2), I find those entries in the array (this > is > > a gigantic array now) that are listed an even number of times and delete > > them from the array. (Reason: if the user has an odd number of > signouts, > > then that user is currently online.) > > > > I don''t follow your logic here -- you''ve stated above if they''ve logged > out then they have a status of 2 then they''ve logged out. Why then the > count of the odd number of statuses? > > Also, why not do a query such as > > select user, count(status) group by user having status = 2 > > Of course, it depends on your background database, but .... This way you > would still need to go through and figure who was even or odd, but there''d > be a whole lot less data to process!! > > Here''s an example of something along those lines: > > http://www.w3schools.com/sql/sql_groupby.asp > > Why not let the database do the work for you? > > Matt > _______________________________________________ > Rails mailing list > Rails@lists.rubyonrails.org > http://lists.rubyonrails.org/mailman/listinfo/rails >-------------- next part -------------- An HTML attachment was scrubbed... URL: http://wrath.rubyonrails.org/pipermail/rails/attachments/20060615/ef29025c/attachment.html
test data: Chris | 5 Bob | 4 Jack | 3 select username, count(status) as total_logoffs from users where status = 2 group by username having total_logoffs % 2 = 1 returns: Chris | 5 Jack | 3 this works under mysql, but should work for other db''s as well with minor modifications (if any) hope this helps On 6/15/06, David R. <davidr64@gmail.com> wrote:> > Find_by_sql and the link Matt provided look to be exactly what I''m looking > for. I have two days to clean up my app''s code and get it rolled out to > customers, and if in that time I have a window of opportunity to implement > this idea in version 1.0 I''ll definitely be using these suggestions. > > Thanks guys, if anything arises I''ll keep you posted. (This mailing list > is a bag of pure awesome.) > > > On 6/15/06, Matthew Williams <matt@harpstar.com> wrote: > > > > > > > > On Thu, 15 Jun 2006, David R. wrote: > > > > > 1. Grab all of the calls from the Calls table where the status type > > is > > > equal to 2 (because these are users that have signed out) and store > > them in > > > an array. > > > > > > 2. Becuase each user will have many entries in the Calls table where > > they > > > have signed out (status type 2), I find those entries in the array > > (this is > > > a gigantic array now) that are listed an even number of times and > > delete > > > them from the array. (Reason: if the user has an odd number of > > signouts, > > > then that user is currently online.) > > > > > > > I don''t follow your logic here -- you''ve stated above if they''ve logged > > out then they have a status of 2 then they''ve logged out. Why then the > > count of the odd number of statuses? > > > > Also, why not do a query such as > > > > select user, count(status) group by user having status = 2 > > > > Of course, it depends on your background database, but .... This way > > you > > would still need to go through and figure who was even or odd, but > > there''d > > be a whole lot less data to process!! > > > > Here''s an example of something along those lines: > > > > http://www.w3schools.com/sql/sql_groupby.asp > > > > Why not let the database do the work for you? > > > > Matt > > _______________________________________________ > > Rails mailing list > > Rails@lists.rubyonrails.org > > http://lists.rubyonrails.org/mailman/listinfo/rails > > > > > _______________________________________________ > Rails mailing list > Rails@lists.rubyonrails.org > http://lists.rubyonrails.org/mailman/listinfo/rails > > >-------------- next part -------------- An HTML attachment was scrubbed... URL: http://wrath.rubyonrails.org/pipermail/rails/attachments/20060615/9c94d746/attachment.html
oops, test data was shown incorrectly i have 5 entries in db for Chris | 2 4 entries for Bob | 2 and 3 entries for Jack | 2 results are shown correctly On 6/15/06, Chris Hall <christopher.k.hall@gmail.com> wrote:> > test data: > > Chris | 5 > Bob | 4 > Jack | 3 > > select username, count(status) as total_logoffs from users where status > 2 group by username having total_logoffs % 2 = 1 > > returns: > > Chris | 5 > Jack | 3 > > this works under mysql, but should work for other db''s as well with minor > modifications (if any) > > hope this helps > > > On 6/15/06, David R. < davidr64@gmail.com> wrote: > > > > Find_by_sql and the link Matt provided look to be exactly what I''m > > looking for. I have two days to clean up my app''s code and get it rolled > > out to customers, and if in that time I have a window of opportunity to > > implement this idea in version 1.0 I''ll definitely be using these > > suggestions. > > > > Thanks guys, if anything arises I''ll keep you posted. (This mailing > > list is a bag of pure awesome.) > > > > > > On 6/15/06, Matthew Williams <matt@harpstar.com> wrote: > > > > > > > > > > > > On Thu, 15 Jun 2006, David R. wrote: > > > > > > > 1. Grab all of the calls from the Calls table where the status type > > > is > > > > equal to 2 (because these are users that have signed out) and store > > > them in > > > > an array. > > > > > > > > 2. Becuase each user will have many entries in the Calls table > > > where they > > > > have signed out (status type 2), I find those entries in the array > > > (this is > > > > a gigantic array now) that are listed an even number of times and > > > delete > > > > them from the array. (Reason: if the user has an odd number of > > > signouts, > > > > then that user is currently online.) > > > > > > > > > > I don''t follow your logic here -- you''ve stated above if they''ve > > > logged > > > out then they have a status of 2 then they''ve logged out. Why then > > > the > > > count of the odd number of statuses? > > > > > > Also, why not do a query such as > > > > > > select user, count(status) group by user having status = 2 > > > > > > Of course, it depends on your background database, but .... This way > > > you > > > would still need to go through and figure who was even or odd, but > > > there''d > > > be a whole lot less data to process!! > > > > > > Here''s an example of something along those lines: > > > > > > http://www.w3schools.com/sql/sql_groupby.asp > > > > > > Why not let the database do the work for you? > > > > > > Matt > > > _______________________________________________ > > > Rails mailing list > > > Rails@lists.rubyonrails.org > > > http://lists.rubyonrails.org/mailman/listinfo/rails > > > > > > > > > _______________________________________________ > > Rails mailing list > > Rails@lists.rubyonrails.org > > http://lists.rubyonrails.org/mailman/listinfo/rails > > > > > > >-------------- next part -------------- An HTML attachment was scrubbed... URL: http://wrath.rubyonrails.org/pipermail/rails/attachments/20060615/3a324db7/attachment-0001.html
On 6/15/06, David R. <davidr64@gmail.com> wrote:> Each customer gets assigned a "status type" of 1 when they sign on > successfully, a status type of 3 when they''ve been logged on for a period of > time (not sure what this interval is) and are still active and thus they are > not kicked offline for being idle, and a status type of 2 when they sign > off.I''m not sure exactly how the data is stored, but here''s some example SQL. If there is a single row for each call that gets updated as the status changes (the most intelligent design), it should be as simple as: SELECT count(1) FROM calls WHERE status_type = 1 OR status_type = 3; Assuming a properly indexed table this should be an extremely fast query. If there are multiple rows for each call (each status type in a separate row, very ugly): SELECT count(1) FROM (SELECT user_id FROM calls WHERE status_type = 1 OR status_type = 2 GROUP BY user_id HAVING sum(CASE WHEN status_type = 1 THEN 1 ELSE 0 END) = sum(CASE WHEN status_type = 2 THEN 1 ELSE 0 END) + 1 ) AS foo; This will be quite a bit slower, but still much faster than using ruby.
On Thursday, June 15, 2006, at 11:24 AM, David R. wrote:>Hi everyone, David R. here; I''m new to this mailing list but I''ve been >incredibly wrapped up in Ruby on Rails for the last two or three months. > >I''ve built a fully functional web application and we''re about to >roll it out >to our customers. However, my supervisor would like to add just one >more feature >to it: listing the number of online users for a particular ISP. > >Here''s the problem: the table that I need to pull data from has over >7,000,000 entries in it, and my rails app hangs when trying to process all >this data, and the server it''s on needs to be rebooted. This application >will service a very small number of users, so we have placed my app on a >weaker box; I know that upgrading the box to 2GB of RAM and a stronger >processor would probably fix this problem, but the process would still take >a few minutes to run successfully even on a machine that could handle it. > >My question is this: Is there a graceful Rails way to alleviate this >problem? > >Thanks, >David > > >_______________________________________________ >Rails mailing list >Rails@lists.rubyonrails.org >http://lists.rubyonrails.org/mailman/listinfo/rails >You might also consider indexing the status field. -- Posted with http://DevLists.com. Sign up and save your mailbox.