thr3ads.net - Rails - Highload project on RoR [Jan 2013]

If this information is useful, please help other people find it:
Share via:

Vitaly Zemlyansky

2013-Jan-16 03:30 UTC

Highload project on RoR

I need advice about What is the best practice to scale RoR project ?

My current stack is:

   - Ruby 1.9.3
   - RoR 3.2.8
   - Redis DB for caching and some hot data
   - PostgreSQL
   - Resque for background jobs  

I am expecting a highload for my project for several month. And I need to 
change my architecture.
Primarily, I want to change my backend architecture. I read about highload 
project, like FB or Twitter, I found that they all use distributed DB, like 
Cassandra or Riak.
And one of my questions is, what is better to use Cassandra or Riak. I 
prefer Riak, but I want to listen pros and cons of both if Somebody use it ?
Also I know that projects like FB or Twitter use both distributed DB and 
relational DB(MySQL), and I wonder Why do they use both models ? What 
benifits do you get by this approach ?
And my last question about caching. Currently I use Redis, but Redis 
doesn''t have master-master replication or some mechanisms of
clustering,
and my question is, what can I do in this case ? Maybe there are some gems 
or other technics to scale Redis ?

-- 
You received this message because you are subscribed to the Google Groups
"Ruby on Rails: Talk" group.
To post to this group, send email to
rubyonrails-talk-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To unsubscribe from this group, send email to
rubyonrails-talk+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit
https://groups.google.com/d/msg/rubyonrails-talk/-/Ioc_9-hQNXYJ.
For more options, visit https://groups.google.com/groups/opt_out.

Peter Hickman

2013-Jan-16 15:15 UTC

head link

Re: Highload project on RoR

The first question is where is this load going to hit your system. No point
boosting Redis if PostgreSQL is the problem, no point booting PostgreSQL if
the server is under powered.

You could waste a lot of time and effort if you don''t measure the
problem
first.

-- 
You received this message because you are subscribed to the Google Groups
"Ruby on Rails: Talk" group.
To post to this group, send email to
rubyonrails-talk-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To unsubscribe from this group, send email to
rubyonrails-talk+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
For more options, visit https://groups.google.com/groups/opt_out.

Vitaly Zemlyansky

2013-Jan-16 16:06 UTC

head link

Re: Highload project on RoR

Currently, I don''t have problems. As I wrote, I am just expecting that
most
problems will be with DB''s. 
I read about this problems in architecture of highload projects. And I know 
that relational model can''t scale for millions simultaneously requests.
All
projects use distributed DB for this problem, like Cassandra or Riak.
And I know, that for in-memmory db, you should build a cluster. Memcached, 
for example, has this functionality. But I am trying to find similar for 
Redis.
My question is not about concrete problem with such database''s, but
it''s
about best practicies or maybe succesfull experience to scale similar stack 
of technologies.

среда, 16 января 2013 г., 23:15:49 UTC+8 пользователь Peter Hickman
написал:>
> The first question is where is this load going to hit your system. No 
> point boosting Redis if PostgreSQL is the problem, no point booting 
> PostgreSQL if the server is under powered.
>
> You could waste a lot of time and effort if you don''t measure the
problem
> first.
>
>
-- 
You received this message because you are subscribed to the Google Groups
"Ruby on Rails: Talk" group.
To post to this group, send email to
rubyonrails-talk-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To unsubscribe from this group, send email to
rubyonrails-talk+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msg/rubyonrails-talk/-/9OJ_uSG0lmoJ.
For more options, visit https://groups.google.com/groups/opt_out.

Peter Hickman

2013-Jan-16 16:57 UTC

head link

Re: Highload project on RoR

On 16 January 2013 16:06, Vitaly Zemlyansky
<vitozemlya-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
> Currently, I don''t have problems. As I wrote, I am just expecting
that
> most problems will be with DB''s.
> I read about this problems in architecture of highload projects. And I
> know that relational model can''t scale for millions simultaneously
requests.
>
>Do you have a good reason to believe that you will have millions of
simultaneously requests to deal with? What sort of request will they be,
will the be 99% reads and 1% writes or 50/50? The answer to that question
will determine how you approach the problem.

You should be able to determine the ratio to some degree from the existing
system - assuming you have a sufficient number of users.

The best practice is to measure the issues and then work forward, we could
spend our lives talking about how best to optimise an application but it
comes to nothing if we do not have real data to work with. Also the
measurements you take will allow you to see what improvement has happened
and if it was worth the effort.

For reference one of our systems is Ruby 1.8.7, Rails 2.3, PostgreSQL 8.4,
512Mb ram, memcached and it handles from 800,000 to 1,700,000 requests per
day (but that works out to an average of 9 to 18 requests per second). If
you are expecting to handle 1,000,000 simultaneously requests (that is
1,000,000 requests per second) you would be looking at round 86,400,000,000
requests a day!

Most places, outside of Facebook, Google or international banks, are very
unlikely to experience the sort of load that you are anticipating. If you
really do plan to handle millions of simultaneously requests you will have
to start writing some very large cheques for the hardware you will need to
run this all on.

Actually millions simultaneously requests is starting to sound like you
shouldn''t be using Ruby at all! Get a programmer from a bank to do this
in
erlang for you.

-- 
You received this message because you are subscribed to the Google Groups
"Ruby on Rails: Talk" group.
To post to this group, send email to
rubyonrails-talk-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To unsubscribe from this group, send email to
rubyonrails-talk+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
For more options, visit https://groups.google.com/groups/opt_out.

Vitaly Zemlyansky

2013-Jan-17 06:40 UTC

head link

Re: Highload project on RoR

Thanks Peter for answering me. The requests mostly will be the reads, 
approximately 70/30.
And I also thought about Erlang. But I don''t want to lose benefits from
Ruby and RoR. What I want to create is some hybrid system with several 
languages. For example, Twitter still using RoR, but for back-end the use 
Scala. Some similar architecture, as in Twitter, I want to reach, but using 
Erlang instead Scala. 

четверг, 17 января 2013 г., 0:57:43 UTC+8 пользователь Peter Hickman 
написал:>
> On 16 January 2013 16:06, Vitaly Zemlyansky
<vitoz...-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org<javascript:>
> > wrote:
>
>> Currently, I don''t have problems. As I wrote, I am just
expecting that
>> most problems will be with DB''s. 
>> I read about this problems in architecture of highload projects. And I 
>> know that relational model can''t scale for millions
simultaneously requests.
>>
>>
> Do you have a good reason to believe that you will have millions of 
> simultaneously requests to deal with? What sort of request will they be, 
> will the be 99% reads and 1% writes or 50/50? The answer to that question 
> will determine how you approach the problem.
>
> You should be able to determine the ratio to some degree from the existing 
> system - assuming you have a sufficient number of users.
>
> The best practice is to measure the issues and then work forward, we could 
> spend our lives talking about how best to optimise an application but it 
> comes to nothing if we do not have real data to work with. Also the 
> measurements you take will allow you to see what improvement has happened 
> and if it was worth the effort.
>
> For reference one of our systems is Ruby 1.8.7, Rails 2.3, PostgreSQL 8.4, 
> 512Mb ram, memcached and it handles from 800,000 to 1,700,000 requests per 
> day (but that works out to an average of 9 to 18 requests per second). If 
> you are expecting to handle 1,000,000 simultaneously requests (that is 
> 1,000,000 requests per second) you would be looking at round 86,400,000,000
> requests a day!
>
> Most places, outside of Facebook, Google or international banks, are very 
> unlikely to experience the sort of load that you are anticipating. If you 
> really do plan to handle millions of simultaneously requests you will have 
> to start writing some very large cheques for the hardware you will need to 
> run this all on.
>
> Actually millions simultaneously requests is starting to sound like you 
> shouldn''t be using Ruby at all! Get a programmer from a bank to do
this in
> erlang for you.
>
>
-- 
You received this message because you are subscribed to the Google Groups
"Ruby on Rails: Talk" group.
To post to this group, send email to
rubyonrails-talk-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To unsubscribe from this group, send email to
rubyonrails-talk+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msg/rubyonrails-talk/-/kouqv0A24rcJ.
For more options, visit https://groups.google.com/groups/opt_out.

Simon Riggs

2013-Jan-17 09:08 UTC

head link

Re: Highload project on RoR

On 16 January 2013 16:06, Vitaly Zemlyansky
<vitozemlya-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
> Also I know that projects like FB or Twitter use both distributed DB and
> relational DB(MySQL), and I wonder Why do they use both models ? What
> benifits do you get by this approach ?
The relational model allows you to store more complex data, arranging
that into multiple tables so that it can be easily joined using SQL.
If you have a need for more complex data then you need relational.

Databases that support relational also support other kinds of data
structures, for example, PostgreSQL allows you to store XML, JSON and
also column oriented data using hstore.

Companies use multiple technologies because they have multiple
different needs. In startups, having a single database is common. As
companies evolve they need more applications and databases, often of
different kinds.
> I read about this problems in architecture of highload projects. And I know
> that relational model can''t scale for millions simultaneously
requests. All
> projects use distributed DB for this problem, like Cassandra or Riak.
> And I know, that for in-memmory db, you should build a cluster. Memcached,
> for example, has this functionality. But I am trying to find similar for
> Redis.
It isn''t true that the "relational model can''t
scale", though I have
seen that comment before.

PostgreSQL supports multiple standby nodes, called hot standby, that
allows you to scale out the number of copies of the database, which
then allows you to scale. Achieving >100,000 requests per second per
node is possible, so doing millions can work also.

The key point is usually database writes. Scaling writes requires
multiple write nodes. The technique that everybody uses is sharding,
that is placing a subset of the data on separate nodes. As soon as you
use sharding you need to limit yourself to simple key read/write
operations. Complex operations that require access to multiple nodes
don''t scale well, whether or not they use relational model or
relational databases. It''s the type of request that is a problem, not
the underlying technology.

-- 
 Simon Riggs                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

-- 
You received this message because you are subscribed to the Google Groups
"Ruby on Rails: Talk" group.
To post to this group, send email to
rubyonrails-talk-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To unsubscribe from this group, send email to
rubyonrails-talk+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
For more options, visit https://groups.google.com/groups/opt_out.

Matt Jones

2013-Jan-17 16:00 UTC

head link

Re: Highload project on RoR

On Wednesday, 16 January 2013 11:06:56 UTC-5, Vitaly Zemlyansky
wrote:>
> Currently, I don''t have problems. 

Oh, you''ve got a problem alright - and it''s called
"premature
optimization". 

 
> As I wrote, I am just expecting that most problems will be with
DB''s.
>
Stop expecting, and start MEASURING. Figure out how to load-test your 
application, and work from there.
 
> I read about this problems in architecture of highload projects. And I 
> know that relational model can''t scale for millions 
> simultaneously requests. All projects use distributed DB for this problem, 
> like Cassandra or Riak.
>
Wow, the MySQL team at Facebook is going to be really surprised that 
they''re not using MySQL:

https://www.facebook.com/notes/facebook-engineering/under-the-hood-automated-backups/10151239431923920

All snark aside, measurement is the ONLY way to solve this problem.

--Matt Jones

-- 
You received this message because you are subscribed to the Google Groups
"Ruby on Rails: Talk" group.
To post to this group, send email to
rubyonrails-talk-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To unsubscribe from this group, send email to
rubyonrails-talk+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit
https://groups.google.com/d/msg/rubyonrails-talk/-/igsL7Jw3oxkJ.
For more options, visit https://groups.google.com/groups/opt_out.

Seemingly Similar Threads

Search for more seemingly similar threads

Rails - Jan 2013 - Highload project on RoR

Highload project on RoR

Re: Highload project on RoR

Re: Highload project on RoR

Re: Highload project on RoR

Re: Highload project on RoR

Re: Highload project on RoR

Re: Highload project on RoR

Seemingly Similar Threads