Matthijs Langenberg
2011-Nov-21 21:54 UTC
Getting the most of our caches when dealing with external HTTP services
Dear all, I am dealing with a group of pages that display data from a few different HTTP resources. In order to get these pages to be performant you must understand that I want to cache as much as possible at the front of my Rails application. Another requirement is that we never show stale data to the user. These pages aren''t getting huge amounts of requests/second, but when there are no changes I want everything to feel snappy. If it takes a bit more time for the first requests to set up the caches it''s okay. Fortunately HTTP ships with something awesome since the 80''s: conditional HTTP request. Just send an ''If-Modified-Since'' or ''If-None- Match'' header along with the request and the server returns a ''304 Not Modified'' or the full response. It is simple to use the Rails Cache Store to cache HTTP responses. But actually, the most time consuming is building an object model from the response and generate the HTML fragments. Therefore I am looking for an API that allows me to conditionally execute render code, based on the response of external HTTP requests. [browser] --[GET /page]--> [Rails app][view][controller][model] -- [GET /resource]--> [external service] I tried to come up with a first proposal: https://gist.github.com/1383983 What do you guys think? Does this make any sense? Are there any other approaches that I could try? At least it provides a way through different layers, without leaking knowledge. The downside is of course that every finder needs support a block, where it normally would just return a result value. This approach makes it impossible to cache different HTML fragments of the same resource, but I think I can mitigate that in a followup proposal. Thank you for providing feedback, Matthijs -- You received this message because you are subscribed to the Google Groups "Ruby on Rails: Talk" group. To post to this group, send email to rubyonrails-talk-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To unsubscribe from this group, send email to rubyonrails-talk+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org For more options, visit this group at http://groups.google.com/group/rubyonrails-talk?hl=en.
Frederick Cheung
2011-Nov-21 23:04 UTC
Re: Getting the most of our caches when dealing with external HTTP services
On Nov 21, 9:54 pm, Matthijs Langenberg <mlangenb...-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:> Dear all, > > I am dealing with a group of pages that display data from a few > different HTTP resources. In order to get these pages to be performant > you must understand that I want to cache as much as possible at the > front of my Rails application. > > Another requirement is that we never show stale data to the user. > These pages aren''t getting huge amounts of requests/second, but when > there are no changes I want everything to feel snappy. If it takes a > bit more time for the first requests to set up the caches it''s okay. > > Fortunately HTTP ships with something awesome since the 80''s: > conditional HTTP request. Just send an ''If-Modified-Since'' or ''If-None- > Match'' header along with the request and the server returns a ''304 Not > Modified'' or the full response.Nit picker''s corner: first version of http was 0.9 was in 1991, and if my reading is correct http 1.0 is the one that added if-modified-since etc.> > It is simple to use the Rails Cache Store to cache HTTP responses. But > actually, the most time consuming is building an object model from the > response and generate the HTML fragments. > > Therefore I am looking for an API that allows me to conditionally > execute render code, based on the response of external HTTP requests. > > [browser] --[GET /page]--> [Rails app][view][controller][model] -- > [GET /resource]--> [external service] > > I tried to come up with a first proposal:https://gist.github.com/1383983 > > What do you guys think? Does this make any sense? Are there any other > approaches that I could try? >Could you be using action controller''s stale? / fresh_when methods ? If so you''ll probably going to want to use something like rack-cache, varnish etc. in front of rails, since otherwise you''d only be able to return 304 if that particular client had already requested the data (which may or may not be a problem). If you only want to cache bits of the page, you could also use bog standard fragment caching, using the etag/last modified since etc. of the remote response as part of the cache key Fred> At least it provides a way through different layers, without leaking > knowledge. The downside is of course that every finder needs support a > block, where it normally would just return a result value. > > This approach makes it impossible to cache different HTML fragments of > the same resource, but I think I can mitigate that in a followup > proposal. > > Thank you for providing feedback, > > Matthijs-- You received this message because you are subscribed to the Google Groups "Ruby on Rails: Talk" group. To post to this group, send email to rubyonrails-talk-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To unsubscribe from this group, send email to rubyonrails-talk+unsubscribe@googlegroups.com. For more options, visit this group at http://groups.google.com/group/rubyonrails-talk?hl=en.
Matthijs Langenberg
2011-Nov-22 12:40 UTC
Re: Getting the most of our caches when dealing with external HTTP services
Hi Fred, Thanks for getting back to me. On Nov 22, 12:04 am, Frederick Cheung <frederick.che...-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:> On Nov 21, 9:54 pm, Matthijs Langenberg <mlangenb...-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote: > > > > > > > > > > > Dear all, > > > I am dealing with a group of pages that display data from a few > > different HTTP resources. In order to get these pages to be performant > > you must understand that I want to cache as much as possible at the > > front of my Rails application. > > > Another requirement is that we never show stale data to the user. > > These pages aren''t getting huge amounts of requests/second, but when > > there are no changes I want everything to feel snappy. If it takes a > > bit more time for the first requests to set up the caches it''s okay. > > > Fortunately HTTP ships with something awesome since the 80''s: > > conditional HTTP request. Just send an ''If-Modified-Since'' or ''If-None- > > Match'' header along with the request and the server returns a ''304 Not > > Modified'' or the full response. > > Nit picker''s corner: first version of http was 0.9 was in 1991, and if > my reading is correct http 1.0 is the one that added if-modified-since > etc. > >You are totally correct. What I was trying to say is that we often re- invent the wheel while there are beautiful gems inside existing standards such as HTTP that leverage a lot of functionality. I was over exaggerating the history of HTTP. Thanks for putting that right.> > It is simple to use the Rails Cache Store to cache HTTP responses. But > > actually, the most time consuming is building an object model from the > > response and generate the HTML fragments. > > > Therefore I am looking for an API that allows me to conditionally > > execute render code, based on the response of external HTTP requests. > > > [browser] --[GET /page]--> [Rails app][view][controller][model] -- > > [GET /resource]--> [external service] > > > I tried to come up with a first proposal:https://gist.github.com/1383983 > > > What do you guys think? Does this make any sense? Are there any other > > approaches that I could try? > > Could you be using action controller''s stale? / fresh_when methods ? > If so you''ll probably going to want to use something like rack-cache, > varnish etc. in front of rails, since otherwise you''d only be able to > return 304 if that particular client had already requested the data > (which may or may not be a problem). If you only want to cache bits of > the page, you could also use bog standard fragment caching, using the > etag/last modified since etc. of the remote response as part of the > cache key >I wonder how that would look. To be clear, I am not into sending a ''403 Not Modified'' to the users browser. For my application that would not be worth the effort. Different users view the same page once. So the second user should be served a cached response from Rails if possible. And by cached response I mean little HTML fragments stitched together. The request would still go through Rails. But based on a ''403 Not Modified'' from external HTTP services, it would skip parsing XML responses and rendering expensive partials. Am I right that in your approach, an ActiveResource finder would attach the returned ETAG or Last-Modified date to the returned object, so it can be used inside a view? I agree with you that the cache key should be chosen from the Rails view. There should be another place that stores the last fetched ETAG for that particular resource. And it actually needs to know if there is a cached HTML fragment for that resource. Looks like a Catch-22 situation to me.> Fred > > > > > > > > > At least it provides a way through different layers, without leaking > > knowledge. The downside is of course that every finder needs support a > > block, where it normally would just return a result value. > > > This approach makes it impossible to cache different HTML fragments of > > the same resource, but I think I can mitigate that in a followup > > proposal. > > > Thank you for providing feedback, > > > Matthijs-- You received this message because you are subscribed to the Google Groups "Ruby on Rails: Talk" group. To post to this group, send email to rubyonrails-talk-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To unsubscribe from this group, send email to rubyonrails-talk+unsubscribe@googlegroups.com. For more options, visit this group at http://groups.google.com/group/rubyonrails-talk?hl=en.
Matthijs Langenberg
2011-Nov-23 21:17 UTC
Re: Getting the most of our caches when dealing with external HTTP services
Alright, I think I can solve the issues by having two separate caches and implement lazy loading. In the view layer I want to be able to do: <% cache [@post.cache_key, ''author''] %> <h1><%= @post.author %></h1> <% end %> Hello, <%= current_user.name %>, this is not cached. <% cache [@post.cache_key, ''body''] %> <p><%= @post.body %></p> <% end %> Then the api model can look something like this: class Api::Post def self.first etag = $cache.read(''data:etag'') response = fetch_first(etag) if response == :not_modified puts ''cache HIT'' else puts ''cache MISS'' $cache.write ''data:etag'', response.first $cache.write ''data'', response.last end new(etag) end def self.fetch_first(etag) if etag.nil? puts ''Fetch XML'' ["1449ee0ec320e5bf5ed7a9949d4771d9", "<post>Hallo!</post>"] else :not_modified end end attr_reader :etag def initialize(etag) @etag = etag @document = nil end def body parse if @document.nil? # Lazy-loading @document.children.first.text end def parse @document = Nokogiri.parse($cache.read(''data'')) end end Two questions: 1. What can I do when the API returns a collection. I cannot return Array. 2. How can I wrap a Domain layer around the Api::Post class. - Matthijs On Nov 22, 1:40 pm, Matthijs Langenberg <mlangenb...-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:> Hi Fred, > > Thanks for getting back to me. > > On Nov 22, 12:04 am, Frederick Cheung <frederick.che...-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> > wrote: > > > > > > > > > > > On Nov 21, 9:54 pm, Matthijs Langenberg <mlangenb...-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote: > > > > Dear all, > > > > I am dealing with a group of pages that display data from a few > > > different HTTP resources. In order to get these pages to be performant > > > you must understand that I want to cache as much as possible at the > > > front of my Rails application. > > > > Another requirement is that we never show stale data to the user. > > > These pages aren''t getting huge amounts of requests/second, but when > > > there are no changes I want everything to feel snappy. If it takes a > > > bit more time for the first requests to set up the caches it''s okay. > > > > Fortunately HTTP ships with something awesome since the 80''s: > > > conditional HTTP request. Just send an ''If-Modified-Since'' or ''If-None- > > > Match'' header along with the request and the server returns a ''304 Not > > > Modified'' or the full response. > > > Nit picker''s corner: first version of http was 0.9 was in 1991, and if > > my reading is correct http 1.0 is the one that added if-modified-since > > etc. > > You are totally correct. What I was trying to say is that we often re- > invent the wheel while there are beautiful gems inside existing > standards such as HTTP that leverage a lot of functionality. I was > over exaggerating the history of HTTP. > Thanks for putting that right. > > > > > > > > > > > > It is simple to use the Rails Cache Store to cache HTTP responses. But > > > actually, the most time consuming is building an object model from the > > > response and generate the HTML fragments. > > > > Therefore I am looking for an API that allows me to conditionally > > > execute render code, based on the response of external HTTP requests. > > > > [browser] --[GET /page]--> [Rails app][view][controller][model] -- > > > [GET /resource]--> [external service] > > > > I tried to come up with a first proposal:https://gist.github.com/1383983 > > > > What do you guys think? Does this make any sense? Are there any other > > > approaches that I could try? > > > Could you be using action controller''s stale? / fresh_when methods ? > > If so you''ll probably going to want to use something like rack-cache, > > varnish etc. in front of rails, since otherwise you''d only be able to > > return 304 if that particular client had already requested the data > > (which may or may not be a problem). If you only want to cache bits of > > the page, you could also use bog standard fragment caching, using the > > etag/last modified since etc. of the remote response as part of the > > cache key > > I wonder how that would look. > > To be clear, I am not into sending a ''403 Not Modified'' to the users > browser. For my application that would not be worth the effort. > Different users view the same page once. So the second user should be > served a cached response from Rails if possible. And by cached > response I mean little HTML fragments stitched together. > > The request would still go through Rails. But based on a ''403 Not > Modified'' from external HTTP services, it would skip parsing XML > responses and rendering expensive partials. > > Am I right that in your approach, an ActiveResource finder would > attach the returned ETAG or Last-Modified date to the returned object, > so it can be used inside a view? > > I agree with you that the cache key should be chosen from the Rails > view. There should be another place that stores the last fetched ETAG > for that particular resource. And it actually needs to know if there > is a cached HTML fragment for that resource. > > Looks like a Catch-22 situation to me. > > > > > > > > > Fred > > > > At least it provides a way through different layers, without leaking > > > knowledge. The downside is of course that every finder needs support a > > > block, where it normally would just return a result value. > > > > This approach makes it impossible to cache different HTML fragments of > > > the same resource, but I think I can mitigate that in a followup > > > proposal. > > > > Thank you for providing feedback, > > > > Matthijs-- You received this message because you are subscribed to the Google Groups "Ruby on Rails: Talk" group. To post to this group, send email to rubyonrails-talk-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To unsubscribe from this group, send email to rubyonrails-talk+unsubscribe@googlegroups.com. For more options, visit this group at http://groups.google.com/group/rubyonrails-talk?hl=en.