Hi Suppose one has a db to track bookmarks for multiple users. Let''s assume that there are ten''s of thousands of users, and that an average user may have 100 bookmarks. (Assume here that no two bookmarks are the same.) Is it a common practice to have a bookmark table for ALL users, thereby producing a table that has on order one million rows. Or, is it better to somehow group the users so the bookmark tables are smaller. If tables are split, is there a rule of thumb as to how many tables a db should have? Thanks -- Jim Freeze
Databases are *really* good at handling tables with millions of rows, and joins with other tables to retrieve sets of rows (assuming you have the correct indexes setup) The only time you would worry about ''too much data'' is when you get so many users that you swamp the database machine with requests. Only then would you look at partitioning the data, and, in your case, partitioning by user would make sense. The partitioning would be across database servers, not between tables on the same server, and here you would have the same schema on each server, and some way of deciding which server a particular user''s data would be saved on. -----Original Message----- From: Jim Freeze [mailto:rails-c4f1mTqwXZkdnm+yROfE0A@public.gmane.org] Sent: Saturday, December 17, 2005 10:42 AM To: rails-1W37MKcQCpIf0INCOvqR/iCwEArCW2h5@public.gmane.org Subject: [Rails] Question on Db Table design Hi Suppose one has a db to track bookmarks for multiple users. Let''s assume that there are ten''s of thousands of users, and that an average user may have 100 bookmarks. (Assume here that no two bookmarks are the same.) Is it a common practice to have a bookmark table for ALL users, thereby producing a table that has on order one million rows. Or, is it better to somehow group the users so the bookmark tables are smaller. If tables are split, is there a rule of thumb as to how many tables a db should have? Thanks -- Jim Freeze _______________________________________________ Rails mailing list Rails-1W37MKcQCpIf0INCOvqR/iCwEArCW2h5@public.gmane.org http://lists.rubyonrails.org/mailman/listinfo/rails
One million records really isn''t that big. Assuming 1K/record, that''s only 1 GB. Heck, that might even fit completely in RAM. That said, here''s some things to consider: Some old OSs have file system limits (4GB/32-bit) that preclude super-large tables, but that is largely a thing of the past unless you are working on a legacy system. Have to use indexes on large tables--but not too many indexes because too many indexes slow down inserts. Might consider moving seldom-used fields into a separate table if those fields are allowed to be NULL, making the bookmarks table as ''slim'' as possible. Might consider changing all VARCHAR fields to CHAR fields since searches seem to go faster when the record size is fixed. Might consider moving all old records that haven''t been accessed in X months to an ''archive'' table that is only SELECTed if not found in main bookmarks table. Add an ''inactive'' flag to each record and set it if record has been inactive after Y months so you can skip those records easily. Some databases (namely, Oracle) will allow you to spread a single table over several spindles, which speeds things up a lot. If you do a lot of sorts on the records it will probably be easier on the db to pull all the records off the db and sort on a client machine. Regardless, I wouldn''t make many efficiency decisions right now until it becomes obvious that a single table isn''t going to give you response times as fast as you require. Jim Freeze wrote:> Hi > > Suppose one has a db to track bookmarks for multiple users. > Let''s assume that there are ten''s of thousands of users, and that > an average user may have 100 bookmarks. (Assume here that no > two bookmarks are the same.) > > Is it a common practice to have a bookmark table for ALL users, thereby > producing a table that has on order one million rows. > Or, is it better to somehow group the users so the bookmark tables are > smaller. > > If tables are split, is there a rule of thumb as to how many tables a > db should have? > > Thanks > -- > Jim Freeze
My two cents worth. I used to work on a database with 3 million rows. The reason that dbs are so good at this is because they use indexes. Indexes search in a sort of binary way. Powers of 2 are really interesting numbers. For instance, what how the power rises and the number rises. 2^1 = 2 2^5 = 32 2^10 = 1000 2^20 = 1000000 2^21 = 2000,000 Thus when you double the records from 1M to 2M you add one twentieth in the time taken to find an indexed record. And as the number of records gets bigger, the increased workload per search is getting progressively smaller (the increase, that is). So, to a db 2M records is a joke and would be an indiscernible difference from 1M records. I doubt very much whether the cost of programming re more tables is worth it for the savings you would obtain on hardware to have the necessary extra power. My 3 million row db used to work like a charm. Lightening fast, brilliant searches, combining 5 and 6 tables at a time from a collection of 65 (mysql, by the way). I recommend deep breaths and much relaxation. Your db can handle the job. They really are incredible examples of fine programming and the extraordinary power of modern computers. bruce PS. 4M is 4x 1M but the total workload on the db to find any given indexed record increased from 20 to 22, an increase of 10%. Don''t you find that awesome?