valrhona at gmail.com
2010-Jun-28 19:33 UTC
[zfs-discuss] Dedup RAM requirements, vs. L2ARC?
I''m putting together a new server, based on a Dell PowerEdge T410. I have simple SAS controller, with six 2TB Hitachi DeskStar 7200 RPM SATA drives. The processor is a quad-core 2 GHz Core i7-based Xeon. I will run the drives as one set of three mirror pairs striped together, for 6 TB of homogeneous storage. I''d like to run Dedup, but right now the server has only 4 GB of RAM. It has been pointed out to me several times that this is far too little. So how much should I buy? A few considerations: 1. I would like to run dedup on old copies of backups (dedup ratio for these filesystems are 3+). Basically I have a few years of backups onto tape, and will consolidate these. I need to have the data there on disk, but I rarely need to access it (maybe once a month). So those filesystems can be exported, and effectively shut off. Am I correct in guessing that, if a filesystem has been exported, its dedup table is not in RAM, and therefore is not relevant to RAM requirements? I don''t mind if it''s really slow to do the first and only copy to the file system, as I can let it run for a week without a problem. 2. Are the RAM requirements for ZFS with dedup based on the total available zpool size (I''m not using thin provisioning), or just on how much data is in the filesystem being deduped? That is, if I have 500 GB of deduped data but 6 TB of possible storage, which number is relevant for calculating RAM requirements? 3. What are the RAM requirements for ZFS in the absence of dedup? That is, if I only have deduped filesystems in an exported state, and all that is active is non-deduped, is 4 GB enough? 4. How does the L2ARC come into play? I can afford to buy a fast Intel X25M G2, for instance, or any of the newer SandForce-based MLC SSDs to cache the dedup table. But does it work that way? It''s not really affordable for me to get more than 16 GB of RAM on this system, because there are only four slots available, and the 8 GB DIMMS are a bit pricey. 5. Could I use one of the PCIe-based SSD cards for this purpose, such as the brand-new OCZ Revo? That should be somewhere between a SATA-based SSD and RAM. Thanks in advance for all of your advice and help. -- This message posted from opensolaris.org
Roy Sigurd Karlsbakk
2010-Jun-28 19:53 UTC
[zfs-discuss] Dedup RAM requirements, vs. L2ARC?
> 2. Are the RAM requirements for ZFS with dedup based on the total > available zpool size (I''m not using thin provisioning), or just on how > much data is in the filesystem being deduped? That is, if I have 500 > GB of deduped data but 6 TB of possible storage, which number is > relevant for calculating RAM requirements?It''s based on the data stored in the zpool. You''ll need about 200 bytes of per DDT (data deduplication table) entry, meaning about 1,2GB per 1TB stored on 128kB blocks. With smaller blocks (smaller files are stored in smaller blocks), that means more memory. With only large files, 1,2GB or 1,5GB per 1TB stored data should sufffice.> 3. What are the RAM requirements for ZFS in the absence of dedup? That > is, if I only have deduped filesystems in an exported state, and all > that is active is non-deduped, is 4 GB enough?Probably not, see above.> 4. How does the L2ARC come into play? I can afford to buy a fast Intel > X25M G2, for instance, or any of the newer SandForce-based MLC SSDs to > cache the dedup table. But does it work that way? It''s not really > affordable for me to get more than 16 GB of RAM on this system, > because there are only four slots available, and the 8 GB DIMMS are a > bit pricey.L2ARC will buffer the DDT along with the data, so if you get some good SSDs (such as Crucial RealSSD C300), this will speed things up quite a bit.> 5. Could I use one of the PCIe-based SSD cards for this purpose, such > as the brand-new OCZ Revo? That should be somewhere between a > SATA-based SSD and RAM.If your budget is low, as it may seem, good SATA SSDs will probably be the best. They can help out quite a bit. Just remember that dedup on opensolaris is not thoroughly tested yet. It works, but AFAIK there are still issues with long hangs in case of unexpected reboots. Disclaimer: I''m not an Oracle (nor Sun) employee - this is just my advice to you based on testing dedup on my test systems. Vennlige hilsener / Best regards roy -- Roy Sigurd Karlsbakk (+47) 97542685 roy at karlsbakk.net http://blogg.karlsbakk.net/ -- I all pedagogikk er det essensielt at pensum presenteres intelligibelt. Det er et element?rt imperativ for alle pedagoger ? unng? eksessiv anvendelse av idiomer med fremmed opprinnelse. I de fleste tilfeller eksisterer adekvate og relevante synonymer p? norsk.
On 6/28/2010 12:33 PM, valrhona at gmail.com wrote:> I''m putting together a new server, based on a Dell PowerEdge T410. > > I have simple SAS controller, with six 2TB Hitachi DeskStar 7200 RPM SATA drives. The processor is a quad-core 2 GHz Core i7-based Xeon. > > I will run the drives as one set of three mirror pairs striped together, for 6 TB of homogeneous storage. > > I''d like to run Dedup, but right now the server has only 4 GB of RAM. It has been pointed out to me several times that this is far too little. So how much should I buy? A few considerations: > > 1. I would like to run dedup on old copies of backups (dedup ratio for these filesystems are 3+). Basically I have a few years of backups onto tape, and will consolidate these. I need to have the data there on disk, but I rarely need to access it (maybe once a month). So those filesystems can be exported, and effectively shut off. Am I correct in guessing that, if a filesystem has been exported, its dedup table is not in RAM, and therefore is not relevant to RAM requirements? I don''t mind if it''s really slow to do the first and only copy to the file system, as I can let it run for a week without a problem. > >That''s correct. An exported pool is effectively ignored by the system, so it won''t contribute to any ARC requirements.> 2. Are the RAM requirements for ZFS with dedup based on the total available zpool size (I''m not using thin provisioning), or just on how much data is in the filesystem being deduped? That is, if I have 500 GB of deduped data but 6 TB of possible storage, which number is relevant for calculating RAM requirements? >Requirements are based on *current* BLOCK usage, after dedup has occurred. That is, ZFS needs an entry in the DDT for each block actually allocated in the filesystem. The number of times that block is referenced won''t influence the DDT size, nor will the *potential* size of the pool matter (other than for capacity planning). Remember that ZFS uses variable size blocks, so you need to determine what your average block size is in order to estimate your DDT usage.> 3. What are the RAM requirements for ZFS in the absence of dedup? That is, if I only have deduped filesystems in an exported state, and all that is active is non-deduped, is 4 GB enough? >It of course depends heavily on your usage pattern, and the kind of files you are serving up. ZFS requires at a bare minimum a couple of dozen MB for its own usage. Everything above that is caching. Heavy write I/O will also eat up RAM, as ZFS needs to cache the writes in RAM before doing a large write I/O to backing store. Take a look at the amount of data you expect to be using heavily - your RAM should probably exceed this amount, plus an additional 1GB or so for the OS/ZFS/kernel/etc use. That is assuming you are doing nothing but fileserving on the system.> 4. How does the L2ARC come into play? I can afford to buy a fast Intel X25M G2, for instance, or any of the newer SandForce-based MLC SSDs to cache the dedup table. But does it work that way? It''s not really affordable for me to get more than 16 GB of RAM on this system, because there are only four slots available, and the 8 GB DIMMS are a bit pricey. >L2ARC is "secondary" ARC. ZFS attempts to cache all reads in the ARC (Adaptive Read Cache) - should it find that it doesn''t have enough space in the ARC (which is RAM-resident), it will evict some data over to the L2ARC (which in turn will simply dump the least-recently-used data when it runs out of space). Remember, however, every time something gets written to the L2ARC, a little bit of space is taken up in the ARC itself (a pointer to the L2ARC entry needs to be kept in ARC). So, it''s not possible to have a giant L2ARC and tiny ARC. As a rule of thump, I try not to have my L2ARC exceed my main RAM by more than 10-15x (with really bigMem machines, I''m a bit looser and allow 20-25x or so, but still...). So, if you are thinking of getting a 160GB SSD, it would be wise to go for at minimum 8GB of RAM. Once again, the amount of ARC space reserved for a L2ARC entry is fixed, and independent of the actual block size stored in L2ARC. The jist of this is that tiny files eat up a disproportionate amount of systems resources for their size (smaller size = larger % overhead vis-a-vis large files).> 5. Could I use one of the PCIe-based SSD cards for this purpose, such as the brand-new OCZ Revo? That should be somewhere between a SATA-based SSD and RAM. > > Thanks in advance for all of your advice and help. >ZFS doesn''t care what you use for the L2ARC. Some of us actually use Hard drives, so a PCI-E Flash card is entirely possible. The Revo is possibly the first PCI-E Flash card that wasn''t massively expensive, otherwise, I don''t think they''d be a good option. They''re going to be more expensive than even an SLC SSD, however. In addition, given that L2ARC is heavily read-biased, cheap MLC SSDs are hard to beat for performance/$. The biggest problem with PCI-E cards is that they require a OS-specific drivers, and OpenSolaris doesn''t always make the cut for support. In your specific case, I''d consider upgrading to 8GB RAM, and looking at an 80GB MLC SSD. That''s just blind guessing, since I don''t know what your usage (and file) pattern is. -- Erik Trimble Java System Support Mailstop: usca22-123 Phone: x17195 Santa Clara, CA
On 6/28/2010 12:53 PM, Roy Sigurd Karlsbakk wrote:>> 2. Are the RAM requirements for ZFS with dedup based on the total >> available zpool size (I''m not using thin provisioning), or just on how >> much data is in the filesystem being deduped? That is, if I have 500 >> GB of deduped data but 6 TB of possible storage, which number is >> relevant for calculating RAM requirements? >> > It''s based on the data stored in the zpool. You''ll need about 200 bytes of per DDT (data deduplication table) entry, meaning about 1,2GB per 1TB stored on 128kB blocks. With smaller blocks (smaller files are stored in smaller blocks), that means more memory. With only large files, 1,2GB or 1,5GB per 1TB stored data should sufffice. >Actually, I think the rule-of-thumb is 270 bytes/DDT entry. It''s 200 bytes of ARC for every L2ARC entry. DDT doesn''t count for this ARC space usage E.g.: I have 1TB of 4k files that are to be deduped, and it turns out that I have about a 5:1 dedup ratio. I''d also like to see how much ARC usage I eat up with a 160GB L2ARC. (1) How many entries are there in the DDT: 1TB of 4k files means there are 2^30 files (about 1 billion). However, at a 5:1 dedup ratio, I''m only actually storing 20% of that, so I have about 214 million blocks. Thus, I need a DDT of about 270 * 214 million =~ 58GB in size (2) My L2ARC is 160GB in size, but I''m using 58GB for the DDT. Thus, I have 102GB free for use as a data cache. 102GB / 4k =~ 27 million blocks can be stored in the remaining L2ARC space. However, 26 million files takes up: 200 * 27 million =~ 5.5GB of space in ARC Thus, I''d better have at least 5.5GB of RAM allocated solely for L2ARC reference pointers, and no other use.>> 4. How does the L2ARC come into play? I can afford to buy a fast Intel >> X25M G2, for instance, or any of the newer SandForce-based MLC SSDs to >> cache the dedup table. But does it work that way? It''s not really >> affordable for me to get more than 16 GB of RAM on this system, >> because there are only four slots available, and the 8 GB DIMMS are a >> bit pricey. >> > L2ARC will buffer the DDT along with the data, so if you get some good SSDs (such as Crucial RealSSD C300), this will speed things up quite a bit. > > >> 5. Could I use one of the PCIe-based SSD cards for this purpose, such >> as the brand-new OCZ Revo? That should be somewhere between a >> SATA-based SSD and RAM. >> > If your budget is low, as it may seem, good SATA SSDs will probably be the best. They can help out quite a bit. > > Just remember that dedup on opensolaris is not thoroughly tested yet. It works, but AFAIK there are still issues with long hangs in case of unexpected reboots. > > Disclaimer: I''m not an Oracle (nor Sun) employee - this is just my advice to you based on testing dedup on my test systems. > > Vennlige hilsener / Best regards > > royWhile I''m an Oracle employee, but I don''t have any insider knowledge on this. It''s solely my experience talking. -- Erik Trimble Java System Support Mailstop: usca22-123 Phone: x17195 Santa Clara, CA
valrhona at gmail.com
2010-Jun-30 20:35 UTC
[zfs-discuss] Dedup RAM requirements, vs. L2ARC?
Thanks to everyone for such helpful and detailed answers. Contrary to some of the trolls in other threads, I''ve had a fantastic experience here, and am grateful to the community. Based on the feedback, I''ll upgrade my machine to 8 GB of RAM. I only have two slots on the motherboard, and either add two 2 GB DIMMs to add to the two I have there, or throw those away and start over with 4 GB DIMMs, which is not something I''m quite ready to do yet (before this is all working, for instance). Now, for the SSD, Crucial appears to have their (recommended above) C300 64 GB drive for $150, which seems like a good deal. Intel''s X25M G2 is $200 for 80 GB. Does anyone have a strong opinion as to which would work better for the L2ARC? I am having a hard time understanding, from the performance numbers given, which would be a better choice. Finally, for my purposes, it doesn''t seem like a ZIL is necessary? I''m the only user of the fileserver, so there probably won''t be more than two or three computers, maximum, accessing stuff (and writing stuff) remotely. But, from what I can gather, by spending a little under $400, I should substantially increase the performance of my system with dedup? Many thanks, again, in advance. -- This message posted from opensolaris.org
valrhona at gmail.com
2010-Jun-30 21:01 UTC
[zfs-discuss] Dedup RAM requirements, vs. L2ARC?
Another question on SSDs in terms of performance vs. capacity. Between $150 and $200, there are at least three SSDs that would fit the rough specifications for the L2ARC on my system: 1. Crucial C300, 64 GB: $150: medium performance, medium capacity. 2. OCZ Vertex 2, 50 GB: $180: higher performance, lower capacity. (The Agility 2 is similar, but $15 cheaper) 3. Corsair Force 60 GB, $195: similar performance, slightly higher capacity (more over-provisioning with the same SandForce controller). 4. Intel X25M G2, 80 GB: $200: largest capacity, probably lowest(?) performance. So which would be the best choice L2ARC? Is it size, or is it throughput, that really matter for this? Within this range, price doesn''t make much difference. Thanks, as always, for the guidance. -- This message posted from opensolaris.org
On Wed, Jun 30, 2010 at 01:35:31PM -0700, valrhona at gmail.com wrote:> Finally, for my purposes, it doesn''t seem like a ZIL is necessary? I''m > the only user of the fileserver, so there probably won''t be more than > two or three computers, maximum, accessing stuff (and writing stuff) > remotely.It depends on what you''re doing. The perennial complaint about NFS is the synchronous open()/close() operations and the fact that archivers (tar, ...) will generally unpack archives in a single-threaded manner, which means all those synchronous ops punctuate the archiver''s performance with pauses. This is a load type for which ZIL devices come in quite handy. If you write lots of small files often and in single-threaded ways _and_ want to guarantee you don''t lose transactions, then you want a ZIL device. (The recent knob for controlling whether synchronous I/O gets done asynchronously would help you if you don''t care about losing a few seconds worth of writes, assuming that feature makes it into any release of Solaris.)> But, from what I can gather, by spending a little under $400, I should > substantially increase the performance of my system with dedup? Many > thanks, again, in advance.If you have deduplicatious data, yes. Nico --
On Wed, 2010-06-30 at 16:41 -0500, Nicolas Williams wrote:> On Wed, Jun 30, 2010 at 01:35:31PM -0700, valrhona at gmail.com wrote: > > Finally, for my purposes, it doesn''t seem like a ZIL is necessary? I''m > > the only user of the fileserver, so there probably won''t be more than > > two or three computers, maximum, accessing stuff (and writing stuff) > > remotely. > > It depends on what you''re doing. > > The perennial complaint about NFS is the synchronous open()/close() > operations and the fact that archivers (tar, ...) will generally unpack > archives in a single-threaded manner, which means all those synchronous > ops punctuate the archiver''s performance with pauses. This is a load > type for which ZIL devices come in quite handy. If you write lots of > small files often and in single-threaded ways _and_ want to guarantee > you don''t lose transactions, then you want a ZIL device. (The recent > knob for controlling whether synchronous I/O gets done asynchronously > would help you if you don''t care about losing a few seconds worth of > writes, assuming that feature makes it into any release of Solaris.)Btw, that feature will be in the NexentaStor 3.0.4 release (which is currently in late development/early QA, and should be out soon.) Archivers are not the only thing that acts this way, btw. Databases, and even something as benign as compiling a large software suite can have similar implications where a fast slog device can help. - Garrett
On 6/30/2010 2:01 PM, valrhona at gmail.com wrote:> Another question on SSDs in terms of performance vs. capacity. > > Between $150 and $200, there are at least three SSDs that would fit the rough specifications for the L2ARC on my system: > > 1. Crucial C300, 64 GB: $150: medium performance, medium capacity. > 2. OCZ Vertex 2, 50 GB: $180: higher performance, lower capacity. (The Agility 2 is similar, but $15 cheaper) > 3. Corsair Force 60 GB, $195: similar performance, slightly higher capacity (more over-provisioning with the same SandForce controller). > 4. Intel X25M G2, 80 GB: $200: largest capacity, probably lowest(?) performance. > > So which would be the best choice L2ARC? Is it size, or is it throughput, that really matter for this? > > Within this range, price doesn''t make much difference. Thanks, as always, for the guidance. >For L2ARC, random read performance is the primary important factor. Given what I''ve seen of the C300''s random read performance for NON-4k block sizes, and for short queue depths (both of which are highly likely for an L2ARC device), my recommendation of the above four is #4, the Intel device. #2 is possibly the best performing for ZIL usage, should you choose to use a portion of the device for that purpose. For what you''ve said your usage pattern is, I think the Intel X25M is the best fit for good performance and size for the dollar. -- Erik Trimble Java System Support Mailstop: usca22-123 Phone: x17195 Santa Clara, CA Timezone: US/Pacific (GMT-0800)
> Actually, I think the rule-of-thumb is 270 bytes/DDT > entry. It''s 200 > bytes of ARC for every L2ARC entry. > > DDT doesn''t count for this ARC space usage > > E.g.: I have 1TB of 4k files that are to be > deduped, and it turns > out that I have about a 5:1 dedup ratio. I''d also > like to see how much > ARC usage I eat up with a 160GB L2ARC. > > (1) How many entries are there in the DDT: > 1TB of 4k files means there are 2^30 > files (about 1 billion). > However, at a 5:1 dedup ratio, I''m only > actually storing > 0% of that, so I have about 214 million blocks. > Thus, I need a DDT of about 270 * 214 > million =~ 58GB in size > (2) My L2ARC is 160GB in size, but I''m using 58GB > for the DDT. Thus, > I have 102GB free for use as a data cache. > 102GB / 4k =~ 27 million blocks can be > stored in the > emaining L2ARC space. > However, 26 million files takes up: > 200 * 27 million =~ > GB of space in ARC > Thus, I''d better have at least 5.5GB of > RAM allocated > olely for L2ARC reference pointers, and no other use. > >Hi Erik. Are you saying the DDT will automatically look to be stored in an L2ARC device if one exists in the pool, instead of using ARC? Or is there some sort of memory pressure point where the DDT gets moved from ARC to L2ARC? Thanks, Geoff -- This message posted from opensolaris.org
On 7/1/2010 9:23 PM, Geoff Nordli wrote:> Hi Erik. > > Are you saying the DDT will automatically look to be stored in an L2ARC device if one exists in the pool, instead of using ARC? > > Or is there some sort of memory pressure point where the DDT gets moved from ARC to L2ARC? > > Thanks, > > Geoff >Good question, and I don''t know. My educated guess is the latter (initially stored in ARC, then moved to L2ARC as size increases). Anyone? -- Erik Trimble Java System Support Mailstop: usca22-123 Phone: x17195 Santa Clara, CA Timezone: US/Pacific (GMT-0800)
On 07/01/10 22:33, Erik Trimble wrote:> On 7/1/2010 9:23 PM, Geoff Nordli wrote: >> Hi Erik. >> >> Are you saying the DDT will automatically look to be stored in an >> L2ARC device if one exists in the pool, instead of using ARC? >> >> Or is there some sort of memory pressure point where the DDT gets >> moved from ARC to L2ARC? >> >> Thanks, >> >> Geoff >> > > Good question, and I don''t know. My educated guess is the latter > (initially stored in ARC, then moved to L2ARC as size increases). > > Anyone? >The L2ARC just holds blocks that have been evicted from the ARC due to memory pressure. The DDT is no different than any other object (e.g. file). So when looking for a block ZFS checks first in the ARC then the L2ARC and if neither succeeds reads from the main pool. - Anyone.
On 7/1/2010 10:17 PM, Neil Perrin wrote:> On 07/01/10 22:33, Erik Trimble wrote: >> On 7/1/2010 9:23 PM, Geoff Nordli wrote: >>> Hi Erik. >>> >>> Are you saying the DDT will automatically look to be stored in an >>> L2ARC device if one exists in the pool, instead of using ARC? >>> >>> Or is there some sort of memory pressure point where the DDT gets >>> moved from ARC to L2ARC? >>> >>> Thanks, >>> >>> Geoff >> >> Good question, and I don''t know. My educated guess is the latter >> (initially stored in ARC, then moved to L2ARC as size increases). >> >> Anyone? >> > > The L2ARC just holds blocks that have been evicted from the ARC due > to memory pressure. The DDT is no different than any other object > (e.g. file). So when looking for a block ZFS checks first in the ARC then > the L2ARC and if neither succeeds reads from the main pool. > > - Anyone. > >That''s what I assumed. One further thought, though. Is the DDT is treated as a single entity - so it''s *all* either in the ARC or in the L2ARC? Or does it move one entry at a time into the L2ARC as it fills the ARC? -- Erik Trimble Java System Support Mailstop: usca22-123 Phone: x17195 Santa Clara, CA Timezone: US/Pacific (GMT-0800)
On 07/02/10 00:57, Erik Trimble wrote:> On 7/1/2010 10:17 PM, Neil Perrin wrote: >> On 07/01/10 22:33, Erik Trimble wrote: >>> On 7/1/2010 9:23 PM, Geoff Nordli wrote: >>>> Hi Erik. >>>> >>>> Are you saying the DDT will automatically look to be stored in an >>>> L2ARC device if one exists in the pool, instead of using ARC? >>>> >>>> Or is there some sort of memory pressure point where the DDT gets >>>> moved from ARC to L2ARC? >>>> >>>> Thanks, >>>> >>>> Geoff >>> >>> Good question, and I don''t know. My educated guess is the latter >>> (initially stored in ARC, then moved to L2ARC as size increases). >>> >>> Anyone? >>> >> >> The L2ARC just holds blocks that have been evicted from the ARC due >> to memory pressure. The DDT is no different than any other object >> (e.g. file). So when looking for a block ZFS checks first in the ARC >> then >> the L2ARC and if neither succeeds reads from the main pool. >> >> - Anyone. >> >> > > > That''s what I assumed. One further thought, though. Is the DDT is > treated as a single entity - so it''s *all* either in the ARC or in the > L2ARC? Or does it move one entry at a time into the L2ARC as it fills > the ARC? > >It''s not treated as a single entity but at a block at a time. Neil.
On 7/2/2010 6:30 AM, Neil Perrin wrote:> On 07/02/10 00:57, Erik Trimble wrote: >> That''s what I assumed. One further thought, though. Is the DDT is >> treated as a single entity - so it''s *all* either in the ARC or in >> the L2ARC? Or does it move one entry at a time into the L2ARC as it >> fills the ARC? >> > It''s not treated as a single entity but at a block at a time. > > Neil.Where 1 block = ? I''m assuming that more than on DDT entry will fit in a block (since DDT entries are ~270 bytes) - but, how big does the block get? Depending on the total size of the DDT? Or does it use fixed-sized blocks (I''d assume the smallest block possible, in this case)? Which reminds me: the current DDT is stored on disk - correct? - so that when I boot up, ZFS loads a complete DDT into the ARC when the pool is mounted? Or is it all constructed on the fly? -- Erik Trimble Java System Support Mailstop: usca22-123 Phone: x17195 Santa Clara, CA Timezone: US/Pacific (GMT-0800)
>>>>> "np" == Neil Perrin <neil.perrin at oracle.com> writes:np> The L2ARC just holds blocks that have been evicted from the np> ARC due to memory pressure. The DDT is no different than any np> other object (e.g. file). The other cacheable objects require pointers to stay in the ARC pointing to blocks in the L2ARC. If the DDT required this, L2ARC-ification would be pointless since DDT entries aren''t much smaller than ARC->L2ARC pointers, so from what I hear it is actually special in some way though I don''t know precisely what way. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 304 bytes Desc: not available URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20100702/e4524624/attachment.bin>
On 07/02/10 11:14, Erik Trimble wrote:> On 7/2/2010 6:30 AM, Neil Perrin wrote: >> On 07/02/10 00:57, Erik Trimble wrote: >>> That''s what I assumed. One further thought, though. Is the DDT is >>> treated as a single entity - so it''s *all* either in the ARC or in >>> the L2ARC? Or does it move one entry at a time into the L2ARC as it >>> fills the ARC? >>> >> It''s not treated as a single entity but at a block at a time. >> >> Neil. > > Where 1 block = ? > I''m assuming that more than on DDT entry will fit in a block (since > DDT entries are ~270 bytes) - but, how big does the block get? > Depending on the total size of the DDT? Or does it use fixed-sized > blocks (I''d assume the smallest block possible, in this case)?- Yes, a pool block will contain many DDT entries. They are stored as a ZAP entries. I assume but I''m not sure if zap blocks grow to the maximum SPA block size (currently 128KB).> > > Which reminds me: the current DDT is stored on disk - correct? - so > that when I boot up, ZFS loads a complete DDT into the ARC when the > pool is mounted? Or is it all constructed on the fly?- It''s read as needed on the fly. Neil.