On 08/22/08 10:49, Stan Park wrote:> I''m working on doing I/O working with only the DMU and ZIL
interfaces
> from the ZFS-FUSE source. I think I''ve got things working except
for ZIL
> replays. For large writes and when slogging is not available, the itx
> write state is set to WR_INDIRECT. If I understand the ZIL correctly, this
> means that the actual data to be written isn''t a part of the
object''s
> associated ZIL entry until the ZIL is being flushed through a zil_commit()
> call. In the WR_INDIRECT case, a blkptr is retrieved through a
> zfs_get_data() -> dmu_sync() call and stored in the ZIL entry on disk..
Correct. A large block is written directly into the main pool and its block
pointer is saved in the zil record which is written in the intent log. When
the txg later commits, the block is just linked into the txg tree of blocks.
If a crash/power fail occurs before this, then
>
> When I force a crash and try to run through a replay, the
> zfs_replay_write() call does this: ''char *data = (char *)(lr + 1);
/* data
> follows lr_write_t */'' It seems to me that this only works when
the
> WR_COPIED state is used since the data should be pointed to by
> lr->lr_blkptr in the WR_INDIRECT case. Indeed, when running a replay of
> WR_INDIRECT txs, all the write data is missing. Do WR_COPIED and
> WR_INDIRECT use different code paths during replay?
To make it easier later the data is read from the block when parsing the
log to extract the log records. This data read occurs in
zil_replay_log_record().
See the code after the comment: "If this is a TX_WRITE with a blkptr, suck
in the data".
BTW, we developed a tool to test the ZIL (see attached script ziltest), which
you may
be able to adapt to check out your work.
Neil.
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: ziltest
URL:
<http://mail.opensolaris.org/pipermail/zfs-code/attachments/20080903/7d051bd4/attachment.ksh>