wangyan
2020-Jan-08 09:23 UTC
[Ocfs2-devel] [PATCH] ocfs2: fix a NULL pointer dereference when call ocfs2_update_inode_fsync_trans()
I found a NULL pointer dereference in ocfs2_update_inode_fsync_trans(),
handle->h_transaction may be NULL in this situation:
ocfs2_file_write_iter
->__generic_file_write_iter
->generic_perform_write
->ocfs2_write_begin
->ocfs2_write_begin_nolock
->ocfs2_write_cluster_by_desc
->ocfs2_write_cluster
->ocfs2_mark_extent_written
->ocfs2_change_extent_flag
->ocfs2_split_extent
->ocfs2_try_to_merge_extent
->ocfs2_extend_rotate_transaction
->ocfs2_extend_trans
->jbd2_journal_restart
->jbd2__journal_restart
// handle->h_transaction is NULL here
->handle->h_transaction = NULL;
->start_this_handle
/* journal aborted due to storage
network disconnection, return error */
->return -EROFS;
/* line 3806 in ocfs2_try_to_merge_extent (),
it will ignore ret error. */
->ret = 0;
->...
->ocfs2_write_end
->ocfs2_write_end_nolock
->ocfs2_update_inode_fsync_trans
// NULL pointer dereference
->oi->i_sync_tid = handle->h_transaction->t_tid;
The information of NULL pointer dereference as follows:
JBD2: Detected IO errors while flushing file data on dm-11-45
Aborting journal on device dm-11-45.
JBD2: Error -5 detected when updating journal superblock for dm-11-45.
(dd,22081,3):ocfs2_extend_trans:474 ERROR: status = -30
(dd,22081,3):ocfs2_try_to_merge_extent:3877 ERROR: status = -30
Unable to handle kernel NULL pointer dereference at
virtual address 0000000000000008
Mem abort info:
ESR = 0x96000004
Exception class = DABT (current EL), IL = 32 bits
SET = 0, FnV = 0
EA = 0, S1PTW = 0
Data abort info:
ISV = 0, ISS = 0x00000004
CM = 0, WnR = 0
user pgtable: 4k pages, 48-bit VAs, pgdp = 00000000e74e1338
[0000000000000008] pgd=0000000000000000
Internal error: Oops: 96000004 [#1] SMP
Process dd (pid: 22081, stack limit = 0x00000000584f35a9)
CPU: 3 PID: 22081 Comm: dd Kdump: loaded
Hardware name: Huawei TaiShan 2280 V2/BC82AMDD, BIOS 0.98 08/25/2019
pstate: 60400009 (nZCv daif +PAN -UAO)
pc : ocfs2_write_end_nolock+0x2b8/0x550 [ocfs2]
lr : ocfs2_write_end_nolock+0x2a0/0x550 [ocfs2]
sp : ffff0000459fba70
x29: ffff0000459fba70 x28: 0000000000000000
x27: ffff807ccf7f1000 x26: 0000000000000001
x25: ffff807bdff57970 x24: ffff807caf1d4000
x23: ffff807cc79e9000 x22: 0000000000001000
x21: 000000006c6cd000 x20: ffff0000091d9000
x19: ffff807ccb239db0 x18: ffffffffffffffff
x17: 000000000000000e x16: 0000000000000007
x15: ffff807c5e15bd78 x14: 0000000000000000
x13: 0000000000000000 x12: 0000000000000000
x11: 0000000000000000 x10: 0000000000000001
x9 : 0000000000000228 x8 : 000000000000000c
x7 : 0000000000000fff x6 : ffff807a308ed6b0
x5 : ffff7e01f10967c0 x4 : 0000000000000018
x3 : d0bc661572445600 x2 : 0000000000000000
x1 : 000000001b2e0200 x0 : 0000000000000000
Call trace:
ocfs2_write_end_nolock+0x2b8/0x550 [ocfs2]
ocfs2_write_end+0x4c/0x80 [ocfs2]
generic_perform_write+0x108/0x1a8
__generic_file_write_iter+0x158/0x1c8
ocfs2_file_write_iter+0x668/0x950 [ocfs2]
__vfs_write+0x11c/0x190
vfs_write+0xac/0x1c0
ksys_write+0x6c/0xd8
__arm64_sys_write+0x24/0x30
el0_svc_common+0x78/0x130
el0_svc_handler+0x38/0x78
el0_svc+0x8/0xc
To prevent NULL pointer dereference in this situation, we use
is_handle_aborted() before using handle->h_transaction->t_tid.
Signed-off-by: Yan Wang <wangyan122 at huawei.com>
Reviewed-by: Jun Piao <piaojun at huawei.com>
---
fs/ocfs2/journal.h | 8 +++++---
fs/ocfs2/namei.c | 3 +--
2 files changed, 6 insertions(+), 5 deletions(-)
diff --git a/fs/ocfs2/journal.h b/fs/ocfs2/journal.h
index 3103ba7f97a2..bfe611ed1b1d 100644
--- a/fs/ocfs2/journal.h
+++ b/fs/ocfs2/journal.h
@@ -597,9 +597,11 @@ static inline void
ocfs2_update_inode_fsync_trans(handle_t *handle,
{
struct ocfs2_inode_info *oi = OCFS2_I(inode);
- oi->i_sync_tid = handle->h_transaction->t_tid;
- if (datasync)
- oi->i_datasync_tid = handle->h_transaction->t_tid;
+ if (!is_handle_aborted(handle)) {
+ oi->i_sync_tid = handle->h_transaction->t_tid;
+ if (datasync)
+ oi->i_datasync_tid = handle->h_transaction->t_tid;
+ }
}
#endif /* OCFS2_JOURNAL_H */
diff --git a/fs/ocfs2/namei.c b/fs/ocfs2/namei.c
index 8ea51cf27b97..da65251ef815 100644
--- a/fs/ocfs2/namei.c
+++ b/fs/ocfs2/namei.c
@@ -586,8 +586,7 @@ static int __ocfs2_mknod_locked(struct inode *dir,
mlog_errno(status);
}
- oi->i_sync_tid = handle->h_transaction->t_tid;
- oi->i_datasync_tid = handle->h_transaction->t_tid;
+ ocfs2_update_inode_fsync_trans(handle, inode, 1);
leave:
if (status < 0) {
--
2.19.1
Joseph Qi
2020-Jan-08 11:31 UTC
[Ocfs2-devel] [PATCH] ocfs2: fix a NULL pointer dereference when call ocfs2_update_inode_fsync_trans()
On 20/1/8 17:23, wangyan wrote:> I found a NULL pointer dereference in ocfs2_update_inode_fsync_trans(), > handle->h_transaction may be NULL in this situation: > ocfs2_file_write_iter > ? ->__generic_file_write_iter > ????? ->generic_perform_write > ??????? ->ocfs2_write_begin > ????????? ->ocfs2_write_begin_nolock > ??????????? ->ocfs2_write_cluster_by_desc > ????????????? ->ocfs2_write_cluster > ??????????????? ->ocfs2_mark_extent_written > ????????????????? ->ocfs2_change_extent_flag > ??????????????????? ->ocfs2_split_extent > ????????????????????? ->ocfs2_try_to_merge_extent > ??????????????????????? ->ocfs2_extend_rotate_transaction > ????????????????????????? ->ocfs2_extend_trans > ??????????????????????????? ->jbd2_journal_restart > ????????????????????????????? ->jbd2__journal_restart > ??????????????????????????????? // handle->h_transaction is NULL here > ??????????????????????????????? ->handle->h_transaction = NULL; > ??????????????????????????????? ->start_this_handle > ????????????????????????????????? /* journal aborted due to storage > ???????????????????????????????????? network disconnection, return error */ > ????????????????????????????????? ->return -EROFS; > ???????????????????????? /* line 3806 in ocfs2_try_to_merge_extent (), > ??????????????????????????? it will ignore ret error. */ > ??????????????????????? ->ret = 0; > ??????? ->... > ??????? ->ocfs2_write_end > ????????? ->ocfs2_write_end_nolock > ??????????? ->ocfs2_update_inode_fsync_trans > ????????????? // NULL pointer dereference > ????????????? ->oi->i_sync_tid = handle->h_transaction->t_tid; > > The information of NULL pointer dereference as follows: > ??? JBD2: Detected IO errors while flushing file data on dm-11-45 > ??? Aborting journal on device dm-11-45. > ??? JBD2: Error -5 detected when updating journal superblock for dm-11-45. > ??? (dd,22081,3):ocfs2_extend_trans:474 ERROR: status = -30 > ??? (dd,22081,3):ocfs2_try_to_merge_extent:3877 ERROR: status = -30 > ??? Unable to handle kernel NULL pointer dereference at > ??? virtual address 0000000000000008 > ??? Mem abort info: > ????? ESR = 0x96000004 > ????? Exception class = DABT (current EL), IL = 32 bits > ????? SET = 0, FnV = 0 > ????? EA = 0, S1PTW = 0 > ??? Data abort info: > ????? ISV = 0, ISS = 0x00000004 > ????? CM = 0, WnR = 0 > ??? user pgtable: 4k pages, 48-bit VAs, pgdp = 00000000e74e1338 > ??? [0000000000000008] pgd=0000000000000000 > ??? Internal error: Oops: 96000004 [#1] SMP > ??? Process dd (pid: 22081, stack limit = 0x00000000584f35a9) > ??? CPU: 3 PID: 22081 Comm: dd Kdump: loaded > ??? Hardware name: Huawei TaiShan 2280 V2/BC82AMDD, BIOS 0.98 08/25/2019 > ??? pstate: 60400009 (nZCv daif +PAN -UAO) > ??? pc : ocfs2_write_end_nolock+0x2b8/0x550 [ocfs2] > ??? lr : ocfs2_write_end_nolock+0x2a0/0x550 [ocfs2] > ??? sp : ffff0000459fba70 > ??? x29: ffff0000459fba70 x28: 0000000000000000 > ??? x27: ffff807ccf7f1000 x26: 0000000000000001 > ??? x25: ffff807bdff57970 x24: ffff807caf1d4000 > ??? x23: ffff807cc79e9000 x22: 0000000000001000 > ??? x21: 000000006c6cd000 x20: ffff0000091d9000 > ??? x19: ffff807ccb239db0 x18: ffffffffffffffff > ??? x17: 000000000000000e x16: 0000000000000007 > ??? x15: ffff807c5e15bd78 x14: 0000000000000000 > ??? x13: 0000000000000000 x12: 0000000000000000 > ??? x11: 0000000000000000 x10: 0000000000000001 > ??? x9 : 0000000000000228 x8 : 000000000000000c > ??? x7 : 0000000000000fff x6 : ffff807a308ed6b0 > ??? x5 : ffff7e01f10967c0 x4 : 0000000000000018 > ??? x3 : d0bc661572445600 x2 : 0000000000000000 > ??? x1 : 000000001b2e0200 x0 : 0000000000000000 > ??? Call trace: > ???? ocfs2_write_end_nolock+0x2b8/0x550 [ocfs2] > ???? ocfs2_write_end+0x4c/0x80 [ocfs2] > ???? generic_perform_write+0x108/0x1a8 > ???? __generic_file_write_iter+0x158/0x1c8 > ???? ocfs2_file_write_iter+0x668/0x950 [ocfs2] > ???? __vfs_write+0x11c/0x190 > ???? vfs_write+0xac/0x1c0 > ???? ksys_write+0x6c/0xd8 > ???? __arm64_sys_write+0x24/0x30 > ???? el0_svc_common+0x78/0x130 > ???? el0_svc_handler+0x38/0x78 > ???? el0_svc+0x8/0xc > > To prevent NULL pointer dereference in this situation, we use > is_handle_aborted() before using handle->h_transaction->t_tid. > > Signed-off-by: Yan Wang <wangyan122 at huawei.com> > Reviewed-by: Jun Piao <piaojun at huawei.com> > --- > ?fs/ocfs2/journal.h | 8 +++++--- > ?fs/ocfs2/namei.c?? | 3 +-- > ?2 files changed, 6 insertions(+), 5 deletions(-) > > diff --git a/fs/ocfs2/journal.h b/fs/ocfs2/journal.h > index 3103ba7f97a2..bfe611ed1b1d 100644 > --- a/fs/ocfs2/journal.h > +++ b/fs/ocfs2/journal.h > @@ -597,9 +597,11 @@ static inline void ocfs2_update_inode_fsync_trans(handle_t *handle, > ?{ > ???? struct ocfs2_inode_info *oi = OCFS2_I(inode); > > -??? oi->i_sync_tid = handle->h_transaction->t_tid; > -??? if (datasync) > -??????? oi->i_datasync_tid = handle->h_transaction->t_tid; > +??? if (!is_handle_aborted(handle)) { > +??????? oi->i_sync_tid = handle->h_transaction->t_tid; > +??????? if (datasync) > +??????????? oi->i_datasync_tid = handle->h_transaction->t_tid;Use tab instead of space, please.> +??? } > ?} > > ?#endif /* OCFS2_JOURNAL_H */ > diff --git a/fs/ocfs2/namei.c b/fs/ocfs2/namei.c > index 8ea51cf27b97..da65251ef815 100644 > --- a/fs/ocfs2/namei.c > +++ b/fs/ocfs2/namei.c > @@ -586,8 +586,7 @@ static int __ocfs2_mknod_locked(struct inode *dir, > ???????????? mlog_errno(status); > ???? } > > -??? oi->i_sync_tid = handle->h_transaction->t_tid; > -??? oi->i_datasync_tid = handle->h_transaction->t_tid; > +??? ocfs2_update_inode_fsync_trans(handle, inode, 1); >I don't see any reason why we have to check handle here. Thanks, Joseph> ?leave: > ???? if (status < 0) {
Changwei Ge
2020-Jan-09 01:57 UTC
[Ocfs2-devel] [PATCH] ocfs2: fix a NULL pointer dereference when call ocfs2_update_inode_fsync_trans()
On 1/8/20 5:23 PM, wangyan wrote:> I found a NULL pointer dereference in ocfs2_update_inode_fsync_trans(), > handle->h_transaction may be NULL in this situation: > ocfs2_file_write_iter > ->__generic_file_write_iter > ->generic_perform_write > ->ocfs2_write_begin > ->ocfs2_write_begin_nolock > ->ocfs2_write_cluster_by_desc > ->ocfs2_write_cluster > ->ocfs2_mark_extent_written > ->ocfs2_change_extent_flag > ->ocfs2_split_extent > ->ocfs2_try_to_merge_extent > ->ocfs2_extend_rotate_transaction > ->ocfs2_extend_trans > ->jbd2_journal_restart > ->jbd2__journal_restart > // handle->h_transaction is NULL here > ->handle->h_transaction = NULL; > ->start_this_handle > /* journal aborted due to storage > network disconnection, return error */ > ->return -EROFS; > /* line 3806 in ocfs2_try_to_merge_extent (), > it will ignore ret error. */ > ->ret = 0; > ->... > ->ocfs2_write_end > ->ocfs2_write_end_nolock > ->ocfs2_update_inode_fsync_trans > // NULL pointer dereference > ->oi->i_sync_tid = handle->h_transaction->t_tid; > > The information of NULL pointer dereference as follows: > JBD2: Detected IO errors while flushing file data on dm-11-45 > Aborting journal on device dm-11-45. > JBD2: Error -5 detected when updating journal superblock for dm-11-45. > (dd,22081,3):ocfs2_extend_trans:474 ERROR: status = -30 > (dd,22081,3):ocfs2_try_to_merge_extent:3877 ERROR: status = -30 > Unable to handle kernel NULL pointer dereference at > virtual address 0000000000000008 > Mem abort info: > ESR = 0x96000004 > Exception class = DABT (current EL), IL = 32 bits > SET = 0, FnV = 0 > EA = 0, S1PTW = 0 > Data abort info: > ISV = 0, ISS = 0x00000004 > CM = 0, WnR = 0 > user pgtable: 4k pages, 48-bit VAs, pgdp = 00000000e74e1338 > [0000000000000008] pgd=0000000000000000 > Internal error: Oops: 96000004 [#1] SMP > Process dd (pid: 22081, stack limit = 0x00000000584f35a9) > CPU: 3 PID: 22081 Comm: dd Kdump: loaded > Hardware name: Huawei TaiShan 2280 V2/BC82AMDD, BIOS 0.98 08/25/2019 > pstate: 60400009 (nZCv daif +PAN -UAO) > pc : ocfs2_write_end_nolock+0x2b8/0x550 [ocfs2] > lr : ocfs2_write_end_nolock+0x2a0/0x550 [ocfs2] > sp : ffff0000459fba70 > x29: ffff0000459fba70 x28: 0000000000000000 > x27: ffff807ccf7f1000 x26: 0000000000000001 > x25: ffff807bdff57970 x24: ffff807caf1d4000 > x23: ffff807cc79e9000 x22: 0000000000001000 > x21: 000000006c6cd000 x20: ffff0000091d9000 > x19: ffff807ccb239db0 x18: ffffffffffffffff > x17: 000000000000000e x16: 0000000000000007 > x15: ffff807c5e15bd78 x14: 0000000000000000 > x13: 0000000000000000 x12: 0000000000000000 > x11: 0000000000000000 x10: 0000000000000001 > x9 : 0000000000000228 x8 : 000000000000000c > x7 : 0000000000000fff x6 : ffff807a308ed6b0 > x5 : ffff7e01f10967c0 x4 : 0000000000000018 > x3 : d0bc661572445600 x2 : 0000000000000000 > x1 : 000000001b2e0200 x0 : 0000000000000000 > Call trace: > ocfs2_write_end_nolock+0x2b8/0x550 [ocfs2] > ocfs2_write_end+0x4c/0x80 [ocfs2] > generic_perform_write+0x108/0x1a8 > __generic_file_write_iter+0x158/0x1c8 > ocfs2_file_write_iter+0x668/0x950 [ocfs2] > __vfs_write+0x11c/0x190 > vfs_write+0xac/0x1c0 > ksys_write+0x6c/0xd8 > __arm64_sys_write+0x24/0x30 > el0_svc_common+0x78/0x130 > el0_svc_handler+0x38/0x78 > el0_svc+0x8/0xc > > To prevent NULL pointer dereference in this situation, we use > is_handle_aborted() before using handle->h_transaction->t_tid. > > Signed-off-by: Yan Wang <wangyan122 at huawei.com> > Reviewed-by: Jun Piao <piaojun at huawei.com> > --- > fs/ocfs2/journal.h | 8 +++++--- > fs/ocfs2/namei.c | 3 +-- > 2 files changed, 6 insertions(+), 5 deletions(-) > > diff --git a/fs/ocfs2/journal.h b/fs/ocfs2/journal.h > index 3103ba7f97a2..bfe611ed1b1d 100644 > --- a/fs/ocfs2/journal.h > +++ b/fs/ocfs2/journal.h > @@ -597,9 +597,11 @@ static inline void > ocfs2_update_inode_fsync_trans(handle_t *handle, > { > struct ocfs2_inode_info *oi = OCFS2_I(inode); > > - oi->i_sync_tid = handle->h_transaction->t_tid; > - if (datasync) > - oi->i_datasync_tid = handle->h_transaction->t_tid; > + if (!is_handle_aborted(handle)) { > + oi->i_sync_tid = handle->h_transaction->t_tid; > + if (datasync) > + oi->i_datasync_tid = handle->h_transaction->t_tid; > + }I don't think your way can fix the issue you reported completely. Even you check if the journal is ABORTED or not, you still face a race causing accessing NULL h_transaction. Otherwise, you need synchronization mechanism help. Besides, if journal is aborted, ocfs2 won't fence the machine by resetting? Thanks, Changwei> } > > #endif /* OCFS2_JOURNAL_H */ > diff --git a/fs/ocfs2/namei.c b/fs/ocfs2/namei.c > index 8ea51cf27b97..da65251ef815 100644 > --- a/fs/ocfs2/namei.c > +++ b/fs/ocfs2/namei.c > @@ -586,8 +586,7 @@ static int __ocfs2_mknod_locked(struct inode *dir, > mlog_errno(status); > } > > - oi->i_sync_tid = handle->h_transaction->t_tid; > - oi->i_datasync_tid = handle->h_transaction->t_tid; > + ocfs2_update_inode_fsync_trans(handle, inode, 1); > > leave: > if (status < 0) {