Greg KH
2022-Oct-21 05:37 UTC
[Ocfs2-devel] [PATCH 00/11] fix memory leak while kset_register() fails
On Fri, Oct 21, 2022 at 01:29:31AM -0400, Luben Tuikov wrote:> On 2022-10-20 22:20, Yang Yingliang wrote: > > The previous discussion link: > > https://lore.kernel.org/lkml/0db486eb-6927-927e-3629-958f8f211194 at huawei.com/T/ > > The very first discussion on this was here: > > https://www.spinics.net/lists/dri-devel/msg368077.html > > Please use this link, and not the that one up there you which quoted above, > and whose commit description is taken verbatim from the this link. > > > > > kset_register() is currently used in some places without calling > > kset_put() in error path, because the callers think it should be > > kset internal thing to do, but the driver core can not know what > > caller doing with that memory at times. The memory could be freed > > both in kset_put() and error path of caller, if it is called in > > kset_register(). > > As I explained in the link above, the reason there's > a memory leak is that one cannot call kset_register() without > the kset->kobj.name being set--kobj_add_internal() returns -EINVAL, > in this case, i.e. kset_register() fails with -EINVAL. > > Thus, the most common usage is something like this: > > kobj_set_name(&kset->kobj, format, ...); > kset->kobj.kset = parent_kset; > kset->kobj.ktype = ktype; > res = kset_register(kset); > > So, what is being leaked, is the memory allocated in kobj_set_name(), > by the common idiom shown above. This needs to be mentioned in > the documentation, at least, in case, in the future this is absolved > in kset_register() redesign, etc.Based on this, can kset_register() just clean up from itself when an error happens? Ideally that would be the case, as the odds of a kset being embedded in a larger structure is probably slim, but we would have to search the tree to make sure. thanks, greg k-h
Luben Tuikov
2022-Oct-21 07:55 UTC
[Ocfs2-devel] [PATCH 00/11] fix memory leak while kset_register() fails
On 2022-10-21 01:37, Greg KH wrote:> On Fri, Oct 21, 2022 at 01:29:31AM -0400, Luben Tuikov wrote: >> On 2022-10-20 22:20, Yang Yingliang wrote: >>> The previous discussion link: >>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Flkml%2F0db486eb-6927-927e-3629-958f8f211194%40huawei.com%2FT%2F&data=05%7C01%7Cluben.tuikov%40amd.com%7C65b33f087ef245a9f23708dab3264840%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C638019274318153227%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=1ZoieEob62iU9kI8fvpp20qGut9EeHKIHtCAT01t%2Bz8%3D&reserved=0 >> >> The very first discussion on this was here: >> >> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.spinics.net%2Flists%2Fdri-devel%2Fmsg368077.html&data=05%7C01%7Cluben.tuikov%40amd.com%7C65b33f087ef245a9f23708dab3264840%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C638019274318153227%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=9joWxGLUxZZMvrfkxCR8KbkoXifsqoMK0vGR%2FyEG62w%3D&reserved=0 >> >> Please use this link, and not the that one up there you which quoted above, >> and whose commit description is taken verbatim from the this link. >> >>> >>> kset_register() is currently used in some places without calling >>> kset_put() in error path, because the callers think it should be >>> kset internal thing to do, but the driver core can not know what >>> caller doing with that memory at times. The memory could be freed >>> both in kset_put() and error path of caller, if it is called in >>> kset_register(). >> >> As I explained in the link above, the reason there's >> a memory leak is that one cannot call kset_register() without >> the kset->kobj.name being set--kobj_add_internal() returns -EINVAL, >> in this case, i.e. kset_register() fails with -EINVAL. >> >> Thus, the most common usage is something like this: >> >> kobj_set_name(&kset->kobj, format, ...); >> kset->kobj.kset = parent_kset; >> kset->kobj.ktype = ktype; >> res = kset_register(kset); >> >> So, what is being leaked, is the memory allocated in kobj_set_name(), >> by the common idiom shown above. This needs to be mentioned in >> the documentation, at least, in case, in the future this is absolved >> in kset_register() redesign, etc. > > Based on this, can kset_register() just clean up from itself when an > error happens? Ideally that would be the case, as the odds of a kset > being embedded in a larger structure is probably slim, but we would have > to search the tree to make sure.Looking at kset_register(), we can add kset_put() in the error path, when kobject_add_internal(&kset->kobj) fails. See the attached patch. It needs to be tested with the same error injection as Yang has been doing. Now, struct kset is being embedded in larger structs--see amdgpu_discovery.c starting at line 575. If you're on an AMD system, it gets you the tree structure you'll see when you run "tree /sys/class/drm/card0/device/ip_discovery/". That shouldn't be a problem though. Regards, Luben -------------- next part -------------- A non-text attachment was scrubbed... Name: 0001-kobject-Add-kset_put-if-kset_register-fails.patch Type: text/x-patch Size: 1221 bytes Desc: not available URL: <http://oss.oracle.com/pipermail/ocfs2-devel/attachments/20221021/746437c0/attachment-0001.bin>
Greg KH
2022-Oct-21 08:18 UTC
[Ocfs2-devel] [PATCH 00/11] fix memory leak while kset_register() fails
On Fri, Oct 21, 2022 at 03:55:18AM -0400, Luben Tuikov wrote:> On 2022-10-21 01:37, Greg KH wrote: > > On Fri, Oct 21, 2022 at 01:29:31AM -0400, Luben Tuikov wrote: > >> On 2022-10-20 22:20, Yang Yingliang wrote: > >>> The previous discussion link: > >>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Flkml%2F0db486eb-6927-927e-3629-958f8f211194%40huawei.com%2FT%2F&data=05%7C01%7Cluben.tuikov%40amd.com%7C65b33f087ef245a9f23708dab3264840%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C638019274318153227%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=1ZoieEob62iU9kI8fvpp20qGut9EeHKIHtCAT01t%2Bz8%3D&reserved=0 > >> > >> The very first discussion on this was here: > >> > >> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.spinics.net%2Flists%2Fdri-devel%2Fmsg368077.html&data=05%7C01%7Cluben.tuikov%40amd.com%7C65b33f087ef245a9f23708dab3264840%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C638019274318153227%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=9joWxGLUxZZMvrfkxCR8KbkoXifsqoMK0vGR%2FyEG62w%3D&reserved=0 > >> > >> Please use this link, and not the that one up there you which quoted above, > >> and whose commit description is taken verbatim from the this link. > >> > >>> > >>> kset_register() is currently used in some places without calling > >>> kset_put() in error path, because the callers think it should be > >>> kset internal thing to do, but the driver core can not know what > >>> caller doing with that memory at times. The memory could be freed > >>> both in kset_put() and error path of caller, if it is called in > >>> kset_register(). > >> > >> As I explained in the link above, the reason there's > >> a memory leak is that one cannot call kset_register() without > >> the kset->kobj.name being set--kobj_add_internal() returns -EINVAL, > >> in this case, i.e. kset_register() fails with -EINVAL. > >> > >> Thus, the most common usage is something like this: > >> > >> kobj_set_name(&kset->kobj, format, ...); > >> kset->kobj.kset = parent_kset; > >> kset->kobj.ktype = ktype; > >> res = kset_register(kset); > >> > >> So, what is being leaked, is the memory allocated in kobj_set_name(), > >> by the common idiom shown above. This needs to be mentioned in > >> the documentation, at least, in case, in the future this is absolved > >> in kset_register() redesign, etc. > > > > Based on this, can kset_register() just clean up from itself when an > > error happens? Ideally that would be the case, as the odds of a kset > > being embedded in a larger structure is probably slim, but we would have > > to search the tree to make sure. > > Looking at kset_register(), we can add kset_put() in the error path, > when kobject_add_internal(&kset->kobj) fails. > > See the attached patch. It needs to be tested with the same error injection > as Yang has been doing. > > Now, struct kset is being embedded in larger structs--see amdgpu_discovery.c > starting at line 575. If you're on an AMD system, it gets you the tree > structure you'll see when you run "tree /sys/class/drm/card0/device/ip_discovery/". > That shouldn't be a problem though.Yes, that shouldn't be an issue as the kobject embedded in a kset is ONLY for that kset itself, the kset structure should not be controling the lifespan of the object it is embedded in, right? Note, the use of ksets by a device driver like you are doing here in the amd driver is BROKEN and will cause problems by userspace tools. Don't do that please, just use a single subdirectory for an attribute. Doing deeper stuff like this is sure to cause problems and be a headache. thanks, greg k-h
Yang Yingliang
2022-Oct-21 08:24 UTC
[Ocfs2-devel] [PATCH 00/11] fix memory leak while kset_register() fails
On 2022/10/21 13:37, Greg KH wrote:> On Fri, Oct 21, 2022 at 01:29:31AM -0400, Luben Tuikov wrote: >> On 2022-10-20 22:20, Yang Yingliang wrote: >>> The previous discussion link: >>> https://lore.kernel.org/lkml/0db486eb-6927-927e-3629-958f8f211194 at huawei.com/T/ >> The very first discussion on this was here: >> >> https://www.spinics.net/lists/dri-devel/msg368077.html >> >> Please use this link, and not the that one up there you which quoted above, >> and whose commit description is taken verbatim from the this link. >> >>> kset_register() is currently used in some places without calling >>> kset_put() in error path, because the callers think it should be >>> kset internal thing to do, but the driver core can not know what >>> caller doing with that memory at times. The memory could be freed >>> both in kset_put() and error path of caller, if it is called in >>> kset_register(). >> As I explained in the link above, the reason there's >> a memory leak is that one cannot call kset_register() without >> the kset->kobj.name being set--kobj_add_internal() returns -EINVAL, >> in this case, i.e. kset_register() fails with -EINVAL. >> >> Thus, the most common usage is something like this: >> >> kobj_set_name(&kset->kobj, format, ...); >> kset->kobj.kset = parent_kset; >> kset->kobj.ktype = ktype; >> res = kset_register(kset); >> >> So, what is being leaked, is the memory allocated in kobj_set_name(), >> by the common idiom shown above. This needs to be mentioned in >> the documentation, at least, in case, in the future this is absolved >> in kset_register() redesign, etc. > Based on this, can kset_register() just clean up from itself when an > error happens? Ideally that would be the case, as the odds of a kset > being embedded in a larger structure is probably slim, but we would have > to search the tree to make sure.I have search the whole tree, the kset used in bus_register() - patch #3, kset_create_and_add() - patch #4 __class_register() - patch #5,? fw_cfg_build_symlink() - patch #6 and amdgpu_discovery.c - patch #10 is embedded in a larger structure. In these cases, we can not call kset_put() in error path in kset_register() itself. Thanks, Yang> > thanks, > > greg k-h > .