Greg KH
2022-Oct-21 08:18 UTC
[Ocfs2-devel] [PATCH 00/11] fix memory leak while kset_register() fails
On Fri, Oct 21, 2022 at 03:55:18AM -0400, Luben Tuikov wrote:> On 2022-10-21 01:37, Greg KH wrote: > > On Fri, Oct 21, 2022 at 01:29:31AM -0400, Luben Tuikov wrote: > >> On 2022-10-20 22:20, Yang Yingliang wrote: > >>> The previous discussion link: > >>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Flkml%2F0db486eb-6927-927e-3629-958f8f211194%40huawei.com%2FT%2F&data=05%7C01%7Cluben.tuikov%40amd.com%7C65b33f087ef245a9f23708dab3264840%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C638019274318153227%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=1ZoieEob62iU9kI8fvpp20qGut9EeHKIHtCAT01t%2Bz8%3D&reserved=0 > >> > >> The very first discussion on this was here: > >> > >> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.spinics.net%2Flists%2Fdri-devel%2Fmsg368077.html&data=05%7C01%7Cluben.tuikov%40amd.com%7C65b33f087ef245a9f23708dab3264840%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C638019274318153227%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=9joWxGLUxZZMvrfkxCR8KbkoXifsqoMK0vGR%2FyEG62w%3D&reserved=0 > >> > >> Please use this link, and not the that one up there you which quoted above, > >> and whose commit description is taken verbatim from the this link. > >> > >>> > >>> kset_register() is currently used in some places without calling > >>> kset_put() in error path, because the callers think it should be > >>> kset internal thing to do, but the driver core can not know what > >>> caller doing with that memory at times. The memory could be freed > >>> both in kset_put() and error path of caller, if it is called in > >>> kset_register(). > >> > >> As I explained in the link above, the reason there's > >> a memory leak is that one cannot call kset_register() without > >> the kset->kobj.name being set--kobj_add_internal() returns -EINVAL, > >> in this case, i.e. kset_register() fails with -EINVAL. > >> > >> Thus, the most common usage is something like this: > >> > >> kobj_set_name(&kset->kobj, format, ...); > >> kset->kobj.kset = parent_kset; > >> kset->kobj.ktype = ktype; > >> res = kset_register(kset); > >> > >> So, what is being leaked, is the memory allocated in kobj_set_name(), > >> by the common idiom shown above. This needs to be mentioned in > >> the documentation, at least, in case, in the future this is absolved > >> in kset_register() redesign, etc. > > > > Based on this, can kset_register() just clean up from itself when an > > error happens? Ideally that would be the case, as the odds of a kset > > being embedded in a larger structure is probably slim, but we would have > > to search the tree to make sure. > > Looking at kset_register(), we can add kset_put() in the error path, > when kobject_add_internal(&kset->kobj) fails. > > See the attached patch. It needs to be tested with the same error injection > as Yang has been doing. > > Now, struct kset is being embedded in larger structs--see amdgpu_discovery.c > starting at line 575. If you're on an AMD system, it gets you the tree > structure you'll see when you run "tree /sys/class/drm/card0/device/ip_discovery/". > That shouldn't be a problem though.Yes, that shouldn't be an issue as the kobject embedded in a kset is ONLY for that kset itself, the kset structure should not be controling the lifespan of the object it is embedded in, right? Note, the use of ksets by a device driver like you are doing here in the amd driver is BROKEN and will cause problems by userspace tools. Don't do that please, just use a single subdirectory for an attribute. Doing deeper stuff like this is sure to cause problems and be a headache. thanks, greg k-h
Luben Tuikov
2022-Oct-21 08:24 UTC
[Ocfs2-devel] [PATCH 00/11] fix memory leak while kset_register() fails
On 2022-10-21 04:18, Greg KH wrote:> On Fri, Oct 21, 2022 at 03:55:18AM -0400, Luben Tuikov wrote: >> On 2022-10-21 01:37, Greg KH wrote: >>> On Fri, Oct 21, 2022 at 01:29:31AM -0400, Luben Tuikov wrote: >>>> On 2022-10-20 22:20, Yang Yingliang wrote: >>>>> The previous discussion link: >>>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Flkml%2F0db486eb-6927-927e-3629-958f8f211194%40huawei.com%2FT%2F&data=05%7C01%7Cluben.tuikov%40amd.com%7Cd41da3fd6449492d01f808dab33cdb75%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C638019371236833115%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=C%2Bj1THkHpzVGks5eqB%2Fm%2FPAkMRohR7CYvRnOCqUqdcM%3D&reserved=0 >>>> >>>> The very first discussion on this was here: >>>> >>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.spinics.net%2Flists%2Fdri-devel%2Fmsg368077.html&data=05%7C01%7Cluben.tuikov%40amd.com%7Cd41da3fd6449492d01f808dab33cdb75%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C638019371236833115%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=pSR10abmK8nAMvKSezqWC0SPUBL4qEwtCCizyIKW7Dc%3D&reserved=0 >>>> >>>> Please use this link, and not the that one up there you which quoted above, >>>> and whose commit description is taken verbatim from the this link. >>>> >>>>> >>>>> kset_register() is currently used in some places without calling >>>>> kset_put() in error path, because the callers think it should be >>>>> kset internal thing to do, but the driver core can not know what >>>>> caller doing with that memory at times. The memory could be freed >>>>> both in kset_put() and error path of caller, if it is called in >>>>> kset_register(). >>>> >>>> As I explained in the link above, the reason there's >>>> a memory leak is that one cannot call kset_register() without >>>> the kset->kobj.name being set--kobj_add_internal() returns -EINVAL, >>>> in this case, i.e. kset_register() fails with -EINVAL. >>>> >>>> Thus, the most common usage is something like this: >>>> >>>> kobj_set_name(&kset->kobj, format, ...); >>>> kset->kobj.kset = parent_kset; >>>> kset->kobj.ktype = ktype; >>>> res = kset_register(kset); >>>> >>>> So, what is being leaked, is the memory allocated in kobj_set_name(), >>>> by the common idiom shown above. This needs to be mentioned in >>>> the documentation, at least, in case, in the future this is absolved >>>> in kset_register() redesign, etc. >>> >>> Based on this, can kset_register() just clean up from itself when an >>> error happens? Ideally that would be the case, as the odds of a kset >>> being embedded in a larger structure is probably slim, but we would have >>> to search the tree to make sure. >> >> Looking at kset_register(), we can add kset_put() in the error path, >> when kobject_add_internal(&kset->kobj) fails. >> >> See the attached patch. It needs to be tested with the same error injection >> as Yang has been doing. >> >> Now, struct kset is being embedded in larger structs--see amdgpu_discovery.c >> starting at line 575. If you're on an AMD system, it gets you the tree >> structure you'll see when you run "tree /sys/class/drm/card0/device/ip_discovery/". >> That shouldn't be a problem though. > > Yes, that shouldn't be an issue as the kobject embedded in a kset is > ONLY for that kset itself, the kset structure should not be controling > the lifespan of the object it is embedded in, right?Yes, and it doesn't. It only does a kobject_get(parent) and kobject_put(parent). So that's fine and natural. Yang, do you want to try the patch in my previous email in this thread, since you've got the error injection set up already? Regards, Luben