Daniel P. Berrangé
2019-Apr-18 15:55 UTC
Re: [libvirt-users] [libvirt] JVM crashes during GC
On Thu, Apr 18, 2019 at 05:51:06PM +0200, Michal Prívozník wrote:> On 4/17/19 10:24 AM, Sachin Soman wrote: > > Hi, > > > > Could you tell me if the following is some known issue? > > > > While performing the following simple test, I see my JVM crashing > > (consistently): > > 1. Open a connection to an ESXi driver/host (passing ConnectAuthDefault > > instance). > > 2. Close the connection. > > 3. Invoke GC > > > > When GC is triggered, at some point, some unallocated native memory is > > being tried to release. That's failing. > > > > The error thrown is: > > > > java(78745,0x70000241e000) malloc: *** error for object 0x7fd5df561390: > > pointer being freed was not allocated > > > > *** set a breakpoint in malloc_error_break to debug > > > > > > Frames from core dump: > > > > frame #0: 0x00007fff5b274b66 libsystem_kernel.dylib`__pthread_kill + 10 > > > > frame #1: 0x00007fff5b43f080 libsystem_pthread.dylib`pthread_kill + 333 > > > > frame #2: 0x00007fff5b1d01ae libsystem_c.dylib`abort + 127 > > > > frame #3: 0x00007fff5b2ce8a6 libsystem_malloc.dylib`free + 521 > > > > frame #4: 0x00000001127f43a7 > > > > frame #5: 0x00000001127e3ffd > > > > frame #6: 0x00000001127e3ffd > > > > frame #7: 0x00000001127e3ffd > > > > frame #8: 0x00000001127e3ffd > > > > frame #9: 0x00000001127e4042 > > > > frame #10: 0x00000001127e3ffd > > > > frame #11: 0x00000001127e3ffd > > > > frame #12: 0x00000001127dc4e7 > > > > frame #13: 0x000000010c0e235e > > libjvm.dylib`JavaCalls::call_helper(JavaValue*, methodHandle*, > > JavaCallArguments*, Thread*) + 1710 > > > > frame #14: 0x000000010c0e2b02 > > libjvm.dylib`JavaCalls::call_virtual(JavaValue*, KlassHandle, Symbol*, > > Symbol*, JavaCallArguments*, Thread*) + 356 > > > > frame #15: 0x000000010c0e2cae > > libjvm.dylib`JavaCalls::call_virtual(JavaValue*, Handle, KlassHandle, > > Symbol*, Symbol*, Thread*) + 74 > > > > frame #16: 0x000000010c1208ee libjvm.dylib`thread_entry(JavaThread*, > > Thread*) + 124 > > > > frame #17: 0x000000010c33e84d > > libjvm.dylib`JavaThread::thread_main_inner() + 155 > > > > frame #18: 0x000000010c33ff12 libjvm.dylib`JavaThread::run() + 448 > > > > frame #19: 0x000000010c26058a libjvm.dylib`java_start(Thread*) + 246 > > > > frame #20: 0x00007fff5b43c661 libsystem_pthread.dylib`_pthread_body + > > 340 > > > > frame #21: 0x00007fff5b43c50d libsystem_pthread.dylib`_pthread_start + > > 377 > > > > frame #22: 0x00007fff5b43bbf9 libsystem_pthread.dylib`thread_start + 13 > > > > > > I have installed Libvirt 5.2.0. > > Java bindings libvirt-java 0.5.1 > > JNA 4.0.0 > > Tested Java environments: Oracle Java 8 and OpenJDK 8 on MAC, OpenJDK 11 on > > Ubuntu 16 > > The backtrace does not suggest it's libvirt related, but I wouldn't be > surprised if our Java bindings mangled memory somewhere. They are > heavily unmaintained.It could just as easily be a memory corruption bug in the ESX libvirt driver, since that runs directly in the applicatin process as it is a stateless client side driver. We would probably need to have an small demo program that can reproduce the problem in an isolated fashion, in order to try to debug it, along with full libvirt debug logs. Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
I am attaching the execution results. At the top of each file I have mentioned the environment details. Following is the test program I have used: ================================================= *package* org.libvirt; *import* org.libvirt.jna.Libvirt; *public* *class* LibvirtCrashTest { *void* createAndDestroyDefaultAuthConnection() { ConnectAuth ca = *new* ConnectAuthDefault(); *try* { System.*out*.println("Starting new connection with default auth"); Connect connect = *new* Connect("esx://x.x.x.x/?no_verify=1", ca, 0); Thread.*sleep*(1000); System.*out*.println("Explicit connection closure"); connect.close(); Thread.*sleep*(5000); } *catch* (Exception e) { e.printStackTrace(); } } *public* *static* *void* main(String[] args) *throws* Exception { LibvirtCrashTest testInstance = *new* LibvirtCrashTest(); *for*(*int* counter = 0; counter < 3; counter++) { testInstance.createAndDestroyDefaultAuthConnection(); System.*out*.println("gc'ing"); System.*gc*(); System.*out*.println("gc'd"); *int* tCounter = 0; *while*(tCounter++ < 20) { System.*out*.println("waiting.. " + tCounter); Thread.*sleep*(1000); } } System.*out*.println("Going down..."); } } ================================================= Thanks & Regards Sachin Soman On Thu, Apr 18, 2019 at 9:25 PM Daniel P. Berrangé <berrange@redhat.com> wrote:> On Thu, Apr 18, 2019 at 05:51:06PM +0200, Michal Prívozník wrote: > > On 4/17/19 10:24 AM, Sachin Soman wrote: > > > Hi, > > > > > > Could you tell me if the following is some known issue? > > > > > > While performing the following simple test, I see my JVM crashing > > > (consistently): > > > 1. Open a connection to an ESXi driver/host (passing ConnectAuthDefault > > > instance). > > > 2. Close the connection. > > > 3. Invoke GC > > > > > > When GC is triggered, at some point, some unallocated native memory is > > > being tried to release. That's failing. > > > > > > The error thrown is: > > > > > > java(78745,0x70000241e000) malloc: *** error for object 0x7fd5df561390: > > > pointer being freed was not allocated > > > > > > *** set a breakpoint in malloc_error_break to debug > > > > > > > > > Frames from core dump: > > > > > > frame #0: 0x00007fff5b274b66 libsystem_kernel.dylib`__pthread_kill > + 10 > > > > > > frame #1: 0x00007fff5b43f080 libsystem_pthread.dylib`pthread_kill > + 333 > > > > > > frame #2: 0x00007fff5b1d01ae libsystem_c.dylib`abort + 127 > > > > > > frame #3: 0x00007fff5b2ce8a6 libsystem_malloc.dylib`free + 521 > > > > > > frame #4: 0x00000001127f43a7 > > > > > > frame #5: 0x00000001127e3ffd > > > > > > frame #6: 0x00000001127e3ffd > > > > > > frame #7: 0x00000001127e3ffd > > > > > > frame #8: 0x00000001127e3ffd > > > > > > frame #9: 0x00000001127e4042 > > > > > > frame #10: 0x00000001127e3ffd > > > > > > frame #11: 0x00000001127e3ffd > > > > > > frame #12: 0x00000001127dc4e7 > > > > > > frame #13: 0x000000010c0e235e > > > libjvm.dylib`JavaCalls::call_helper(JavaValue*, methodHandle*, > > > JavaCallArguments*, Thread*) + 1710 > > > > > > frame #14: 0x000000010c0e2b02 > > > libjvm.dylib`JavaCalls::call_virtual(JavaValue*, KlassHandle, Symbol*, > > > Symbol*, JavaCallArguments*, Thread*) + 356 > > > > > > frame #15: 0x000000010c0e2cae > > > libjvm.dylib`JavaCalls::call_virtual(JavaValue*, Handle, KlassHandle, > > > Symbol*, Symbol*, Thread*) + 74 > > > > > > frame #16: 0x000000010c1208ee > libjvm.dylib`thread_entry(JavaThread*, > > > Thread*) + 124 > > > > > > frame #17: 0x000000010c33e84d > > > libjvm.dylib`JavaThread::thread_main_inner() + 155 > > > > > > frame #18: 0x000000010c33ff12 libjvm.dylib`JavaThread::run() + 448 > > > > > > frame #19: 0x000000010c26058a libjvm.dylib`java_start(Thread*) + > 246 > > > > > > frame #20: 0x00007fff5b43c661 > libsystem_pthread.dylib`_pthread_body + > > > 340 > > > > > > frame #21: 0x00007fff5b43c50d > libsystem_pthread.dylib`_pthread_start + > > > 377 > > > > > > frame #22: 0x00007fff5b43bbf9 libsystem_pthread.dylib`thread_start > + 13 > > > > > > > > > I have installed Libvirt 5.2.0. > > > Java bindings libvirt-java 0.5.1 > > > JNA 4.0.0 > > > Tested Java environments: Oracle Java 8 and OpenJDK 8 on MAC, OpenJDK > 11 on > > > Ubuntu 16 > > > > The backtrace does not suggest it's libvirt related, but I wouldn't be > > surprised if our Java bindings mangled memory somewhere. They are > > heavily unmaintained. > > It could just as easily be a memory corruption bug in the ESX libvirt > driver, since that runs directly in the applicatin process as it is a > stateless client side driver. > > We would probably need to have an small demo program that can reproduce > the problem in an isolated fashion, in order to try to debug it, along > with full libvirt debug logs. > > > Regards, > Daniel > -- > |: https://berrange.com -o- > https://www.flickr.com/photos/dberrange :| > |: https://libvirt.org -o- > https://fstop138.berrange.com :| > |: https://entangle-photo.org -o- > https://www.instagram.com/dberrange :| >
Note: A couple of times I have seen errors while closing the connection (the trace ending with virFree). Also, a few times I have seen backtraces which show the flow going via esx driver and finally failing to close connection. Unfortunately I dont have those logs anymore. The execution results I have shared have been obtained using Libvirt built from source using the following config parameters: # configure --prefix=/usr --localstatedir=/var --sysconfdir=/etc --with-esx=yes The libvirt java bindings I have taken from : https://github.com/libvirt/libvirt-java I have added a sysout in the "finalize" method of Connect class in there, thats why the "Connect finalizing.." messages in execution results. Thanks & Regards Sachin Soman On Thu, Apr 18, 2019 at 10:46 PM Sachin Soman <sachonline.soman@gmail.com> wrote:> I am attaching the execution results. At the top of each file I have > mentioned the environment details. > > Following is the test program I have used: > > =================================================> > *package* org.libvirt; > > > *import* org.libvirt.jna.Libvirt; > > > *public* *class* LibvirtCrashTest { > > *void* createAndDestroyDefaultAuthConnection() { > > ConnectAuth ca = *new* ConnectAuthDefault(); > > *try* { > > System.*out*.println("Starting new connection with default auth"); > > Connect connect = *new* Connect("esx://x.x.x.x/?no_verify=1", ca, 0); > > Thread.*sleep*(1000); > > System.*out*.println("Explicit connection closure"); > > connect.close(); > > Thread.*sleep*(5000); > > } *catch* (Exception e) { > > e.printStackTrace(); > > } > > } > > > *public* *static* *void* main(String[] args) *throws* Exception { > > LibvirtCrashTest testInstance = *new* LibvirtCrashTest(); > > > *for*(*int* counter = 0; counter < 3; counter++) { > > testInstance.createAndDestroyDefaultAuthConnection(); > > System.*out*.println("gc'ing"); > > System.*gc*(); > > System.*out*.println("gc'd"); > > *int* tCounter = 0; > > *while*(tCounter++ < 20) { > > System.*out*.println("waiting.. " + tCounter); > > Thread.*sleep*(1000); > > } > > } > > System.*out*.println("Going down..."); > > } > > > } > =================================================> > > Thanks & Regards > Sachin Soman > > > > > On Thu, Apr 18, 2019 at 9:25 PM Daniel P. Berrangé <berrange@redhat.com> > wrote: > >> On Thu, Apr 18, 2019 at 05:51:06PM +0200, Michal Prívozník wrote: >> > On 4/17/19 10:24 AM, Sachin Soman wrote: >> > > Hi, >> > > >> > > Could you tell me if the following is some known issue? >> > > >> > > While performing the following simple test, I see my JVM crashing >> > > (consistently): >> > > 1. Open a connection to an ESXi driver/host (passing >> ConnectAuthDefault >> > > instance). >> > > 2. Close the connection. >> > > 3. Invoke GC >> > > >> > > When GC is triggered, at some point, some unallocated native memory is >> > > being tried to release. That's failing. >> > > >> > > The error thrown is: >> > > >> > > java(78745,0x70000241e000) malloc: *** error for object >> 0x7fd5df561390: >> > > pointer being freed was not allocated >> > > >> > > *** set a breakpoint in malloc_error_break to debug >> > > >> > > >> > > Frames from core dump: >> > > >> > > frame #0: 0x00007fff5b274b66 >> libsystem_kernel.dylib`__pthread_kill + 10 >> > > >> > > frame #1: 0x00007fff5b43f080 libsystem_pthread.dylib`pthread_kill >> + 333 >> > > >> > > frame #2: 0x00007fff5b1d01ae libsystem_c.dylib`abort + 127 >> > > >> > > frame #3: 0x00007fff5b2ce8a6 libsystem_malloc.dylib`free + 521 >> > > >> > > frame #4: 0x00000001127f43a7 >> > > >> > > frame #5: 0x00000001127e3ffd >> > > >> > > frame #6: 0x00000001127e3ffd >> > > >> > > frame #7: 0x00000001127e3ffd >> > > >> > > frame #8: 0x00000001127e3ffd >> > > >> > > frame #9: 0x00000001127e4042 >> > > >> > > frame #10: 0x00000001127e3ffd >> > > >> > > frame #11: 0x00000001127e3ffd >> > > >> > > frame #12: 0x00000001127dc4e7 >> > > >> > > frame #13: 0x000000010c0e235e >> > > libjvm.dylib`JavaCalls::call_helper(JavaValue*, methodHandle*, >> > > JavaCallArguments*, Thread*) + 1710 >> > > >> > > frame #14: 0x000000010c0e2b02 >> > > libjvm.dylib`JavaCalls::call_virtual(JavaValue*, KlassHandle, Symbol*, >> > > Symbol*, JavaCallArguments*, Thread*) + 356 >> > > >> > > frame #15: 0x000000010c0e2cae >> > > libjvm.dylib`JavaCalls::call_virtual(JavaValue*, Handle, KlassHandle, >> > > Symbol*, Symbol*, Thread*) + 74 >> > > >> > > frame #16: 0x000000010c1208ee >> libjvm.dylib`thread_entry(JavaThread*, >> > > Thread*) + 124 >> > > >> > > frame #17: 0x000000010c33e84d >> > > libjvm.dylib`JavaThread::thread_main_inner() + 155 >> > > >> > > frame #18: 0x000000010c33ff12 libjvm.dylib`JavaThread::run() + 448 >> > > >> > > frame #19: 0x000000010c26058a libjvm.dylib`java_start(Thread*) + >> 246 >> > > >> > > frame #20: 0x00007fff5b43c661 >> libsystem_pthread.dylib`_pthread_body + >> > > 340 >> > > >> > > frame #21: 0x00007fff5b43c50d >> libsystem_pthread.dylib`_pthread_start + >> > > 377 >> > > >> > > frame #22: 0x00007fff5b43bbf9 >> libsystem_pthread.dylib`thread_start + 13 >> > > >> > > >> > > I have installed Libvirt 5.2.0. >> > > Java bindings libvirt-java 0.5.1 >> > > JNA 4.0.0 >> > > Tested Java environments: Oracle Java 8 and OpenJDK 8 on MAC, OpenJDK >> 11 on >> > > Ubuntu 16 >> > >> > The backtrace does not suggest it's libvirt related, but I wouldn't be >> > surprised if our Java bindings mangled memory somewhere. They are >> > heavily unmaintained. >> >> It could just as easily be a memory corruption bug in the ESX libvirt >> driver, since that runs directly in the applicatin process as it is a >> stateless client side driver. >> >> We would probably need to have an small demo program that can reproduce >> the problem in an isolated fashion, in order to try to debug it, along >> with full libvirt debug logs. >> >> >> Regards, >> Daniel >> -- >> |: https://berrange.com -o- >> https://www.flickr.com/photos/dberrange :| >> |: https://libvirt.org -o- >> https://fstop138.berrange.com :| >> |: https://entangle-photo.org -o- >> https://www.instagram.com/dberrange :| >> >
Daniel P. Berrangé
2019-Apr-18 17:33 UTC
Re: [libvirt-users] [libvirt] JVM crashes during GC
On Thu, Apr 18, 2019 at 10:46:19PM +0530, Sachin Soman wrote:> I am attaching the execution results. At the top of each file I have > mentioned the environment details. > > Following is the test program I have used: > > =================================================> > *package* org.libvirt; > > > *import* org.libvirt.jna.Libvirt; > > > *public* *class* LibvirtCrashTest { > > *void* createAndDestroyDefaultAuthConnection() { > > ConnectAuth ca = *new* ConnectAuthDefault(); > > *try* { > > System.*out*.println("Starting new connection with default auth"); > > Connect connect = *new* Connect("esx://x.x.x.x/?no_verify=1", ca, 0);It could be interesting to try different libvirt drivers. eg "test:///default" this could help identify if its a bug in libvirt common code vs a bug in only the ESX driver code.> > Thread.*sleep*(1000); > > System.*out*.println("Explicit connection closure"); > > connect.close(); > > Thread.*sleep*(5000); > > } *catch* (Exception e) { > > e.printStackTrace(); > > } > > } > > > *public* *static* *void* main(String[] args) *throws* Exception { > > LibvirtCrashTest testInstance = *new* LibvirtCrashTest(); > > > *for*(*int* counter = 0; counter < 3; counter++) { > > testInstance.createAndDestroyDefaultAuthConnection(); > > System.*out*.println("gc'ing"); > > System.*gc*(); > > System.*out*.println("gc'd"); > > *int* tCounter = 0; > > *while*(tCounter++ < 20) { > > System.*out*.println("waiting.. " + tCounter); > > Thread.*sleep*(1000); > > } > > } > > System.*out*.println("Going down..."); > > } > > > } > =================================================> > > Thanks & Regards > Sachin Soman > > > > > On Thu, Apr 18, 2019 at 9:25 PM Daniel P. Berrangé <berrange@redhat.com> > wrote: > > > On Thu, Apr 18, 2019 at 05:51:06PM +0200, Michal Prívozník wrote: > > > On 4/17/19 10:24 AM, Sachin Soman wrote: > > > > Hi, > > > > > > > > Could you tell me if the following is some known issue? > > > > > > > > While performing the following simple test, I see my JVM crashing > > > > (consistently): > > > > 1. Open a connection to an ESXi driver/host (passing ConnectAuthDefault > > > > instance). > > > > 2. Close the connection. > > > > 3. Invoke GC > > > > > > > > When GC is triggered, at some point, some unallocated native memory is > > > > being tried to release. That's failing. > > > > > > > > The error thrown is: > > > > > > > > java(78745,0x70000241e000) malloc: *** error for object 0x7fd5df561390: > > > > pointer being freed was not allocated > > > > > > > > *** set a breakpoint in malloc_error_break to debug > > > > > > > > > > > > Frames from core dump: > > > > > > > > frame #0: 0x00007fff5b274b66 libsystem_kernel.dylib`__pthread_kill > > + 10 > > > > > > > > frame #1: 0x00007fff5b43f080 libsystem_pthread.dylib`pthread_kill > > + 333 > > > > > > > > frame #2: 0x00007fff5b1d01ae libsystem_c.dylib`abort + 127 > > > > > > > > frame #3: 0x00007fff5b2ce8a6 libsystem_malloc.dylib`free + 521 > > > > > > > > frame #4: 0x00000001127f43a7 > > > > > > > > frame #5: 0x00000001127e3ffd > > > > > > > > frame #6: 0x00000001127e3ffd > > > > > > > > frame #7: 0x00000001127e3ffd > > > > > > > > frame #8: 0x00000001127e3ffd > > > > > > > > frame #9: 0x00000001127e4042 > > > > > > > > frame #10: 0x00000001127e3ffd > > > > > > > > frame #11: 0x00000001127e3ffd > > > > > > > > frame #12: 0x00000001127dc4e7 > > > > > > > > frame #13: 0x000000010c0e235e > > > > libjvm.dylib`JavaCalls::call_helper(JavaValue*, methodHandle*, > > > > JavaCallArguments*, Thread*) + 1710 > > > > > > > > frame #14: 0x000000010c0e2b02 > > > > libjvm.dylib`JavaCalls::call_virtual(JavaValue*, KlassHandle, Symbol*, > > > > Symbol*, JavaCallArguments*, Thread*) + 356 > > > > > > > > frame #15: 0x000000010c0e2cae > > > > libjvm.dylib`JavaCalls::call_virtual(JavaValue*, Handle, KlassHandle, > > > > Symbol*, Symbol*, Thread*) + 74 > > > > > > > > frame #16: 0x000000010c1208ee > > libjvm.dylib`thread_entry(JavaThread*, > > > > Thread*) + 124 > > > > > > > > frame #17: 0x000000010c33e84d > > > > libjvm.dylib`JavaThread::thread_main_inner() + 155 > > > > > > > > frame #18: 0x000000010c33ff12 libjvm.dylib`JavaThread::run() + 448 > > > > > > > > frame #19: 0x000000010c26058a libjvm.dylib`java_start(Thread*) + > > 246 > > > > > > > > frame #20: 0x00007fff5b43c661 > > libsystem_pthread.dylib`_pthread_body + > > > > 340 > > > > > > > > frame #21: 0x00007fff5b43c50d > > libsystem_pthread.dylib`_pthread_start + > > > > 377 > > > > > > > > frame #22: 0x00007fff5b43bbf9 libsystem_pthread.dylib`thread_start > > + 13 > > > > > > > > > > > > I have installed Libvirt 5.2.0. > > > > Java bindings libvirt-java 0.5.1 > > > > JNA 4.0.0 > > > > Tested Java environments: Oracle Java 8 and OpenJDK 8 on MAC, OpenJDK > > 11 on > > > > Ubuntu 16 > > > > > > The backtrace does not suggest it's libvirt related, but I wouldn't be > > > surprised if our Java bindings mangled memory somewhere. They are > > > heavily unmaintained. > > > > It could just as easily be a memory corruption bug in the ESX libvirt > > driver, since that runs directly in the applicatin process as it is a > > stateless client side driver. > > > > We would probably need to have an small demo program that can reproduce > > the problem in an isolated fashion, in order to try to debug it, along > > with full libvirt debug logs. > > > > > > Regards, > > Daniel > > -- > > |: https://berrange.com -o- > > https://www.flickr.com/photos/dberrange :| > > |: https://libvirt.org -o- > > https://fstop138.berrange.com :| > > |: https://entangle-photo.org -o- > > https://www.instagram.com/dberrange :| > >Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|