Gaurav Dasgupta
2013-Feb-08 06:57 UTC
[Rd] ClassNotFoundException when running distributed job using rJava package
Hi, I have a MapReduce Java code, which I am calling from R using rJava. I have prepared the R package and tested that successfully. But when I deployed the package in a cluster and executed it, I am getting ClassNotFoundException. If I run the same job directly without integrating with R, it runs perfectly. Here is my R code: library(rJava) muMstSpark <- function(mesosMaster = NULL, input = NULL, output = NULL, scalaLib = NULL, sparkCore = NULL, inputSplits = 8) { if (is.null(mesosMaster) || is.null(input) || is.null(output) || is.null(scalaLib) || is.null(sparkCore)) { stop("Usage: muMST(<mesosMaster>, <input>, <output>, <scalaLib>, <sparkCore>, [<inputSplits>]") } # Gets the absolute path of the external Scala and Java JARS pkgPath = paste(system.file(package="MuMstBig"), "/jars", sep="") # Initializes the JVM specifying the directory where the main Java class resides: .jinit("pkgPath") # Adds all the required JARs to the class path: .jaddClassPath(paste(pkgPath, "Prims.jar", sep="/")) .jaddClassPath(paste(pkgPath, "MSTInSpark.jar", sep="/")) .jaddClassPath(scalaLib) .jaddClassPath(sparkCore) # Creates the R object for the main Java class: obj <- .jnew("MSTInSpark") # Calls the Java main class .jcall(obj, "V", "mst", c(mesosMaster, input, output, inputSplits)) } Here is the error log: 13/02/08 00:54:48 INFO cluster.TaskSetManager: Loss was due to java.lang.ClassNotFoundException: Prims$$anonfun$PrimsExecute$1 at java.net.URLClassLoader$1.run(URLClassLoader.java:202) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:190) at java.lang.ClassLoader.loadClass(ClassLoader.java:306) at java.lang.ClassLoader.loadClass(ClassLoader.java:247) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:247) at spark.JavaDeserializationStream$$anon$1.resolveClass(JavaSerializer.scala:20) at java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1574) at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1495) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1731) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1328) at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1946) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1870) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1752) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1328) at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1946) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1870) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1752) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1328) at java.io.ObjectInputStream.readObject(ObjectInputStream.java:350) at scala.collection.immutable.$colon$colon.readObject(List.scala:435) at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:969) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1848) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1752) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1328) at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1946) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1870) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1752) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1328) at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1946) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1870) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1752) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1328) at java.io.ObjectInputStream.readObject(ObjectInputStream.java:350) at scala.collection.immutable.$colon$colon.readObject(List.scala:435) at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:969) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1848) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1752) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1328) at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1946) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1870) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1752) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1328) at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1946) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1870) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1752) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1328) at java.io.ObjectInputStream.readObject(ObjectInputStream.java:350) at spark.JavaDeserializationStream.readObject(JavaSerializer.scala:23) at spark.JavaSerializerInstance.deserialize(JavaSerializer.scala:45) at spark.executor.Executor$TaskRunner.run(Executor.scala:93) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) I think R is unable to find the classpath. But I have specified that in the script by taking the absolute path of the JARs in the package. The package's installed across the cluster. Any idea, whats going wrong? Thanks, [[alternative HTML version deleted]]
Prof Brian Ripley
2013-Feb-08 07:16 UTC
[Rd] ClassNotFoundException when running distributed job using rJava package
This is not the rJava support list: that is at http://www.rosuda.org/lists.shtml. On 08/02/2013 06:57, Gaurav Dasgupta wrote:> Hi, > > I have a MapReduce Java code, which I am calling from R using rJava. I have > prepared the R package and tested that successfully. But when I deployed > the package in a cluster and executed it, I am getting > ClassNotFoundException. If I run the same job directly without integrating > with R, it runs perfectly. > Here is my R code: > > library(rJava) > muMstSpark <- function(mesosMaster = NULL, input = NULL, output = NULL, > scalaLib = NULL, sparkCore = NULL, inputSplits = 8) { > if (is.null(mesosMaster) || is.null(input) || is.null(output) || > is.null(scalaLib) || is.null(sparkCore)) { > stop("Usage: muMST(<mesosMaster>, <input>, <output>, <scalaLib>, > <sparkCore>, [<inputSplits>]") > } > > # Gets the absolute path of the external Scala and Java JARS > pkgPath = paste(system.file(package="MuMstBig"), "/jars", sep="") > > # Initializes the JVM specifying the directory where the main Java class > resides: > .jinit("pkgPath") > > # Adds all the required JARs to the class path: > .jaddClassPath(paste(pkgPath, "Prims.jar", sep="/")) > .jaddClassPath(paste(pkgPath, "MSTInSpark.jar", sep="/")) > .jaddClassPath(scalaLib) > .jaddClassPath(sparkCore) > > # Creates the R object for the main Java class: > obj <- .jnew("MSTInSpark") > > # Calls the Java main class > .jcall(obj, "V", "mst", c(mesosMaster, input, output, inputSplits)) > } > Here is the error log: > > 13/02/08 00:54:48 INFO cluster.TaskSetManager: Loss was due to > java.lang.ClassNotFoundException: Prims$$anonfun$PrimsExecute$1 > at java.net.URLClassLoader$1.run(URLClassLoader.java:202) > at java.security.AccessController.doPrivileged(Native Method) > at java.net.URLClassLoader.findClass(URLClassLoader.java:190) > at java.lang.ClassLoader.loadClass(ClassLoader.java:306) > at java.lang.ClassLoader.loadClass(ClassLoader.java:247) > at java.lang.Class.forName0(Native Method) > at java.lang.Class.forName(Class.java:247) > at > spark.JavaDeserializationStream$$anon$1.resolveClass(JavaSerializer.scala:20) > at java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1574) > at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1495) > at > java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1731) > at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1328) > at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1946) > at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1870) > at > java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1752) > at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1328) > at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1946) > at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1870) > at > java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1752) > at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1328) > at java.io.ObjectInputStream.readObject(ObjectInputStream.java:350) > at scala.collection.immutable.$colon$colon.readObject(List.scala:435) > at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > at java.lang.reflect.Method.invoke(Method.java:597) > at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:969) > at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1848) > at > java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1752) > at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1328) > at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1946) > at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1870) > at > java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1752) > at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1328) > at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1946) > at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1870) > at > java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1752) > at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1328) > at java.io.ObjectInputStream.readObject(ObjectInputStream.java:350) > at scala.collection.immutable.$colon$colon.readObject(List.scala:435) > at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > at java.lang.reflect.Method.invoke(Method.java:597) > at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:969) > at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1848) > at > java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1752) > at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1328) > at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1946) > at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1870) > at > java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1752) > at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1328) > at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1946) > at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1870) > at > java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1752) > at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1328) > at java.io.ObjectInputStream.readObject(ObjectInputStream.java:350) > at spark.JavaDeserializationStream.readObject(JavaSerializer.scala:23) > at spark.JavaSerializerInstance.deserialize(JavaSerializer.scala:45) > at spark.executor.Executor$TaskRunner.run(Executor.scala:93) > at > java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) > at java.lang.Thread.run(Thread.java:662) > > I think R is unable to find the classpath. But I have specified that in the > script by taking the absolute path of the JARs in the package. The > package's installed across the cluster. Any idea, whats going wrong? > > Thanks, > > [[alternative HTML version deleted]] > > ______________________________________________ > R-devel at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel >-- Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595