Gaurav Dasgupta
2013-Sep-23 12:00 UTC
[Rd] Unable to execute Java MapReduce (Hadoop) code from R using rJava
Hi All, I have written a Java MapReduce code that runs on Hadoop. My intention is to create an R package which will call the Java code and execute the job. Hence, I have written a similar R function. But when I call this function from R terminal, the Hadoop job is not running. Its just printing few lines of warning messages and does nothing further. Here is the execution scenario: *> source("mueda.R")* *> mueda(analysis="eda", input="/user/root/sample1.txt", output="/user/root/eda_test", columns=c("0", "1"), columnSeparator=",")* *log4j:WARN No appenders could be found for logger (org.apache.hadoop.metrics2.lib.MutableMetricsFactory).* *log4j:WARN Please initialize the log4j system properly.* *log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.* *>* The warning messages displayed are common while running the Hadoop job normally (without R), but then it executes properly. What might be the cause that I am unable to execute the job from R using rJava and there's not error message as well? Below is the R code using rJava and calling the Java code written in the backend: *library(rJava) * *mueda <- function(analysis = "eda",* * input = NULL, * * output = NULL, * * columns = c(NULL), * * columnSeparator = NULL,* * cat.cut.off = 50,* * percentilePoints = c(0,0.01,0.05,0.1,0.25,0.50,0.75,0.90,0.95,0.99,1),* * histogram = FALSE) {* * * * if (is.null(input) || is.null(output) || is.null(columns) || is.null(columnSeparator)) {* * stop("Usage: mueda(<analysis>, <input>, <output>, <columns>, <columnSeparator>, [<cat.cut.off>], [<percentilePoints>], [<histogram>]")* * }* * * * # Gets the absolute path of the external JARS* * #pkgPath = paste(system.file(package="muEDA"), "/jars", sep="")* * pkgPath = paste("../inst", "/jars", sep="")* * * * # Initializes the JVM specifying the directory where the main Java class resides:* * .jinit("pkgPath")* * * * # Adds all the required JAR to the class path:* * .jaddClassPath(paste(pkgPath, "Eda.jar", sep="/"))* * .jaddClassPath(paste(pkgPath, "commons-cli-1.2.jar", sep="/"))* * .jaddClassPath(paste(pkgPath, "hadoop-hdfs-2.0.0-cdh4.3.0.jar", sep="/"))* * .jaddClassPath(paste(pkgPath, "slf4j-log4j12-1.6.1.jar", sep="/"))* * .jaddClassPath(paste(pkgPath, "commons-configuration-1.6.jar", sep="/"))* * .jaddClassPath(paste(pkgPath, "guava-11.0.2.jar", sep="/"))* * .jaddClassPath(paste(pkgPath, "hadoop-mapreduce-client-core-2.0.0-cdh4.3.0.jar", sep="/"))* * .jaddClassPath(paste(pkgPath, "commons-lang-2.5.jar", sep="/"))* * .jaddClassPath(paste(pkgPath, "hadoop-auth-2.0.0-cdh4.3.0.jar", sep="/"))* * .jaddClassPath(paste(pkgPath, "log4j-1.2.17.jar", sep="/"))* * .jaddClassPath(paste(pkgPath, "commons-logging-1.1.1.jar", sep="/"))* * .jaddClassPath(paste(pkgPath, "hadoop-common-2.0.0-cdh4.3.0.jar", sep="/"))* * .jaddClassPath(paste(pkgPath, "slf4j-api-1.6.1.jar", sep="/"))* * * * # Creates the R object for the main Java class:* * obj <- .jnew("EDA")* * * * if ((analysis == "eda") || (analysis == "univ")) {* * * * # Concatenating the column names to pass as an argument to Java* * col = columns[1]* * for(i in 2:length(columns)) {* * col = paste(col, columns[i], sep = ",")* * }* * * * switch (analysis,* * * * # Calls the Java main class with the “return type”, “method name”, “parameters to pass” to perform EDA* * eda = .jcall(obj, "V", "edaExecute", c("eda", input, output, col)),* * # Calls the Java main class with the “return type”, “method name”, “parameters to pass” to perform* * # Univariate Analysis* * univ = .jcall(obj, "V", "edaExecute", c("univ", input, output, col)))* * } else if (analysis == "freq") {* * * * # Calls the Java main class with the “return type”, “method name”, “parameters to pass” to perform* * # Frequency Analysis* * .jcall(obj, "V", "edaExecute", c("freq", input, output, col))* * } else if ((analysis != "eda") && (analysis != "univ") && (analysis !"freq")) {* * stop("Please provide either \"eda\" or \"univ\" or \"freq\" for <analysis> argument")* * }* *}* Regards, Gaurav [[alternative HTML version deleted]]