Taylor, Ronald C
2017-Feb-01 00:30 UTC
[R] need help in trying out sparklyr - spark_connect will not work
Hello R-help list, I am a new list member. My first question: I was trying out sparklyr (in R ver 3.3.2) on my Red Hat Linux workstation, following the instructions at spark.rstudio.com as to how to download and use a local copy of Spark. The Spark download appears to work. However, when I try to the do the spark_connect, to get started, I get the error msgs that you see below. I cannot find any guidance as to how to fix this. Quite frustrating. Can somebody give me a bit of help? Does something need to be added to my PATH env var in my .mycshrc file, for example? Is there a closed port problem? Has anybody run into this type of error msg? Do I need to do something additional to start up the local copy of Spark that is not mentioned in the RStudio online documentation? - Ron %%%%%%%%%%%%%%%%%%%%> spark_install(version = "1.6.2")Installing Spark 1.6.2 for Hadoop 2.6 or later. Downloading from: - 'https://d3kbcqa49mib13.cloudfront.net/spark-1.6.2-bin-hadoop2.6.tgz' Installing to: - '~/.cache/spark/spark-1.6.2-bin-hadoop2.6' trying URL 'https://d3kbcqa49mib13.cloudfront.net/spark-1.6.2-bin-hadoop2.6.tgz' Content type 'application/x-tar' length 278057117 bytes (265.2 MB) =================================================downloaded 265.2 MB Installation complete.> > sc <- spark_connect(master = "local")Error in force(code) : Failed while connecting to sparklyr to port (8880) for sessionid (3689): Gateway in port (8880) did not respond. Path: /home/rtaylor/.cache/spark/spark-1.6.2-bin-hadoop2.6/bin/spark-submit Parameters: --class, sparklyr.Backend, --jars, '/usr/lib64/R/library/sparklyr/java/spark-csv_2.11-1.3.0.jar','/usr/lib64/R/library/sparklyr/java/commons-csv-1.1.jar','/usr/lib64/R/library/sparklyr/java/univocity-parsers-1.5.1.jar', '/usr/lib64/R/library/sparklyr/java/sparklyr-1.6-2.10.jar', 8880, 3689 ---- Output Log ---- /home/rtaylor/.cache/spark/spark-1.6.2-bin-hadoop2.6/bin/spark-class: line 86: /usr/local/bin/bin/java: No such file or directory ---- Error Log ---->%%%%%%%%%%%%%%%%%% Full screen output of my R session, from the R invocation on: sidney115% R R version 3.3.2 (2016-10-31) -- "Sincere Pumpkin Patch" Copyright (C) 2016 The R Foundation for Statistical Computing Platform: x86_64-redhat-linux-gnu (64-bit)> > library(sparklyr) > > ls(pos = "package:sparklyr")[1] "%>%" [2] "compile_package_jars" [3] "connection_config" [4] "connection_is_open" [5] "copy_to" [6] "ensure_scalar_boolean" [7] "ensure_scalar_character" [8] "ensure_scalar_double" [9] "ensure_scalar_integer" [10] "find_scalac" [11] "ft_binarizer" [12] "ft_bucketizer" [13] "ft_discrete_cosine_transform" [14] "ft_elementwise_product" [15] "ft_index_to_string" [16] "ft_one_hot_encoder" [17] "ft_quantile_discretizer" [18] "ft_regex_tokenizer" [19] "ft_sql_transformer" [20] "ft_string_indexer" [21] "ft_tokenizer" [22] "ft_vector_assembler" [23] "hive_context" [24] "invoke" [25] "invoke_method" [26] "invoke_new" [27] "invoke_static" [28] "java_context" [29] "livy_available_versions" [30] "livy_config" [31] "livy_home_dir" [32] "livy_install" [33] "livy_install_dir" [34] "livy_installed_versions" [35] "livy_service_start" [36] "livy_service_stop" [37] "ml_als_factorization" [38] "ml_binary_classification_eval" [39] "ml_classification_eval" [40] "ml_create_dummy_variables" [41] "ml_decision_tree" [42] "ml_generalized_linear_regression" [43] "ml_gradient_boosted_trees" [44] "ml_kmeans" [45] "ml_lda" [46] "ml_linear_regression" [47] "ml_load" [48] "ml_logistic_regression" [49] "ml_model" [50] "ml_multilayer_perceptron" [51] "ml_naive_bayes" [52] "ml_one_vs_rest" [53] "ml_options" [54] "ml_pca" [55] "ml_prepare_dataframe" [56] "ml_prepare_features" [57] "ml_prepare_response_features_intercept" [58] "ml_random_forest" [59] "ml_save" [60] "ml_survival_regression" [61] "ml_tree_feature_importance" [62] "na.replace" [63] "print_jobj" [64] "register_extension" [65] "registered_extensions" [66] "sdf_copy_to" [67] "sdf_import" [68] "sdf_load_parquet" [69] "sdf_load_table" [70] "sdf_mutate" [71] "sdf_mutate_" [72] "sdf_partition" [73] "sdf_persist" [74] "sdf_predict" [75] "sdf_quantile" [76] "sdf_read_column" [77] "sdf_register" [78] "sdf_sample" [79] "sdf_save_parquet" [80] "sdf_save_table" [81] "sdf_schema" [82] "sdf_sort" [83] "sdf_with_unique_id" [84] "spark_available_versions" [85] "spark_compilation_spec" [86] "spark_compile" [87] "spark_config" [88] "spark_connect" [89] "spark_connection" [90] "spark_connection_is_open" [91] "spark_context" [92] "spark_dataframe" [93] "spark_default_compilation_spec" [94] "spark_dependency" [95] "spark_disconnect" [96] "spark_disconnect_all" [97] "spark_home_dir" [98] "spark_install" [99] "spark_install_dir" [100] "spark_install_tar" [101] "spark_installed_versions" [102] "spark_jobj" [103] "spark_load_table" [104] "spark_log" [105] "spark_read_csv" [106] "spark_read_json" [107] "spark_read_parquet" [108] "spark_save_table" [109] "spark_session" [110] "spark_uninstall" [111] "spark_version" [112] "spark_version_from_home" [113] "spark_web" [114] "spark_write_csv" [115] "spark_write_json" [116] "spark_write_parquet" [117] "tbl_cache" [118] "tbl_uncache"> > > > spark_install(version = "1.6.2")Installing Spark 1.6.2 for Hadoop 2.6 or later. Downloading from: - 'https://d3kbcqa49mib13.cloudfront.net/spark-1.6.2-bin-hadoop2.6.tgz' Installing to: - '~/.cache/spark/spark-1.6.2-bin-hadoop2.6' trying URL 'https://d3kbcqa49mib13.cloudfront.net/spark-1.6.2-bin-hadoop2.6.tgz' Content type 'application/x-tar' length 278057117 bytes (265.2 MB) =================================================downloaded 265.2 MB Installation complete.> > sc <- spark_connect(master = "local")Error in force(code) : Failed while connecting to sparklyr to port (8880) for sessionid (3689): Gateway in port (8880) did not respond. Path: /home/rtaylor/.cache/spark/spark-1.6.2-bin-hadoop2.6/bin/spark-submit Parameters: --class, sparklyr.Backend, --jars, '/usr/lib64/R/library/sparklyr/java/spark-csv_2.11-1.3.0.jar','/usr/lib64/R/library/sparklyr/java/commons-csv-1.1.jar','/usr/lib64/R/library/sparklyr/java/univocity-parsers-1.5.1.jar', '/usr/lib64/R/library/sparklyr/java/sparklyr-1.6-2.10.jar', 8880, 3689 ---- Output Log ---- /home/rtaylor/.cache/spark/spark-1.6.2-bin-hadoop2.6/bin/spark-class: line 86: /usr/local/bin/bin/java: No such file or directory ---- Error Log ---->%%%%%%%%%%%%%%%%%% Ronald C. Taylor, Ph.D. Computational Biology & Bioinformatics Group Pacific Northwest National Laboratory (U.S. Dept of Energy/Battelle) Richland, WA 99352 phone: (509) 372-6568, email: ronald.taylor at pnnl.gov web page: http://www.pnnl.gov/science/staff/staff_info.asp?staff_num=7048 [[alternative HTML version deleted]]