thr3ads.net - similar to: "Using R with Hadoop/Hive for Big Data"

Displaying 20 results from an estimated 1000 matches similar to: "Using R with Hadoop/Hive for Big Data"

Large data sets with R (binding to hadoop available?)

2008 Aug 21

Large data sets with R (binding to hadoop available?)

Dear R community, I find R fantastic and use R whenever I can for my data analytic needs. Certain data sets, however, are so large that other tools seem to be needed to pre-process data such that it can be brought into R for further analysis. Questions I have for the many expert contributors on this list are: 1. How do others handle situations of large data sets (gigabytes, terabytes)

How to make xapian run in hadoop

2019 Nov 21

How to make xapian run in hadoop

Hi all, We use xapian as the backend of our system. Now the data need be indexed ever-increasing, and the local mode is hard to maintain, so we plan to move the index builder to hadoop. We try to make xapian can be run in hadoop, and now met a problem that there are many seek operations when xapian writes the index files, but the method seek() in hadoop c api only support read, and we blocked by

Glusterfs-Hadoop

2013 May 20

Glusterfs-Hadoop

Hi, Where can I find glusterfs-hadoop-0.20.2-0.1.x86_64.rpm? The following link is from the Gluster FS Admin Guide, but it doesn't exist: http://download.gluster.com/pub/gluster/glusterfs/qa-releases/3.3-beta-2/glusterfs-hadoop-0.20.2-0.1.x86_64.rpm Thanks! -------------- next part -------------- An HTML attachment was scrubbed... URL:

R + Hadoop on Amazon

2012 Nov 07

R + Hadoop on Amazon

Hello All, Having some issue with local machine, I need to locate myself on Amazon for running R and Hadoop with Amazon instance. After searching a lot, I can't able to take a decision for choosing Image for Amazon instance. Can any one using R + Hadoop on Amazon. Thanks [[alternative HTML version deleted]]

database table merging tips with R

2008 Sep 11

database table merging tips with R

I have not devoted time to setting up ROracle since binaries are not available and it seems to require some effort to compile (see http://cran.r-project.org/web/packages/ROracle/index.html). On the other hand, RODBC worked more or less magically once I set up the data sources. What is your success using ROracle and why would it be preferable to RODBC ? -Avram On Thursday, September 11,

SVM hadoop

2015 Dec 09

SVM hadoop

Buenos días, alguien sabe si hay alguna manera de implementar una máquina de soporte vectorial (svm) con R-hadoop?? Mi interés es hacer procesamiento big data con svm. Se que en R, existen los paquetes {RtextTools} y {e1071} que permiten hacer svm. Pero no estoy segura de que el algoritmo sea paralelizable, es decir, que pueda correr en paralelo a través de la plataforma R-hadoop. Muchas

Hadoop Hive output read into R

2011 Jun 13

Hadoop Hive output read into R

All, I am using a pretty crude method to get data out of HDFS via Hive and into R and was curious about alternatives that the group has explored. Basically, I run a system command that runs a hive statement and writes the returned data to a delimited file. Then, I read that file into an object and continue. For example: hive.script <- "select * from orders where date =

SVM hadoop

2015 Dec 10

SVM hadoop

Hola, Puedes poner un RStudio en Amazon, poner "caret" y a correr.... No sé si tendrás suficiente con lo que te pueda ofrecer Amazon para tu problema... creo que sí... ;-).... O directamente hacerlo aquí, que toda esta instalación ya la tienen hecha: http://www.teraproc.com/front-page-posts/r-on-demand/ Gracias, Carlos. El 10 de diciembre de 2015, 14:43, MªLuz Morales <mlzmrls

SVM hadoop

2015 Dec 10

SVM hadoop

Estimados Un día leí algo en el siguiente hipervínculo, pero nunca lo use. http://blog.revolutionanalytics.com/2015/06/using-hadoop-with-r-it-depends.html Javier Rubén Marcuzzi De: Carlos J. Gil Bellosta Enviado: miércoles, 9 de diciembre de 2015 14:33 Para: MªLuz Morales CC: r-help-es Asunto: Re: [R-es] SVM hadoop No, no correrán en paralelo si usas los SVM de paquetes como e1071. No

SVM hadoop

2015 Dec 11

SVM hadoop

Hola Mª Luz, Te cuento un poco mi visión: Lo primero de todo es tener claro qué quiero hacer exactamente en paralelo, se me ocurren 3 escenarios: (1) Aplicar un modelo en este caso SVM sobre unos datos muy grandes y por eso necesito hadoop/spark (2) Realizar muchos modelos SVM sobre datos pequeños (por ejemplo uno por usuario) y por eso necesito hadoop/spark para parelilizar estos procesos

Running scripts in hadoop

2010 Dec 24

Running scripts in hadoop

R-help group, I'm looking for some assistance on using an R-script to read STDIN from hadoop. Example, say I have two tables. One is a student table, the other is a class roster table (tables join on student_id). Student SAT score is in the student table, whether the student passed or not is in the roster table. So to determine if a student passed or failed based on their SAT score, I'd

Sparse KMeans/KDE/Nearest Neighbors?

2010 Feb 24

Sparse KMeans/KDE/Nearest Neighbors?

hi, I have a dataset (the netflix dataset) which is basically ~18k columns and well variable number of rows but let's assume 25 thousand for now. The dataset is very sparse. I was wondering how to do kmeans/nearest neighbors or kernel density estimation on it. I tired using the spMatrix function in "Matrix" package. I think I'm able to create the matrix but as soon as I pass

Hadoop Cluster on Xen

2009 Nov 06

Hadoop Cluster on Xen

Hi all, Has anyone created a Xen cluster to run a hadoop vm cluster? I would be interested in how it performs Thanks Lance -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users

help converting for loop to vector operation

2009 Apr 29

help converting for loop to vector operation

Dear List, I have a wrapper function that draws a graph that I'd like to use in a vector-like manner. The for-loop version I currently use is below. library(ggplot2) data(economics) h <- 600 w <- 800 #---------------------------------------------------------- draw_metric_by_date <- function( df, i, smooth=FALSE, BASEPATH ) { mlabel <- names(df)[i] qmetric

Reglas de asociación en un cluster Hadoop

2018 Mar 07

Reglas de asociación en un cluster Hadoop

Hola, Quizás no es el ámbito más apropiado, pero vale la pena intentar. ¿Existe alguna implementación en R del modelo de reglas de asociación que pueda realizar el cálculo en paralelo sobre un cluster Hadoop? He visto los paquetes que 'paralelizan' R, pero no mencionan nada acerca de modelos de reglas de asociación. Saludos -- Oscar Benitez [[alternative HTML version deleted]]

Wiki article on installing hadoop

2010 Mar 12

Wiki article on installing hadoop

Hey I would like to write a little article on installing hadoop on CentOS. Comments? Can I just go forward with this? Cheers Didi -- Hoffmann Geerd-Dietger http://contact.ribalba.de

Hadoop

2016 Jun 15

Hadoop

Hola buenas, me preguntaba si alguno usa hadoop Spark en su día día y si me podíais recomendar un buen curso para empezar. Estuve en la charla de meetup de madrid hace unos meses de Rspark y estuvo bien, ahora me preguntaba si es posible profundizar. Pero me gustaría tener recomendaciones de cualquier material que podáis recomendar, cursos de coursera que hayais hecho, libros que hayais leido,

Help with maps

2008 Dec 03

Help with maps

A few questions about maps... (1) How can I find a listing of the internal data sets that map() from the maps library contains? For example, "usa", "county", "state", "nz" all work. Are there any others? (2) Is there an easier, more generalized way to produce this (http://www.ai.rug.nl/~hedderik/R/US2004/ ) type of plot than this

Unable to execute Java MapReduce (Hadoop) code from R using rJava

2013 Sep 23

Unable to execute Java MapReduce (Hadoop) code from R using rJava

Hi All, I have written a Java MapReduce code that runs on Hadoop. My intention is to create an R package which will call the Java code and execute the job. Hence, I have written a similar R function. But when I call this function from R terminal, the Hadoop job is not running. Its just printing few lines of warning messages and does nothing further. Here is the execution scenario: *>

How to make xapian run in hadoop

2019 Nov 22

How to make xapian run in hadoop

On Thu, Nov 21, 2019 at 10:20:19AM +0800, ??? wrote: > We use xapian as the backend of our system. Now the data need be > indexed ever-increasing, and the local mode is hard to maintain, so we > plan to move the index builder to hadoop. We try to make xapian can be > run in hadoop, and now met a problem that there are many seek > operations when xapian writes the index files, but

similar to: Using R with Hadoop/Hive for Big Data