Displaying 20 results from an estimated 3000 matches similar to: "Large data sets with R (binding to hadoop available?)"
2009 Jul 31
1
Using R with Hadoop/Hive for Big Data
Hive <http://hadoop.apache.org/hive/> is a data warehouse infrastructure
built on top of Hadoop that provides tools to enable easy data
summarization, adhoc querying and analysis of large datasets data stored in
Hadoop files. It provides a mechanism to put structure on this data and it
also provides a simple query language called QL which is based on SQL and
which enables users familiar with
2008 Sep 11
2
database table merging tips with R
I have not devoted time to setting up ROracle since binaries are not available and it seems to require some effort to compile (see http://cran.r-project.org/web/packages/ROracle/index.html). On the other hand, RODBC worked more or less magically once I set up the data sources.
What is your success using ROracle and why would it be preferable to RODBC ?
-Avram
On Thursday, September 11,
2012 Nov 07
2
R + Hadoop on Amazon
Hello All,
Having some issue with local machine, I need to locate myself on Amazon
for running R and Hadoop with Amazon instance. After searching a lot, I
can't able to take a decision for choosing Image for Amazon instance. Can any
one using R + Hadoop on Amazon.
Thanks
[[alternative HTML version deleted]]
2009 Apr 29
2
help converting for loop to vector operation
Dear List,
I have a wrapper function that draws a graph that I'd like to use in a vector-like manner. The for-loop version I currently use is below.
library(ggplot2)
data(economics)
h <- 600
w <- 800
#----------------------------------------------------------
draw_metric_by_date <- function( df, i, smooth=FALSE, BASEPATH ) {
mlabel <- names(df)[i]
qmetric
2013 Sep 23
0
Unable to execute Java MapReduce (Hadoop) code from R using rJava
Hi All,
I have written a Java MapReduce code that runs on Hadoop. My intention is
to create an R package which will call the Java code and execute the job.
Hence, I have written a similar R function. But when I call this function
from R terminal, the Hadoop job is not running. Its just printing few lines
of warning messages and does nothing further. Here is the execution
scenario:
*>
2013 Oct 09
2
Error while running MR using rmr2
Hi,
I have trying to run a simple MR program using rmr2 in a single node Hadoop
cluster. Here is the environment for the setup
Ubuntu 12.04 (32 bit)
R (Ubuntu comes with 2.14.1, so updated to 3.0.2)
Installed the latest rmr2 and rhdfs from
here<https://github.com/RevolutionAnalytics/RHadoop/wiki/Downloads>and
the corresponding dependencies
Hadoop 1.2.1
Now I am trying to run a simple MR
2003 Dec 04
1
assigning colors to barplot when beside=TRUE
dear list,
i am having trouble coloring the bars in a barplot. my data have two
groups, which i would like to plot side by side. within each group i
want to sort the observations in decreasing order, like a pareto
chart. the bar colors would relfect the value of a third variable.
below i have generated a reproducible example. the bar heights are a
given pig's "gain",
2019 Nov 21
2
How to make xapian run in hadoop
Hi all,
We use xapian as the backend of our system. Now the data need be indexed ever-increasing, and the local mode is hard to maintain, so we plan to move the index builder to hadoop. We try to make xapian can be run in hadoop, and now met a problem that there are many seek operations when xapian writes the index files, but the method seek() in hadoop c api only support read, and we blocked by
2005 Oct 20
4
creating a derived variable in a data frame
Hello,
I have read through the manuals and can't seem to find an answer.
I have a categorical, character variable that has hundreds of values. I want to group the existing values of this variable into a new, derived (categorical) variable by applying conditions to the values in the data.
For example, suppose I have a data frame with variables: date, country, x, y, and z.
x,y,z are
2013 May 20
1
Glusterfs-Hadoop
Hi,
Where can I find glusterfs-hadoop-0.20.2-0.1.x86_64.rpm?
The following link is from the Gluster FS Admin Guide, but it doesn't exist:
http://download.gluster.com/pub/gluster/glusterfs/qa-releases/3.3-beta-2/glusterfs-hadoop-0.20.2-0.1.x86_64.rpm
Thanks!
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
2015 Dec 09
2
SVM hadoop
Buenos días,
alguien sabe si hay alguna manera de implementar una máquina de soporte
vectorial (svm) con R-hadoop??
Mi interés es hacer procesamiento big data con svm. Se que en R, existen
los paquetes {RtextTools} y {e1071} que permiten hacer svm. Pero no estoy
segura de que el algoritmo sea paralelizable, es decir, que pueda correr en
paralelo a través de la plataforma R-hadoop.
Muchas
2015 Dec 10
2
SVM hadoop
Hola,
Puedes poner un RStudio en Amazon, poner "caret" y a correr....
No sé si tendrás suficiente con lo que te pueda ofrecer Amazon para tu
problema... creo que sí... ;-)....
O directamente hacerlo aquí, que toda esta instalación ya la tienen hecha:
http://www.teraproc.com/front-page-posts/r-on-demand/
Gracias,
Carlos.
El 10 de diciembre de 2015, 14:43, MªLuz Morales <mlzmrls
2015 Dec 10
3
SVM hadoop
Estimados
Un día leí algo en el siguiente hipervínculo, pero nunca lo use.
http://blog.revolutionanalytics.com/2015/06/using-hadoop-with-r-it-depends.html
Javier Rubén Marcuzzi
De: Carlos J. Gil Bellosta
Enviado: miércoles, 9 de diciembre de 2015 14:33
Para: MªLuz Morales
CC: r-help-es
Asunto: Re: [R-es] SVM hadoop
No, no correrán en paralelo si usas los SVM de paquetes como e1071.
No
2015 Dec 11
2
SVM hadoop
Hola Mª Luz,
Te cuento un poco mi visión:
Lo primero de todo es tener claro qué quiero hacer exactamente en paralelo,
se me ocurren 3 escenarios:
(1) Aplicar un modelo en este caso SVM sobre unos datos muy grandes y por
eso necesito hadoop/spark
(2) Realizar muchos modelos SVM sobre datos pequeños (por ejemplo uno por
usuario) y por eso necesito hadoop/spark para parelilizar estos procesos
2010 Dec 24
1
Running scripts in hadoop
R-help group,
I'm looking for some assistance on using an R-script to read STDIN from
hadoop.
Example, say I have two tables. One is a student table, the other is a class
roster table (tables join on student_id). Student SAT score is in the
student table, whether the student passed or not is in the roster table.
So to determine if a student passed or failed based on their SAT score, I'd
2008 Dec 03
3
Help with maps
A few questions about maps...
(1) How can I find a listing of the internal data sets that map() from the maps library contains?
For example, "usa", "county", "state", "nz" all work. Are there any others?
(2) Is there an easier, more generalized way to produce this (http://www.ai.rug.nl/~hedderik/R/US2004/ ) type of plot than this
2009 Feb 26
1
bottom legends in ggplot2 ?
Has anyone had success with producing legends to a qplot graph such that the legend is placed on the bottom, under the abcissa rather than to the right hand side ?
The following doesn't move the legend:
library(ggplot2)
qplot(mpg, wt, data=mtcars, colour=cyl, gpar(legend.position="bottom") )
I am using ggplot2_0.8.2.
Thanks in advance,
Avram
2009 Nov 06
4
Hadoop Cluster on Xen
Hi all,
Has anyone created a Xen cluster to run a hadoop vm cluster?
I would be interested in how it performs
Thanks
Lance
--
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.
_______________________________________________
Xen-users mailing list
Xen-users@lists.xensource.com
http://lists.xensource.com/xen-users
2018 Mar 07
2
Reglas de asociación en un cluster Hadoop
Hola,
Quizás no es el ámbito más apropiado, pero vale la pena intentar.
¿Existe alguna implementación en R del modelo de reglas de asociación que
pueda realizar el cálculo en paralelo sobre un cluster Hadoop?
He visto los paquetes que 'paralelizan' R, pero no mencionan nada acerca de
modelos de reglas de asociación.
Saludos
--
Oscar Benitez
[[alternative HTML version deleted]]
2016 Feb 16
5
error en tarea mapreduce en Rstudio en ubuntu
Buenos días,
Tengo instalado un cluster en una máquina virtual y he instalado R y
Rstudio (sobre Ubuntu server 14.04 64 bits ) . Desde consola puedo entrar
en R y ejecutar un ejemplo con mapreduce sin problemas. Pero cuando lo
intento hacer desde Rstudio obtengo este error:
16/02/16 10:37:00 ERROR streaming.StreamJob: Job not successful!
Streaming Command Failed!
Error in mr(map = map, reduce =