2013 May 20
Where can I find glusterfs-hadoop-0.20.2-0.1.x86_64.rpm?
The following link is from the Gluster FS Admin Guide, but it doesn't exist:
2009 May 07
problem with conditionals
I''m new to puppet. I''m trying to use some real case examples to better
understand how Puppet works.
Here''s my case:
exec { "usermod -d /home/hadoop -s /bin/bash hadoop":
unless => "test `grep ^hadoop /etc/passwd | awk -F: ''{print
$6}''` == ''/home/hadoop''"
The idea is the usermod would only get executed if the user''s home
directory was something other then /home/hadoop...
2013 Oct 09
Error while running MR using rmr2
I have trying to run a simple MR program using rmr2 in a single node Hadoop
cluster. Here is the environment for the setup
Ubuntu 12.04 (32 bit)
R (Ubuntu comes with 2.14.1, so updated to 3.0.2)
Installed the latest rmr2 and rhdfs from
the corresponding dependencies
Hadoop 1.2.1
Now I am trying...
2013 Mar 11
Understanding lustre setup ..
I have been reading
http://wiki.lustre.org/images/1/1b/Hadoop_wp_v0.4.2.pdf for setting up
Hadoop over lustre.
Generally in hadoop setup, we have 1 Namenode and various number of datanodes.
If I want to setup the same keeping Lustre as backend, in the document
it is mentioned that:
".............Our experiments run on cluster with 8 nodes in total,
2019 Nov 21
How to make xapian run in hadoop
Hi all,
We use xapian as the backend of our system. Now the data need be indexed ever-increasing, and the local mode is hard to maintain, so we plan to move the index builder to hadoop. We try to make xapian can be run in hadoop, and now met a problem that there are many seek operations when xapian writes the index files, but the method seek() in hadoop c api only support read, and we blocked by that now?It looks a big work to rewrite the xapian database backend to adapter the ha...
2008 Aug 21
Large data sets with R (binding to hadoop available?)
into R for further analysis.
Questions I have for the many expert contributors on this list are:
1. How do others handle situations of large data sets (gigabytes,
terabytes) for analysis in R ?
2. Are there existing ways or plans to devise ways to use the R
language to interact with Hadoop or PIG ? The Hadoop project by
Apache has been successful at processing data on a large scale using
the map-reduce algorithm. A sister project uses an emerging language
called ?PIG-latin? or simply ?PIG? for using the Hadoop framework in
a manner reminiscent of the look and feel of R. Is...
2011 Oct 19
gluster map/reduce performance..
Hi, all,
i try to check the performance of Map/Reduce of Gluster File system.
Mapper side speed is quite good and it is sometimes faster than hadoop's map job.
But in the Reduce Side job is much slower than hadoop.
i analyze the result and i found the primary reason of slow speed is bad performance in Merging stage.
Would you have any suggestion for this issue
FYI check the blog http://storage4com.blogspot.com/
2009 Jul 31
Using R with Hadoop/Hive for Big Data
Hive <http://hadoop.apache.org/hive/> is a data warehouse infrastructure
built on top of Hadoop that provides tools to enable easy data
summarization, adhoc querying and analysis of large datasets data stored in
Hadoop files. It provides a mechanism to put structure on this data and it
also provides a simple query...
2013 Nov 20
How come that module is not executed in Windows?
I have the following in vagrantfile in WIndows system.
config.vm.provision :puppet do |puppet|
puppet.manifests_path = "manifests"
puppet.manifest_file = "base-hadoop.pp"
puppet.module_path = "modules"
when i run vagrant provision, i do see manifest and module folders are
mounted and ssh into vm, I can find files in the following path
[default] -- /tmp/vagrant-puppet/manifests
[default] -- /tmp/vagrant-puppet/modules-0
I do see bas...
2010 Oct 08
New user - Issue using Generic::Mkuser in the ghoneycutt/generic module.
...requirement for ssh keys
to work. Here is my issue. I am getting this error from the agent. The
SSH part works fine, but it will not create the user due to a
dependency issue. I do not know how to debug this.
err: Could not run Puppet configuration client: Could not find
dependency Generic::Mkuser[hadoop] for Ssh::Authorized_keys[hadoop]
at /etc/puppet/manifests/templates.pp:5
Here are my files
node "ns1.colo.networkedinsights.com" inherits "default" {
include ntp::server
ssh::authorized_keys { "hadoop":
users => [...
2011 Jan 04
Allowing puppet to drop privileges for a manifest
...root-owned cfengine
client running every 15 minutes from cron contacting a single cfservd
server. Additionally, our employees start their own cfengine and
puppet instances on on some servers running under their various
service accounts to manage their own software configurations (for
example, the Hadoop team does not have root access, and runs as the
''hadoop'' user with a puppet instance running as ''hadoop''). Having
multiple configuration management daemons causes increased system load
and it generally seems wrong.
I''d like the ability to have one pupp...
2015 Dec 11
SVM hadoop
Hola Mª Luz,
Te cuento un poco mi visión:
Lo primero de todo es tener claro qué quiero hacer exactamente en paralelo,
se me ocurren 3 escenarios:
(1) Aplicar un modelo en este caso SVM sobre unos datos muy grandes y por
eso necesito hadoop/spark
(2) Realizar muchos modelos SVM sobre datos pequeños (por ejemplo uno por
usuario) y por eso necesito hadoop/spark para parelilizar estos procesos
en muchas máquinas y acabar en un tiempo finito.
(3) Con un modelo ya realizado en local sobre una muestra quiero hacer
predicciones "pred...
2015 Dec 10
SVM hadoop
Un día leí algo en el siguiente hipervínculo, pero nunca lo use.
Javier Rubén Marcuzzi
De: Carlos J. Gil Bellosta
Enviado: miércoles, 9 de diciembre de 2015 14:33
Para: MªLuz Morales
CC: r-help-es
Asunto: Re: [R-es] SVM hadoop
No, no correrán en paralelo si usas los SVM de paquetes como e1071.
No obstante, tienes, por un lado, los...
2015 Dec 10
SVM hadoop
>>> gracias por vuestras respuestas anteriores. Son interesantes aunque me
>>> han
>>> surgido algunas dudas. Por ejemplo, con respecto al paquete e1071. En
>>> este
>>> enlace parece que si lo usan para hacer máquina de soporte vector en
>>> hadoop.
>>> http://stackoverflow.com/questions/17731261/r-hadoop-rmr2-svm-model-conver-result-list-class-to-original-class-sv?rq=1
>>> Carlos, por qué decías que no correrán en paralelo los svm del paquete
>>> e1071??
>>> Gracias
2013 Sep 23
Unable to execute Java MapReduce (Hadoop) code from R using rJava
Hi All,
I have written a Java MapReduce code that runs on Hadoop. My intention is
to create an R package which will call the Java code and execute the job.
Hence, I have written a similar R function. But when I call this function
from R terminal, the Hadoop job is not running. Its just printing few lines
of warning messages and does nothing further. Here is the...
2015 Dec 09
SVM hadoop
Buenos días,
alguien sabe si hay alguna manera de implementar una máquina de soporte
vectorial (svm) con R-hadoop??
Mi interés es hacer procesamiento big data con svm. Se que en R, existen
los paquetes {RtextTools} y {e1071} que permiten hacer svm. Pero no estoy
segura de que el algoritmo sea paralelizable, es decir, que pueda correr en
paralelo a través de la plataforma R-hadoop.
Muchas gracias
Un saludo
2012 Nov 07
R + Hadoop on Amazon
Hello All,
Having some issue with local machine, I need to locate myself on Amazon
for running R and Hadoop with Amazon instance. After searching a lot, I
can't able to take a decision for choosing Image for Amazon instance. Can any
one using R + Hadoop on Amazon.
2009 Nov 06
Hadoop Cluster on Xen
Hi all,
Has anyone created a Xen cluster to run a hadoop vm cluster?
I would be interested in how it performs
2019 Nov 22
How to make xapian run in hadoop
On Thu, Nov 21, 2019 at 10:20:19AM +0800, ??? wrote:
> We use xapian as the backend of our system. Now the data need be
> indexed ever-increasing, and the local mode is hard to maintain, so we
> plan to move the index builder to hadoop. We try to make xapian can be
> run in hadoop, and now met a problem that there are many seek
> operations when xapian writes the index files, but the method seek()
> in hadoop c api only support read, and we blocked by that now
Updating a glass backend database pretty fundamentally requi...
2010 Dec 24
Running scripts in hadoop
R-help group,
I'm looking for some assistance on using an R-script to read STDIN from
Example, say I have two tables. One is a student table, the other is a class
roster table (tables join on student_id). Student SAT score is in the
student table, whether the student passed or not is in the roster table.
So to determine if a student passed or failed based on their SAT score, I'...