Displaying 20 results from an estimated 40000 matches similar to: "Merging and Updating Data Frames with Unequal Size"
2013 May 02
3
R issue with unequal large data frames with multiple columns
I'm a bit of an amateur R programmer. I can do simple R scenarios but my
handle on complex grammatical issues isn't steady.
I have 12 CSV files that I've read into dataframes. Each has 8 columns and
over 2000000 rows. Each dataframe has data associated by time component
and a date component in the format of:
X.DATE and then X.TIME
X.DATE is in the format of MMDDYYYY and X.TIME is
2012 Oct 26
2
Stata Database & R
Dear All,
I am given some data to analyze. The data is in the form of a Stata
database (.dta file).
What is the best way to import it into an R dataframe?
Is there any particular caveat I should be aware of?
Many thanks
Lorenzo
2010 Nov 29
2
Setting Values of Elements in a Dataframe
Dear All,
I am experiencing some problems in resetting the values of some selected
elements in a dataframe.
Consider
d<-seq(-1,1,length=16)
dim(d)<-c(4,4)
d<-as.data.frame(d)
sel_pos<-which(d>0, arr.ind=TRUE)
d[sel_pos]<- -9
which returns the error
Error in `[<-.data.frame`(`*tmp*`, sel_pos, value = -9) :
only logical matrix subscripts are allowed in replacement
2013 Feb 09
3
Addressing Columns in a Data Frame
Dear All,
Probably a one liner, but I am banging my head against the floor.
Consider the following
DF <- data.frame(
x=1:10,
y=10:1,
z=rep(5,10),
a=11:20
)
mn<-names(DF)
but then I cannot retrieve a column by doing e.g,
DF$mn[2]
I tried to play with the quotes and so on, but so far with no avail.
Any suggestion is welcome.
Cheers
Lorenzo
2005 Jul 18
1
dataframes of unequal size
I have two dataframes C and C1. Each has three columns viz. state, psu
and weight. The dataframes are of unequal size i.e. C1 could be
2/25/50 rows and C has 42000 rows. C1 is the master table i.e.
C1$state, C1$psu and C1$weight are never the same. ThisA. P., Urban, 0
is not so for C.
For example
C
state, psu,weight
A. P., Urban, 0
Mah., Rural, 0
W.B., Rural,0
Ass., Rural,0
M. P., Urban,0
A. P.,
2009 Oct 07
1
merging dataframes with an unequal number of variables
Hallo Everyone
I have the kind of problem that one should never have because one must
always plan well and communicate with your team. But now I haven't so here
is my problem.
I have data coming in on a daily basis from surveys in 10 towns. The
questionnaire has 62 variables but some of the regions have used older
versions of the questionnaire that have a few variables less. I want to
combine
2013 Feb 03
3
RandomForest, Party and Memory Management
Dear All,
For a data mining project, I am relying heavily on the RandomForest and
Party packages.
Due to the large size of the data set, I have often memory problems (in
particular with the Party package; RandomForest seems to use less memory).
I really have two questions at this point
1) Please see how I am using the Party and RandomForest packages. Any
comment is welcome and useful.
2019 Jul 09
3
[R] Curl4, Quantmod, tseries and forecast
Hi Ralf,
I tried the following
> install.packages("RCurl")
which went OK, but then same story when I tried to install tseries.
> sessionInfo()
R version 3.6.1 (2019-07-05)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Debian GNU/Linux 10 (buster)
Matrix products: default
BLAS: /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.8.0
LAPACK:
2013 Mar 24
3
Parallelizing GBM
Dear All,
I am far from being a guru about parallel programming.
Most of the time, I rely or randomForest for data mining large datasets.
I would like to give a try also to the gradient boosted methods in GBM,
but I have a need for parallelization.
I normally rely on gbm.fit for speed reasons, and I usually call it this
way
gbm_model <- gbm.fit(trainRF,prices_train,
offset = NULL,
misc =
2009 Jul 20
3
Histograms on a log scale
Dear All,
I would like to be able to plot histograms/densities on a semi-log or
log-log scale.
I found several suggestions online
http://tolstoy.newcastle.edu.au/R/help/05/09/12044.html
https://stat.ethz.ch/pipermail/r-help/2002-June/022295.html
http://www.harding.edu/fmccown/R/#histograms
Now, consider the code snippet taken from
http://www.harding.edu/fmccown/R/#histograms
# Get a random
2011 Dec 15
3
From Distance Matrix to 2D coordinates
Dear All,
I am struggling with the following problem: I am given a NxN symmetric
matrix P ( P[i,i]=0, i=1...N and P[i,j]>0 for i!=j) which stands for the
relative distances of N points.
I would like use it to get the coordinates of the N points in a 2D
plane. Of course, the solution is not unique (given one solution, I can
translate or rotate all the points by the same amount and generate
2007 Aug 08
2
Relocating Axis Label/Title --2
Apologies for the previous mail (I sent it off too early by mistake).
This is the correct example:
rm(list=ls())
D_mean<-seq(-5,5,length=100)
y<-exp(-D_mean^2/5)
pdf("my.pdf")
plot(D_mean,y,type="l",yaxt="n",lty=2,lwd=2,col="black",
ylab = list(expression(paste(dN/dlogD[agg]," ["*cm^-3*"]"))),
xlab = expression(paste(D[agg],"
2012 Oct 05
2
Test for Random Points on a Sphere
Dear All,
I implemented an algorithm for (uniform) random rotations.
In order to test it, I can apply it to a unit vector (0,0,1) in Cartesian
coordinates.
The result is supposed to be a set of random, uniformly distributed,
points on a sphere (not the point of the algorithm, but a way to test it).
This is what the points look like when I plot them, but other then
eyeballing them, can anyone
2013 Mar 25
2
Reassign Multiple Factors to same Factor Value
Dear All,
Probably something very easy, but I am looking for the most efficient ways
to achieve this.
Consider the following snippet
y<-c('a','b','c','d','e','f','g')
x<-rnorm(length(y))
df<-data.frame(y,x)
leading to
> df$y
[1] a b c d e f g
Levels: a b c d e f g
Now, I would like to replace levels
2013 Jan 28
1
RandomForest and Missing Values
Dear All,
I would like to use a randomForest algorithm on a dataset.
The set is not particularly large/difficult to handle, but it has some
missing values (both factors and numerical values).
According to what I found
https://stat.ethz.ch/pipermail/r-help/2005-September/078880.html
https://stat.ethz.ch/pipermail/r-help/2007-January/123117.html
the randomForest package has a problem with missing
2016 Apr 19
3
Problem with X11
Dear All,
I have never had this problem before. I run debian testing on my box
and I have recently update my R environment.
Now, see what happens when I try the most trivial of all plots
> plot(seq(22))
Error in (function (display = "", width, height, pointsize, gamma, bg,
:
X11 module cannot be loaded
In addition: Warning message:
In (function (display = "", width,
2010 Feb 26
3
Plotting a Trivial Matrix
Dear All,
Consider a matrix (N x N) where each entry is either zero or one (can
hardly get any simpler).
Now, I would like to plot it as a 'chessboard' where every matrix entry
is a black (1) or white (0) square.
Whatever tool I use to plot it, it should not try to interpolate the
data at all.
I found some online references
http://www.phaget4.org/R/image_matrix.html
but probably I can
2010 Oct 24
6
Contour Plot on a non Rectangular Grid
Dear All,
I would like to plot a scalar (e.g. a temperature) on a non-rectangular
domain (or even better: I would simply like to be able to draw a contour
plot on an arbitrary 2D domain). I wonder if there is any tool to
achieve that with R. I did some online search in particular on the list
archives, found several queries similar to this one but was not able to
find any conclusive answer.
I
2007 Apr 05
17
Reasons to Use R
Dear All,
The institute I work for is organizing an internal workshop for High
Performance Computing (HPC).
I am planning to attend it and talk a bit about fluid dynamics, but
there is also quite a lot of interest devoted to data post-processing
and management of huge data sets.
A lot of people are interested in image processing/pattern recognition
and statistic applied to geography/ecology, but I
2019 Jul 07
2
Curl4, Quantmod, tseries and forecast
Dear All,
I have just upgraded to Debian stable 10 and rebuilt most of the R
packages.
I use the R backported packages from here
https://cran.r-project.org/bin/linux/debian/#debian-buster-testing
for the core system.
I encounter some issues when updating quantmod, tseries and forecast.
For instance, see the following
> install.packages("tseries")
which finally fails with the