Displaying 20 results from an estimated 70000 matches similar to: "manupulating a data frame column"
2008 Mar 29
1
Tabulating Sparse Contingency Table
I have a sparse contingency table (most cells are 0):
> xtabs(~.,data[,idx:(idx+4)])
, , x3 = 1, x4 = 1, x5 = 1
x2
x1 1 2 3
1 0 0 31
2 0 0 112
3 0 0 94
, , x3 = 2, x4 = 1, x5 = 1
x2
x1 1 2 3
1 0 0 0
2 0 0 0
3 0 0 0
, , x3 = 3, x4 = 1, x5 = 1
x2
x1 1 2 3
1 0 0 0
2 0 0 0
3 0 0 0
, , x3 = 1, x4
2003 Aug 05
1
code speed help? -- example and results provided
I have the following piece of code that combines lists comprised of components of varying length into a list with components of constant length. I have found 2 ways to do it, and the faster of the two is posted below along with sample results. Do you have any suggestions on how to decrease the calculation time by modifying the code?
> ####Function###########
>
2012 Aug 01
2
sub setting a data frame with binomial responses
Hi everyone,
Let me have a dataframe named ?mydata? and created as below,
*> n=c(5,5,5,5) #number of trils
> x1=c(2,3,1,3) ) #number of successes
> x2=c(5,5,5,5) #number of successes
> x3=c(0,0,0,0) #number of successes
> x4=c(5,0,5,0) #number of successes
> mydata=data.frame(n,x1,x2,x3,x4)
> mydata*
n x1 x2 x3 x4
1 5 2 5 0 5
2 5 3 5 0 0
3 5 1 5 0 5
4 5 3 5 0
2007 Oct 29
1
lm design matrix bug?
Hi All
Maybe I dont understand it, but I would have expected that the design matrix has
as many rows as there were observations available to fit the model.
Below a small artificial dataset created, then one model fitted and the design
matrix outputted, having 27 rows. Then I delete 6 obs, and fit the model on
these 21 obs, but the design matrix that comes out has 26 rows?
Thanks for your
2010 Sep 24
1
Standard Error for difference in predicted probabilities
Is there a way to estimate the standard error for the difference in
predicted probabilities obtained from a logistic regression model?
For example, this code gives the difference for the predicted
probability of when x2==1 vs. when x2==0, holding x1 constant at its
mean:
y=rbinom(100,1,.4)
x1=rnorm(100, 3, 2)
x2=rbinom(100, 1, .7)
mod=glm(y ~ x1 + x2, family=binomial)
pred=predict(mod,
2009 Apr 15
3
excluding a column from a data frame
Dear R People:
Suppose I have the following data frame:
x1 x2 x3
1 -0.1582116 0.06635783 1.765448
2 -1.1407422 0.47235664 0.615931
3 0.8702362 2.32301341 2.653805
> str(xx)
'data.frame': 3 obs. of 3 variables:
$ x1: num -0.158 -1.141 0.87
$ x2: num 0.0664 0.4724 2.323
$ x3: num 1.765 0.616 2.654
I can exclude the second column nicely via:
>
2011 Sep 01
1
vector output loop or function
Dear all
Sorry for simple question:
I want to put the following option into look as number of X is large 1000
variables
X1 <- sample(c(1,2, 3, 4),10, replace = T, prob = c(0.4, 0.2, 0.2, 0.2))
cv1 <- round(runif(2, 1, 10))
# X2 is copy of X1
X2 <- X1
# now X2 is different in cv1 random positions
X2[cv1] <- 5
cv2 <- round(runif(2, 1, 10))
# X3 is copy of X2
X3 <- X2
2011 Jun 01
3
error in model specification for cfa with lavaan-package
Dear R-List,
(I am not sure whether this list is the right place for my question...)
I have a dataframe df.cfa
2007 Mar 01
1
how to apply the function cut( ) to many columns in a data.frame?
Dear useRs,
In a data.frame (df) I have several columns (x1, x2, x3....xn) containing
data as a continuous numerical response:
df
var x1 x2 x3
1 143 147 137
2 93 93 117
3 164 39 101
4 123 118 97
5 63 125 97
6 129 83 124
7 123 93 136
8 123 80 79
9 89 107 150
10 78 95 121
I want to
2009 Jan 20
2
Summing Select Columns of a Data Frame?
Hi,
I would like to operate on certain columns in a dataframe, but not
others. My data looks like this:
x1 x2 x3
1 2 3
4 5 6
7 8 9
I want to create a new column named x4 that is the sum of x1 and x2,
but NOT x3. I looked at colSums and apply, but those functions seem to
use all the columns in a dataframe. How do I only use select columns?
If it helps, in Stata this would be gen x4
2012 Sep 12
3
how to create a substraction matrix (subtract a row of every column from the same row in other columns)
Hello
I have data like this
x1 x2 x3 x4 x5
I want to create a matrix similar to a correlation matrix, but with the
difference between the two values, like this
x1 x2 x3 x4 x5
x1 x2-x1 x3-x1 x4-x1 x5-x1
x2 x3-x2 x4-x2 x5-x2
x3 x4-x3 x5-x3
x4 x5-x4
x5
Then I
2011 Feb 17
2
sort by column and row names
Hello, All,
How can one sort on column and row names. For example:
How can this
X1 X3 X2
X1 1 0 0
X3 0 1 0
X2 0 0 1
become this?
X1 X2 X3
X1 1 0 0
X2 0 1 0
X3 0 0 1
Thank you for your time!
Jim
[[alternative HTML version deleted]]
2011 Oct 22
1
Data frame manipulation by eliminating rows containing extreme values
Dear All,
I have got the limits for removing extreme values for each variables using
following function .
f=function(x){quantile(x, c(0.25, 0.75),na.rm = TRUE) - matrix(IQR(x,na.rm =
TRUE) * c(1.5), nrow = 1) %*% c(-1, 1)}
#Example:
n <- 100
x1 <- runif(n)
x2 <- runif(n)
x3 <- x1 + x2 + runif(n)/10
x4 <- x1 + x2 + x3 + runif(n)/10
x5 <-
2011 Mar 12
1
Column order in stacking/unstacking
Dear R users,
I'm having some problems with the stack() and unstack() functions, and
wondered if you could help.
I have a large data frame (400 rows x 2000 columns), which I need to reduce
to a single column of values (and therefore 800000 rows), so that I can use
it in other operations (e.g., generating predictions from a GLM object).
However, the problem I'm having can be reproduced
2005 Jan 26
3
Still avoiding loops
Dear all,
I have a matrix X with 47 lines and say 500 columns - values are in {0,1}.
I'd like to compare lines.
For that, I first did:
for (i in 1:(dim(X)[1]-1))
for (j in (i+1):dim(X)[1]) {
Y <- X[i,]+Y[j,]
etc.
but, since it takes a long time, I would prefer avoding loops;
for that, my first idea was to add this matrix:
X1=X[,rep(1:46,46:1)]
to this one:
res=NULL
for (i in
2008 Jan 08
2
problem when extacting columns of a data frame in a new data frame
Dear R-users,
I would like to create a new data frame composed of 2 columns of another
data frame. But it does not give me what I want...
> casesCNST[1:10,]
case X1 X2 X3 X4 expected
1 A1 0 0 0 0 E
2 A2 0 0 0 1 C
3 A3 0 0 0 2 C
4 A4 0 0 0 3 C
5 A5 0 0 0 4 C
6 A6 0 0 1 0 C
7 A7 0 0 1 1 C
8
2009 Jul 14
2
How to provide list as an argument for the data.frame()
Hi R -users,
i've a table as describe below. I'm reading the numeric value presented in this table to populate a list.
#table
#============
#X A B C
#x1 2 3 4
#x2 5 7 10
#x4 2 3 5
#============
rawData <- read.table("raw_data.txt",header=T, sep="\t")
myList=list()
counter=0
for (i in c(1:length(rawData$X)))
{
print (i)
2005 Apr 22
3
as.data.frame: Error in "names<-.default" (PR#7808)
Hello,
I found a potential problem in R 2.1.0 (and R 2.0.1)
I expect that
> tmp <- FUN(x1, x2, x3, x4)
> as.data.frame(tmp)
is the same as
> as.data.frame(FUN(x1, x2, x3, x4))
since the tmp variable in this case is unnecessary.
However, below I will demonstrate that under an odd set of conditions, I
can correctly perform as.data.frame(tmp), but not as.data.frame(FUN(x1,
x2, x3,
2010 Aug 21
2
t.tests on a data.frame using an apply-type function
I have a data.frame with ~250 observations (rows) in each of ~50
categories (columns). I would like to perform t.tests on subsets of
observations within each column, with the subsets according to index
vectors contained in other columns of the data.frame.
My data.frame looks something like this:
x<-data.frame(matrix(rnorm(200,mean=5,sd=.5),nrow=20))
colnames(x)<-c("site",
2008 Jan 30
1
re stricting points in a data frame
useR's,
Consider some variables and a data frame of points:
x1 <- c(1,2,3)
x2 <- c(3,4,5)
xk1 <- seq(min(x1)-.5, max(x1)+.5,.5)
xk2 <- seq(min(x2)-.5, max(x2)+.5,.5)
expand.grid(xk1=xk1,xk2=xk2)
xk1 xk2
1 0.5 2.5
2 1.0 2.5
3 1.5 2.5
4 2.0 2.5
5 2.5 2.5
6 3.0 2.5
7 3.5 2.5
...
46 2.0 5.5
47 2.5 5.5
48 3.0 5.5
49 3.5 5.5
I want to restrict the data frame to only contain