thr3ads.net - search: "ruser2006"

Displaying 20 results from an estimated 21 matches for "ruser2006".

R: R: [Re:] function to replace missing values with median value?]]

2006 May 04

R: R: [Re:] function to replace missing values with median value?]]

...t;environment: namespace:base> Stefano >-----Messaggio originale----- >Da: r-help-bounces at stat.math.ethz.ch >[mailto:r-help-bounces at stat.math.ethz.ch]Per conto di >Guazzetti Stefano >Inviato: Thursday, May 04, 2006 07:55 AM >A: isaia at econ.unito.it; ruser2006 at yahoo.com >Cc: r-help at stat.math.ethz.ch >Oggetto: [R] R: [Re:] function to replace missing values with median >value?]] > > >there is also a replace function > > > >-----Messaggio originale----- > >Da: r-help-bounces at sta...

Converting from a dataset to a single "column"

2006 Jan 23

Converting from a dataset to a single "column"

I have a dataset of 3 ?columns? and 5 ?rows?. temp<-data.frame(col1=c(5,10,14,56,7),col2=c(4,2,8,3,34),col3=c(28,4,52,34,67)) I wish to convert this to a single ?column?, with column 1 on ?top? and column 3 on ?bottom?. i.e. 5 10 14 56 7 4 2 8 3 34 28 4 52 34 67 Are there any functions that do this, and that will work well on much larger datasets (e.g. 1000 rows, 6000 columns)?

importing a VERY LARGE database from Microsoft SQL into R

2006 Jan 24

importing a VERY LARGE database from Microsoft SQL into R

I am using R 2.1.1 in a Windows Xp environment. I need to import a large database from Microsoft SQL into R. I am currently using the ?sqlQuery? function/command. This works, but I sometimes run out of memory if my database is too big, or it take quite a long time for the data to import into R. Is there a better way to bring a large SQL database into R? IS there an efficient way to convert

question re: "summarry.lm" and NA values

2006 Aug 15

question re: "summarry.lm" and NA values

Is there a way to get the following code to include NA values where the coefficients are ?NA?? ((summary(reg))$coefficients) explanation: Using a loop, I am running regressions on several ?subsets? of ?data1?. ?reg <- ( lm(lm(data1[,1] ~., data1[,2:l])) )? My regression has 10 independent variables, and I therefore expect 11 coefficients. After each regression, I wish to save the

converting code into a function - seperating a data frame with n columns into n individual vectors

2006 May 05

converting code into a function - seperating a data frame with n columns into n individual vectors

I have many very large dataframes with 20 columns each. In order to conserve memory, I wish to separate the data frame into 20 vectors, each named the name of the dataframe followed by .1,.2,.3 .20. (For example purposes, one data frame is named ?testa?.) e.g. testa.1, testa.2, testa.3 I have written the code to do this (see below). I am trying to convert this into a function that I can reuse.

basic question re lm()

2006 Aug 10

basic question re lm()

I am using R in a Windows environment. I have a basic question regarding lm(). I have a dataframe ?data1? with ncol=w. I know that my dependent variable is in column1. Is there a way to write the regression formula so that I can use columns 2 thru w as my independent variables? e.g. something like: ? lm(data1[,1] ~ data1[,2:w] ) ? Thanks

exporting dates into Microsoft SQL Server

2006 Jan 23

exporting dates into Microsoft SQL Server

I am running R 2.1.1 in a Windows XP environment. I wish to use the sqlSave command to export a dataframe into Microsoft SQL. My dataframe is called temp and has 2 ?columns?, ?monthenddate? and ?value?. Monthenddate is in 'POSIXct', format. (i.e. 'POSIXct', format: chr "1984-01-31" "1984-01-31" "1984-01-31" "1984-01-31" ...). How can I

using a value in a column to "lookup" data in a certian column of a dataset?

2006 Mar 14

using a value in a column to "lookup" data in a certian column of a dataset?

I have a dataset with 20 columns and ~600,000 rows. Column 1 has a number from 2-19. This number tells me, for each row, which column has the ?applicable? data. (i.e. the data that I wish to use for each individual row) I want to create a vector that contains the data from the value in column 1. e.g. If column 1, row 1, has a value of ?6?, I want to obtain the value in column 6, row1. If

"Conditional" match?

2006 Jan 27

"Conditional" match?

I have two datasets, big and small. s_date<-c(?2005-12-02?, ?2005-12-01?, ?2004-11-02?,?2002-10-05?,?2000-12-15?) s_id<-c(?a?,?a?,?b?,?c?,?d?) b_date<- c(?2005-12-31?, ?2005-12-31?, ?2004-12-31?,?2002-10-05?,?2001-10-31?,?1999-12-31?) b_id<-c(?a?,?b?,?c?,?d?,?e?,?c?) small<-data.frame(date_=as.Date(s_date),id=s_id) big<-data.frame(date_=as.Date(b_date),id=b_id) For each row

paste - eliminate spaces?

2006 Jan 25

paste - eliminate spaces?

I am trying to combine the value of a variable and text. e.g. I want ?test1?, with no spaces. I try: h=1 paste(?test?,1) But get: [1] "test 1" (i.e. there is a space between ?test?? and ?1?) Is there a way to eliminate the space?

matrix math

2006 Jan 04

matrix math

I am using R 2.1.1 in an windows XP environment. I have 2 dataframes, temp1 and temp2. Each dataframe has 20 variables (“cocolumns") and 525 observations (“rows”). All variables are numeric. I want to create a new dataframe that also has 20 columns and 525 rows. The values in this dataframe should be the sum of the 2 other dataframe. (i.e. temp1$column

vector math: calculating a rolling 12 row product?

2006 Feb 28

vector math: calculating a rolling 12 row product?

I have a dataframe of numeric values with 30 ?rows? and 7 ?columns?. For each column, beginning at ?row? 12 and down to ?row? 30, I wish to calculate the ?rolling 12 row product?. I.e., within each column, I wish to multiply all the values in row 1:12, 2:13, 19:30. I wish to save the results as a new dataframe, which will have 19 rows and 7 columns.

getting sapply to skip columns with non-numeric data?

2006 Aug 17

getting sapply to skip columns with non-numeric data?

getting s-apply to skip columns with non-numeric data? I have a dataframe ?x? of w columns. Some columns are numeric, some are not. I wish to create a function to calculate the mean and standard deviation of each numeric column, and then ?bind? the column mean and standard deviation to the bottom of the dataframe. e.g. tempmean <- apply(data.frame(x), 2, mean, na.rm = T) xnew <-

For loop gets exponentially slower as dataset gets larger...

2006 Jan 03

For loop gets exponentially slower as dataset gets larger...

I am running R 2.1.1 in a Microsoft Windows XP environment. I have a matrix with three vectors (“columns”) and ~2 million “rows”. The three vectors are date_, id, and price. The data is ordered (sorted) by code and date_. (The matrix contains daily prices for several thousand stocks, and has ~2 million “rows”. If a stock did not trade on a particular date, its price is set to “NA”)

memory managment under Windows XP

2006 Feb 23

memory managment under Windows XP

I am using R 2.2.1 in a Windowes XP environment. I work with very large datasets, and occassionally run out of memory. I have modified my boot.ini file to use the "/3gb switch". I also run the following line after I launch R ( I am unsure if it is helpful). "memory.limit(size = 4095)" Please point me to useful references on how to better manage memory, or suggestother

rowVars

2006 Mar 31

rowVars

I am using the R 2.2.1 in a Windows XP environment. I have a dataframe with 12 columns and 1,000 rows. (Some of the rows have 1 or fewer values.) I am trying to use rowVars to calculate the variance of each row. I am getting the following message: ?Error in na.remove.default(x) : length of 'dimnames' [1] not equal to array extent? Is there a good work-around?

array vs matrix vs dataframe?

2006 Oct 01

array vs matrix vs dataframe?

What is the difference among an array, a dataframe and a matrix? Why is the size of a dataframe so much larger? (see example below) a<-c(rep(1:1000000,1)) b<-c(rep(1:1000000,1)) c1<-cbind(a,b) cdf<-as.data.frame(cbind(a,b)) cm<-as.matrix(cbind(a,b)) object.size(a)/1000000 object.size(b)/1000000 object.size(c1)/1000000 object.size(cdf)/1000000 object.size(cm)/1000000

"renaming" dataframe1 using "column" names from dataframe2?

2006 Mar 17

"renaming" dataframe1 using "column" names from dataframe2?

I have a dataframe named ?temp?, and another dataframe named ?descriptions?. I wish to ?rename? temp, and to ?call? it the names of a certain column in the dataframe ?descriptions?. Is there a good way to do this? A similar question: I am using a ?for loop? to create several new dataframes. e.g. for(j in 1:9){ .. I?d like each dataframe to be named d1, d2, d3, with the number being tied to

calcualtign a trailing 12 column mean in a dataframe?

2006 Mar 29

calcualtign a trailing 12 column mean in a dataframe?

I have a dataframe of 25 columns and 100,000 rows called ?testdf?. I wish to build a new dataframe, with 14 columns and 100,000 rows. I wish the new dataframe to have the ?trailing 12 column? mean. That is, I want column 1 of the new dataframe to have soemthing like: ?( mean(testdf[,1:12],na.rm=T)? What is the best way to accomplish this?

average by group...

2006 May 30

average by group...

I have a dataframe with 700,000 rows and 2 vectors (columns): ?group? and ?score?. I wish to calculate a third vector of length 700000: the average score by group. Even though the avarge value will repeat, I wish to return the average for that particular group for each row. (I know I can do this by calculating each group?s average and then using the merge command, but as my calculations get

search for: ruser2006