thr3ads.net - similar to: "More difficulties in getting data into R"

Failing on reading a "slightly big" dataset

2004 Jul 05

2

Failing on reading a "slightly big" dataset

I have a file with 4 columns per line, all pipe delimited. $ wc -l cmie_firm_data.text 89325 cmie_firm_data.text $ ls -al cmie_firm_data.text -rw-r--r-- 1 ajayshah ajayshah 4415637 Jul 5 15:25 cmie_firm_data.text $ awk -F\| '(NF != 4)' cmie_firm_data.text $ head cmie_firm_data.text All figures are for the year 20030331||| Company|GVA Less Interest (Rs. thousand)|Interest (Rs.

Did I use "step" function correctly? (Is R's step() function reliable?)

2006 Mar 16

3

Did I use "step" function correctly? (Is R's step() function reliable?)

Hi all, I put up an exhaustive model to use R's "step" function: ------------------------ mygam=gam(col1 ~ 1 + col2 + col3 + col4 + col2 ^ 2 + col3 ^ 2 + col4 ^ 2 + col2 ^ 3 + col3 ^ 3 + col4 ^ 3 + s(col2, 1) + s(col3, 1) + s(col4, 1) + s(col2, 2) + s(col3, 2) + s(col4, 2) + s(col2, 3) + s(col3, 3) + s(col4, 3) + s(col2, 4) + s(col3, 4) + s(col4, 4) + s(col2, 5) + s(col3,

selecting dataframe columns based on substring of col name(s)

2017 Jun 21

4

selecting dataframe columns based on substring of col name(s)

Suppose I have the following sort of dataframe, where each column name has a common structure: prefix, followed by a number (for this example, col1, col2, col3 and col4): d = data.frame( col1=runif(10), col2=runif(10), col3=runif(10),col4=runif(10)) What I haven't been able to suss out is how to efficiently 'extract/manipulate/play with' columns from the data frame, making use

selecting dataframe columns based on substring of col name(s)

2017 Jun 21

0

selecting dataframe columns based on substring of col name(s)

> On Jun 21, 2017, at 9:11 AM, Evan Cooch <evan.cooch at gmail.com> wrote: > > Suppose I have the following sort of dataframe, where each column name has a common structure: prefix, followed by a number (for this example, col1, col2, col3 and col4): > > d = data.frame( col1=runif(10), col2=runif(10), col3=runif(10),col4=runif(10)) > > What I haven't been able to

help needed with boxplot

2010 Mar 22

1

help needed with boxplot

I am new to R, can anyone help with boxplot for a dataset like: file1 col1 col2 col3 col4 col5 050350005 101 56.625 48.318 RED 051010002 106 50.625 46.990 GREEN 051190007 25 65.875 74.545 BLUE 051191002 246 52.875 57.070 RED 220050004 55 70 80.274 BLUE 220150008 75 67.750 62.749 RED 220170001 77 65.750 54.307 GREEN file2 col1 col2 col3 col4 col5 050350005 101 56.625 57 RED 051010002 106 50.625 77

How to find moving averages within each subgroup of a data frame

2009 Oct 22

2

How to find moving averages within each subgroup of a data frame

Dear all, If I have the following data frame: > set.seed(21) > df1 <- data.frame(col1=c(rep('a',5), rep('b',5), rep('c',5)), col4=rnorm(1:15)) col1 col4 1 a 0.793013171 2 a 0.522251264 3 a 1.746222241 4 a -1.271336123 5 a 2.197389533 6 b 0.433130777 7 b -1.570199630 8 b -0.934905667 9 b 0.063493345 10 b

union data in column

2010 Jul 24

2

union data in column

Is there any function/way to merge/unite the following data GENEID col1 col2 col3 col4 G234064 1 0 0 0 G234064 1 0 0 0 G234064 1 0 0 0 G234064 0 1

How to loop through all the columns in dataframe

2008 Mar 16

2

How to loop through all the columns in dataframe

Hi: Can anyone advice me on how to loop and perform a calculation through all the columns. here's my data xd<- c(2.2024,2.4216,1.4672,1.4817,1.4957,1.4431,1.5676) pd<- c(0.017046,0.018504,0.012157,0.012253,0.012348,0.011997,0.012825) td<- c(160524,163565,143973,111956,89677,95269,81558) mydf<-data.frame(xd,pd,td) trans<-t(mydf) trans I have these values that I need to

Split rows depending on time frame

2010 Oct 11

2

Split rows depending on time frame

Hi, I have the following data frame, where col2 is a startdate and col3 an enddate COL1 COL2 COL3 A 40462 40482 B 40462 40478 The above timeframe of 3 weeks I would like to splits it in weeks like this COL1 COL2 COL3 COL4 A 40462 40468 1 A 40469 40475 1 A 40476 40482 1 B

by funtion

2010 Apr 29

2

by funtion

Hello, I have a data.frame: name col1 col2 col3 col4 AA 23 54 0.999 0.78 BB 123 5 1 0.99 AA 203 98 0.79 0.99 I want to get mean value data.frame in terms of name: name col1 col2 col3 col4 AA 113.0000 76.0000 0.8945 0.8850 BB 123.00 5.00 1.00 0.99 I tried to use by function: >aa<-by(test[,2:5], feature, mean)

compare two matrices

2010 Sep 27

1

compare two matrices

Hi everyone: I have a kinda easy question but i do not know how to solve that in a simple way. I want to compare the rows of two matrices. col1 <- c(1,2,3,4,5,6) col2 <- c(6,5,4,3,2,1) m <- cbind(col1, col2) col3 <- c(1,3,2,6) col4 <- c(6,3,5,1) n <- cbind(col3, col4) In matrix n, for example the first row is (1,6), it is also some row

Remove rows in a matrix that match rows in another matrix

2009 Dec 20

2

Remove rows in a matrix that match rows in another matrix

Dear R Community, The following seems like a simple problem, but I''ve been stuck on it for some time, with no luck using matching or subsetting functions. I''m trying to remove the rows from a large matrix that match rows in another large matrix. A (small scale) example: col1<-c("A", "B", "C", "D") col2<-c("A",

Better way to change the name of a column in a dataframe?

2006 Dec 14

5

Better way to change the name of a column in a dataframe?

Hello R users -- If I have a dataframe such as the following, named "frame" with the columns intended to be named col1 through col6, > frame col1 col2 cmlo3 col4 col5 col6 [1,] 3 10 2 6 5 7 [2,] 6 8 4 10 7 1 [3,] 7 5 1 3 1 8 [4,] 10 6 5 4 9 2 and I want to correct or otherwise change the

Leer un txt a trozos

2019 Feb 12

7

Leer un txt a trozos

Estimad en s eRRer en s, Tengo un txt que quiero importar a R. Pero no tiene un formato adecuado para usar cosas normales, como por ejemplo read.csv() El formato es algo así: time 1 col1 col2 col3 col4 dato dato dato dato dato dato dato dato dato dato dato dato dato dato dato dato dato dato dato dato end time 2 col1 col2 col3 col4 dato dato dato dato dato dato dato dato dato dato dato dato dato

help with regexp mass substitution

2009 Oct 02

3

help with regexp mass substitution

Hello * i have to rename a lot of variables, and, given that they have regular name constructs, I would like to use regexps. Here's a dump of my head(names(df)) varnames <- c("id.quest", "txt.1.3", "col1.1.3", "col2.1.3", "col3.1.3", "col4.1.3", "col5.1.3", "txt.2.3", "col1.2.3",

Replace values according to conditions

2008 Apr 09

1

Replace values according to conditions

Greetings R-users, I have the following data called mydata in a data.frame Col1 Col2 Col3 Col4 Col5 1 2 4 6 7 8 8 7 3 5 4 4 5 6 7 I want to replace the data according to the following conditions Condition 1 if data <= 3, replace with -1 Condition 2 if data >=6, replace with 1 Condition 3 if data = 4 or data =5, replace with 0

Adding rows to column

2010 Oct 21

2

Adding rows to column

I'm new to R. I'm extracting important columns from single table using following code: File2<-"file.txt" table2<- read.delim(File2, skip=19, sep=";", header=F, na.strings=NA, fill=T) #extracting column 7 where rows match "ID" col1<- table2[grep("ID", table2[,1]),7] #similarly extracting column 9,11,13,15 col2<-

speeding read.table

2012 Oct 18

4

speeding read.table

R 2.15.1 OS X Colleagues, I am reading a 1 GB file into R using read.table. The file consists of 100 tables, each of which is headed by two lines of characters. The first of these lines is: TABLE NO. 1 The second is a list of column headers. For example: TABLE NO. 1 COL1 COL2 COL3 COL4 COL5 COL6 COL7 COL8 COL9 COL10

Linear programming problem, RGPLK - "no feasible solution".

2011 Oct 10

1

Linear programming problem, RGPLK - "no feasible solution".

In my post at https://stat.ethz.ch/pipermail/r-help/2011-October/292019.html I included an undefined term "ej". The problem code should be as follows. It seems like a simple linear programming problem, but for some reason my code is not finding the solution. obj <- c(rep(0,3),1) col1 <-c(1,0,0,1,0,0,1,-2.330078923,0) col2 <-c(0,1,0,0,1,0,1,-2.057855981,0) col3

replicate lines of data frame

2011 Aug 25

2

replicate lines of data frame

Greetings! I am just now learning to use R for my dissertation project. I need to manipulate a lot of text and numeric data. I created a data frame that has 7 columns and 127 unique rows. Now I need to replicate each line 6 times and then later change values in the first 2 columns. I am trying to figure out how to accomplish this. I think that I need to use rep(my.df, each=6) but it does

similar to: More difficulties in getting data into R