thr3ads.net - similar to: "editing a big file"

Displaying 20 results from an estimated 10000 matches similar to: "editing a big file"

2006 Jun 30

data extraction

Dear mailing list I have a data that have 20,000 rows and 20 columns. Io wonted to extract the 10th row only. Example the 10th, 20th, 30th 40th…..20000 th. can you please help me how do I do that.Than kyou. Example is below. Inpute: AG GG GG AG CC CC CC CC CT CC CT CT GG GG GG GG CC CC CC CC GG GG GG GG CC CC CC CC GG CG CG GG GG GG GG GG *CC CC CC CC* AA AG AG AA AA AA AA AA GG AG AG GG GG AG AG

managing data

2006 Jun 17

managing data

Dear mailing list, may some one be kind to help me solve following problem. I am trying to write a code that will combine two tables "x" and "y". The first columns of both tables are unique identification for the rows. The first column of table "X" is a sub set of the first column of "Y". I need to find the matching rows in both tables by looking on their

spliting

2006 Jul 28

spliting

Dear mailing list, I have a big data frame and each element in the matrix has two alphabets. I want to split those alphabets into two so each element will have one alphabet and the number of my columns will be doubled . So can some one help with the code? Example of what I want is to split them. Input (three column) GG AG AG CC CC CC CC CC CC AG

transposing a big data file

2006 May 09

transposing a big data file

I HAVE A VERY BIG DATA OF 67 COLMS AND 25000 ROWS AND WOULD LIKE TO TRANSPOSE IT THE R HELP WAS NOT ENOUGH INFORMATION BECOUSE I AM NOT A PROGRAMMER AND FIRST TIME R USER. SO CAN YOU GIVE SOME HINTS OF CODING, AA TT GG GG CC AA TT GG GG CC AA TT GG GG CC AA TT GG GG CC AA TT GG GG CC TO AA AA AA AA AA TT TT TT TT TT GG GG GG GG GG GG GG GG GG GG CC CC CC CC CC [[alternative HTML

replacing a factor value in a data frame

2005 Oct 28

replacing a factor value in a data frame

Hi All, I have the following problem, that's driving me mad. I have a dataframe of factors, from a genetic scan of SNPs. I DO have NAs in the dataframe, which would look like: V4 V5 V6 V7 V8 V9 V10 1 TT GG TT AC AG AG TT 2 AT CC TT AA AA AA TT 3 AT CC TT AC AA <NA> TT 4 TT CC TT AA AA AA TT 5 AT CG TT CC AA AA TT 6 TT CC TT AA AA AA TT 7 AT CC

column to row

2006 Aug 14

column to row

Dear mailing list I have a data in two columns and how can i convert it to one row . thank you in advance inpute 1 2 3 4 5 6 7 8 9 1 out put 1 2 3 4 5 6 7 8 9 1 [[alternative HTML version deleted]]

how to count "A","C","T","G" in each row in a big data.frame?

2013 Jan 09

how to count "A","C","T","G" in each row in a big data.frame?

Dear All I have a data.frame like that: structure(list(name = c("Gga_rs10722041", "Gga_rs10722249", "Gga_rs10722565", "Gga_rs10723082", "Gga_rs10723993", "Gga_rs10724555", "Gga_rs10726238", "Gga_rs10726461", "Gga_rs10726774", "Gga_rs10726967", "Gga_rs10727581", "Gga_rs10728004",

data managment

2006 Jun 14

data managment

First I would really like to thank the mailing list for help I got in the past, as a new to R I am really needing some support on hoe to code the following problem. I am trying to sort some data I have in a big file. The file has 4 columns and 19000 rows. An example of it looks like this:- G 0.892 A 0.108 G 0.883 T 0.117 T 0.5 C

replace string values with numbers

2012 Sep 26

replace string values with numbers

Hi everyone, I have a data frame Gene with SNPs eg. P1 P2 P3 CG CG GG -- -- AC -- AC CC AC -- AC I tried to replace all the GG with a value 3. Gene[Gene=="GG"]<-3 It always give me: Warning in `[<-.factor`(`*tmp*`, thisvar, value = 3) : invalid factor level, NAs generated Does any know if there is anything wrong with my code? Thanks, Zhengyu

Sum of character vector

2009 Mar 30

Sum of character vector

Dear list, I am trying to evaluate how many elements in a vector equal a certain value. The vectors are the columns of a data.frame, read in using read.table(): > dim(data) [1] 2600 742 > data[1:5,1:5] SNP001 SNP002 SNP003 SNP004 SNP005 1 GG AA TT TT GG 2 GG AA TC TT GG 3 GG AC CC TT GG 4 AG AA TT TT GG 5

Filling in empty arrays/lists from using "paste" function

2009 Aug 25

Filling in empty arrays/lists from using "paste" function

Dear R users, I am trying to fill in arrays (5 different according to distinct "id") from objects produced from arbitrary data set below. a <-

rho stat from a fasta sequence file

2012 Jan 16

rho stat from a fasta sequence file

Hi all, I have a sequence file (fasta format) and want to calculate the rho statistics for dinucleotide abundance value on my data.. the code which I use is (using seqinr library and current working directory) seq_info<-read.fasta("gene.txt") rho(seq_info[1],2) but it yields only the dinucleotides, not their rho values, i.e, > rho(seq_info[1],2) aa ac ag at ca cc cg ct ga gc

Help needed!

2011 Apr 20

Help needed!

Hi everyone, I have a question. Now I am reading the resource code of the package "ssfcov". The resource code is as following. I cannot find the resource code of the function "myss2d" anywhere in the package. Can anyone give me a hint how to find it in the package. Thanks a lot!!bv > ssfcov function (time, x, subject, nbasis = 5, centered = FALSE, noDiag = TRUE) {

Counting occurances of a letter by a factor

2010 Sep 10

Counting occurances of a letter by a factor

I'm trying to find a more elegant way of doing this. What I'm trying to accomplish is to count the frequency of letters (major / minor alleles) in a string grouped by the factor levels in another column of my data frame. Ex. > DF<-data.frame(c("CC", "CC", NA, "CG", "GG", "GC"), c("L", "U", "L",

overlay two histograms ggplot

2017 Dec 13

overlay two histograms ggplot

Hi all, How can I overlay these two histograms? ggplot(gg, aes(gg$Alz, fill = gg$veg)) + geom_histogram(alpha = 0.2) ggplot(tt, aes(tt$Cont, fill = tt$veg)) + geom_histogram(alpha = 0.2) thanks for any help! Elahe

questions about string handling

2010 Aug 05

questions about string handling

Hi, I have a question about the data handling. I have a dataset as following: ID snp1 snp2 snp3 1001 0/0 1/1 1/1 1002 2/2 3/3 1/1 1003 4/4 3/3 2/2 I want to convert the dataset to the following format: ID snp1 snp2 snp3 1001 00 AA AA 1002 GG

Determining variance components of classed covariates

2009 Jan 12

Determining variance components of classed covariates

Hi - I am interested in solving variance components for the data below with respect to the response variable, Expression within R. However, the covariates aren't independent and they also have a class (of which the total variance explained by covariates in that class I am most interested in). Very naively, I have tried to look at each individual covariates variance like this >

reposTools

2007 Jan 28

reposTools

Dear List, I tested the example in the reposTools vignette: library(reposTools); Loading required package: tools genRepos("Test Repository", "http://biowww.dfci.harvard.edu/~jgentry/","newRepos"); Error in rep.int(colnames(x), nr) : unimplemented type 'NULL' in 'rep' Could someone help me out with this one? I'd appreciate all help.... I am

strsplit for multiple columns

2009 Jun 03

strsplit for multiple columns

Hi, I am trying to split multiple columns. One column works just fine, but I want to do it for multiple columns??? Example > a ID V2 V3 V4 V5 V6 V7 V8 V9 V10 1 PBBA0644 -- GG AA -- AA -- AA GG GG 2 PBBA1010 -- GG AA -- AA -- AA GG GG 3 0127ATPR -- GG AA -- AA -- AA GG GG 4 0128EHAB -- GG AA -- AG -- AA AG GG 5 PBBA0829 -- GG AA -- AA -- AA GG AG

Discrete choice model maximum likelihood estimation

2012 May 13

Discrete choice model maximum likelihood estimation

Hello, I am new to R and I am trying to estimate a discrete model with three choices. I am stuck at a point and cannot find a solution. I have probability functions for occurrence of these choices, and then I build the likelihood functions associated to these choices and finally I build the general log-likelihood function. There are four parameters in the model, three of them are associated to

similar to: editing a big file