thr3ads.net - similar to: "subsetting a data frame"

Displaying 20 results from an estimated 30000 matches similar to: "subsetting a data frame"

2008 Feb 02

transforming one column into 2 columns

Hello I have a data frame and one of its columns is as follows: Col chr1:71310034 chr14:23354088 chr15:37759058 chr22:18262638 chrUn:31337214 chr10_random:4369261 chrUn:3545097 I would like to get rid of colon (:) and replace this column with two new columns containing the terms on each side of the colon. The new columns should look as follows: Col_a Col_b chr1

counting sequence mismatches

2008 Feb 23

counting sequence mismatches

Hello I have 2 columns of short sequences that I would like to compare and count the number of mismatches and record the number of mismatches in a new column. The sequences are part of a data frame that looks like this: seq1=c("CGGTGTAGAGGAAAAAAAGGAAACAGGAGTTC","CGGTGGTCAGTCTGGGACCTGGGCAGCAGGCT", "CGGGCCTCTCGGCCTGCAGCCCCCAACAGCCA")

difference of two data frames

2008 Sep 14

difference of two data frames

Hello I have 2 data frames DF1 and DF2 where DF2 is a subset of DF1: DF1= data.frame(V1=1:6, V2= letters[1:6]) DF2= data.frame(V1=1:3, V2= letters[1:3]) How do I create a new data frame of the difference between DF1 and DF2 newDF=data.frame(V1=4:6, V2= letters[4:6]) In my real data, the rows are not in order as in the example I provided. Thanks much Joseph [[alternative HTML version

inserting text lines in a dat frame

2008 Feb 06

inserting text lines in a dat frame

Hi Jim I am trying to prepare a bed file to load as accustom track on the UCSC genome browser. I have a data frame that looks like the one below. > x V1 V2 V3 1 chr1 11255 55 2 chr1 11320 29 3 chr1 11400 45 4 chr2 21680 35 5 chr2 21750 84 6 chr2 21820 29 7 chr2 31890 46 8 chr3 32100 29 9 chr3 52380 29 10 chr3 66450 46 I would like to insert the following 4 lines at the beginning:

help with gsub and date pattern

2009 May 21

help with gsub and date pattern

Dear List, I am having a problem using gsub to remove dates from a date/time string. For example: x<-c("5/31/2009 12:34:00","6/1/2009 1:14:00") I would like to remove the date and have just the time. I have tried: gsub("[0-9+]/[0-9+]/[0-9+]","",x) and various versions. I think my problem is that the / is a special character and is telling it

merging more than 2 data frames

2008 Feb 12

merging more than 2 data frames

Hi merge() takes only 2 data frames. What can you do to it to make take more than two data frames? or is there another function that does that? Thanks joseph ____________________________________________________________________________________ Looking for last minute shopping deals? [[alternative HTML version deleted]]

data frame question

2008 Feb 10

data frame question

Hello I have 2 data frames df1 and df2. I would like to create a new data frame new_df which will contain only the common rows based on the first 2 columns (chrN and start). The column score in the new data frame should be replaced with a column containing the average score (average_score) from df1 and df2. df1= data.frame(chrN= c(“chr1”, “chr1”, “chr1”, “chr1”, “chr2”, “chr2”, “chr2”),

Subsetting a large number into smaller numbers and find the largest product

2013 Apr 18

Subsetting a large number into smaller numbers and find the largest product

Hello, I have a big number lets say of around hundred digits. I want to subset that big number into consecutive number of 5 digits and find the product of those 5 digits. For example my first 5 digit number would be 73167. I need to check the product of the individual numbers in 73167 and so on. The sample number is as follows:

64-bit R on Mac OS X 10.5.4

2008 Jul 27

64-bit R on Mac OS X 10.5.4

Hi Matt Your method is the easiest way for me to install the 64-bit R. I followed the directions on your web site and then did the following: R --arch=x86_64 source("http://bioconductor.org/biocLite.R") biocLite(type = "source",lib = "/Library/Frameworks/R.framework/Versions/2.8/Resources/RLib64") I got many errors and warnings which I copied to the attached file.

naming components of a list

2008 May 25

naming components of a list

Hi I have a character vector with thousands of names which looks like this: > V=c("Fred", "Mary", "SAM") > V [1] "Fred" "Mary" "SAM" > class(V) [1] "character" I would like to change it to a list: > L=as.list(V) > L [[1]] [1] "Fred" [[2]] [1] "Mary" [[3]] [1] "SAM" but I need to

Question regarding subsetting

2005 Jul 22

Question regarding subsetting

I run R 2.1.1 in a Linux environment (RedHat 9) although my question is not platform-specific. Consider the following: > A <- c("Prefix-aaa", "Prefix-bbb", "Prefix-ccc") > B <- strsplit(A, "-") > B [[1]] [1] "Prefix" "aaa" [[2]] [1] "Prefix" "bbb" [[3]] [1] "Prefix" "ccc" How

subsetting character vector into groups of numerics

2002 Oct 28

subsetting character vector into groups of numerics

I'm sure there's a simple way to do this, but I can only think of complicated ones. I have a number of character vectors that look something like this: "12 78 23 9 76 43 2 15 41 81 92 5(92 12) (81 78 5 76 9 41) (23 2 15 43)" I wish to get it into a list of numerical vectors like this: $Group [1] 12 78 23 9 76 43 2 15 41 81 92 5 $Subgroup1 [1] 92 12 $Subgroup2 [1] 81 78 5

remove column names from a data frame

2008 Feb 18

remove column names from a data frame

I want to remove the column names from a data frame. I do it the long way, can any body show me a better way ? df= data.frame(chrN= c(“chr1”, “chr2”, “chr3”), start= c(1, 2, 3), end= c(4, 5, 6), score= c(7, 8, 9)) df #I write a txt file without row or column names write.table(df,"df1.txt",sep='\t',quote=FALSE,row.names=F,col.names=F) #then I read it with the header = F

data.frame question

2010 Mar 07

data.frame question

hello can you show me how to create a data.frame from two factors x and y. column 1 should be equal to x and column 2 is 1 if it is common to y and 0 if it is not. x=factor(c("A","B","C","D","E","F","G")) y=factor(c("B","C","G")) the output should look like this: A 0 B 1 C 1 D 0 E

convertin a data frame column from character to numeric

2008 Feb 08

convertin a data frame column from character to numeric

I have a data.frame with all character columns, I would like to convert the last two columns into numeric.> x[1:5, ] chrN start end 1 chr1 71310034 71310064 2 chr14 23354088 23354118 3 chr14 71310034 71310064 4 chr15 37759058 37759088 5 chr22 18262638 18262668 > apply(x, 2, FUN = mode) chrN start end

Another subsetting enigma

2007 Feb 19

Another subsetting enigma

Hello again, I'm trying to do the following: subset(dataframe,list %in% strsplit(dataframe[[Field]],",")) But This returns always the complete dataframe, since the strsplit(dataframe[[Field]],",") is evaluated as one big list for the whole data frame rather than one list per row. How can I have this evaluated on a per row basis? After 1.5 h hitting head against wall -

counting identical data in a column

2008 Feb 04

counting identical data in a column

Hi Peter I have the following data frame with chromosome name, start and end positions: chrN start end 1 chr1 11122333 11122633 2 chr1 11122333 11122633 3 chr3 11122333 11122633 8 chr3 111273334 111273634 7 chr2 12122334 12122634 4 chr1 21122377 21122677 5 chr2 33122355 33122655 6 chr2 33122355 33122655 I would like to count the positions that have the same start and

numbers as part of long character

2008 Jun 12

numbers as part of long character

Hi, I'm looking for some way to pick up the numbers which are contained and buried in a long character. For example, outtree.new="(((B:1204.25,E:1204.25):7581.11,F:8785.36):8353.85,C:17139.21);" num.char =

str(data.frame) after subsetting reflects original structure, not subsetted structure?

2009 Jul 24

str(data.frame) after subsetting reflects original structure, not subsetted structure?

I find that after subsetting (you may prefer "conditional selection") a data frame and assigning it to a new object, the str(new object) reflects the original data frame, not the new one: A <- rnorm(20) B <- factor(rep(c("t", "g"), 10)) C <- factor(rep(c("h", "l"), 10)) D <- data.frame(A, B, C) str(D) # reports correctly E <-

Finding (swapped) repetitions of numbers pairs across two columns

2012 Dec 27

Finding (swapped) repetitions of numbers pairs across two columns

Hi, I've had this problem for a while and tackled it is a quite dirty way so I'm wondering is a better solution exists: If we have two vectors: v1 = c(0,1,2,3,4) v2 = c(5,3,2,1,0) How to remove one instance of the "3,1" / "1,3" double? At the moment I'm using the following solution, which is quite horrible: v1 = c(0,1,2,3,4) v2 = c(5,3,2,1,0) ft <-

similar to: subsetting a data frame