thr3ads.net - search: "fasta"

2017 Jun 17

3

write.dna command

Hi all, I am learning R by "doing". And this is my first post. I want to use R: 1- to fetch a DNA sequence from a databank (see bellow) and 2- store it as FASTA file. The problem: neither an error is prompted nor the fasta file is created. Testing the code (see bellow), I notice that everything works until the *"write.dna" *command - which is not creating the fasta file. Here is my code: ####Get gene sequence from GenBank and store it as fasta...

write.dna command

2017 Jun 17

0

write.dna command

...se excuse my brevity. On June 17, 2017 7:26:42 AM PDT, Mogjib Salek <mogjibs at gmail.com> wrote: >Hi all, > >I am learning R by "doing". And this is my first post. > >I want to use R: 1- to fetch a DNA sequence from a databank (see >bellow) >and 2- store it as FASTA file. > >The problem: neither an error is prompted nor the fasta file is >created. >Testing the code (see bellow), I notice that everything works until >the *"write.dna" >*command - which is not creating the fasta file. > >Here is my code: > >####Get gene seq...

DO NOT REPLY [Bug 5963] New: rsync fails with errors that make no sense...

2008 Dec 11

3

DO NOT REPLY [Bug 5963] New: rsync fails with errors that make no sense...

...nux) system. We're transferring from one directory tree to another with a script that looks like: for i in /data2/* /data3/* do /usr/local/bin/rsync -a --delete ${i}/ /data1/`basename $i` done For some files/dirs, rsync is failing: rsync: rename "/data1/bordner/Pfam/version_22.0/fasta/.PF09720.fa.8z0Imb" -> "Pfam/version_22.0/fasta/PF09720.fa": File too large (27) rsync: rename "/data1/bordner/Pfam/version_22.0/fasta/.PF09721.fa.0OoTTG" -> "Pfam/version_22.0/fasta/PF09721.fa": File too large (27) rsync: rename "/data1/bordner/Pfam/ve...

rho stat from a fasta sequence file

2012 Jan 16

1

rho stat from a fasta sequence file

Hi all, I have a sequence file (fasta format) and want to calculate the rho statistics for dinucleotide abundance value on my data.. the code which I use is (using seqinr library and current working directory) seq_info<-read.fasta("gene.txt") rho(seq_info[1],2) but it yields only the dinucleotides, not their rho values,...

Memory leak with character arrays?

2007 Jan 17

4

Memory leak with character arrays?

...havior on Mac OS X, Linux for AMD_64 and X86_64., and the R versions are 2.4, 2.4 and 2.2, respectively. So, it would seem that this is platform and R version independant. The file that I'm reading contains the upstream regions of the yeast genome, with each upstream region labeled using a FASTA header, i.e.: FASTA header for gene 1 upstream region..... ..... .... FASTA header for gene 2 upstream.... .... The script I use - code below - opens the file, parses for a FASTA header, and then parses the header for the gene name. Once this is done, it reads the f...

How learn a probabilities matrix from a large fasta file in R?

2005 Jul 27

0

How learn a probabilities matrix from a large fasta file in R?

Hi everybody, I have a large fasta file(15M) which contains a lot of DNA sequence in fasta format.And i want to get probabilities matrix for 2nd order markov china from this background.Here is a web tool, http://tandem.bu.edu/markov.html.But i can not upload a large file. I wonder if there is a R packages can do this ,Thanks in...

write.fasta (seqinr package)

2009 Jan 22

0

write.fasta (seqinr package)

Hi I would like to use 'write.fasta(sequences, names, nbchar = 60, file.out, open = "w")' to convert a DNA sequence in a text file to fasta format. How do I read the the text file to prepare the argument 'sequences' of the function. The DNA sequence in the text file is one line as below: ATCACACAACGACACTCACCCTGG...

FASTA annot issue

2012 Sep 19

0

FASTA annot issue

I am trying to pull a subset form a large group of FASTA sequences. I need to pull them based on the annot and write.fasta them. I have my subset annot titles in a .csv. What is the way to go about this? I tried pulling the sequences from a .csv but then MEGA 5 was not happy when i tried to put them back and I need to use MEGA to keep my data uniform. A...

AMOVA error: 'bin' must be numeric or a factor

2012 Feb 11

1

AMOVA error: 'bin' must be numeric or a factor

Hi! I am trying to analyse my data using amova (http://www.oga-lab.net/RGM2/func.php?rd_id=pegas:amova): My input to R is a DNA sequence file, format=fasta dna<- read.dna("XX.fasta", format="fasta") #left other options as default d<- dist.dna(dna, model="raw") g<- read.table("XXX.design") Load necessary libraries: library(pegas) Loading required package: adegenet Loading required package...

seqinr updated : release 1.0-5

2006 Jul 25

0

seqinr updated : release 1.0-5

...or uco() when computing RSCU on sequences where an amino-acid is missing. There is now a new argument NA.rscu that allows the user to force the missing values to his favorite magic value. http://pbil.univ-lyon1.fr/software/SeqinR/SEQINR_CRAN/DOC/html/uco.html o There was a bug in read.fasta(): some sequence names were truncated, this is now fixed (thanks to Marcus G. Daniels for pointing this). In order to be more consistent with standard functions such as read.table() or scan(), the file argument starts now with a lower case letter (i.e."file") in function re...

seqinr updated : release 1.0-5

2006 Jul 25

0

seqinr updated : release 1.0-5

...or uco() when computing RSCU on sequences where an amino-acid is missing. There is now a new argument NA.rscu that allows the user to force the missing values to his favorite magic value. http://pbil.univ-lyon1.fr/software/SeqinR/SEQINR_CRAN/DOC/html/uco.html o There was a bug in read.fasta(): some sequence names were truncated, this is now fixed (thanks to Marcus G. Daniels for pointing this). In order to be more consistent with standard functions such as read.table() or scan(), the file argument starts now with a lower case letter (i.e."file") in function re...

How to find data that includes certain values

2011 Jan 21

3

How to find data that includes certain values

I am trying to return an index for a data set by searching using filenames. The name may be ANG_AUT.N.0734C70411A-1_1sA_0734C70411A.fasta, but i'd just like to search it using the term "0734C70411" as the file may be 0734C70411A or 0734C70411C or 0734C70411D Any way to do this other than doing something like this. where 0734C70411A is part of matrix list[,8] samp=paste("ANG_AUT.N.",list[i,8],"-1_1sA_&...

get compressed data via a socket connection

2006 Nov 08

1

get compressed data via a socket connection

...o","k=btg@") # do a query ( it means : give me the sequences associated to the keywords begining with btg, save the list in toto) > socket <- banknameSocket$socket # here I save the socket in "socket" > request="extractseqs&lrank=2&format=\"fasta\"&operation=\"simple\"&zlib=F" # writing a request > #( it means : give me the sequence information of the list of rank 2 ( toto) in fasta format, no compressed > writeLines(request, socket, sep = "\n") # put the request into the socket > seq &l...

R

2011 Jul 28

3

R

Good afternoon. I am a master student in University of Porto in Portugal. At this moment I’m starting to use R, so I have some doubts. The aim of my analysis is: calculate a pairwise FST matrix from fasta file and creat a principal component analyses with adegenet package (I use seqinr and ape package to read this file, then I convert this file into a genind object with DNA2genind function provide in adegenet package). After convert my file the pairwise.fst function is not found. If you could help...

New version of seqinR released

2007 Dec 12

0

New version of seqinR released

...or instance with a sequence like NNNNNNNNNNNN). The argument oldGC is now deprecated and a warning is issued. Functions GC1(), GC2(), GC3() are now simple wrappers for the more general GCpos() function. The new argument frame allows to take the frame into account for CDS. o Function read.fasta() now supports comment lines starting by a semicolon character in FASTA files. An example of such a file is provided in sequences/legacy.fasta. The argument File is now deprecated. There is a new argument seqonly to import just the sequences without names, annotations and coercion attempt...

New version of seqinR released

2007 Dec 12

0

New version of seqinR released

...or instance with a sequence like NNNNNNNNNNNN). The argument oldGC is now deprecated and a warning is issued. Functions GC1(), GC2(), GC3() are now simple wrappers for the more general GCpos() function. The new argument frame allows to take the frame into account for CDS. o Function read.fasta() now supports comment lines starting by a semicolon character in FASTA files. An example of such a file is provided in sequences/legacy.fasta. The argument File is now deprecated. There is a new argument seqonly to import just the sequences without names, annotations and coercion attempt...

How to emulate perl style of reading files ?

2008 Jul 03

2

How to emulate perl style of reading files ?

I tried the following, obviously it didn't work. Hope you get my point, how to do it in R ? My objective is to read a large fasta file (but not storing the entire data into memory) , and compute some sequence composition statistics. while(a <- readLines("test1") != EOF) print(a) _________________________________________________________________ [[alternative HTML version deleted]]

how to load only lines that start with a particular symbol

2009 Sep 15

3

how to load only lines that start with a particular symbol

Dear all, I have DNA sequence data which are fasta-formatted as >gene A;..... AAAAACCCC TTTTTGGGG CCCTTTTTT >gene B;.... CCCCCAAAA GGGGGTTTT I want to load only the lines that start with ">" where the annotation information for the gene is contained. In principle, I can remove the sequences before loading or after loading all t...

motif search

2008 Dec 09

2

motif search

Hi, I am very new to R and wanted to know if there is a package that, given very long nucleotide sequences, searches and identifies short (7-10nt) motifs.. I would like to look for enrichment of certain motifs in genomic sequences. I tried using MEME (not an R package, I know), but the online version only allows sequences up to MAX 60000 nucleotides, and that's too short for my needs..

How can I avoid a for-loop through sapply or lapply ?

2009 Sep 29

4

How can I avoid a for-loop through sapply or lapply ?

Through converting a miRNAs file from FASTA to character format I get a vector which looks like the following: > nml [1] "hsa-let-7a MIMAT0000062 Homo sapiens let-7a" [2] "hsa-let-7b MIMAT0000063 Homo sapiens let-7b" [3] "hsa-let-7c MIMAT0000064 Homo sapiens let-7c"...

search for: fasta