thr3ads.net - R devel - [Rd] pairwiseAlignment Improvements [Apr 2017]

If this information is useful, please help other people find it:
Share via:

Dario Strbenac

2017-Apr-28 06:00 UTC

[Rd] pairwiseAlignment Improvements

Good day,

The location of indels can be retrieved from a PairwiseAlignmentsSingleSubject
object by using indel. Determining any difference between the two sequences,
including substitutions, is not quick nor easy. I suppose that summary displays
details of the mismatches, but the variable is of class
PairwiseAlignmentsSingleSubjectSummary which has no documented accessors. So,
the code to access the information looks bad.

summaryAlign at mismatchSummary[["subject"]]

    SubjectPosition Subject Pattern Count Probability
1               2       T       A     1           1
2               3       T       A     1           1

This could be improved with accessors for end users.

Also, instead of being a data.frame, this would be better stored as IRanges with
associated metadata columns, accessible with mcols, so that methods like reduce
could easily be used to look for contiguous blocks of differences.

Is there a reason why the show method for the summary only shows mismatches,
even if there are indels contained in it? This seems arbitrary and also
misleading, because it always gives a false impression that there are no indels.

Could the return data types consistently be made to be IRanges ? Sometimes
it's IntegerList, sometimes it's IRanges. For example,
> A  11-letter "DNAString" instance
seq: GAACGAGGACC> B  8-letter "DNAString" instance
seq: GGACGAGC> alignment <- pairwiseAlignment(A, B, gapOpening = 0, gapExtension = 1,
substitutionMatrix = substitutions)
> alignment at subject@mismatchIntegerList of length 1
[[1]] 2> alignment at subject@indelIRangesList of length 1
[[1]]
IRanges of length 1
    start end width
[1]     8   9     2

Lastly, why are functions like insertion, deletion, and indel documented in
Numeric Summary Methods? Unlike nchar and score, they are not numerical
summaries of the data.

It'd be nice to see this part of Biostrings thoroughly refactored with more
focus on UX.

--------------------------------------
Dario Strbenac
University of Sydney
Camperdown NSW 2050
Australia

Dario Strbenac

2017-Apr-28 06:05 UTC

head link

[Rd] pairwiseAlignment Improvements

This has been re-sent to the Bioconductor Development mailing list and may be
deleted from the archive if it is possible.

Reasonably Related Threads

Search for more possibly parallel threads

R devel - Apr 2017 - pairwiseAlignment Improvements

[Rd] pairwiseAlignment Improvements

[Rd] pairwiseAlignment Improvements

Reasonably Related Threads