thr3ads.net - R help - [R] Random resampling of columns in species association matrices [May 2012]

If this information is useful, please help other people find it:
Share via:

mariasve

2012-May-09 14:34 UTC

[R] Random resampling of columns in species association matrices

I have a host-parasite association matrix in which parasite species are rows
and host species columns and cells contain the frequency of interactions.
Some parasites are associated with many hosts, and some hosts harbor several
parasites, and I want to repeatedly select only one single representative
host per "generalized" (multi-host) parasite to create a new matrix in
which
no hosts are repeated. That is, I want multiple randomly generated symmetric
matrices in which a host and a parasite species appear only once.
Furthermore, I want to weight the probability of selecting a particular host
for a parasite by the frequency of interactions between the two. Finally, a
handful of parasites associate with only one single host. I do not want to
lose these from the matrix, but rather fix these associations and only
randomly select hosts for the generalized parasite species.

My goal is to eventually perform generalized least squares regressions
between a parasite trait and several host traits, but the first major hurdle
for me to get over is how to randomly select only one host per parasite with
no repetition of species in the matrix. I am also generally interested in
how to resample columns instead of rows (in the package boot, for instance)
because of another analysis I'm working on, and I have been unable to find a
solution to this when searching the R help site.

Any suggestions would be most welcomed.

Maria

--
View this message in context:
http://r.789695.n4.nabble.com/Random-resampling-of-columns-in-species-association-matrices-tp4620618.html
Sent from the R help mailing list archive at Nabble.com.

David L Carlson

2012-May-09 16:01 UTC

head link

[R] Random resampling of columns in species association matrices

Sample data would make it possible to explore the options in more detail,
but here are two possibilities:

1. Convert each row of the matrix to row proportions and then take the
cumulative sum. Now draw a random uniform number between 0 and 1 and find
the first column that is larger than the random number. That column is your
randomly selected host. If there is one host, the cumulative sums will be
zero until you reach that column and then it will flip to 1 so that you will
always select that host.

2. For each parasite, create a vector of host names with each host repeated
by the number of interactions with that host. Use sample() to randomly draw
a host. You'll probably want to combine the vectors into a list to automate
the process over all parasites.

----------------------------------------------
David L Carlson
Associate Professor of Anthropology
Texas A&M University
College Station, TX 77843-4352

> -----Original Message-----
> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-
> project.org] On Behalf Of mariasve
> Sent: Wednesday, May 09, 2012 9:35 AM
> To: r-help at r-project.org
> Subject: [R] Random resampling of columns in species association
> matrices
> 
> I have a host-parasite association matrix in which parasite species are
> rows
> and host species columns and cells contain the frequency of
> interactions.
> Some parasites are associated with many hosts, and some hosts harbor
> several
> parasites, and I want to repeatedly select only one single
> representative
> host per "generalized" (multi-host) parasite to create a new
matrix in
> which
> no hosts are repeated. That is, I want multiple randomly generated
> symmetric
> matrices in which a host and a parasite species appear only once.
> Furthermore, I want to weight the probability of selecting a particular
> host
> for a parasite by the frequency of interactions between the two.
> Finally, a
> handful of parasites associate with only one single host. I do not want
> to
> lose these from the matrix, but rather fix these associations and only
> randomly select hosts for the generalized parasite species.
> 
> My goal is to eventually perform generalized least squares regressions
> between a parasite trait and several host traits, but the first major
> hurdle
> for me to get over is how to randomly select only one host per parasite
> with
> no repetition of species in the matrix. I am also generally interested
> in
> how to resample columns instead of rows (in the package boot, for
> instance)
> because of another analysis I'm working on, and I have been unable to
> find a
> solution to this when searching the R help site.
> 
> Any suggestions would be most welcomed.
> 
> Maria
> 
> 
> 
> --
> View this message in context: http://r.789695.n4.nabble.com/Random-
> resampling-of-columns-in-species-association-matrices-tp4620618.html
> Sent from the R help mailing list archive at Nabble.com.
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.

mariasve

2012-May-10 13:46 UTC

head link

[R] Random resampling of columns in species association matrices

Hi David,

Thank you for your suggestions. I am quite the beginner at R and don?t
understand how to actually implement your suggestion and am hoping for some
further advice on that, if possible.

This is a subset of my data. Rows are host species, and columns parasite
species. Three of the parasites are generalists, but P4L is a strict
specialist on FORCOL (27 individuals have this parasite).

H17L P25L P41L P4L
AUTINF 39 0 0 0
GLYSPI 16 2 15 0
FORCOL 1 0 0 27
HYLPOE 3 0 2 0
HYLNAE 1 4 2 0
MYRMYO 2 5 2 0
THAARD 0 8 0 0

This is a list of host trait values for each of the hosts:
abundance weight survival
AUTINF 488 38 0.48
GLYSPI 827 14.1 0.59
FORCOL 156 44.3 0.55
HYLPOE 322 17.5 0.54
HYLNAE 309 14.5 0.73
MYRMYO 475 20.8 0.59
THAARD 429 18.4 0.67

And this is an estimate of host specificity of the parasites, incorporating
prevalence and phylogeny:

Specificity
H17L 2.08
P25L 1.72
P41L 2.19
P4L 0

I want to determine whether specificity of the parasites relates to any of
the host traits. For this, I would like to do a multiple regression. To
avoid psedureplication, I want to include a host species only once in the
matrix. So, for H17L, I could pick either of the hosts (except THAARD),
etc., but once a host is picked for one parasite, it cannot be picked for
another. For example, if I pick GLYSPI for H17L, GLYSPI has to be removed as
a choice for P25L and P41L. Thus, I also have to randomize which parasite
has its host picked first. In all cases, I want to lock FORCOL and P4L, so
FORCOL will not be an option for H17L anymore. This last part I?m still
uncertain about, I might just randomly pick hosts for all parasites and then
risk losing the strict host species specialists from some matrices.

If I make 2 random selections I might end up with:
Random1 Random2
H17L AUTINF GLYSPI
P25L GLYSPI HYLNAE
P41L HYLPOE MYRMYO
P4L FORCOL FORCOL

For the first random table I would then do a multiple regression on the
dependent specificity variable and independent host trait values:
Specificity abundance weight survival
2.08 488 38 0.48
1.72 827 14.1 0.59
2.19 322 17.5 0.54
0 156 44.3 0.55

If I generate 1000 randomly selected host-parasite combinations, I would
have 1000 such tables, on which I would have to run 1000 independent
regressions. Since I?m using model selection and multimodel inference to
estimate parameter values, I will end up doing the model selection 1000
times.

Your second suggestion makes most sense to me, but I don?t understand how to
implement it. Would you (or someone else) please give me some advise on
that? Also, once I have the 1000 random host-parasite matrices, how do I
link these to the tables of actual values (host traits and parasite
specificity)?

Thanks so much!
Maria

--
View this message in context:
http://r.789695.n4.nabble.com/Random-resampling-of-columns-in-species-association-matrices-tp4620618p4623563.html
Sent from the R help mailing list archive at Nabble.com.

Possibly Parallel Threads

Search for more seemingly similar threads

R help - May 2012 - Random resampling of columns in species association matrices

[R] Random resampling of columns in species association matrices

[R] Random resampling of columns in species association matrices

[R] Random resampling of columns in species association matrices

Possibly Parallel Threads