thr3ads.net - R help - [R] Temporal clustering of factors [Aug 2013]

If this information is useful, please help other people find it:
Share via:

Chris McOwen

2013-Aug-15 09:20 UTC

[R] Temporal clustering of factors

Dear list,

I have 50 sites where information was recorded over a 45 year time period. The
recorded data could take one of four forms: Fishing effort, Environmental, Both
or Inconclusive.

What i am aiming to do is cluster sites based on their similarity through time,
essentially i view this as being similar to making a phylogeny, where instead of
a genetic sequence i have a sequence of factors.

I was thinking of using Gower distance to create a dissimilarity matrix and go
from there but i don't think this captures what i am looking for?

Any suggestions would be gratefully received.


For space i have restricted the sample data to 4 sites

temporal_sites <- structure(list(Year = c(1959L, 1960L, 1961L, 1962L, 1963L,
1964L,
1965L, 1966L, 1967L, 1968L, 1969L, 1970L, 1971L, 1972L, 1973L,
1974L, 1975L, 1976L, 1977L, 1978L, 1979L, 1980L, 1981L, 1982L,
1983L, 1984L, 1985L, 1986L, 1987L, 1988L, 1989L, 1990L, 1991L,
1992L, 1993L, 1994L, 1995L, 1996L, 1997L, 1998L, 1999L, 2000L,
2001L, 2002L, 2003L, 2004L, 1959L, 1960L, 1961L, 1962L, 1963L,
1964L, 1965L, 1966L, 1967L, 1968L, 1969L, 1970L, 1971L, 1972L,
1973L, 1974L, 1975L, 1976L, 1977L, 1978L, 1979L, 1980L, 1981L,
1982L, 1983L, 1984L, 1985L, 1986L, 1987L, 1988L, 1989L, 1990L,
1991L, 1992L, 1993L, 1994L, 1995L, 1996L, 1997L, 1998L, 1999L,
2000L, 2001L, 2002L, 2003L, 2004L, 1959L, 1960L, 1961L, 1962L,
1963L, 1964L, 1965L, 1966L, 1967L, 1968L, 1969L, 1970L, 1971L,
1972L, 1973L, 1974L, 1975L, 1976L, 1977L, 1978L, 1979L, 1980L,
1981L, 1982L, 1983L, 1984L, 1985L, 1986L, 1987L, 1988L, 1989L,
1990L, 1991L, 1992L, 1993L, 1994L, 1995L, 1996L, 1997L, 1998L,
1999L, 2000L, 2001L, 2002L, 2003L, 2004L, 1959L, 1960L, 1961L,
1962L, 1963L, 1964L, 1965L, 1966L, 1967L, 1968L, 1969L, 1970L,
1971L, 1972L, 1973L, 1974L, 1975L, 1976L, 1977L, 1978L, 1979L,
1980L, 1981L, 1982L, 1983L, 1984L, 1985L, 1986L, 1987L, 1988L,
1989L, 1990L, 1991L, 1992L, 1993L, 1994L, 1995L, 1996L, 1997L,
1998L, 1999L, 2000L, 2001L, 2002L, 2003L, 2004L), Site = structure(c(1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L,
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 4L, 4L, 4L,
4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L,
4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L,
4L, 4L, 4L, 4L, 4L, 4L, 4L), .Label = c("A", "B",
"C", "D"), class = "factor"),
    Factor = structure(c(2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
    2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 1L, 1L, 1L, 1L,
    1L, 1L, 1L, 1L, 1L, 1L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
    3L, 3L, 3L, 3L, 3L, 3L, 3L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
    2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
    2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L,
    3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
    2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
    3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
    3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
    2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
    1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 3L, 3L, 3L, 3L, 3L, 3L,
    3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L), .Label = c("Both",
    "Environmental", "Fishing Effort"), class =
"factor")), .Names = c("Year",
"Site", "Factor"), class = "data.frame", row.names
= c(NA, -184L
))



	[[alternative HTML version deleted]]

Chris Mcowen

2013-Aug-15 09:24 UTC

head link

[R] Temporal clustering of factors

Dear list,

 

I have 50 sites where information was recorded over a 45 year time period.
The recorded data could take one of four forms: Fishing effort,
Environmental, Both or Inconclusive. 

 

What i am aiming to do is cluster sites based on their similarity through
time, essentially i view this as being similar to making a phylogeny, where
instead of a genetic sequence i have a sequence of factors.

 

I was thinking of using Gower distance to create a dissimilarity matrix and
go from there but i don't think this captures what i am looking for? 

 

Any suggestions would be gratefully received.

 

 

For space i have restricted the sample data to 4 sites

 

temporal_sites <- structure(list(Year = c(1959L, 1960L, 1961L, 1962L, 1963L,
1964L, 

1965L, 1966L, 1967L, 1968L, 1969L, 1970L, 1971L, 1972L, 1973L, 

1974L, 1975L, 1976L, 1977L, 1978L, 1979L, 1980L, 1981L, 1982L, 

1983L, 1984L, 1985L, 1986L, 1987L, 1988L, 1989L, 1990L, 1991L, 

1992L, 1993L, 1994L, 1995L, 1996L, 1997L, 1998L, 1999L, 2000L, 

2001L, 2002L, 2003L, 2004L, 1959L, 1960L, 1961L, 1962L, 1963L, 

1964L, 1965L, 1966L, 1967L, 1968L, 1969L, 1970L, 1971L, 1972L, 

1973L, 1974L, 1975L, 1976L, 1977L, 1978L, 1979L, 1980L, 1981L, 

1982L, 1983L, 1984L, 1985L, 1986L, 1987L, 1988L, 1989L, 1990L, 

1991L, 1992L, 1993L, 1994L, 1995L, 1996L, 1997L, 1998L, 1999L, 

2000L, 2001L, 2002L, 2003L, 2004L, 1959L, 1960L, 1961L, 1962L, 

1963L, 1964L, 1965L, 1966L, 1967L, 1968L, 1969L, 1970L, 1971L, 

1972L, 1973L, 1974L, 1975L, 1976L, 1977L, 1978L, 1979L, 1980L, 

1981L, 1982L, 1983L, 1984L, 1985L, 1986L, 1987L, 1988L, 1989L, 

1990L, 1991L, 1992L, 1993L, 1994L, 1995L, 1996L, 1997L, 1998L, 

1999L, 2000L, 2001L, 2002L, 2003L, 2004L, 1959L, 1960L, 1961L, 

1962L, 1963L, 1964L, 1965L, 1966L, 1967L, 1968L, 1969L, 1970L, 

1971L, 1972L, 1973L, 1974L, 1975L, 1976L, 1977L, 1978L, 1979L, 

1980L, 1981L, 1982L, 1983L, 1984L, 1985L, 1986L, 1987L, 1988L, 

1989L, 1990L, 1991L, 1992L, 1993L, 1994L, 1995L, 1996L, 1997L, 

1998L, 1999L, 2000L, 2001L, 2002L, 2003L, 2004L), Site = structure(c(1L, 

1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 

1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 

1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 

2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 

2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 

2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 

3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 

3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 

3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 

4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 

4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 

4L, 4L, 4L, 4L, 4L, 4L, 4L), .Label = c("A", "B",
"C", "D"), class "factor"),

    Factor = structure(c(2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 

    2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 1L, 1L, 1L, 1L, 

    1L, 1L, 1L, 1L, 1L, 1L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 

    3L, 3L, 3L, 3L, 3L, 3L, 3L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 

    2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 

    2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 

    3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 

    2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 

    3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 

    3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 

    2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 

    1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 3L, 3L, 3L, 3L, 3L, 3L, 

    3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L), .Label = c("Both", 

    "Environmental", "Fishing Effort"), class =
"factor")), .Names c("Year",

"Site", "Factor"), class = "data.frame", row.names
= c(NA, -184L

))

 

 


	[[alternative HTML version deleted]]

Bert Gunter

2013-Aug-15 14:35 UTC

head link

[R] Temporal clustering of factors

Post to R-sig-ecology  , not here.

-- Bert

On Thu, Aug 15, 2013 at 2:20 AM, Chris McOwen
<Chris.McOwen at unep-wcmc.org> wrote:> Dear list,
>
> I have 50 sites where information was recorded over a 45 year time period.
The recorded data could take one of four forms: Fishing effort, Environmental,
Both or Inconclusive.
>
> What i am aiming to do is cluster sites based on their similarity through
time, essentially i view this as being similar to making a phylogeny, where
instead of a genetic sequence i have a sequence of factors.
>
> I was thinking of using Gower distance to create a dissimilarity matrix and
go from there but i don't think this captures what i am looking for?
>
> Any suggestions would be gratefully received.
>
>
> For space i have restricted the sample data to 4 sites
>
> temporal_sites <- structure(list(Year = c(1959L, 1960L, 1961L, 1962L,
1963L, 1964L,
> 1965L, 1966L, 1967L, 1968L, 1969L, 1970L, 1971L, 1972L, 1973L,
> 1974L, 1975L, 1976L, 1977L, 1978L, 1979L, 1980L, 1981L, 1982L,
> 1983L, 1984L, 1985L, 1986L, 1987L, 1988L, 1989L, 1990L, 1991L,
> 1992L, 1993L, 1994L, 1995L, 1996L, 1997L, 1998L, 1999L, 2000L,
> 2001L, 2002L, 2003L, 2004L, 1959L, 1960L, 1961L, 1962L, 1963L,
> 1964L, 1965L, 1966L, 1967L, 1968L, 1969L, 1970L, 1971L, 1972L,
> 1973L, 1974L, 1975L, 1976L, 1977L, 1978L, 1979L, 1980L, 1981L,
> 1982L, 1983L, 1984L, 1985L, 1986L, 1987L, 1988L, 1989L, 1990L,
> 1991L, 1992L, 1993L, 1994L, 1995L, 1996L, 1997L, 1998L, 1999L,
> 2000L, 2001L, 2002L, 2003L, 2004L, 1959L, 1960L, 1961L, 1962L,
> 1963L, 1964L, 1965L, 1966L, 1967L, 1968L, 1969L, 1970L, 1971L,
> 1972L, 1973L, 1974L, 1975L, 1976L, 1977L, 1978L, 1979L, 1980L,
> 1981L, 1982L, 1983L, 1984L, 1985L, 1986L, 1987L, 1988L, 1989L,
> 1990L, 1991L, 1992L, 1993L, 1994L, 1995L, 1996L, 1997L, 1998L,
> 1999L, 2000L, 2001L, 2002L, 2003L, 2004L, 1959L, 1960L, 1961L,
> 1962L, 1963L, 1964L, 1965L, 1966L, 1967L, 1968L, 1969L, 1970L,
> 1971L, 1972L, 1973L, 1974L, 1975L, 1976L, 1977L, 1978L, 1979L,
> 1980L, 1981L, 1982L, 1983L, 1984L, 1985L, 1986L, 1987L, 1988L,
> 1989L, 1990L, 1991L, 1992L, 1993L, 1994L, 1995L, 1996L, 1997L,
> 1998L, 1999L, 2000L, 2001L, 2002L, 2003L, 2004L), Site = structure(c(1L,
> 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
> 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
> 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L,
> 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
> 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
> 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L,
> 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
> 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
> 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 4L, 4L, 4L,
> 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L,
> 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L,
> 4L, 4L, 4L, 4L, 4L, 4L, 4L), .Label = c("A", "B",
"C", "D"), class = "factor"),
>     Factor = structure(c(2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
>     2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 1L, 1L, 1L, 1L,
>     1L, 1L, 1L, 1L, 1L, 1L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
>     3L, 3L, 3L, 3L, 3L, 3L, 3L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
>     2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
>     2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L,
>     3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
>     2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
>     3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
>     3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
>     2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
>     1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 3L, 3L, 3L, 3L, 3L, 3L,
>     3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L), .Label = c("Both",
>     "Environmental", "Fishing Effort"), class =
"factor")), .Names = c("Year",
> "Site", "Factor"), class = "data.frame",
row.names = c(NA, -184L
> ))
>
>
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.


-- 

Bert Gunter
Genentech Nonclinical Biostatistics

Internal Contact Info:
Phone: 467-7374
Website:
http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm

R help - Aug 2013 - Temporal clustering of factors

[R] Temporal clustering of factors

[R] Temporal clustering of factors

[R] Temporal clustering of factors