Dear All, I am not an expert about time series, but I am given a time series to analyze. That time series stands for the list of individuals in contact with a given individual at time t_i, where the ID of every individual is an integer number. (let us not care right now about the meaning of being in contact with in this context, since it does not matter for the discussion). To fix the ideas, consider the following (I am tracking the contacts of an individual whose ID is 1000) c(1000,1100), c(1000,1100,1200),c(1000),c(NA), c(1000,1400) t_1 , t_2 , t_3 , t_4 , t_5 i.e. at time t_i individual 1000 is in contact with individual 1100, at time t_2 he is in contact also with individual 1200, at time t_3 he is by himself (represented as the individual in contact by himself), whereas at time t_4 I have no info about his state (missing info) and finally at time t_5 he is in contact with individual 1400. How would you analyze this series? I do not have a single number at every time so I cannot assume that the series is the typical succession {x_i} at time {t_i}. Replacing the lists of individuals at time t_i with just the number of individuals in contact with individual 1000 at time t_i throws away valuable information (I cannot distinguish any more the situation at time t_1 from that at time t_5). If I use a hash (like those provided by the digest package) I can then squeeze every list at time t_i into a string, but again I lose information (e.g. I cannot tell any more than there is considerable overlap in the situation at time t_1 and t_2). Finally, I would like to stress that strictly speaking I do not have a vector at every time t_i; indeed I do not have an object I can vary continuously (individual 1000 either is in contact with individual 1100 or he is not) and on top of of that I do not have an obvious/uniquely defined notion of distance between the time series at t_i and the one at t_j. Any suggestions are appreciated. Many thanks Lorenzo