and see duplicate files...
http://cluster.biodiversitylibrary.org/n/naturalistslibra30jardrich/
P
On Wed, Oct 27, 2010 at 12:23 PM, phil cryer <phil at cryer.us>
wrote:> We''re building our cluster of data, downloading book data from
> Internet Archive. I''ve come across one that looks like this:
> http://cluster.biodiversitylibrary.org/n/naturwissenschaft19deut/
>
> Almost all the files appear to be there twice, but have the same name,
> timestamp and inode! What could be causing this, and how can we fix
> it? At issue is space; it appears that we''re using far more space
than
> we should, and an `du -h` or `ls -lsh` both say this directory takes
> 3.9G when it should really be about 1/2 that. If it has done this on
> many of the directories, it could explain how we''re using 78T of
97T
> of space already.
>
> P
> --
> http://philcryer.com
>
--
http://philcryer.com