This is among the things we need to do when a user leaves, and it's a larger question than it sounds. Our Office has many servers, with a good number of fileservers for projects, with large filesystems (i.e. 10's of TB). Can anyone think of a way *other* than running what's probably a many-hour long find / -user on all our systems, which is really intensive, to find all the files own by a given user? Locate would be great, but from the man pages and what I can find online, it only stores filenames and paths. mark
On Wed, 1 Aug 2018, mark wrote:> This is among the things we need to do when a user leaves, and it's > a larger question than it sounds. Our Office has many servers, with > a good number of fileservers for projects, with large filesystems > (i.e. 10's of TB). Can anyone think of a way *other* than running > what's probably a many-hour long find / -user on all our systems, > which is really intensive, to find all the files own by a given > user? > > Locate would be great, but from the man pages and what I can find > online, it only stores filenames and paths.The only way I know is to keep an updated database of metadata, which may be a security vulnerability depending on its accessibility and the nature of your work. The Robinhood engine was written for this sort of purpose: https://github.com/cea-hpc/robinhood/wiki That said, we use Robinhood on a single lustre filesystem. I don't know how if you can set up a central instance across several file servers or if each filesystem would need its own engine. -- Paul Heinlein heinlein at madboa.com 45?38' N, 122?6' W
On 08/01/18 10:10, mark wrote:> This is among the things we need to do when a user leaves, and it's a > larger question than it sounds. Our Office has many servers, with a good > number of fileservers for projects, with large filesystems (i.e. 10's of > TB). Can anyone think of a way *other* than running what's probably a > many-hour long find / -user on all our systems, which is really intensive, > to find all the files own by a given user? > > Locate would be great, but from the man pages and what I can find online, > it only stores filenames and paths.If you want to be rigorous with result (and I for one would), avoid locate: that one is using database which is updated how often? *hmm*, once a week. find is the only command I will use for the task (and I definitely will use -uid instead of -user, just in case I already deleted user on one of the boxes I look for the user stuff, whereas numeric userid is what is there in file/directory attributes). I also wil look for stuff owned by user's individual group (separate command with -gid argument, as I may want to deal with these differently). Just my $0.02 Valeri> > mark > > > _______________________________________________ > CentOS mailing list > CentOS at centos.org > https://lists.centos.org/mailman/listinfo/centos >-- ++++++++++++++++++++++++++++++++++++++++ Valeri Galtsev Sr System Administrator Department of Astronomy and Astrophysics Kavli Institute for Cosmological Physics University of Chicago Phone: 773-702-4247 ++++++++++++++++++++++++++++++++++++++++
Valeri Galtsev wrote:> On 08/01/18 10:10, mark wrote: > >> This is among the things we need to do when a user leaves, and it's a >> larger question than it sounds. Our Office has many servers, with a good >> number of fileservers for projects, with large filesystems (i.e. 10's >> of TB). Can anyone think of a way *other* than running what's probably a >> many-hour long find / -user on all our systems, which is really >> intensive, to find all the files own by a given user? >> >> Locate would be great, but from the man pages and what I can find >> online, it only stores filenames and paths. > > If you want to be rigorous with result (and I for one would), avoid > locate: that one is using database which is updated how often? *hmm*, > once a week. > > find is the only command I will use for the task (and I definitely will > use -uid instead of -user, just in case I already deleted user on one of > the boxes I look for the user stuff, whereas numeric userid is what is > there in file/directory attributes). I also wil look for stuff owned by > user's individual group (separate command with -gid argument, as I may > want to deal with these differently). >Well, we do have to be rigorous, or we'll get dinged by security about it, and as a federal agency, there are laws. That being said, we're talking about every time someone leaves, and we can't just delete, since these were for projects, and programs, scripts, data and results should *not* go away, but need to be given to someone else. I found the other post, about robinhood, interesting, but it doesn't seem to be in any of the std. repos. I'm not worried about it being update once a week (actually, I thought that happened every night), since we've got over a month to deal with it. One solution that hit me was to fork locate, and add UID and GID, and update that weekly, or so, which would work. Still hoping for a package; don't need to reinvent the wheel, and I can't believe no one's run into this before.
On Wed, Aug 01, 2018 at 10:33:33AM -0500, Valeri Galtsev wrote:> > If you want to be rigorous with result (and I for one would), avoid locate: > that one is using database which is updated how often? *hmm*, once a week.Daily. -- Normally the beautiful days in life come after fatigue and difficulties. The difficult labor produces a more beloved result. - Prime Minister Nuri Kamal Al-Maliki of Iraq, after voting on a day with high turnout despite a wave of bombings, New York Times, 8 March 2010 -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 198 bytes Desc: not available URL: <http://lists.centos.org/pipermail/centos/attachments/20180801/e354c9ae/attachment-0001.sig>