Sorry if this isn't the right place to ask this question. I'm thinking of writing a module that intercepts Samba requests to keep only a subset of a file store on a local server with the rest being stored on a cloud. It would be an open source module similar to these products: https://aws.amazon.com/storagegateway/ https://docs.microsoft.com/en-us/azure/storage/files/storage-sync-files-deployment-guide?tabs=azure-portal I'm thinking that I would have to do the following. 1. Keep my own directory lists, probably in Mongodb 2. Intercept File Open,Dir,and Stat requests from Samba 3. When a request to open is received for a file that doesn't exist locally, create a thread to retrieve the file from the cloud, blocking the request until the file is retrieved. 4. Have a background process that archives files to the cloud when necessary 5. Have a background process that retrieves files from the cloud when the local store needs to be rebuilt. If Samba does any file access operations outside of interceptable VFS requests, then I guess I'm out of luck. The reasons I'm interested in doing this are: 1. Won't have to incessantly worry about disk size on servers 2. Can rebuild servers seemlessly from cloud when local server is compromised. Is this idea wrong? Has it already been done? Is there a better alternative? Thanks, Mark
Jeremy Allison
2019-Apr-10 20:50 UTC
[Samba] Am I nuts to attempt a VFS local cache module?
On Wed, Apr 10, 2019 at 01:26:59PM -0700, Mark Winslow via samba wrote:> Sorry if this isn't the right place to ask this question. > > I'm thinking of writing a module that intercepts Samba requests to keep only > a subset of a file store on a local server with the rest being stored on a > cloud. It would be an open source module similar to these products: > > https://aws.amazon.com/storagegateway/ > > https://docs.microsoft.com/en-us/azure/storage/files/storage-sync-files-deployment-guide?tabs=azure-portal > > I'm thinking that I would have to do the following. > > 1. Keep my own directory lists, probably in Mongodb > 2. Intercept File Open,Dir,and Stat requests from Samba > 3. When a request to open is received for a file that doesn't exist locally, > create a thread to retrieve the file from the cloud, blocking the request > until the file is retrieved. > 4. Have a background process that archives files to the cloud when necessary > 5. Have a background process that retrieves files from the cloud when the > local store needs to be rebuilt. > > If Samba does any file access operations outside of interceptable VFS > requests, then I guess I'm out of luck.No it doesn't. *Everything* goes through the VFS for just this reason. You're in luck :-).> The reasons I'm interested in doing this are: > > 1. Won't have to incessantly worry about disk size on servers > 2. Can rebuild servers seemlessly from cloud when local server is > compromised. > > Is this idea wrong?No, it's a great idea ! See here: https://www.snia.org/sites/default/files/SDC15_presentations/smb/JeremyAllison_The_Future_is_Cloudy.pdf for a possible design (from 2015 no less :-).> Has it already been done? Is there a better > alternative?Had the design, no time to write it. I'm happy to help with anything you want to discuss :-). Loughborough University in the UK wrote this code but never released it back to the community (I asked, they ignored me :-( ). Jeremy.
Thanks Jeremy. I'm going to look at the links you provided and try to do some initial coding, I'm thinking a couple of weeks. It's not that I think I can produce the thing in a couple of weeks, but I'm the type of programmer who doesn't understand anything until I've written some code on it. Thanks On 4/10/2019 1:50 PM, Jeremy Allison wrote:> On Wed, Apr 10, 2019 at 01:26:59PM -0700, Mark Winslow via samba wrote: >> Sorry if this isn't the right place to ask this question. >> >> I'm thinking of writing a module that intercepts Samba requests to keep only >> a subset of a file store on a local server with the rest being stored on a >> cloud. It would be an open source module similar to these products: >> >> https://aws.amazon.com/storagegateway/ >> >> https://docs.microsoft.com/en-us/azure/storage/files/storage-sync-files-deployment-guide?tabs=azure-portal >> >> I'm thinking that I would have to do the following. >> >> 1. Keep my own directory lists, probably in Mongodb >> 2. Intercept File Open,Dir,and Stat requests from Samba >> 3. When a request to open is received for a file that doesn't exist locally, >> create a thread to retrieve the file from the cloud, blocking the request >> until the file is retrieved. >> 4. Have a background process that archives files to the cloud when necessary >> 5. Have a background process that retrieves files from the cloud when the >> local store needs to be rebuilt. >> >> If Samba does any file access operations outside of interceptable VFS >> requests, then I guess I'm out of luck. > No it doesn't. *Everything* goes through the VFS for > just this reason. You're in luck :-). > >> The reasons I'm interested in doing this are: >> >> 1. Won't have to incessantly worry about disk size on servers >> 2. Can rebuild servers seemlessly from cloud when local server is >> compromised. >> >> Is this idea wrong? > No, it's a great idea ! See here: > > https://www.snia.org/sites/default/files/SDC15_presentations/smb/JeremyAllison_The_Future_is_Cloudy.pdf > > for a possible design (from 2015 no less :-). > >> Has it already been done? Is there a better >> alternative? > Had the design, no time to write it. I'm happy > to help with anything you want to discuss :-). > > Loughborough University in the UK wrote this code > but never released it back to the community (I > asked, they ignored me :-( ). > > Jeremy.