Noob Centos Admin
2010-Jan-24 17:09 UTC
[CentOS] Centos/Linux Disk Caching, might be OT in some ways
I'm trying to optimize some database app running on a CentOS server and wanted to confirm some things about the disk/file caching mechanism.>From what I've read, Linux has a Virtual Filesystem layer that sitsbetween the physical file system and everything else. So no matter what FS is used, applications are still addressing the VFS. Due to this, disk caching is done on an inode/block basis. I'm assuming that this is still the case in CentOS or am I badly mistaken? If that is correct, then here is my scenario and hypothesis. Assuming the server has xxx MB of free memory and the database consist of several tables more than xxx MB in size. So no table will fit entirely into memory. And assuming other processes do not interfere with the caching behaviour or available memory etc. Given the inode caching behaviour, if the DBMS only access a bunch of inodes that total less than xxx MB, is it therefore likely to be always using the cache, hence faster? My thought is that if this is the case, then I could likely speed up the application behaviour if I further split the tables into parts that are more frequently accessed, and parts that are unlikely touched. e.g. the table may currently have rows with 20 fields and total 1KB/row, but very often say only 5/20 fields are used in actual processing. Reading x rows from this table may access x inodes which would not fit into the cache/memory. However if now I break the table into two parts with those 5 fields into a smaller table, there would be a speed increase since the reading the same x rows from this table would only access 1/x inodes. Further more, these would more likely fit into the disk/memory cache for even faster access. Or would I simply be duplicating what the DBMS's index files would already be doing and therefore see no improvement?
Noob Centos Admin wrote:> I'm trying to optimize some database app running on a CentOS server > and wanted to confirm some things about the disk/file caching > mechanism.If you want a fast database forget about file system caching, use Direct I/O and put your memory to better use - application level caching. nate
On Mon, 2010-01-25 at 01:09 +0800, Noob Centos Admin wrote:> e.g. the table may currently have rows with 20 fields and total > 1KB/row, but very often say only 5/20 fields are used in actual > processing. Reading x rows from this table may access x inodes which > would not fit into the cache/memory.20 feilds or columns is really nothing. BUT That's dependant on the type of data being inserted.> However if now I break the table into two parts with those 5 fields > into a smaller table, there would be a speed increase since the > reading the same x rows from this table would only access 1/x inodes. > Further more, these would more likely fit into the disk/memory cache > for even faster access.Ok so break the one table down create 2 or more, then you will have "Joins" & clustered indexes thus slowing you down more possibly. That is greatly dependant on your select, delete, and update scripts.> Or would I simply be duplicating what the DBMS's index files would > already be doing and therefore see no improvement?Possibly very correct, but Nate is very correct on how you are accessing the DB ie direct i/o also. Your fastest access come in optimized SPROCS and Triggers and TSQL. Slam enough memory into the server and load it in memory. If speed is what your after why are you worried about VFS? CentOS does support Raw Disk Access (no filesystem). John
Ross Walker
2010-Jan-28 14:48 UTC
[CentOS] Centos/Linux Disk Caching, might be OT in some ways
On Jan 27, 2010, at 7:50 PM, Christopher Chan <christopher.chan at bradbury.edu.hk > wrote:> >> Sorry to be the bearer of bad news, but on top of LVM on CentOS/RHEL >> the best assurance your going to get is fsync(), meaning the data is >> out of the kernel, but probably still on disk write cache. Make sure >> you have a good UPS setup, so the disks can flush after main power >> loss. > > Or turn off write caching...Have you tried doing any kind of write with write caching turned off? It is so horribly slow to make it almost useless. If you need to turn write-caching off then I would start looking at SSD drives with capacitor based caches. -Ross
Les Mikesell
2010-Jan-28 16:18 UTC
[CentOS] Centos/Linux Disk Caching, might be OT in some ways
On 1/28/2010 8:48 AM, Ross Walker wrote:> >>> Sorry to be the bearer of bad news, but on top of LVM on CentOS/RHEL >>> the best assurance your going to get is fsync(), meaning the data is >>> out of the kernel, but probably still on disk write cache. Make sure >>> you have a good UPS setup, so the disks can flush after main power >>> loss. >> >> Or turn off write caching... > > Have you tried doing any kind of write with write caching turned off? > It is so horribly slow to make it almost useless. > > If you need to turn write-caching off then I would start looking at > SSD drives with capacitor based caches. >I wonder if the generally-horrible handling that linux has always done for fsync() is the real reason Oracle spun off their own distro? Do they get it better? -- Les Mikesell lesmikesell at gmail.com
Christopher Chan
2010-Jan-29 00:27 UTC
[CentOS] Centos/Linux Disk Caching, might be OT in some ways
On Thursday, January 28, 2010 10:48 PM, Ross Walker wrote:> > On Jan 27, 2010, at 7:50 PM, Christopher Chan<christopher.chan at bradbury.edu.hk > > wrote: > >> >>> Sorry to be the bearer of bad news, but on top of LVM on CentOS/RHEL >>> the best assurance your going to get is fsync(), meaning the data is >>> out of the kernel, but probably still on disk write cache. Make sure >>> you have a good UPS setup, so the disks can flush after main power >>> loss. >> >> Or turn off write caching... > > Have you tried doing any kind of write with write caching turned off? > It is so horribly slow to make it almost useless.If they needed the performance in the first place, I doubt they would be using md raid1. You want performance and reliability? Hardware raid + bbu cache. Otherwise, it is turn off write caching unless the i/o path supports barriers.> > If you need to turn write-caching off then I would start looking at > SSD drives with capacitor based caches. >How do those compare with bbu nvram cards for external data + metadata journaling?