thr3ads.net - CentOS - [CentOS] inquiry about limitation of file system [Nov 2018]

If this information is useful, please help other people find it:
Share via:

yf chu

2018-Nov-03 07:44 UTC

[CentOS] inquiry about limitation of file system

I have a website with millions of pages. We often deploy our websites on centos
with Nginx and Apache Http Server as HTTP Web Server. I am very worried about
the performance of web server if the amount of pages is very very large. I
wonder whether the performance will be affected if there are too many files and
directories on the server. Besides, our bugdet is limited. We do not want to
deploy too many Servers and use other file system such as GFS. So I want to know
the limitation of centos file system and how should I handle this issue.

Walter H.

2018-Nov-03 08:03 UTC

head link

[CentOS] inquiry about limitation of file system

On 03.11.2018 08:44, yf chu wrote:> I have a website with millions of pages.
>does 'millions of pages' also mean 'millions of files on the file
system'?

just a hint - has nothing to do with any file system as its universal:
e.g. when you have 10000 files
don't store them in one folder, create 100 folders with 100 files in each;

there is no file system that handles millions of files in one folder
or with limited resources (e.g. RAM)

yf chu

2018-Nov-03 08:16 UTC

head link

[CentOS] inquiry about limitation of file system

Thank you for your hint.
I really mean I am planning to  store  millions of files on the file system.
Then may I ask that what is the maximum number of files which could be stored in
one directory without affecting the performance of web server?








At 2018-11-03 16:03:56, "Walter H." <Walter.H at
mathemainzel.info> wrote:>On 03.11.2018 08:44, yf chu wrote:
>> I have a website with millions of pages.
>>
>does 'millions of pages' also mean 'millions of files on the
file system'?
>
>just a hint - has nothing to do with any file system as its universal:
>e.g. when you have 10000 files
>don't store them in one folder, create 100 folders with 100 files in
each;
>
>there is no file system that handles millions of files in one folder
>or with limited resources (e.g. RAM)
>
>_______________________________________________
>CentOS mailing list
>CentOS at centos.org
>https://lists.centos.org/mailman/listinfo/centos

Gordon Messmer

2018-Nov-04 00:40 UTC

head link

[CentOS] inquiry about limitation of file system

On 11/3/18 12:44 AM, yf chu wrote:> I wonder whether the performance will be affected if there are too many
files and directories on the server.

With XFS on modern CentOS systems, you probably don't need to worry:
https://www.youtube.com/watch?v=FegjLbCnoBw

For older systems, as best I understand it: As the directory tree grows,
the answer to your question depends on how many entries are in the
directories, how deep the directory structure is, and how random the
access pattern is.? Ultimately, you want to minimize the number of
individual disk reads required.

Directories with lots of entries is one situation where you may see
performance degrade.? Typically around the time the directory grows
larger than the maximum size of the direct block list [1] (48k), reading
the directory starts to take a little longer. After the maximum size of
the single indirect block list (4MB), it will tend to get slower again.?
File names impact directory size, so average filename length factors in,
as well as the number of files.

A given file lookup will need to reach each of the parent directories to
locate the next item in the path.? If your path is very deep, then your
directories are likely to be smaller on average, but you're increasing
the number of lookups required for parent directories to reduce the
length of the block list.? It might make your worst-case better, but
your best-case is probably worse.

The system's cache means that accessing a few files in a large structure
is not as expensive as random files in a large structure.? If you have a
large structure, but users tend to access mostly the same files at any
given time, then the system won't be reading the disk for every lookup.?
If accesses aren't random, then structure size becomes less important.

Hashed name directory structure has been mentioned, and those can be
useful if you have a very large number of objects to store, and they all
have the same permission set.? A hashed name structure typically
requires that you? store in a database a map between the original names
(that users see) and the names' hashes.? You could hash each name at
lookup, but that doesn't give you a good mechanism for dealing with
collisions.? Hashed name directory structures typically have a worse
best-case performance due to the lookup, but they offer predictable and
even growth for lookup times for each file.? Where a free-form directory
structure might have a large difference between the best-case and
worst-case lookup, a hashed name directory structure should be roughly
the same access time for all files.

1: https://en.wikipedia.org/wiki/Inode_pointer_structure

Reasonably Related Threads

Search for more maybe matching threads

CentOS - Nov 2018 - inquiry about limitation of file system

[CentOS] inquiry about limitation of file system

[CentOS] inquiry about limitation of file system

[CentOS] inquiry about limitation of file system

[CentOS] inquiry about limitation of file system

Reasonably Related Threads