Displaying 12 results from an estimated 12 matches for "tomfit".
Did you mean:
tomfite
2018 Jan 18
2
Blocking IO when hot tier promotion daemon runs
...e from a later point in time. The issue is hit
earlier than the logs what is available in the log. I need the logs
from an earlier time.
And along with the entire tier logs, can you send the glusterd and
brick logs too?
Rest of the comments are inline
On Wed, Jan 10, 2018 at 9:03 PM, Tom Fite <tomfite at gmail.com> wrote:
> I should add that additional testing has shown that only accessing files is
> held up, IO is not interrupted for existing transfers. I think this points
> to the heat metadata in the sqlite DB for the tier, is it possible that a
> table is temporarily locked w...
2018 Jan 10
0
Blocking IO when hot tier promotion daemon runs
...nterrupted for existing transfers. I think this points
to the heat metadata in the sqlite DB for the tier, is it possible that a
table is temporarily locked while the promotion daemon runs so the calls to
update the access count on files are blocked?
On Wed, Jan 10, 2018 at 10:17 AM, Tom Fite <tomfite at gmail.com> wrote:
> The sizes of the files are extremely varied, there are millions of small
> (<1 MB) files and thousands of files larger than 1 GB.
>
> Attached is the tier log for gluster1 and gluster2. These are full of
> "demotion failed" messages, which is...
2018 Jan 18
0
Blocking IO when hot tier promotion daemon runs
...e is hit
> earlier than the logs what is available in the log. I need the logs
> from an earlier time.
> And along with the entire tier logs, can you send the glusterd and
> brick logs too?
>
> Rest of the comments are inline
>
> On Wed, Jan 10, 2018 at 9:03 PM, Tom Fite <tomfite at gmail.com> wrote:
> > I should add that additional testing has shown that only accessing files
> is
> > held up, IO is not interrupted for existing transfers. I think this
> points
> > to the heat metadata in the sqlite DB for the tier, is it possible that a
> >...
2018 Jan 10
2
Blocking IO when hot tier promotion daemon runs
...Tue, Jan 9, 2018 at 10:33 PM, Hari Gowtham <hgowtham at redhat.com> wrote:
> Hi,
>
> Can you send the volume info, and volume status output and the tier logs.
> And I need to know the size of the files that are being stored.
>
> On Tue, Jan 9, 2018 at 9:51 PM, Tom Fite <tomfite at gmail.com> wrote:
> > I've recently enabled an SSD backed 2 TB hot tier on my 150 TB 2 server
> / 3
> > bricks per server distributed replicated volume.
> >
> > I'm seeing IO get blocked across all client FUSE threads for 10 to 15
> > seconds while th...
2018 Jan 02
0
Wrong volume size with df
...I added a hot tier to the pool, the brick
sizes are now reporting the correct size of all bricks combined instead of
just one brick.
Not sure if that gives you any clues for this... maybe adding another brick
to the pool would have a similar effect?
On Thu, Dec 21, 2017 at 11:44 AM, Tom Fite <tomfite at gmail.com> wrote:
> Sure!
>
> > 1 - output of gluster volume heal <volname> info
>
> Brick pod-sjc1-gluster1:/data/brick1/gv0
> Status: Connected
> Number of entries: 0
>
> Brick pod-sjc1-gluster2:/data/brick1/gv0
> Status: Connected
> Number of ent...
2018 Jan 10
0
Blocking IO when hot tier promotion daemon runs
Hi,
Can you send the volume info, and volume status output and the tier logs.
And I need to know the size of the files that are being stored.
On Tue, Jan 9, 2018 at 9:51 PM, Tom Fite <tomfite at gmail.com> wrote:
> I've recently enabled an SSD backed 2 TB hot tier on my 150 TB 2 server / 3
> bricks per server distributed replicated volume.
>
> I'm seeing IO get blocked across all client FUSE threads for 10 to 15
> seconds while the promotion daemon runs. I see...
2017 Dec 21
3
Wrong volume size with df
Sure!
> 1 - output of gluster volume heal <volname> info
Brick pod-sjc1-gluster1:/data/brick1/gv0
Status: Connected
Number of entries: 0
Brick pod-sjc1-gluster2:/data/brick1/gv0
Status: Connected
Number of entries: 0
Brick pod-sjc1-gluster1:/data/brick2/gv0
Status: Connected
Number of entries: 0
Brick pod-sjc1-gluster2:/data/brick2/gv0
Status: Connected
Number of entries: 0
Brick
2018 Jan 09
2
Blocking IO when hot tier promotion daemon runs
I've recently enabled an SSD backed 2 TB hot tier on my 150 TB 2 server / 3
bricks per server distributed replicated volume.
I'm seeing IO get blocked across all client FUSE threads for 10 to 15
seconds while the promotion daemon runs. I see the 'glustertierpro' thread
jump to 99% CPU usage on both boxes when these delays occur and they happen
every 25 minutes (my
2018 Feb 27
2
Very slow rsync to gluster volume UNLESS `ls` or `find` scan dir on gluster volume first
Any updates on this one?
On Mon, Feb 5, 2018 at 8:18 AM, Tom Fite <tomfite at gmail.com> wrote:
> Hi all,
>
> I have seen this issue as well, on Gluster 3.12.1. (3 bricks per box, 2
> boxes, distributed-replicate) My testing shows the same thing -- running a
> find on a directory dramatically increases lstat performance. To add
> another clue, the p...
2017 Dec 05
1
Slow seek times on stat calls to glusterfs metadata
Hi all,
I have a distributed / replicated pool consisting of 2 boxes, with 3 bricks
a piece. Each brick is mounted via a RAID 6 array consisting of 11 6 TB
disks. I'm running CentOS 7 with XFS and LVM. The 150 TB pool is loaded
with about 15 TB of data. Clients are connected via FUSE. I'm using
glusterfs 3.12.1.
I've found that running large rsyncs to populate the pool are taking a
2018 Feb 05
0
Very slow rsync to gluster volume UNLESS `ls` or `find` scan dir on gluster volume first
Hi all,
I have seen this issue as well, on Gluster 3.12.1. (3 bricks per box, 2
boxes, distributed-replicate) My testing shows the same thing -- running a
find on a directory dramatically increases lstat performance. To add
another clue, the performance degrades again after issuing a call to reset
the system's cache of dentries and inodes:
# sync; echo 2 > /proc/sys/vm/drop_caches
I
2018 Feb 05
2
Very slow rsync to gluster volume UNLESS `ls` or `find` scan dir on gluster volume first
Thanks for the report Artem,
Looks like the issue is about cache warming up. Specially, I suspect rsync
doing a 'readdir(), stat(), file operations' loop, where as when a find or
ls is issued, we get 'readdirp()' request, which contains the stat
information along with entries, which also makes sure cache is up-to-date
(at md-cache layer).
Note that this is just a off-the memory