Difference between revisions of "BXadmin:AFS performance testing"

From CCGB
Jump to: navigation, search
Line 1: Line 1:
<pre>
+
Conversation with Matt Benjamin detailing testing work to be done for cache-bypass on 1.5.x:
(12:58:08) matt: Well, then 1.5.x is your branch.  I'd be interested in getting comparative results first with the unmodified branch, on memcache with appropriate
+
 
chunksize (e.g., 18?), with and without cache bypass enabled.  You can use one cm binary, built with --enable-cache-bypass, setting a low fs bypasthreshold to enable bypassing.  Read vs. mixed read-write workloads are interesting--the latter should have a significant penalty, should ideally, workload should be read-heavy.
+
<pre>(12:58:08) matt: Well, then 1.5.x is your branch.  I'd be interested in getting comparative results first with the unmodified branch, on memcache with appropriate
                          ax Then, repeating with cache-bypass refcounting patch in gerrit.
+
chunksize (e.g., 18?), with and without cache bypass enabled.  You can use one cm binary, built with --enable-cache-bypass, setting a low fs bypasthreshold to enable bypassing.  Read vs. mixed read-write workloads are interesting--the latter should have a significant penalty, should ideally, workload should be read-heavy. Then, repeating with cache-bypass refcounting patch in gerrit.
                          ax(12:58:30) matt: When we have a connection pooling patch worth running, repeating with thiat.
+
(12:58:30) matt: When we have a connection pooling patch worth running, repeating with thiat.
                          ax(12:58:32) matt: that
+
(12:58:32) matt: that
                          ax(12:59:31) matt: The refcounting patch should have no measurable effect on performance, starting from 1.5.x + refcounting would probably be reasonable.
+
(12:59:31) matt: The refcounting patch should have no measurable effect on performance, starting from 1.5.x + refcounting would probably be reasonable.
                          ax(13:00:34) phalenor: do you care about disk cache performance?
+
(13:00:34) phalenor: do you care about disk cache performance?
                          ax(13:02:14) phalenor: and when you say with and without cache bypass, do you mean with and without --enable-cache-bypass at build time?
+
(13:02:14) phalenor: and when you say with and without cache bypass, do you mean with and without --enable-cache-bypass at build time?
                          ax(13:02:59) matt: with and without:  no, with fs bypassthresh default (0?, -1?  don't recall...) or vs. with some small value (which enables bypassing)
+
(13:02:59) matt: with and without:  no, with fs bypassthresh default (0?, -1?  don't recall...) or vs. with some small value (which enables bypassing)
                          ax(13:05:02) phalenor: okay. so the only stumbling block I see then is I don't have any machines handy with new enough autoconf, etc to run regen.sh, though that
+
(13:05:02) phalenor: okay. so the only stumbling block I see then is I don't have any machines handy with new enough autoconf, etc to run regen.sh, though that could be rectified I suppose
                          axcould be rectified I suppose
+
(13:05:52) matt: But as regards disk.  You get massive "improvement" but relative to memcache, the results are unrealistically scaled.  But it should work regardless, and reduces disk workload, obviously, though it's now background work, since Simon's changes of summer.
                          ax(13:05:52) matt: But as regards disk.  You get massive "improvement" but relative to memcache, the results are unrealistically scaled.  But it should work
+
(13:06:01) matt: Yeah, you just need to put an autoconf somewhere...
                          axregardless, and reduces disk workload, obviously, though it's now background work, since Simon's changes of summer.
+
(13:07:05) matt: And parallel fetching is still happening, of course.  I'm just admitting that I barely ran cache-bypassing with disk cache.
                          ax(13:06:01) matt: Yeah, you just need to put an autoconf somewhere...
+
(13:07:49) phalenor: for the most part, our 'big' machines run with around a half gig of memcache because they have the memory to spare (some 16G, most 32, one 64), workstations and machines on 100Mb are still disk cache, as even with 1.4 disc cache becomes network bound
                          ax(13:07:05) matt: And parallel fetching is still happening, of course.  I'm just admitting that I barely ran cache-bypassing with disk cache.
+
(13:08:02) matt: Yes, that's the ticket.
                          ax(13:07:49) phalenor: for the most part, our 'big' machines run with around a half gig of memcache because they have the memory to spare (some 16G, most 32, one 64),
+
(13:08:29) matt: Oh, and you want to increase -daemons, esp. with more calls patch to come.
                          axworkstations and machines on 100Mb are still disk cache, as even with 1.4 disc cache becomes network bound
+
(13:09:00) phalenor: right now we're running with 12, so more than that?
                          ax(13:08:02) matt: Yes, that's the ticket.                                                                                                                            
+
(13:09:26) shadow@gmail.com/owl1EA1D463: -daemons 12 is probably fine for now.
                          ax(13:08:29) matt: Oh, and you want to increase -daemons, esp. with more calls patch to come.                                                                        
+
(13:09:28) matt: Actually, that's probably fine.  Worth looking into, perhaps.
                          ax(13:09:00) phalenor: right now we're running with 12, so more than that?                                                                                            
+
(13:10:09) phalenor: I haven't tested while varying that number, but I suppose I could fiddle with it a bit
                          ax(13:09:26) shadow@gmail.com/owl1EA1D463: -daemons 12 is probably fine for now.                                                                                      
+
(13:10:15) matt: With clones, I never used more than 4xRX_MAXCALLS = 12 anyway, btw.
                          ax(13:09:28) matt: Actually, that's probably fine.  Worth looking into, perhaps.                                                                                      
+
(13:10:33) matt: So there would be no improvement, unless we were starving something else.
                          ax(13:10:09) phalenor: I haven't tested while varying that number, but I suppose I could fiddle with it a bit                                                        
+
(13:10:55) matt: sorry, 3x</pre>
                          ax(13:10:15) matt: With clones, I never used more than 4xRX_MAXCALLS = 12 anyway, btw.                                                                                
 
                          x(13:10:33) matt: So there would be no improvement, unless we were starving something else.                                                                          
 
lqqqqqqqqqqqqqqqqqqqqqqqqqkx(13:10:55) matt: sorry, 3x
 
</pre>
 

Revision as of 13:45, 22 June 2010

Conversation with Matt Benjamin detailing testing work to be done for cache-bypass on 1.5.x:

(12:58:08) matt: Well, then 1.5.x is your branch.  I'd be interested in getting comparative results first with the unmodified branch, on memcache with appropriate
chunksize (e.g., 18?), with and without cache bypass enabled.  You can use one cm binary, built with --enable-cache-bypass, setting a low fs bypasthreshold to enable bypassing.  Read vs. mixed read-write workloads are interesting--the latter should have a significant penalty, should ideally, workload should be read-heavy. Then, repeating with cache-bypass refcounting patch in gerrit.
(12:58:30) matt: When we have a connection pooling patch worth running, repeating with thiat.
(12:58:32) matt: that
(12:59:31) matt: The refcounting patch should have no measurable effect on performance, starting from 1.5.x + refcounting would probably be reasonable.
(13:00:34) phalenor: do you care about disk cache performance?
(13:02:14) phalenor: and when you say with and without cache bypass, do you mean with and without --enable-cache-bypass at build time?
(13:02:59) matt: with and without:  no, with fs bypassthresh default (0?, -1?  don't recall...) or vs. with some small value (which enables bypassing)
(13:05:02) phalenor: okay. so the only stumbling block I see then is I don't have any machines handy with new enough autoconf, etc to run regen.sh, though that could be rectified I suppose
(13:05:52) matt: But as regards disk.  You get massive "improvement" but relative to memcache, the results are unrealistically scaled.  But it should work regardless, and reduces disk workload, obviously, though it's now background work, since Simon's changes of summer.
(13:06:01) matt: Yeah, you just need to put an autoconf somewhere...
(13:07:05) matt: And parallel fetching is still happening, of course.  I'm just admitting that I barely ran cache-bypassing with disk cache.
(13:07:49) phalenor: for the most part, our 'big' machines run with around a half gig of memcache because they have the memory to spare (some 16G, most 32, one 64), workstations and machines on 100Mb are still disk cache, as even with 1.4 disc cache becomes network bound
(13:08:02) matt: Yes, that's the ticket.
(13:08:29) matt: Oh, and you want to increase -daemons, esp. with more calls patch to come.
(13:09:00) phalenor: right now we're running with 12, so more than that?
(13:09:26) shadow@gmail.com/owl1EA1D463: -daemons 12 is probably fine for now.
(13:09:28) matt: Actually, that's probably fine.  Worth looking into, perhaps.
(13:10:09) phalenor: I haven't tested while varying that number, but I suppose I could fiddle with it a bit
(13:10:15) matt: With clones, I never used more than 4xRX_MAXCALLS = 12 anyway, btw.
(13:10:33) matt: So there would be no improvement, unless we were starving something else.
(13:10:55) matt: sorry, 3x