BXadmin:AFS performance testing
From CCGB
Conversation with Matt Benjamin detailing testing work to be done for cache-bypass on 1.5.x:
(12:58:08) matt: Well, then 1.5.x is your branch. I'd be interested in getting comparative results first with the unmodified branch, on memcache with appropriate chunksize (e.g., 18?), with and without cache bypass enabled. You can use one cm binary, built with --enable-cache-bypass, setting a low fs bypasthreshold to enable bypassing. Read vs. mixed read-write workloads are interesting--the latter should have a significant penalty, should ideally, workload should be read-heavy. Then, repeating with cache-bypass refcounting patch in gerrit. (12:58:30) matt: When we have a connection pooling patch worth running, repeating with thiat. (12:58:32) matt: that (12:59:31) matt: The refcounting patch should have no measurable effect on performance, starting from 1.5.x + refcounting would probably be reasonable. (13:00:34) phalenor: do you care about disk cache performance? (13:02:14) phalenor: and when you say with and without cache bypass, do you mean with and without --enable-cache-bypass at build time? (13:02:59) matt: with and without: no, with fs bypassthresh default (0?, -1? don't recall...) or vs. with some small value (which enables bypassing) (13:05:02) phalenor: okay. so the only stumbling block I see then is I don't have any machines handy with new enough autoconf, etc to run regen.sh, though that could be rectified I suppose (13:05:52) matt: But as regards disk. You get massive "improvement" but relative to memcache, the results are unrealistically scaled. But it should work regardless, and reduces disk workload, obviously, though it's now background work, since Simon's changes of summer. (13:06:01) matt: Yeah, you just need to put an autoconf somewhere... (13:07:05) matt: And parallel fetching is still happening, of course. I'm just admitting that I barely ran cache-bypassing with disk cache. (13:07:49) phalenor: for the most part, our 'big' machines run with around a half gig of memcache because they have the memory to spare (some 16G, most 32, one 64), workstations and machines on 100Mb are still disk cache, as even with 1.4 disc cache becomes network bound (13:08:02) matt: Yes, that's the ticket. (13:08:29) matt: Oh, and you want to increase -daemons, esp. with more calls patch to come. (13:09:00) phalenor: right now we're running with 12, so more than that? (13:09:26) shadow@gmail.com/owl1EA1D463: -daemons 12 is probably fine for now. (13:09:28) matt: Actually, that's probably fine. Worth looking into, perhaps. (13:10:09) phalenor: I haven't tested while varying that number, but I suppose I could fiddle with it a bit (13:10:15) matt: With clones, I never used more than 4xRX_MAXCALLS = 12 anyway, btw. (13:10:33) matt: So there would be no improvement, unless we were starving something else. (13:10:55) matt: sorry, 3x
iozone
http://www.bx.psu.edu/~phalenor/afs_performance_results/
/afs/bx.psu.edu/user/phalenor/afs_performance/results
Apache
Testing apache performance with 1.4.x vs 1.5.x.
web-1 and web-2 are 32-bit CentOS 5.5 VMs running under VMWare ESXi.
Test was performed with wget against a 500MB iso living in a public_html directory in my home directory, served out by 1.4.11 on Solaris 10u8, ZFS on 4GFC.
- web-1: 1.4.12.1, 1GB 2GFC disk cache. 4 vCPUs, pcnet32, -stat 9600 -daemons 6 -volumes 512 -chunksize 19
- wget: 2.03MB/s
- iperf: ~80Mb/s
- (high load average because of pcnet32 emulation?)
- web-2: 1.5.77, 1GB 2GFC disk cache, 4 vCPUs, vmxnet, -stat 9600 -daemons 12 -volumes 512 -chunksize 19 -rxpck 2048
- wget: 14.0MB/s
- iperf: >400Mb/s
- web-1: 1.4.12.1, 1GB 2GFC disk cache. 4 vCPUs, vmxnet, -stat 9600 -daemons 6 -volumes 512 -chunksize 19
- iperf: >400Mb/s
- wget: 2.99 MB/s
- web-2: bypassthreshold=1 (crashed)
- wget: 44.9MB/s