Difference between revisions of "BXadmin:Network/Galaxy"
From CCGB
(cisco nat for private ips) |
|||
(13 intermediate revisions by the same user not shown) | |||
Line 1: | Line 1: | ||
+ | == Notes == | ||
+ | |||
+ | Useful stuff: | ||
+ | |||
+ | * <tt>/afs/bx.psu.edu/service/rancid/prod/var/bx/config/*</tt> contains router configs pulled from the hardware by rancid, super useful. | ||
+ | * Per the HP Procurve 2510G config/admin guide, ports are set to allow jumbos when they are added to a vlan that supports jumbos, and they are set to disallow jumbos when they are removed from a vlan that supports jumbos. 3 ports on procurve-8 were not allowing jumbos and i had to move them to a different vlan and back to 270 to make them work. | ||
+ | |||
+ | Unexpected problems I encountered: | ||
+ | |||
+ | * <tt>show run</tt> on the Cisco does not show you the commands used to create the vlans. Since <tt>int vlan '''''id'''''</tt> can be executed without ever having executed <tt>vlan '''''id'''''</tt>, the parts of <tt>show run</tt> relevant to your new vlan will look just like the parts for existing vlans, but your new vlan will not pass traffic. It's not until you execute <tt>show vlan</tt> that you will realize that your new vlan does not actually exist (the rancid config dump above ''does'' execute <tt>show vlan</tt>!). '''Moral:''' don't forget to <tt>vlan '''''id'''''</tt>. | ||
+ | * <tt>pbs_server</tt> and <tt>pbs_mom</tt> refused to communicate in the following scenario: | ||
+ | ** The server has interfaces on both the old and the new networks | ||
+ | ** The DNS name for the server points to the server's address on the old network | ||
+ | ** A mom is on the new network | ||
+ | ** Packets go: mom -> router -> pbs_server -> mom | ||
+ | ** TORQUE does not like this. So you have to put the new address for the server in the mom's /etc/hosts until DNS changes. | ||
+ | * Likewise with NFS, be aware of where your packets will appear to come from from multihomed hosts once DNS for the server changes. | ||
+ | * The LDAP server has ACLs that limits access to the Group OU but not (the necessary attributes in) the People OU. So if you don't modify the ACLs, users work but groups don't. | ||
+ | * RCC has a firewall that has to be updated if the PBS submit host (main) changes IPs and if the NFS servers change. | ||
+ | * You have to think really hard about how packets are going to route once you make certain changes, e.g. DNS. And never forget that just because you send them one may does not mean they'll want to come back the same way. | ||
+ | * The only interface in the main-web1 zone with a <tt>defrouter</tt> specified was still not being used as *the* default route because: | ||
+ | ** bigsky had an address on the old public subnet | ||
+ | ** main-web1 had an address on both the old and new public subnets, and <tt>defrouter</tt> specified on the new network | ||
+ | ** Because main-web1 had an interface on the old subnet, it was still using that default route (since it was bigsky's default route) | ||
+ | ** Okay, even more problems: the routing table is completely global, so because I removed bigsky and main-db1's public IPs, they had a default route of 172.18.2.1. Since main-web1 also has a 172.18.2.0/25 interface, it was just picking one or the other default route and so cyberstar connections were frequently failing. I readded bigsky and main-db1's public IPs so the 172.18.2.1 default route could go away (this is now possible since main-web1's 128.118.200.0/23 interface is down). Probably I should just get rid of the zones. | ||
+ | ** More on this: | ||
+ | *** [https://blogs.oracle.com/stw/entry/solaris_zones_and_networking_common https://blogs.oracle.com/stw/entry/solaris_zones_and_networking_common] | ||
+ | *** [https://blogs.oracle.com/stw/entry/guidelines_on_zones_with_shared https://blogs.oracle.com/stw/entry/guidelines_on_zones_with_shared] | ||
+ | |||
== Initial Configuration == | == Initial Configuration == | ||
Galaxy has its own subnet, this is the configuration that was done to create it: | Galaxy has its own subnet, this is the configuration that was done to create it: | ||
Line 12: | Line 41: | ||
=== switch-cisco-3750-1 === | === switch-cisco-3750-1 === | ||
− | <pre>switch-cisco-3750-1(config)#int vlan 140 | + | <pre>switch-cisco-3750-1(config)#vlan 140 |
+ | switch-cisco-3750-1(config-vlan)#name GALAXY_PUBLIC | ||
+ | switch-cisco-3750-1(config-vlan)#exit | ||
+ | switch-cisco-3750-1(config)#vlan 270 | ||
+ | switch-cisco-3750-1(config-vlan)#name GALAXY_PRIVATE | ||
+ | switch-cisco-3750-1(config-vlan)#exit | ||
+ | switch-cisco-3750-1(config)#int vlan 140 | ||
switch-cisco-3750-1(config-if)#description GALAXY_PUBLIC | switch-cisco-3750-1(config-if)#description GALAXY_PUBLIC | ||
switch-cisco-3750-1(config-if)#no ip address | switch-cisco-3750-1(config-if)#no ip address | ||
Line 21: | Line 56: | ||
switch-cisco-3750-1(config-if)#exit | switch-cisco-3750-1(config-if)#exit | ||
switch-cisco-3750-1(config)#ip route 128.118.250.0 255.255.255.224 10.1.7.2 | switch-cisco-3750-1(config)#ip route 128.118.250.0 255.255.255.224 10.1.7.2 | ||
− | switch-cisco-3750-1(config)#ip route 172.18.2.0 255.255.255. | + | switch-cisco-3750-1(config)#ip route 172.18.2.0 255.255.255.128 10.1.7.2 |
</pre> | </pre> | ||
+ | |||
+ | Also, established connections to <pre>172.18.0.0 0.0.15.255</pre> had to be added to the inbound access-list. | ||
=== switch-dell-powerconnect-6248-1 === | === switch-dell-powerconnect-6248-1 === | ||
Line 42: | Line 79: | ||
switch-dell-powerconnect-1(config)#interface vlan 270 | switch-dell-powerconnect-1(config)#interface vlan 270 | ||
switch-dell-powerconnect-1(config-if-vlan270)#name "GALAXY_PRIVATE" | switch-dell-powerconnect-1(config-if-vlan270)#name "GALAXY_PRIVATE" | ||
− | switch-dell-powerconnect-1(config-if-vlan270)#ip address 172.18.2.1 255.255.255. | + | switch-dell-powerconnect-1(config-if-vlan270)#ip address 172.18.2.1 255.255.255.128 |
switch-dell-powerconnect-1(config-if-vlan270)#routing | switch-dell-powerconnect-1(config-if-vlan270)#routing | ||
switch-dell-powerconnect-1(config-if-vlan270)#no ip redirects | switch-dell-powerconnect-1(config-if-vlan270)#no ip redirects | ||
Line 52: | Line 89: | ||
switch-dell-powerconnect-1(config-if-ch2)#exit | switch-dell-powerconnect-1(config-if-ch2)#exit | ||
switch-dell-powerconnect-1(config)#interface port-channel 4 | switch-dell-powerconnect-1(config)#interface port-channel 4 | ||
− | switch-dell-powerconnect-1(config-if-ch4)#switchport general allowed vlan add 140 tagged | + | switch-dell-powerconnect-1(config-if-ch4)#switchport general allowed vlan add 140,270 tagged |
Warning: The use of large numbers of VLANs or interfaces may cause significant | Warning: The use of large numbers of VLANs or interfaces may cause significant | ||
delays in applying the configuration. | delays in applying the configuration. | ||
Line 60: | Line 97: | ||
delays in applying the configuration. | delays in applying the configuration. | ||
</pre> | </pre> | ||
+ | |||
+ | Also add 140,270 tagged to bigsky, thumper, rochefort, westmalle, orval | ||
=== switch-hp-procurve-8.net.bx.psu.edu === | === switch-hp-procurve-8.net.bx.psu.edu === | ||
Line 74: | Line 113: | ||
<pre># touch /etc/hostname.aggr140001 | <pre># touch /etc/hostname.aggr140001 | ||
+ | # echo 'bigsky.g2.bx.psu.edu mtu 9000' > /etc/hostname.aggr270001 | ||
+ | # cat /dev/null > /etc/hostname.aggr1 | ||
# ifconfig aggr140001 plumb | # ifconfig aggr140001 plumb | ||
+ | # ifconfig aggr270001 plumb | ||
# zonecfg -z main-web1 | # zonecfg -z main-web1 | ||
zonecfg:main-web1> add net | zonecfg:main-web1> add net | ||
zonecfg:main-web1:net> set physical=aggr140001 | zonecfg:main-web1:net> set physical=aggr140001 | ||
zonecfg:main-web1:net> set address=128.118.250.4/27 | zonecfg:main-web1:net> set address=128.118.250.4/27 | ||
+ | zonecfg:main-web1:net> end | ||
+ | zonecfg:main-web1:net> set physical=aggr270001 | ||
+ | zonecfg:main-web1:net> set address=172.18.2.20/25 | ||
zonecfg:main-web1:net> end | zonecfg:main-web1:net> end | ||
zonecfg:main-web1> verify | zonecfg:main-web1> verify | ||
zonecfg:main-web1> commit | zonecfg:main-web1> commit | ||
zonecfg:main-web1> exit | zonecfg:main-web1> exit | ||
+ | # echo '172.18.2.0 255.255.255.128' >> /etc/netmasks | ||
+ | # ifconfig aggr270001 plumb 172.18.2.20 netmask + broadcast + up | ||
+ | # ifconfig aggr270001 addif 172.18.2.100/27 zone main-web1 up | ||
+ | # ifconfig aggr270001 addif 172.18.2.101/27 zone main-db1 up | ||
# ifconfig aggr140001 addif 128.118.250.4/27 zone main-web1 | # ifconfig aggr140001 addif 128.118.250.4/27 zone main-web1 | ||
# ifconfig aggr140001:1 up | # ifconfig aggr140001:1 up | ||
</pre> | </pre> | ||
− | + | See Notes above for a discussion of route problems here. | |
+ | |||
+ | === frisell === | ||
+ | |||
+ | Changed interfaces in ESXi, Changed IPs in /etc/hosts, deleted public IPs from test-db1, set defrouter for test-db1 | ||
+ | |||
+ | === rochefort/westmalle/orval === | ||
+ | |||
+ | <pre># dladm create-vlan -l aggr0 -v 140 vlan140 | ||
+ | # dladm create-vlan -l aggr0 -v 270 vlan270 | ||
+ | # ipadm create-if vlan140 | ||
+ | # ipadm create-if vlan270 | ||
+ | # ipadm create-addr -T static -a 128.118.250.XXX/27 vlan140/v4 | ||
+ | # ipadm create-addr -T static -a 172.18.2.XXX/25 vlan270/v4 | ||
+ | </pre> |
Latest revision as of 15:36, 12 March 2012
Contents
Notes
Useful stuff:
- /afs/bx.psu.edu/service/rancid/prod/var/bx/config/* contains router configs pulled from the hardware by rancid, super useful.
- Per the HP Procurve 2510G config/admin guide, ports are set to allow jumbos when they are added to a vlan that supports jumbos, and they are set to disallow jumbos when they are removed from a vlan that supports jumbos. 3 ports on procurve-8 were not allowing jumbos and i had to move them to a different vlan and back to 270 to make them work.
Unexpected problems I encountered:
- show run on the Cisco does not show you the commands used to create the vlans. Since int vlan id can be executed without ever having executed vlan id, the parts of show run relevant to your new vlan will look just like the parts for existing vlans, but your new vlan will not pass traffic. It's not until you execute show vlan that you will realize that your new vlan does not actually exist (the rancid config dump above does execute show vlan!). Moral: don't forget to vlan id.
- pbs_server and pbs_mom refused to communicate in the following scenario:
- The server has interfaces on both the old and the new networks
- The DNS name for the server points to the server's address on the old network
- A mom is on the new network
- Packets go: mom -> router -> pbs_server -> mom
- TORQUE does not like this. So you have to put the new address for the server in the mom's /etc/hosts until DNS changes.
- Likewise with NFS, be aware of where your packets will appear to come from from multihomed hosts once DNS for the server changes.
- The LDAP server has ACLs that limits access to the Group OU but not (the necessary attributes in) the People OU. So if you don't modify the ACLs, users work but groups don't.
- RCC has a firewall that has to be updated if the PBS submit host (main) changes IPs and if the NFS servers change.
- You have to think really hard about how packets are going to route once you make certain changes, e.g. DNS. And never forget that just because you send them one may does not mean they'll want to come back the same way.
- The only interface in the main-web1 zone with a defrouter specified was still not being used as *the* default route because:
- bigsky had an address on the old public subnet
- main-web1 had an address on both the old and new public subnets, and defrouter specified on the new network
- Because main-web1 had an interface on the old subnet, it was still using that default route (since it was bigsky's default route)
- Okay, even more problems: the routing table is completely global, so because I removed bigsky and main-db1's public IPs, they had a default route of 172.18.2.1. Since main-web1 also has a 172.18.2.0/25 interface, it was just picking one or the other default route and so cyberstar connections were frequently failing. I readded bigsky and main-db1's public IPs so the 172.18.2.1 default route could go away (this is now possible since main-web1's 128.118.200.0/23 interface is down). Probably I should just get rid of the zones.
Initial Configuration
Galaxy has its own subnet, this is the configuration that was done to create it:
asa
ciscoasa(config)# access-list Outside_access_in extended permit ip any 128.118.250.0 255.255.255.224 ciscoasa(config)# route Bioinformatics 128.118.250.0 255.255.255.224 172.28.90.18 1 ciscoasa(config)# nat (Bioinformatics) 1 172.18.0.0 255.255.240.0 ciscoasa(config)# route Bioinformatics 172.18.0.0 255.255.240.0 172.28.90.18 1
switch-cisco-3750-1
switch-cisco-3750-1(config)#vlan 140 switch-cisco-3750-1(config-vlan)#name GALAXY_PUBLIC switch-cisco-3750-1(config-vlan)#exit switch-cisco-3750-1(config)#vlan 270 switch-cisco-3750-1(config-vlan)#name GALAXY_PRIVATE switch-cisco-3750-1(config-vlan)#exit switch-cisco-3750-1(config)#int vlan 140 switch-cisco-3750-1(config-if)#description GALAXY_PUBLIC switch-cisco-3750-1(config-if)#no ip address switch-cisco-3750-1(config-if)#exit switch-cisco-3750-1(config)#int vlan 270 switch-cisco-3750-1(config-if)#description GALAXY_PRIVATE switch-cisco-3750-1(config-if)#no ip address switch-cisco-3750-1(config-if)#exit switch-cisco-3750-1(config)#ip route 128.118.250.0 255.255.255.224 10.1.7.2 switch-cisco-3750-1(config)#ip route 172.18.2.0 255.255.255.128 10.1.7.2Also, established connections to
172.18.0.0 0.0.15.255had to be added to the inbound access-list.
switch-dell-powerconnect-6248-1
switch-dell-powerconnect-1(config)#vlan database switch-dell-powerconnect-1(config-vlan)#vlan 140 Warning: The use of large numbers of VLANs or interfaces may cause significant delays in applying the configuration. switch-dell-powerconnect-1(config-vlan)#vlan 270 Warning: The use of large numbers of VLANs or interfaces may cause significant delays in applying the configuration. switch-dell-powerconnect-1(config-vlan)#exit switch-dell-powerconnect-1(config)#interface vlan 140 switch-dell-powerconnect-1(config-if-vlan140)#name "GALAXY_PUBLIC" switch-dell-powerconnect-1(config-if-vlan140)#ip address 128.118.250.1 255.255.255.224 switch-dell-powerconnect-1(config-if-vlan140)#routing switch-dell-powerconnect-1(config-if-vlan140)#no ip redirects switch-dell-powerconnect-1(config-if-vlan140)#exit switch-dell-powerconnect-1(config)#interface vlan 270 switch-dell-powerconnect-1(config-if-vlan270)#name "GALAXY_PRIVATE" switch-dell-powerconnect-1(config-if-vlan270)#ip address 172.18.2.1 255.255.255.128 switch-dell-powerconnect-1(config-if-vlan270)#routing switch-dell-powerconnect-1(config-if-vlan270)#no ip redirects switch-dell-powerconnect-1(config-if-vlan270)#exit switch-dell-powerconnect-1(config)#interface port-channel 2 switch-dell-powerconnect-1(config-if-ch2)#switchport general allowed vlan add 140,270 tagged Warning: The use of large numbers of VLANs or interfaces may cause significant delays in applying the configuration. switch-dell-powerconnect-1(config-if-ch2)#exit switch-dell-powerconnect-1(config)#interface port-channel 4 switch-dell-powerconnect-1(config-if-ch4)#switchport general allowed vlan add 140,270 tagged Warning: The use of large numbers of VLANs or interfaces may cause significant delays in applying the configuration. switch-dell-powerconnect-1(config)#interface port-channel 1 switch-dell-powerconnect-1(config-if-ch1)#switchport general allowed vlan add 270 tagged Warning: The use of large numbers of VLANs or interfaces may cause significant delays in applying the configuration.
Also add 140,270 tagged to bigsky, thumper, rochefort, westmalle, orval
switch-hp-procurve-8.net.bx.psu.edu
switch-hp-procurve-8(config)# vlan 270 switch-hp-procurve-8(vlan-270)# name GALAXY_PRIVATE String GALAXY_PR... too long. Allowed length is 12. switch-hp-procurve-8(vlan-270)# name GALAXY_PRIV switch-hp-procurve-8(vlan-270)# tagged trk1 switch-hp-procurve-8(vlan-270)# exit
bigsky
# touch /etc/hostname.aggr140001 # echo 'bigsky.g2.bx.psu.edu mtu 9000' > /etc/hostname.aggr270001 # cat /dev/null > /etc/hostname.aggr1 # ifconfig aggr140001 plumb # ifconfig aggr270001 plumb # zonecfg -z main-web1 zonecfg:main-web1> add net zonecfg:main-web1:net> set physical=aggr140001 zonecfg:main-web1:net> set address=128.118.250.4/27 zonecfg:main-web1:net> end zonecfg:main-web1:net> set physical=aggr270001 zonecfg:main-web1:net> set address=172.18.2.20/25 zonecfg:main-web1:net> end zonecfg:main-web1> verify zonecfg:main-web1> commit zonecfg:main-web1> exit # echo '172.18.2.0 255.255.255.128' >> /etc/netmasks # ifconfig aggr270001 plumb 172.18.2.20 netmask + broadcast + up # ifconfig aggr270001 addif 172.18.2.100/27 zone main-web1 up # ifconfig aggr270001 addif 172.18.2.101/27 zone main-db1 up # ifconfig aggr140001 addif 128.118.250.4/27 zone main-web1 # ifconfig aggr140001:1 up
See Notes above for a discussion of route problems here.
frisell
Changed interfaces in ESXi, Changed IPs in /etc/hosts, deleted public IPs from test-db1, set defrouter for test-db1
rochefort/westmalle/orval
# dladm create-vlan -l aggr0 -v 140 vlan140 # dladm create-vlan -l aggr0 -v 270 vlan270 # ipadm create-if vlan140 # ipadm create-if vlan270 # ipadm create-addr -T static -a 128.118.250.XXX/27 vlan140/v4 # ipadm create-addr -T static -a 172.18.2.XXX/25 vlan270/v4