BXadmin:Network/Galaxy

From CCGB
Revision as of 17:32, 24 February 2012 by Nate (talk | contribs) (Notes)

Jump to: navigation, search

Notes

Useful stuff:

  • /afs/bx.psu.edu/service/rancid/prod/var/bx/config/* contains router configs pulled from the hardware by rancid, super useful.

Unexpected problems I encountered:

  • show run on the Cisco does not show you the commands used to create the vlans. Since int vlan id can be executed without ever having executed vlan id, the parts of show run relevant to your new vlan will look just like the parts for existing vlans, but your new vlan will not pass traffic. It's not until you execute show vlan that you will realize that your new vlan does not actually exist (the rancid config dump above does execute show vlan!). Moral: don't forget to vlan id.
  • pbs_server and pbs_mom refused to communicate in the following scenario:
    • The server has interfaces on both the old and the new networks
    • The DNS name for the server points to the server's address on the old network
    • A mom is on the new network
    • Packets go: mom -> router -> pbs_server -> mom
    • TORQUE does not like this. So you have to put the new address for the server in the mom's /etc/hosts until DNS changes.
  • Likewise with NFS, be aware of where your packets will appear to come from from multihomed hosts once DNS for the server changes.
  • The LDAP server has ACLs that limits access to the Group OU but not (the necessary attributes in) the People OU. So if you don't modify the ACLs, users work but groups don't.
  • RCC has a firewall that has to be updated if the PBS submit host (main) changes IPs and if the NFS servers change.
  • You have to think really hard about how packets are going to route once you make certain changes, e.g. DNS. And never forget that just because you send them one may does not mean they'll want to come back the same way.
  • The only interface in the main-web1 zone with a defrouter specified was still not being used as *the* default route because:
    • bigsky had an address on the old public subnet
    • main-web1 had an address on both the old and new public subnets, and defrouter specified on the new network
    • Because main-web1 had an interface on the old subnet, it was still using that default route (since it was bigsky's default route)
    • Okay, even more problems: the routing table is completely global, so because I removed bigsky and main-db1's public IPs, they had a default route of 172.18.2.1. Since main-web1 also has a 172.18.2.0/25 interface, it was just picking one or the other default route and so cyberstar connections were frequently failing. I readded bigsky and main-db1's public IPs so the 172.18.2.1 default route could go away (this is now possible since main-web1's 128.118.200.0/23 interface is down). Probably I should just get rid of the zones.
    • More on this:

Initial Configuration

Galaxy has its own subnet, this is the configuration that was done to create it:

asa

ciscoasa(config)# access-list Outside_access_in extended permit ip any 128.118.250.0 255.255.255.224 
ciscoasa(config)# route Bioinformatics 128.118.250.0 255.255.255.224 172.28.90.18 1
ciscoasa(config)# nat (Bioinformatics) 1 172.18.0.0 255.255.240.0
ciscoasa(config)# route Bioinformatics 172.18.0.0 255.255.240.0 172.28.90.18 1

switch-cisco-3750-1

switch-cisco-3750-1(config)#vlan 140
switch-cisco-3750-1(config-vlan)#name GALAXY_PUBLIC
switch-cisco-3750-1(config-vlan)#exit
switch-cisco-3750-1(config)#vlan 270
switch-cisco-3750-1(config-vlan)#name GALAXY_PRIVATE
switch-cisco-3750-1(config-vlan)#exit
switch-cisco-3750-1(config)#int vlan 140
switch-cisco-3750-1(config-if)#description GALAXY_PUBLIC
switch-cisco-3750-1(config-if)#no ip address
switch-cisco-3750-1(config-if)#exit
switch-cisco-3750-1(config)#int vlan 270
switch-cisco-3750-1(config-if)#description GALAXY_PRIVATE
switch-cisco-3750-1(config-if)#no ip address
switch-cisco-3750-1(config-if)#exit
switch-cisco-3750-1(config)#ip route 128.118.250.0 255.255.255.224 10.1.7.2
switch-cisco-3750-1(config)#ip route 172.18.2.0 255.255.255.128 10.1.7.2
Also, established connections to
172.18.0.0 0.0.15.255
had to be added to the inbound access-list.

switch-dell-powerconnect-6248-1

switch-dell-powerconnect-1(config)#vlan database 
switch-dell-powerconnect-1(config-vlan)#vlan 140
Warning: The use of large numbers of VLANs or interfaces may cause significant
delays in applying the configuration.
switch-dell-powerconnect-1(config-vlan)#vlan 270
Warning: The use of large numbers of VLANs or interfaces may cause significant
delays in applying the configuration.
switch-dell-powerconnect-1(config-vlan)#exit
switch-dell-powerconnect-1(config)#interface vlan 140 
switch-dell-powerconnect-1(config-if-vlan140)#name "GALAXY_PUBLIC"
switch-dell-powerconnect-1(config-if-vlan140)#ip address 128.118.250.1 255.255.255.224
switch-dell-powerconnect-1(config-if-vlan140)#routing
switch-dell-powerconnect-1(config-if-vlan140)#no ip redirects
switch-dell-powerconnect-1(config-if-vlan140)#exit
switch-dell-powerconnect-1(config)#interface vlan 270
switch-dell-powerconnect-1(config-if-vlan270)#name "GALAXY_PRIVATE"
switch-dell-powerconnect-1(config-if-vlan270)#ip address 172.18.2.1 255.255.255.128
switch-dell-powerconnect-1(config-if-vlan270)#routing
switch-dell-powerconnect-1(config-if-vlan270)#no ip redirects
switch-dell-powerconnect-1(config-if-vlan270)#exit
switch-dell-powerconnect-1(config)#interface port-channel 2
switch-dell-powerconnect-1(config-if-ch2)#switchport general allowed vlan add 140,270 tagged
Warning: The use of large numbers of VLANs or interfaces may cause significant
delays in applying the configuration.
switch-dell-powerconnect-1(config-if-ch2)#exit
switch-dell-powerconnect-1(config)#interface port-channel 4
switch-dell-powerconnect-1(config-if-ch4)#switchport general allowed vlan add 140,270 tagged
Warning: The use of large numbers of VLANs or interfaces may cause significant
delays in applying the configuration.
switch-dell-powerconnect-1(config)#interface port-channel 1
switch-dell-powerconnect-1(config-if-ch1)#switchport general allowed vlan add 270 tagged
Warning: The use of large numbers of VLANs or interfaces may cause significant
delays in applying the configuration.

Also add 140,270 tagged to bigsky, thumper, rochefort, westmalle, orval

switch-hp-procurve-8.net.bx.psu.edu

switch-hp-procurve-8(config)# vlan 270
switch-hp-procurve-8(vlan-270)# name GALAXY_PRIVATE
String GALAXY_PR... too long. Allowed length is 12.
switch-hp-procurve-8(vlan-270)# name GALAXY_PRIV
switch-hp-procurve-8(vlan-270)# tagged trk1
switch-hp-procurve-8(vlan-270)# exit

bigsky

# touch /etc/hostname.aggr140001
# echo 'bigsky.g2.bx.psu.edu mtu 9000' > /etc/hostname.aggr270001
# cat /dev/null > /etc/hostname.aggr1
# ifconfig aggr140001 plumb
# ifconfig aggr270001 plumb
# zonecfg -z main-web1
zonecfg:main-web1> add net
zonecfg:main-web1:net> set physical=aggr140001
zonecfg:main-web1:net> set address=128.118.250.4/27
zonecfg:main-web1:net> end
zonecfg:main-web1:net> set physical=aggr270001
zonecfg:main-web1:net> set address=172.18.2.20/25
zonecfg:main-web1:net> end
zonecfg:main-web1> verify
zonecfg:main-web1> commit
zonecfg:main-web1> exit
# echo '172.18.2.0        255.255.255.128' >> /etc/netmasks
# ifconfig aggr270001 plumb 172.18.2.20 netmask + broadcast + up
# ifconfig aggr270001 addif 172.18.2.100/27 zone main-web1 up
# ifconfig aggr270001 addif 172.18.2.101/27 zone main-db1 up
# ifconfig aggr140001 addif 128.118.250.4/27 zone main-web1
# ifconfig aggr140001:1 up

See Notes above for a discussion of route problems here.

frisell

Changed interfaces in ESXi, Changed IPs in /etc/hosts, deleted public IPs from test-db1, set defrouter for test-db1

rochefort/westmalle/orval

# dladm create-vlan -l aggr0 -v 140 vlan140
# dladm create-vlan -l aggr0 -v 270 vlan270
# ipadm create-if vlan140
# ipadm create-if vlan270
# ipadm create-addr -T static -a 128.118.250.XXX/27 vlan140/v4
# ipadm create-addr -T static -a 172.18.2.XXX/25 vlan270/v4