1st party NEXT_HOPの検証

技術ネタ

先日参加したJANOG42で、BGPのNEXT_HOPアトリビュートについてのセッションがあった。

BGP NEXT_HOP Attribute
JANOGとはインターネットに於ける技術的事項、および、それにまつわるオペレーションに関する事項を議論、検討、紹介することにより日本のインターネット技術者、および、利用者に貢献することを目的としたグループです。

この中で、1st Party NEXT_HOPっていうのがスライドのP35からあるんだけど、全然意識したことない動作だったから気になって発表者の土屋さんに聞いたところ「Ciscoはnext-hop-unchangedをしなくても勝手に書き換えるよ」って。
なんですって!じゃあAristaさんとCiscoさんでは挙動が違うのか!?

てなわけで、自分のGNS3環境にあるBGPしゃべることができるノードをいくつか用意して、実際に動作を確認してみた。

用意するもの:広報する2つのルータ

まずはスライドのR1とR3に相当するノードを用意。これは何でもいいので馴染みのあるIOSで用意。

R2に相当するノードとして、R1(AS65001:広報するプレフィクス 1.1.1.0/24)を以下のような設定で用意。

interface Loopback0
 ip address 1.1.1.1 255.255.255.0
interface FastEthernet0/0
 ip address 192.168.0.1 255.255.255.0
 no shutdown
router bgp 65001
 network 1.1.1.0 mask 255.255.255.0
 neighbor 192.168.0.10 remote-as 65010
 neighbor 192.168.0.20 remote-as 65020
 neighbor 192.168.0.30 remote-as 65030
 neighbor 192.168.0.40 remote-as 65040
 neighbor 192.168.0.50 remote-as 65050

R3に相当するノードとして、R2(AS65002:広報するプレフィクス 2.2.2.0/24)を以下のような設定で用意。

interface Loopback0
 ip address 2.2.2.2 255.255.255.0
interface FastEthernet0/0
 ip address 192.168.0.2 255.255.255.0
router bgp 65002
 network 2.2.2.0 mask 255.255.255.0
 neighbor 192.168.0.10 remote-as 65010
 neighbor 192.168.0.20 remote-as 65020
 neighbor 192.168.0.30 remote-as 65030
 neighbor 192.168.0.40 remote-as 65040
 neighbor 192.168.0.50 remote-as 65050

準備ができたので、同一セグメントに所属する異なるピアとeBGP接続する構成で検証開始。接続イメージはこんな感じ。

IOS

AS65010として、1.1.1.0/24と2.2.2.0/24を中継させる。

interface Loopback0
 ip address 10.10.10.10 255.255.255.255
interface FastEthernet0/0
 ip address 192.168.0.10 255.255.255.0
 no shutdown
router bgp 65010
 neighbor 192.168.0.1 remote-as 65001
 neighbor 192.168.0.2 remote-as 65002

受信したR1とR2のBGPテーブルとルーティングテーブルはこちら。

R1:

R1#show ip bgp 2.2.2.0/24
BGP routing table entry for 2.2.2.0/24, version 3
Paths: (1 available, best #1, table default)
  Not advertised to any peer
  Refresh Epoch 2
  65010 65002
    192.168.0.2 from 192.168.0.10 (10.10.10.10)
      Origin IGP, localpref 100, valid, external, best
      rx pathid: 0, tx pathid: 0x0
R1#show ip route 2.2.2.0
Routing entry for 2.2.2.0/24
  Known via "bgp 65001", distance 20, metric 0
  Tag 65010, type external
  Last update from 192.168.0.2 00:03:26 ago
  Routing Descriptor Blocks:
  * 192.168.0.2, from 192.168.0.10, 00:03:26 ago
      Route metric is 0, traffic share count is 1
      AS Hops 2
      Route tag 65010
      MPLS label: none

R2:

R2#show ip bgp 1.1.1.0/24
BGP routing table entry for 1.1.1.0/24, version 2
Paths: (1 available, best #1, table default)
  Not advertised to any peer
  Refresh Epoch 2
  65010 65001
    192.168.0.1 from 192.168.0.10 (10.10.10.10)
      Origin IGP, localpref 100, valid, external, best
      rx pathid: 0, tx pathid: 0x0
R2#show ip route 1.1.1.0
Routing entry for 1.1.1.0/24
  Known via "bgp 65002", distance 20, metric 0
  Tag 65010, type external
  Last update from 192.168.0.1 00:04:14 ago
  Routing Descriptor Blocks:
  * 192.168.0.1, from 192.168.0.10, 00:04:14 ago
      Route metric is 0, traffic share count is 1
      AS Hops 2
      Route tag 65010
      MPLS label: none

聞いてたとおり、BGPのNEXT_HOPが中継したAS65010のアドレス(192.168.0.10)ではなく、広報元(AS65001/AS65002)のルータのアドレスになってる。そのため、ルーティングテーブルでも192.168.0.1や192.168.0.2がネクストホップになっている。

ちなみに、AS65010のIOSで debug ip bgp updates out したらちゃんと「NEXT_HOP is on same subnet as the bgp peer and set to 192.168.0.1 for net 1.1.1.0/24」っていってるので、IOSがよしなに書き換えてくれてるのは間違いなさそう。

BGP(0): 192.168.0.1 NEXT_HOP is on same subnet as the bgp peer and set to 192.168.0.2 for net 2.2.2.0/24, flags 200, sb: C0A80000, mask: FFFFFF00
BGP(0): (base) 192.168.0.1 send UPDATE (format) 2.2.2.0/24, next 192.168.0.2, metric 0, path 65002

IOS-XRv

AS65020として、1.1.1.0/24と2.2.2.0/24を中継させる。

RP/0/0/CPU0:ios(config)#show config
Tue Jul 17 06:01:25.660 UTC
Building configuration...
!! IOS XR Configuration 6.1.2
interface Loopback0
 ipv4 address 20.20.20.20 255.255.255.255
!
interface GigabitEthernet0/0/0/0
 ipv4 address 192.168.0.20 255.255.255.0
 no shutdown
!
!
route-policy PASS
  pass
end-policy
!
router bgp 65020
 address-family ipv4 unicast
 !
 neighbor 192.168.0.1
  remote-as 65001
  address-family ipv4 unicast
   route-policy PASS in
   route-policy PASS out
  !
 !
 neighbor 192.168.0.2
  remote-as 65002
  address-family ipv4 unicast
   route-policy PASS in
   route-policy PASS out
  !
 !
!
end

受信したR1とR2のBGPテーブルとルーティングテーブルはこちら。

R1:

R1#show ip bgp 2.2.2.0/24
BGP routing table entry for 2.2.2.0/24, version 7
Paths: (1 available, best #1, table default)
  Not advertised to any peer
  Refresh Epoch 1
  65020 65002
    192.168.0.2 from 192.168.0.20 (20.20.20.20)
      Origin IGP, localpref 100, valid, external, best
      rx pathid: 0, tx pathid: 0x0
R1#show ip route 2.2.2.0
Routing entry for 2.2.2.0/24
  Known via "bgp 65001", distance 20, metric 0
  Tag 65020, type external
  Last update from 192.168.0.2 00:01:04 ago
  Routing Descriptor Blocks:
  * 192.168.0.2, from 192.168.0.20, 00:01:04 ago
      Route metric is 0, traffic share count is 1
      AS Hops 2
      Route tag 65020
      MPLS label: none

R2:

R2#show ip bgp 1.1.1.0/24
BGP routing table entry for 1.1.1.0/24, version 6
Paths: (1 available, best #1, table default)
  Not advertised to any peer
  Refresh Epoch 1
  65020 65001
    192.168.0.1 from 192.168.0.20 (20.20.20.20)
      Origin IGP, localpref 100, valid, external, best
      rx pathid: 0, tx pathid: 0x0
R2#show ip route 1.1.1.0
Routing entry for 1.1.1.0/24
  Known via "bgp 65002", distance 20, metric 0
  Tag 65020, type external
  Last update from 192.168.0.1 00:00:29 ago
  Routing Descriptor Blocks:
  * 192.168.0.1, from 192.168.0.20, 00:00:29 ago
      Route metric is 0, traffic share count is 1
      AS Hops 2
      Route tag 65020
      MPLS label: none

こちらも聞いてたとおり、BGPのNEXT_HOPが中継したAS65020のアドレス(192.168.0.20)ではなく、広報元(AS65001/AS65002)のルータのアドレスになってる。そのため、ルーティングテーブルでも192.168.0.1や192.168.0.2がネクストホップになっている。

vEOS

AS65030として、1.1.1.0/24と2.2.2.0/24を中継させる。

ちなみに、EOSではIOSみたいに即時反映するオペレーションとJunosやIOS-XRみたいなcommitベースのオペレーションの2パターンが使える。反映前にdiffを確認したかったらcommitベースのがいいよね!

そのへんがまとまってるのが以下のEOS Centralのエントリ(これも発表者の土屋さんが書いたもの)

Arista Community

設定は以下のような内容で。

localhost(config-s-1)#show session-config diffs
--- system:/running-config
+++ session:/1-session-config
@@ -32,8 +32,20 @@
 !
 interface Ethernet12
 !
+interface Loopback0
+   ip address 30.30.30.30/32
+!
 interface Management1
 !
-no ip routing
+interface Vlan1
+   ip address 192.168.0.30/24
+!
+ip routing
+!
+router bgp 65030
+   neighbor 192.168.0.1 remote-as 65001
+   neighbor 192.168.0.1 maximum-routes 12000
+   neighbor 192.168.0.2 remote-as 65002
+   neighbor 192.168.0.2 maximum-routes 12000
 !
 end

受信したR1とR2のBGPテーブルとルーティングテーブルはこちら。

R1:

R1#show ip bgp 2.2.2.0/24
BGP routing table entry for 2.2.2.0/24, version 3
Paths: (1 available, best #1, table default)
  Not advertised to any peer
  Refresh Epoch 1
  65030 65002
    192.168.0.30 from 192.168.0.30 (30.30.30.30)
      Origin IGP, localpref 100, valid, external, best
      rx pathid: 0, tx pathid: 0x0
R1#show ip route 2.2.2.0
Routing entry for 2.2.2.0/24
  Known via "bgp 65001", distance 20, metric 0
  Tag 65030, type external
  Last update from 192.168.0.30 00:00:28 ago
  Routing Descriptor Blocks:
  * 192.168.0.30, from 192.168.0.30, 00:00:28 ago
      Route metric is 0, traffic share count is 1
      AS Hops 2
      Route tag 65030
      MPLS label: none

R2:

R2#show ip bgp 1.1.1.0/24
BGP routing table entry for 1.1.1.0/24, version 3
Paths: (1 available, best #1, table default)
  Not advertised to any peer
  Refresh Epoch 1
  65030 65001
    192.168.0.30 from 192.168.0.30 (30.30.30.30)
      Origin IGP, localpref 100, valid, external, best
      rx pathid: 0, tx pathid: 0x0
R2#show ip route 1.1.1.0
Routing entry for 1.1.1.0/24
  Known via "bgp 65002", distance 20, metric 0
  Tag 65030, type external
  Last update from 192.168.0.30 00:01:31 ago
  Routing Descriptor Blocks:
  * 192.168.0.30, from 192.168.0.30, 00:01:31 ago
      Route metric is 0, traffic share count is 1
      AS Hops 2
      Route tag 65030
      MPLS label: none

これも聞いてたとおり、BGPのNEXT_HOPが中継したAS65030のアドレス(192.168.0.30)になっている。

vSRX

AS65040として、1.1.1.0/24と2.2.2.0/24を中継させる。

root> show configuration | display set
set version 12.1X47-D20.7
(略)
set interfaces ge-0/0/0 unit 0 family inet address 192.168.0.40/24
set routing-options router-id 40.40.40.40
set routing-options autonomous-system 65040
set protocols bgp group ebgp type external
set protocols bgp group ebgp neighbor 192.168.0.1 peer-as 65001
set protocols bgp group ebgp neighbor 192.168.0.2 peer-as 65002
set security forwarding-options family inet6 mode packet-based
set security forwarding-options family mpls mode packet-based

受信したR1とR2のBGPテーブルとルーティングテーブルはこちら。

R1:

R1#show ip bgp 2.2.2.0/24
BGP routing table entry for 2.2.2.0/24, version 3
Paths: (1 available, best #1, table default)
  Not advertised to any peer
  Refresh Epoch 1
  65040 65002
    192.168.0.2 from 192.168.0.40 (40.40.40.40)
      Origin IGP, localpref 100, valid, external, best
      rx pathid: 0, tx pathid: 0x0
R1#show ip route 2.2.2.0
Routing entry for 2.2.2.0/24
  Known via "bgp 65001", distance 20, metric 0
  Tag 65040, type external
  Last update from 192.168.0.2 00:00:38 ago
  Routing Descriptor Blocks:
  * 192.168.0.2, from 192.168.0.40, 00:00:38 ago
      Route metric is 0, traffic share count is 1
      AS Hops 2
      Route tag 65040
      MPLS label: none

R2:

R2#show ip bgp 1.1.1.0/24
BGP routing table entry for 1.1.1.0/24, version 2
Paths: (1 available, best #1, table default)
  Not advertised to any peer
  Refresh Epoch 1
  65040 65001
    192.168.0.1 from 192.168.0.40 (40.40.40.40)
      Origin IGP, localpref 100, valid, external, best
      rx pathid: 0, tx pathid: 0x0
R2#show ip route 1.1.1.0
Routing entry for 1.1.1.0/24
  Known via "bgp 65002", distance 20, metric 0
  Tag 65040, type external
  Last update from 192.168.0.1 00:00:20 ago
  Routing Descriptor Blocks:
  * 192.168.0.1, from 192.168.0.40, 00:00:20 ago
      Route metric is 0, traffic share count is 1
      AS Hops 2
      Route tag 65040
      MPLS label: none

おやIOSやIOS-XRみたいに、BGPのNEXT_HOPが中継したAS65040のアドレス(192.168.0.40)ではなく、広報元(AS65001/AS65002)のルータのアドレスになってるぞ?
Junosは「よしなに書き換える」グループなのかな。

BIRD

BIRDはいろんなルーティングプロトコルを扱えるソフトウェアルータ。GNS3のアプライアンスとしてリリースされてるので、割と気軽に使える。軽いしいいよねー

Page Not Found | GNS3 Documentation

AS65050として、1.1.1.0/24と2.2.2.0/24を中継させる。

まずはインターフェースにアドレスを設定して、R1/R2と疎通できることを確認。

gns3@box:~$ sudo ifconfig eth0 192.168.0.50 netmask 255.255.255.0 up
gns3@box:~$ ifconfig eth0
eth0      Link encap:Ethernet  HWaddr 0C:62:34:9B:AB:00
          inet addr:192.168.0.50  Bcast:192.168.0.255  Mask:255.255.255.0
          inet6 addr: fe80::e62:34ff:fe9b:ab00/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:67 errors:0 dropped:0 overruns:0 frame:0
          TX packets:7 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:4020 (3.9 KiB)  TX bytes:818 (818.0 B)
gns3@box:~$ ping -c 3 192.168.0.1
PING 192.168.0.1 (192.168.0.1): 56 data bytes
64 bytes from 192.168.0.1: seq=0 ttl=255 time=0.983 ms
64 bytes from 192.168.0.1: seq=1 ttl=255 time=0.818 ms
64 bytes from 192.168.0.1: seq=2 ttl=255 time=1.783 ms

--- 192.168.0.1 ping statistics ---
3 packets transmitted, 3 packets received, 0% packet loss
round-trip min/avg/max = 0.818/1.194/1.783 ms
gns3@box:~$ ping -c 3 192.168.0.2
PING 192.168.0.2 (192.168.0.2): 56 data bytes
64 bytes from 192.168.0.2: seq=0 ttl=255 time=0.880 ms
64 bytes from 192.168.0.2: seq=1 ttl=255 time=0.723 ms
64 bytes from 192.168.0.2: seq=2 ttl=255 time=0.826 ms

--- 192.168.0.2 ping statistics ---
3 packets transmitted, 3 packets received, 0% packet loss
round-trip min/avg/max = 0.723/0.809/0.880 ms

んで、/usr/local/etc/bird.confに以下の設定を追加。

router id 50.50.50.50;

protocol bgp R1 {
  export filter { accept; };
  import filter { accept; };
  local as 65050;
  neighbor 192.168.0.1 as 65001;
}

protocol bgp R2 {
  export filter { accept; };
  import filter { accept; };
  local as 65050;
  neighbor 192.168.0.2 as 65002;
}

birdのデーモンを起動して、コンフィグチェック。コンフィグにも問題はなくちゃんとR1/R2から広報されたプレフィックスを受け取ってる。

gns3@box:/usr/local/etc$ sudo /usr/local/sbin/bird -u gns3 -g staff
gns3@box:/usr/local/etc$ birdc
BIRD 1.5.0 ready.
bird> config check
Reading configuration from /usr/local/etc/bird.conf
Configuration OK
bird> show route
1.1.1.0/24         via 192.168.0.1 on eth0 [R1 05:55:45] * (100) [AS65001i]
2.2.2.0/24         via 192.168.0.2 on eth0 [R2 05:55:46] * (100) [AS65002i]

受信したR1とR2のBGPテーブルとルーティングテーブルはこちら。

R1:

R1#show ip bgp 2.2.2.0/24
BGP routing table entry for 2.2.2.0/24, version 3
Paths: (1 available, best #1, table default)
  Not advertised to any peer
  Refresh Epoch 1
  65050 65002
    192.168.0.2 from 192.168.0.50 (50.50.50.50)
      Origin IGP, metric 0, localpref 100, valid, external, best
      rx pathid: 0, tx pathid: 0x0
R1#show ip route 2.2.2.0
Routing entry for 2.2.2.0/24
  Known via "bgp 65001", distance 20, metric 0
  Tag 65050, type external
  Last update from 192.168.0.2 00:01:18 ago
  Routing Descriptor Blocks:
  * 192.168.0.2, from 192.168.0.50, 00:01:18 ago
      Route metric is 0, traffic share count is 1
      AS Hops 2
      Route tag 65050
      MPLS label: none

R2:

R2#show ip bgp 1.1.1.0/24
BGP routing table entry for 1.1.1.0/24, version 3
Paths: (1 available, best #1, table default)
  Not advertised to any peer
  Refresh Epoch 1
  65050 65001
    192.168.0.1 from 192.168.0.50 (50.50.50.50)
      Origin IGP, metric 0, localpref 100, valid, external, best
      rx pathid: 0, tx pathid: 0x0
R2#show ip route 1.1.1.0
Routing entry for 1.1.1.0/24
  Known via "bgp 65002", distance 20, metric 0
  Tag 65050, type external
  Last update from 192.168.0.1 00:01:53 ago
  Routing Descriptor Blocks:
  * 192.168.0.1, from 192.168.0.50, 00:01:53 ago
      Route metric is 0, traffic share count is 1
      AS Hops 2
      Route tag 65050
      MPLS label: none

あれ?BIRDも「よしなに書き換える」グループ?

検証結果

なんと、Arista vEOS以外ぜーんぶ1st Party NEXT_HOPの書き換えをやってたという。まじか!
これは他のものも用意して試してみたいな…あとなんとかなりそうなのはvyOS、goBGPd、quagga、あたりか…?

参考:RFC4271 5.1.3(2)

BGP NEXT_HOPアトリビュートについては、上記スライドでも言及がある通りRFC 4271の5.1.3(2)に記載されている。

RFC 4271: A Border Gateway Protocol 4 (BGP-4)
This document discusses the Border Gateway Protocol (BGP), which is an inter-Autonomous System routing protocol. The primary function of a BGP speaking system i...
 2) When sending a message to an external peer, X, and the peer is
    one IP hop away from the speaker:

    - If the route being announced was learned from an internal
      peer or is locally originated, the BGP speaker can use an
      interface address of the internal peer router (or the
      internal router) through which the announced network is
      reachable for the speaker for the NEXT_HOP attribute,
      provided that peer X shares a common subnet with this
      address.  This is a form of "third party" NEXT_HOP attribute.

    - Otherwise, if the route being announced was learned from an
      external peer, the speaker can use an IP address of any
      adjacent router (known from the received NEXT_HOP attribute)
      that the speaker itself uses for local route calculation in
      the NEXT_HOP attribute, provided that peer X shares a common
      subnet with this address.  This is a second form of "third
      party" NEXT_HOP attribute.

    - Otherwise, if the external peer to which the route is being
      advertised shares a common subnet with one of the interfaces
      of the announcing BGP speaker, the speaker MAY use the IP
      address associated with such an interface in the NEXT_HOP
      attribute.  This is known as a "first party" NEXT_HOP
      attribute.

    - By default (if none of the above conditions apply), the BGP
      speaker SHOULD use the IP address of the interface that the
      speaker uses to establish the BGP connection to peer X in the
      NEXT_HOP attribute.

「1st Party NEXT_HOP」に該当する3つ目のパターンの「MAY」は、割とみんな実装してるパターンってことなのかな?

追加:vyOS

vyOS 1.1.7があったのを忘れてたのでAS65060として作成して検証。

vyos@vyos# compare
[edit interfaces ethernet eth0]
+address 192.168.0.60/24
[edit]
+protocols {
+    bgp 65060 {
+        neighbor 192.168.0.1 {
+            remote-as 65001
+        }
+        neighbor 192.168.0.2 {
+            remote-as 65002
+        }
+        parameters {
+            router-id 60.60.60.60
+        }
+    }
+}
[edit]

R1:

R1#show ip bgp 2.2.2.0/24
BGP routing table entry for 2.2.2.0/24, version 5
Paths: (1 available, best #1, table default)
  Not advertised to any peer
  Refresh Epoch 1
  65060 65002
    192.168.0.2 from 192.168.0.60 (60.60.60.60)
      Origin IGP, localpref 100, valid, external, best
      rx pathid: 0, tx pathid: 0x0
R1#show ip route 2.2.2.0
Routing entry for 2.2.2.0/24
  Known via "bgp 65001", distance 20, metric 0
  Tag 65060, type external
  Last update from 192.168.0.2 00:01:06 ago
  Routing Descriptor Blocks:
  * 192.168.0.2, from 192.168.0.60, 00:01:06 ago
      Route metric is 0, traffic share count is 1
      AS Hops 2
      Route tag 65060
      MPLS label: none

R2:

R2#show ip bgp 1.1.1.0/24
BGP routing table entry for 1.1.1.0/24, version 5
Paths: (1 available, best #1, table default)
  Not advertised to any peer
  Refresh Epoch 1
  65060 65001
    192.168.0.1 from 192.168.0.60 (60.60.60.60)
      Origin IGP, localpref 100, valid, external, best
      rx pathid: 0, tx pathid: 0x0
R2#show ip route 1.1.1.0
Routing entry for 1.1.1.0/24
  Known via "bgp 65002", distance 20, metric 0
  Tag 65060, type external
  Last update from 192.168.0.1 00:03:46 ago
  Routing Descriptor Blocks:
  * 192.168.0.1, from 192.168.0.60, 00:03:46 ago
      Route metric is 0, traffic share count is 1
      AS Hops 2
      Route tag 65060
      MPLS label: none

vyOSも「よしなに書き換える」グループなのか…

コメント

タイトルとURLをコピーしました