TED topologies and SR paths

In previous posts we have dealt with the TED. It is a database containing network topology used to build Traffic Engineered paths. TED is the one RSVP uses to build a LSP using CSPF.

Within TED we have all the information like link colors (admin groups), bandwidth, TE metric and so on…

Up to now, we have always thought of TED as a single instance.

This is not totally true. It is true that we have a single TED per router but that TED has multiple instances.

Today, we are going to have a look at one of those other instances: l3-unicast.

To better understand why we have more topologies and how they are used, we can get some help from this image:

The IGP, as we already know, populates TED.

RSVP relies on TED standard topology, the only one we have known up to now.

The new topology, l3-unicast, is used the by SRTE client. This means that SR paths rely on this topologies to build segment routing paths.

If you remember, in past posts like this , we had this line in our config:

set protocols isis traffic-engineering l3-unicast-topology

What this command did was to download IGP data (ISIS here) into the l3-unicast topology so that that SRTE client could find data to build LSPs.

For now, we assume that setting is not part of our configuration.

Simply put, when we define a SRTE policy using DCSPF (as described here), providing details like “include color premium and reserve bandwidth X”, the SRTE client will look into TED l3-unicast topology to find the info and build the path.

But…do we always need l3-unicast topology?

Short answer…no!

Let’s look at two different examples to better understand the role of l3-unicast topology.

We are going to define 2 SRTE paths.

This one:

set protocols source-packet-routing segment-list r4a-via-r3a r3 label 1103
set protocols source-packet-routing segment-list r4a-via-r3a r4 label 1104
set protocols source-packet-routing source-routing-path r4a-colored to 100.4.4.4
set protocols source-packet-routing source-routing-path r4a-colored color 555
set protocols source-packet-routing source-routing-path r4a-colored binding-sid 1000555
set protocols source-packet-routing source-routing-path r4a-colored primary r4a-via-r3a bfd-liveness-detection sbfd remote-discriminator 14
set protocols source-packet-routing source-routing-path r4a-colored primary r4a-via-r3a bfd-liveness-detection minimum-interval 1000
set protocols source-packet-routing source-routing-path r4a-colored primary r4a-via-r3a bfd-liveness-detection multiplier 3

set protocols source-packet-routing segment-list r4a-via-r3a r3 label 1103
set protocols source-packet-routing segment-list r4a-via-r3a r4 label 1104

And this one:

set protocols source-packet-routing source-routing-path r4-minimal to 100.4.4.4
set protocols source-packet-routing source-routing-path r4-minimal primary r4a-minimal

set protocols source-packet-routing segment-list r4a-minimal auto-translate
set protocols source-packet-routing segment-list r4a-minimal r3 ip-address 100.3.3.3
set protocols source-packet-routing segment-list r4a-minimal r3 label-type node
set protocols source-packet-routing segment-list r4a-minimal r4 ip-address 100.4.4.4
set protocols source-packet-routing segment-list r4a-minimal r4 label-type node

set protocols source-packet-routing inherit-label-nexthops

We commit and check LSP status:

root@r1a# run show spring-traffic-engineering lsp detail
Name: r4-minimal
  Tunnel-source: Static configuration
  To: 100.4.4.4
  State: Down
    Path: r4a-minimal
    Path Status: NA
    Outgoing interface: NA
    Auto-translate status: Enabled Auto-translate result: TED is empty
    Compute Status:Disabled , Compute Result:N/A , Compute-Profile Name:N/A
    BFD status: N/A BFD name: N/A
    ERO Valid: true
      SR-ERO hop count: 2
        Hop 1 (Loose):
          NAI: IPv4 Node ID, Node address: 100.3.3.3
          SID type: None
        Hop 2 (Loose):
          NAI: IPv4 Node ID, Node address: 100.4.4.4
          SID type: None

Name: r4a-colored
  Tunnel-source: Static configuration
  To: 100.4.4.4-555<c>
  State: Up
    Path: r4a-via-r3a
    Path Status: NA
    Outgoing interface: NA
    Auto-translate status: Disabled Auto-translate result: N/A
    Compute Status:Disabled , Compute Result:N/A , Compute-Profile Name:N/A
    BFD status: Up BFD name: V4-srte_bfd_session-8
    ERO Valid: true
      SR-ERO hop count: 2
        Hop 1 (Strict):
          NAI: None
          SID type: 20-bit label, Value: 1103
        Hop 2 (Strict):
          NAI: None
          SID type: 20-bit label, Value: 1104

First LSP (r4a-colored) is up while second one (r4a-minimal) is down.

The lsp which is down complains about TED being empty. TED is not empty (ISIS populated TED by default); TED l3-unicast topology is:

root@r1a# run show ted database topology-type l3-unicast
TED database: 0 ISIS nodes 0 INET nodes 0 INET6 nodes

Based on what we have said before, SRTE client relies on TED l3-unicast topology, it makes sense to have that LSP down.

So why is the other lsp up and running?

Here comes the interesting part! Let’s compare provided segment lists.

The up lsp was defined with this segment list:

set protocols source-packet-routing segment-list r4a-via-r3a r3 label 1103
set protocols source-packet-routing segment-list r4a-via-r3a r4 label 1104

We provided the full label stack. In this scenario, Junos does not need to validate anything as the label stack is already there. For this reason, l3-unicast topology interaction is skipped.

On the other hand:

set protocols source-packet-routing segment-list r4a-minimal auto-translate
set protocols source-packet-routing segment-list r4a-minimal r3 ip-address 100.3.3.3
set protocols source-packet-routing segment-list r4a-minimal r3 label-type node
set protocols source-packet-routing segment-list r4a-minimal r4 ip-address 100.4.4.4
set protocols source-packet-routing segment-list r4a-minimal r4 label-type node

Here, it is needed to translate IP addresses to node SIDs. This requires TED interaction which, being a SRTE path, means looking inside l3-unicast topology.

The same holds for DCSPF where we have compute profiles.

To sum up, l3-unicast comes into play as soon as auto-translate and compute profiles are involved. In case of fully provided label stack (sort of static route), checking that TED topology is not needed.

To solve our issue, we simply add:

root@r1a# activate protocols isis traffic-engineering l3-unicast-topology

And magically:

[edit]
root@r1a# commit
commit complete

[edit]
root@r1a# run show spring-traffic-engineering lsp
To              State     LSPname
100.4.4.4       Up        r4-minimal
100.4.4.4-555<c> Up       r4a-colored


Total displayed LSPs: 2 (Up: 2, Down: 0)

Is l3-unicast just for that? Nope! We will see it again when bringing EPE adjacencies and express segments into our networks.

Ciao
IoSonoUmberto

Link State on steroids with BGP-LS for Inter-domain TE

Most common IGP protocols, OSPF and ISIS, are so-called link state protocols.

Link state protocols are designed so that all the nodes have the complete view of the network. In other words, each node sees the whole topology and, on that topology, builds shortest paths to different destinations.

Historically, we were told to be careful with link state protocols. Knowing the whole topology on every node is a coll thing but, at the same time, gives us scalability issues. The bigger the network, the bigger the link state database on the device. This was a real issue, let’s say, 20 years ago were boxes were not as powerful as today. For that reason, solutions like areas (NSSA, stub) and levels were invented. The golden rule was something like “do not make your IGP grow too much”.

Anyhow, that strict requirement is no longer true. Boxes are more powerful and can handle bigger link state databases.

Moreover, new trends and needs emerged. One of them is the possibility to provide advanced traffic engineering in order to build services able to guarantee certain SLAs.
As networks are interconnected, it is not out of mind to think of SLAs that span over multiple domains (Inter-AS). This means being able to have inter-as TE capabilities.

We have already seen something like this when we had a look at BGP-CT.

Here, we are going to see another solution.

Let’s think of RSVP. We want to create a LSP starting in AS200 with the egress node in AS100. Our router in AS200 needs to know about routes and topology coming from AS100. We need to extend the link state database to include info from other AS. We no longer want to “limit” the link state; we want to expand it with data from another domain.

This is accomplished, once again, via BGP, through its flavor called BGP-LS (Link State).

What we want to build is the following:

Through BGP-LS we are going to exchange link state routing info between ASs. Our ultimate goal is to create a RSVP LSP from 200.1.1.1 (r1b) to 100.1.1.1 (r1a).

We said we are going to extend the link state database. This might sound like importing remote routes into the local IGP. That is not true. Data from the other AS will be imported and installed within the TED: Traffic Engineering Database. TED is the entity used by default (unless no-cspf is configured) by RSVP to build LSPs.

In this solution, IGP, TED and BGP-LS will all work together to make things work.

This image shows the different interactions:

Let’s understand how this works:

  • IGP has its own routing table and routes typically end up in inet.0
  • if TE is enabled, IG topology is “downloaded” into the TED
  • as said, TED is a special database used to build TE-oriented paths
  • inside TED we have all the nodes and links connecting them, along with additional TE attributes like bandwidth or admin-groups
  • TED data can be exported towards BGP by configuring an import policy on the TED db
  • this might sound confusing but we have to think of TED import/export policies from the BGP perspective
  • BGP-LS introduces a new routing table called lsdist.0
  • TED db export policy means “export data from lsdist.0 to TED”
  • at the same time, TED db import policy means “import data from TED to lsdist.0”
  • IGP data is downloaded to TED and, from there, via a TED import policy is copied into lsdist.0 and made available to BGP-LS
  • similarly, remote LS data is received over BGP-LS and stored into lsdist.0
  • from there, through a TED export policy, is downloaded to TED
  • as a result, TED includes both local data and remote data
  • RSVP could now build a LSP from one AS to another

Now things should be a bit more clear. We can jump into building our BGP-LS network.

AS 100 uses ISIS L2 only. Configuration is standard. I’m going to provide r4a (border router) config only:

set protocols isis interface ge-0/0/0.0 level 1 disable
set protocols isis interface ge-0/0/0.0 point-to-point
set protocols isis interface ge-0/0/1.0 level 1 disable
set protocols isis interface ge-0/0/1.0 point-to-point
set protocols isis interface ge-0/0/2.0 level 1 disable
set protocols isis interface ge-0/0/2.0 point-to-point
set protocols isis interface ge-0/0/2.0 passive remote-node-iso 0000.0000.0024
set protocols isis interface ge-0/0/2.0 passive remote-node-id 192.168.45.1
set protocols isis interface lo0.0

No OSPF runs in AS 100.

AS 200, instead, uses OSPF area 0. Here is config for r4b (border router) only:

set protocols ospf traffic-engineering
set protocols ospf area 0.0.0.0 interface ge-0/0/0.0 interface-type p2p
set protocols ospf area 0.0.0.0 interface ge-0/0/1.0 interface-type p2p
set protocols ospf area 0.0.0.0 interface lo0.0 passive
set protocols ospf area 0.0.0.0 interface ge-0/0/2.0 interface-type p2p
set protocols ospf area 0.0.0.0 interface ge-0/0/2.0 passive traffic-engineering remote-node-id 192.168.45.0
set protocols ospf area 0.0.0.0 interface ge-0/0/2.0 passive traffic-engineering remote-node-router-id 100.4.4.4

Unlike ISIS, we need to explicitly enable TE, otherwise IGP will not be downloaded into TED.

No ISIS runs in AS 200.

Now we deal with TED db.

On both ASs we configure:

set protocols mpls traffic-engineering database import policy ted-to-bgp
set protocols mpls traffic-engineering database import bgp-ls-identifier <N>
set protocols mpls traffic-engineering database export policy bgp-to-ted

Where N is just an identifier. We choose 100 for AS 100 and 200 for AS 200.

Import policy differs as the two ASs use different IGPs:

r4b (AS200)
set policy-options policy-statement ted-to-bgp term ok from protocol ospf
set policy-options policy-statement ted-to-bgp term ok then accept
set policy-options policy-statement ted-to-bgp then reject

r4a (AS100)
set policy-options policy-statement ted-to-bgp term ok from protocol isis
set policy-options policy-statement ted-to-bgp term ok then accept
set policy-options policy-statement ted-to-bgp then reject

Export policy, instead, is identical on both border routers:

set policy-options policy-statement bgp-to-ted term ok from family traffic-engineering
set policy-options policy-statement bgp-to-ted term ok then accept

Family traffic-engineering is a new BGP family we are seeing for the first time. It is the family associated with BGP-LS.

Let’s configure BGP-LS:

set protocols bgp group bgp-ls type external
set protocols bgp group bgp-ls family traffic-engineering unicast
set protocols bgp group bgp-ls export exp-bg-te
set protocols bgp group bgp-ls peer-as 100
set protocols bgp group bgp-ls neighbor 192.168.45.0

That was r4b. Router r4a is identical, apart from neighbor IP.

Export policy look like this:

set policy-options policy-statement exp-bgp-ls term ok from family traffic-engineering
set policy-options policy-statement exp-bgp-ls term ok then accept
set policy-options policy-statement exp-bgp-ls then reject

Routes are exported from lsdist.0 routing table.

We also have iBGP inside each AS. This is on r4b towards r1b:

set protocols bgp group ibgp-ls type internal
set protocols bgp group ibgp-ls local-address 200.4.4.4
set protocols bgp group ibgp-ls family traffic-engineering unicast
set protocols bgp group ibgp-ls neighbor 200.1.1.1

iBGP is needed so that

  • r4b (AS 200) learns LS data from r4a (AS100)
  • that data is advertised via iBGP to r1b
  • this way r1a has both local and remote data

Notice, there is nothing to do about local data at r1b. As both r4b and r1b are part of the same IGP domain, they share the same local data being OSPF/ISIS link state protocols. The only purpose of iBGP is to bring LS routes from border routers (r4b and r4a) to other PE nodes within their AS.

For this reason, on r1a and r1b there is no need to have a BGP-LS export policy. Having it would mean exporting local TED dta (like we do at border routers) but it is useless to create such NLRIs as all the other routers within the AS already have that info. It is enough to have just one router to export local TED data to BGP-LS and it makes sense that router to be the border router, the one responsible for advertising local AS LS routes to remote ASs.

On r4b, we check lsdist.0 gets populated by both AS100 and AS200 routes:

root@r4b# run show route table lsdist.0 brief | match AS:100
NODE { AS:100 BGP-LS ID:100 ISO:0000.0000.0014.00 ISIS-L1:0 }/1216
NODE { AS:100 BGP-LS ID:100 ISO:0000.0000.0011.00 ISIS-L2:0 }/1216
NODE { AS:100 BGP-LS ID:100 ISO:0000.0000.0012.00 ISIS-L2:0 }/1216
NODE { AS:100 BGP-LS ID:100 ISO:0000.0000.0013.00 ISIS-L2:0 }/1216
NODE { AS:100 BGP-LS ID:100 ISO:0000.0000.0014.00 ISIS-L2:0 }/1216
LINK { Local { AS:100 BGP-LS ID:100 ISO:0000.0000.0011.00 }.{ IPv4:192.168.12.0 } Remote { AS:100 BGP-LS ID:100 ISO:0000.0000.0012.00 }.{ IPv4:192.168.12.1 } ISIS-L2:0 }/1216
LINK { Local { AS:100 BGP-LS ID:100 ISO:0000.0000.0011.00 }.{ IPv4:192.168.13.0 } Remote { AS:100 BGP-LS ID:100 ISO:0000.0000.0013.00 }.{ IPv4:192.168.13.1 } ISIS-L2:0 }/1216
LINK { Local { AS:100 BGP-LS ID:100 ISO:0000.0000.0012.00 }.{ IPv4:192.168.12.1 } Remote { AS:100 BGP-LS ID:100 ISO:0000.0000.0011.00 }.{ IPv4:192.168.12.0 } ISIS-L2:0 }/1216
LINK { Local { AS:100 BGP-LS ID:100 ISO:0000.0000.0012.00 }.{ IPv4:192.168.23.0 } Remote { AS:100 BGP-LS ID:100 ISO:0000.0000.0013.00 }.{ IPv4:192.168.23.1 } ISIS-L2:0 }/1216
LINK { Local { AS:100 BGP-LS ID:100 ISO:0000.0000.0012.00 }.{ IPv4:192.168.24.0 } Remote { AS:100 BGP-LS ID:100 ISO:0000.0000.0014.00 }.{ IPv4:192.168.24.1 } ISIS-L2:0 }/1216
LINK { Local { AS:100 BGP-LS ID:100 ISO:0000.0000.0013.00 }.{ IPv4:192.168.13.1 } Remote { AS:100 BGP-LS ID:100 ISO:0000.0000.0011.00 }.{ IPv4:192.168.13.0 } ISIS-L2:0 }/1216
LINK { Local { AS:100 BGP-LS ID:100 ISO:0000.0000.0013.00 }.{ IPv4:192.168.23.1 } Remote { AS:100 BGP-LS ID:100 ISO:0000.0000.0012.00 }.{ IPv4:192.168.23.0 } ISIS-L2:0 }/1216
LINK { Local { AS:100 BGP-LS ID:100 ISO:0000.0000.0013.00 }.{ IPv4:192.168.34.0 } Remote { AS:100 BGP-LS ID:100 ISO:0000.0000.0014.00 }.{ IPv4:192.168.34.1 } ISIS-L2:0 }/1216
LINK { Local { AS:100 BGP-LS ID:100 ISO:0000.0000.0014.00 }.{ IPv4:192.168.24.1 } Remote { AS:100 BGP-LS ID:100 ISO:0000.0000.0012.00 }.{ IPv4:192.168.24.0 } ISIS-L2:0 }/1216
LINK { Local { AS:100 BGP-LS ID:100 ISO:0000.0000.0014.00 }.{ IPv4:192.168.34.1 } Remote { AS:100 BGP-LS ID:100 ISO:0000.0000.0013.00 }.{ IPv4:192.168.34.0 } ISIS-L2:0 }/1216
LINK { Local { AS:100 BGP-LS ID:100 ISO:0000.0000.0014.00 }.{ IPv4:192.168.45.0 } Remote { AS:100 BGP-LS ID:100 ISO:0000.0000.0024.00 }.{ IPv4:192.168.45.1 } ISIS-L2:0 }/1216
PREFIX { Node { AS:100 BGP-LS ID:100 ISO:0000.0000.0014.00 } { IPv4:100.4.4.4/32 } ISIS-L1:0 }/1216
PREFIX { Node { AS:100 BGP-LS ID:100 ISO:0000.0000.0011.00 } { IPv4:100.1.1.1/32 } ISIS-L2:0 }/1216
PREFIX { Node { AS:100 BGP-LS ID:100 ISO:0000.0000.0012.00 } { IPv4:100.2.2.2/32 } ISIS-L2:0 }/1216
PREFIX { Node { AS:100 BGP-LS ID:100 ISO:0000.0000.0013.00 } { IPv4:100.3.3.3/32 } ISIS-L2:0 }/1216
PREFIX { Node { AS:100 BGP-LS ID:100 ISO:0000.0000.0014.00 } { IPv4:100.4.4.4/32 } ISIS-L2:0 }/1216

[edit]
root@r4b# run show route table lsdist.0 brief | match AS:200
NODE { AS:200 BGP-LS ID:200 Area:0.0.0.0 IPv4:200.1.1.1 OSPF:0 }/1216
NODE { AS:200 BGP-LS ID:200 Area:0.0.0.0 IPv4:200.2.2.2 OSPF:0 }/1216
NODE { AS:200 BGP-LS ID:200 Area:0.0.0.0 IPv4:200.3.3.3 OSPF:0 }/1216
NODE { AS:200 BGP-LS ID:200 Area:0.0.0.0 IPv4:200.4.4.4 OSPF:0 }/1216
LINK { Local { AS:200 BGP-LS ID:200 Area:0.0.0.0 IPv4:200.1.1.1 }.{ IPv4:192.168.112.0 } Remote { AS:200 BGP-LS ID:200 Area:0.0.0.0 IPv4:200.2.2.2 }.{ IPv4:192.168.112.1 } OSPF:0 }/1216
LINK { Local { AS:200 BGP-LS ID:200 Area:0.0.0.0 IPv4:200.1.1.1 }.{ IPv4:192.168.113.0 } Remote { AS:200 BGP-LS ID:200 Area:0.0.0.0 IPv4:200.3.3.3 }.{ IPv4:192.168.113.1 } OSPF:0 }/1216
LINK { Local { AS:200 BGP-LS ID:200 Area:0.0.0.0 IPv4:200.2.2.2 }.{ IPv4:192.168.112.1 } Remote { AS:200 BGP-LS ID:200 Area:0.0.0.0 IPv4:200.1.1.1 }.{ IPv4:192.168.112.0 } OSPF:0 }/1216
LINK { Local { AS:200 BGP-LS ID:200 Area:0.0.0.0 IPv4:200.2.2.2 }.{ IPv4:192.168.123.0 } Remote { AS:200 BGP-LS ID:200 Area:0.0.0.0 IPv4:200.3.3.3 }.{ IPv4:192.168.123.1 } OSPF:0 }/1216
LINK { Local { AS:200 BGP-LS ID:200 Area:0.0.0.0 IPv4:200.2.2.2 }.{ IPv4:192.168.124.0 } Remote { AS:200 BGP-LS ID:200 Area:0.0.0.0 IPv4:200.4.4.4 }.{ IPv4:192.168.124.1 } OSPF:0 }/1216
LINK { Local { AS:200 BGP-LS ID:200 Area:0.0.0.0 IPv4:200.3.3.3 }.{ IPv4:192.168.113.1 } Remote { AS:200 BGP-LS ID:200 Area:0.0.0.0 IPv4:200.1.1.1 }.{ IPv4:192.168.113.0 } OSPF:0 }/1216
LINK { Local { AS:200 BGP-LS ID:200 Area:0.0.0.0 IPv4:200.3.3.3 }.{ IPv4:192.168.123.1 } Remote { AS:200 BGP-LS ID:200 Area:0.0.0.0 IPv4:200.2.2.2 }.{ IPv4:192.168.123.0 } OSPF:0 }/1216
LINK { Local { AS:200 BGP-LS ID:200 Area:0.0.0.0 IPv4:200.3.3.3 }.{ IPv4:192.168.134.0 } Remote { AS:200 BGP-LS ID:200 Area:0.0.0.0 IPv4:200.4.4.4 }.{ IPv4:192.168.134.1 } OSPF:0 }/1216
LINK { Local { AS:200 BGP-LS ID:200 Area:0.0.0.0 IPv4:200.4.4.4 }.{ IPv4:192.168.45.1 } Remote { AS:200 BGP-LS ID:200 Area:0.0.0.0 IPv4:100.4.4.4 }.{ IPv4:192.168.45.0 } OSPF:0 }/1216
LINK { Local { AS:200 BGP-LS ID:200 Area:0.0.0.0 IPv4:200.4.4.4 }.{ IPv4:192.168.124.1 } Remote { AS:200 BGP-LS ID:200 Area:0.0.0.0 IPv4:200.2.2.2 }.{ IPv4:192.168.124.0 } OSPF:0 }/1216
LINK { Local { AS:200 BGP-LS ID:200 Area:0.0.0.0 IPv4:200.4.4.4 }.{ IPv4:192.168.134.1 } Remote { AS:200 BGP-LS ID:200 Area:0.0.0.0 IPv4:200.3.3.3 }.{ IPv4:192.168.134.0 } OSPF:0 }/1216

Now, let’s have a closer look to TED. For simplicity we are going to look at links.

We are going to check this from r4b perspective.

When BGP-LS is up we see this:

root@r4b# run show ted link
ID                         ->ID                          LocalPath LocalBW
0000.0000.0011.00(100.1.1.1) 0000.0000.0012.00(100.2.2.2)        0 0bps
0000.0000.0011.00(100.1.1.1) 0000.0000.0013.00(100.3.3.3)        0 0bps
0000.0000.0012.00(100.2.2.2) 0000.0000.0014.00(100.4.4.4)        0 0bps
0000.0000.0012.00(100.2.2.2) 0000.0000.0011.00(100.1.1.1)        0 0bps
0000.0000.0012.00(100.2.2.2) 0000.0000.0013.00(100.3.3.3)        0 0bps
0000.0000.0013.00(100.3.3.3) 0000.0000.0014.00(100.4.4.4)        0 0bps
0000.0000.0013.00(100.3.3.3) 0000.0000.0011.00(100.1.1.1)        0 0bps
0000.0000.0013.00(100.3.3.3) 0000.0000.0012.00(100.2.2.2)        0 0bps
0000.0000.0014.00(100.4.4.4) 0000.0000.0024.00(200.4.4.4)        0 0bps
0000.0000.0014.00(100.4.4.4) 0000.0000.0012.00(100.2.2.2)        0 0bps
0000.0000.0014.00(100.4.4.4) 0000.0000.0013.00(100.3.3.3)        0 0bps
0000.0000.0024.00(200.4.4.4) 200.2.2.2                           0 0bps
0000.0000.0024.00(200.4.4.4) 200.3.3.3                           0 0bps
0000.0000.0024.00(200.4.4.4) 0000.0000.0014.00(100.4.4.4)        0 0bps
200.1.1.1                    200.2.2.2                           0 0bps
200.1.1.1                    200.3.3.3                           0 0bps
200.2.2.2                    200.1.1.1                           0 0bps
200.2.2.2                    200.3.3.3                           0 0bps
200.2.2.2                    0000.0000.0024.00(200.4.4.4)        0 0bps
200.3.3.3                    200.1.1.1                           0 0bps
200.3.3.3                    200.2.2.2                           0 0bps
200.3.3.3                    0000.0000.0024.00(200.4.4.4)        0 0bps

All the links are there.

Now, we deactivate TED db export policy (from bgp to TED):

root@r4b# deactivate protocols mpls traffic-engineering database export

[edit]
root@r4b# commit
commit complete

root@r4b# run show ted link
ID                         ->ID                          LocalPath LocalBW
0000.0000.0024.00(200.4.4.4) 200.2.2.2                           0 0bps
0000.0000.0024.00(200.4.4.4) 200.3.3.3                           0 0bps
0000.0000.0024.00(200.4.4.4) 0000.0000.0014.00(100.4.4.4)        0 0bps
200.1.1.1                    200.2.2.2                           0 0bps
200.1.1.1                    200.3.3.3                           0 0bps
200.2.2.2                    200.1.1.1                           0 0bps
200.2.2.2                    200.3.3.3                           0 0bps
200.2.2.2                    0000.0000.0024.00(200.4.4.4)        0 0bps
200.3.3.3                    200.1.1.1                           0 0bps
200.3.3.3                    200.2.2.2                           0 0bps
200.3.3.3                    0000.0000.0024.00(200.4.4.4)        0 0bps

root@r4b# run show route table lsdist.0 | match AS:100 | match 100.1.1.1
PREFIX { Node { AS:100 BGP-LS ID:100 ISO:0000.0000.0011.00 } { IPv4:100.1.1.1/32 } ISIS-L2:0 }/1216

As a result, only local links are present even if remote links are inside lsdist.0. This shows us what TED db export policy does.

We do the same with import policy:

root@r4b# deactivate protocols mpls traffic-engineering database import

[edit]
root@r4b# commit
commit complete

root@r4b# run show route table lsdist.0 | match AS:200

[edit]
root@r4b#

This time, data is not copied from TED to lsdist.0, meaning I have no AS200 routes there, As a consequence, AS100 is receiving nothing via BGP-LS.

Now, we go to a PE, r1b. Again, we check TED links:

root@r1b# run show ted link
ID                         ->ID                          LocalPath LocalBW
0000.0000.0011.00(100.1.1.1) 0000.0000.0013.00(100.3.3.3)        0 0bps
0000.0000.0011.00(100.1.1.1) 0000.0000.0012.00(100.2.2.2)        0 0bps
0000.0000.0012.00(100.2.2.2) 0000.0000.0014.00(100.4.4.4)        0 0bps
0000.0000.0012.00(100.2.2.2) 0000.0000.0013.00(100.3.3.3)        0 0bps
0000.0000.0012.00(100.2.2.2) 0000.0000.0011.00(100.1.1.1)        0 0bps
0000.0000.0013.00(100.3.3.3) 0000.0000.0014.00(100.4.4.4)        0 0bps
0000.0000.0013.00(100.3.3.3) 0000.0000.0012.00(100.2.2.2)        0 0bps
0000.0000.0013.00(100.3.3.3) 0000.0000.0011.00(100.1.1.1)        0 0bps
0000.0000.0014.00(100.4.4.4) 0000.0000.0013.00(100.3.3.3)        0 0bps
0000.0000.0014.00(100.4.4.4) 0000.0000.0012.00(100.2.2.2)        0 0bps
0000.0000.0014.00(100.4.4.4) 0000.0000.0024.00                   0 0bps
200.1.1.1                    200.2.2.2                           0 0bps
200.1.1.1                    200.3.3.3                           1 0bps
200.2.2.2                    200.1.1.1                           0 0bps
200.2.2.2                    200.3.3.3                           0 0bps
200.2.2.2                    200.4.4.4                           0 0bps
200.3.3.3                    200.1.1.1                           0 0bps
200.3.3.3                    200.2.2.2                           0 0bps
200.3.3.3                    200.4.4.4                           1 0bps
200.4.4.4                    200.2.2.2                           0 0bps
200.4.4.4                    200.3.3.3                           0 0bps
200.4.4.4                    0000.0000.0014.00(100.4.4.4)        0 0bps

All the links are there.

Remote links are received via iBGP-LS from r4b and copied into TED (TED db export policy).

This is iBGP relevant config on r1b:

set policy-options policy-statement bgp-to-ted term ok from family traffic-engineering
set policy-options policy-statement bgp-to-ted term ok then accept
set policy-options policy-statement bgp-to-ted then reject
set policy-options policy-statement lb then load-balance per-packet
set protocols bgp group ibgp-ls type internal
set protocols bgp group ibgp-ls local-address 200.1.1.1
set protocols bgp group ibgp-ls family traffic-engineering unicast
set protocols bgp group ibgp-ls neighbor 200.4.4.4
set protocols mpls traffic-engineering database export policy bgp-to-ted

As you can see there is no BGP export policy for the resons we exaplained before.

At the same time, we have no TED db import policy. That is a sort of consequence of not needing the BGP export policy; if i have nothing to export via BGP, there is no need to copy routs from TED to lsdist.0.

Everything seems ready but it is not.

TED now includes both local OSPF entries and remote (exported) ISIS ones.

root@r1b# run show ted protocol
Protocol name        Credibility  Self node
Exported ISIS-L2(1)  347
Exported ISIS-L1(2)  346
OSPF(0)              0            200.1.1.1

By default, RSVP prefers local entries to build LSPS. This is imposed via credibility values. Simply put, a local protocol is “more credible” than an external one.

This prevents RSVP to build inter-as lsps. To overcome this we add:

set protocols mpls cross-credibility-cspf

Next, we configure 2 lsps: one to r4b (intra-as) and one to r1a (inter-as):

set protocols mpls label-switched-path r1a to 100.1.1.1
set protocols mpls label-switched-path r4b to 200.4.4.4

root@r1b# run show mpls lsp
Ingress LSP: 2 sessions
To              From            State Rt P     ActivePath       LSPname
100.1.1.1       0.0.0.0         Dn     0       -                r1a
200.4.4.4       200.1.1.1       Up     0 *                      r4b
Total 2 displayed, Up 1, Down 1

The intra-AS one failed to come up, even if we configured cross credibility. The issue must be somewhere else.

Let’s find out more:

100.1.1.1
  From: 0.0.0.0, State: Dn, ActiveRoute: 0, LSPname: r1a, LSPid: 11
  ActivePath: (none)
  LSPtype: Static Configured, Penultimate hop popping
  LoadBalance: Random
  Follow destination IGP metric
  Encoding type: Packet, Switching type: Packet, GPID: IPv4
  LSP Self-ping Status : Enabled
  Primary                    State: Dn
    Priorities: 7 0
    SmartOptimizeTimer: 180
    Flap Count: 1
    MBB Count: 2
    Will be enqueued for recomputation in 24 second(s).
   48 Jun 27 13:40:34.027 CSPF failed: no route toward 100.1.1.1[6 times, first Jun 27 12:43:28.749]

root@r1b# run show ted database 100.1.1.1
TED database: 5 ISIS nodes 8 INET nodes 0 INET6 nodes
ID                            Type Age(s) LnkIn LnkOut Protocol
0000.0000.0011.00(100.1.1.1)  Rtr    2974     2      2 Exported ISIS-L2(1)
    To: 0000.0000.0013.00(100.3.3.3), Local: 192.168.13.0, Remote: 192.168.13.1
      Local interface index: 0, Remote interface index: 0
    To: 0000.0000.0012.00(100.2.2.2), Local: 192.168.12.0, Remote: 192.168.12.1
      Local interface index: 0, Remote interface index: 0

It complains about not finding a route towards 100.1.1.1 (r1a) even if 100.1.1.1 is present within the TED db.

I enable traceoptions to find out this:

Jun 27 13:40:34.027798  Reverse Link for 192.168.45.1(200.4.4.4:335)->192.168.45.0(100.4.4.4:0) not found

The link between ASs seems to be the problem.

More checking:

root@r1b# run show ted link detail | match 45
0000.0000.0014.00(100.4.4.4)->0000.0000.0024.00, Local: 192.168.45.0, Remote: 192.168.45.1
200.4.4.4->0000.0000.0014.00(100.4.4.4), Local: 192.168.45.1, Remote: 192.168.45.0

root@r4b# run show route table lsdist.0 | match 45 | match AS
LINK { Local { AS:100 BGP-LS ID:100 ISO:0000.0000.0014.00 }.{ IPv4:192.168.45.0 } Remote { AS:100 BGP-LS ID:100 ISO:0000.0000.0024.00 }.{ IPv4:192.168.45.1 } ISIS-L2:0 }/1216
LINK { Local { AS:200 BGP-LS ID:200 Area:0.0.0.0 IPv4:200.4.4.4 }.{ IPv4:192.168.45.1 } Remote { AS:200 BGP-LS ID:200 Area:0.0.0.0 IPv4:100.4.4.4 }.{ IPv4:192.168.45.0 } OSPF:0 }/1216

Junos complains about not finding a reverse link. However, we can see both links are there. Anyhow, one is from OSPF and the other from ISIS. Currently, Junos requires both links to come from the same protocol so even if TED has an entry for both r4a->r4b and r4b->r4a, it is not able to match them.

As a result, we have to enable a second IGP on one of the ASs.

Here, we chose to enable OSPF in AS100:

set protocols ospf traffic-engineering
set protocols ospf area 0.0.0.0 interface ge-0/0/2.0 interface-type p2p
set protocols ospf area 0.0.0.0 interface ge-0/0/2.0 passive traffic-engineering remote-node-id 192.168.45.1
set protocols ospf area 0.0.0.0 interface ge-0/0/2.0 passive traffic-engineering remote-node-router-id 200.4.4.4
set policy-options policy-statement ted-to-bgp term ok from protocol ospf

As you may notice, we also need to update the ted-to-bgp policy so to copy the new ospf route inot lsdist.0 as well.

Now, I have one more link:

root@r1b# run show ted link detail | match 45
0000.0000.0014.00(100.4.4.4)->200.4.4.4, Local: 192.168.45.0, Remote: 192.168.45.1
0000.0000.0014.00(100.4.4.4)->0000.0000.0024.00, Local: 192.168.45.0, Remote: 192.168.45.1
200.4.4.4->0000.0000.0014.00(100.4.4.4), Local: 192.168.45.1, Remote: 192.168.45.0

root@r4b# run show route table lsdist.0 | match 45 | match AS
LINK { Local { AS:100 BGP-LS ID:100 ISO:0000.0000.0014.00 }.{ IPv4:192.168.45.0 } Remote { AS:100 BGP-LS ID:100 ISO:0000.0000.0024.00 }.{ IPv4:192.168.45.1 } ISIS-L2:0 }/1216
LINK { Local { AS:100 BGP-LS ID:100 Area:0.0.0.0 IPv4:100.4.4.4 }.{ IPv4:192.168.45.0 } Remote { AS:100 BGP-LS ID:100 Area:0.0.0.0 IPv4:200.4.4.4 }.{ IPv4:192.168.45.1 } OSPF:0 }/1216
LINK { Local { AS:200 BGP-LS ID:200 Area:0.0.0.0 IPv4:200.4.4.4 }.{ IPv4:192.168.45.1 } Remote { AS:200 BGP-LS ID:200 Area:0.0.0.0 IPv4:100.4.4.4 }.{ IPv4:192.168.45.0 } OSPF:0 }/1216

Two of them are ospf and that is enough to have our lsp up:

root@r1b# run show mpls lsp ingress
Ingress LSP: 2 sessions
To              From            State Rt P     ActivePath       LSPname
100.1.1.1       200.1.1.1       Up     0 *                      r1a
200.4.4.4       200.1.1.1       Up     0 *                      r4b
Total 2 displayed, Up 2, Down 0

100.1.1.1
  From: 200.1.1.1, State: Up, ActiveRoute: 0, LSPname: r1a, LSPid: 11
  ActivePath:  (primary)
  LSPtype: Static Configured, Penultimate hop popping
  LoadBalance: Random
  Follow destination IGP metric
  Encoding type: Packet, Switching type: Packet, GPID: IPv4
  LSP Self-ping Status : Enabled
 *Primary                    State: Up
    Priorities: 7 0
    SmartOptimizeTimer: 180
    Flap Count: 2
    MBB Count: 2
    Computed ERO (S [L] denotes strict [loose] hops): (CSPF metric: 23)
 192.168.113.1 S 192.168.134.1 S 192.168.45.0 S 192.168.34.0 S 192.168.13.0 S
    Received RRO (ProtectionFlag 1=Available 2=InUse 4=B/W 8=Node 10=SoftPreempt 20=Node-ID):
          192.168.113.1(Label=26) 192.168.134.1(Label=34) 192.168.45.0(Label=33) 192.168.34.0(Label=26) 192.168.13.0(Label=3)

That’s it! We managed to create our inter-as rsvp AS!

Is it enough? It might be but while we are here let’s do one more thing.

BGP-LS is designed to support TE use-cases. It means it is designed to bring all those TE attributes we are familiar with.

On r1a (AS100) i configure admin groups:

root@r1a# set protocols mpls admin-groups premium 1
root@r1a# set protocols mpls interface ge-0/0/0 admin-group premium
root@r1a# set protocols isis traffic-engineering advertisement always

Advertisement always is needed with ISIS to have colors advertised into TED.

On a AS200 router I see this:

0000.0000.0011.00(100.1.1.1)->0000.0000.0012.00(100.2.2.2), Local: 192.168.12.0, Remote: 192.168.12.1
  Local interface index: 0, Remote interface index: 0
  LocalPath: 0, Metric: 10, IGP metric: 10, StaticBW: 1000Mbps, AvailBW: 1000Mbps
      Color: 0x2 1
  localBW [0] 0bps  [1] 0bps  [2] 0bps  [3] 0bps
  localBW [4] 0bps  [5] 0bps  [6] 0bps  [7] 0bps
  AvailBW [0] 1000Mbps  [1] 1000Mbps  [2] 1000Mbps  [3] 1000Mbps
  AvailBW [4] 1000Mbps  [5] 1000Mbps  [6] 1000Mbps  [7] 1000Mbps

Color information si preserved via BGP-LS. This means we can use that data to build lsps.

I configure admin-group premium throughout the network:

I define another LSP:

root@r1b# set protocols mpls label-switched-path r1a-premium to 100.1.1.1 admin-group include-any premium

100.1.1.1
  From: 200.1.1.1, State: Up, ActiveRoute: 0, LSPname: r1a-premium, LSPid: 13
  ActivePath:  (primary)
  LSPtype: Static Configured, Penultimate hop popping
  LoadBalance: Random
  Follow destination IGP metric
  Encoding type: Packet, Switching type: Packet, GPID: IPv4
  LSP Self-ping Status : Enabled
 *Primary                    State: Up
    Priorities: 7 0
    SmartOptimizeTimer: 180
          Include Any: premium
    Flap Count: 0
    MBB Count: 0
    Computed ERO (S [L] denotes strict [loose] hops): (CSPF metric: 23)
 192.168.112.1 S 192.168.124.1 S 192.168.45.0 S 192.168.24.0 S 192.168.12.0 S
    Received RRO (ProtectionFlag 1=Available 2=InUse 4=B/W 8=Node 10=SoftPreempt 20=Node-ID):
          192.168.112.1(Label=32) 192.168.124.1(Label=36) 192.168.45.0(Label=35) 192.168.24.0(Label=24) 192.168.12.0(Label=3)

LSP is up and traverses routers r2a and r2b, the ones where group premium is configured.

Let’s compare the two RROs:

Non premium lsp
192.168.113.1(Label=26) 192.168.134.1(Label=34) 192.168.45.0(Label=33) 192.168.34.0(Label=26) 192.168.13.0(Label=3)

Premium lsp
192.168.112.1(Label=32) 192.168.124.1(Label=36) 192.168.45.0(Label=35) 192.168.24.0(Label=24) 192.168.12.0(Label=3)

They take different routes based on the different user constraints.

We really have just extended TE and LS to multiple domains.

Ciao
IoSonoUmberto

Transport classes auto creation

In previous posts, we have always explicitly configured transport classes.

Anyhow, this approach might not be ideal as we need to pre-provision all the TCs we might need in the future.

An alternative approach is to have transport classes on-demand.

This is our reference topology:

Router r1b (rightmost one) advertises IPv4 prefix 200.200.200.201 with community color:0:666.

Transport class for that color is not configured anywhere.

Now, on both ASs, we configure some lsps to use color 666 (what will use color 666 is not important here. It might be a flex algo, a rsvp lsp or a stre lsps. Choose what you want. What matters is that there are some 666-colored lsps within both ASs).

For example, in AS 100 (left one) I have on r1a (leftmost router):

set routing-options flex-algorithm 129 color 666

To better understand how this auto creation works, we temporarily disable algo 129:

root@r1a# delete protocols isis source-packet-routing flex-algorithm 129

Next, we enable auto-creation:

root@r1a# set routing-options transport-class auto-create

Let’s check:

root@r1a# run show route receive-protocol bgp 100.0.0.100 table inet.0 hidden extensive

inet.0: 20 destinations, 20 routes (19 active, 0 holddown, 1 hidden)
  200.200.200.201/32 (1 entry, 0 announced)
     Accepted
     Nexthop: 200.1.1.1
     Localpref: 100
     AS path: 200 I
     Communities: color:0:666

root@r1a# run show routing transport-class all
Transport Class: junos-tc-555  Configured name: tc555
  Color: 555, References: 4
  Transport Endpoints: IPv4 4  IPv6 0
  Mapping community: color:0:555
  Route Target: transport-target:0:555
  Routing instance: junos-rti-tc-555
root@r1a# run show route 200.1.1.1 table inet.0

[edit]
root@r1a#

Route is hidden. Why? Even if route has color community color:0:666 there is no TC 666 as we can see. At this point, Junos to try to solve it as a standard IPv4 route but there is no inet.0 route to remote PNH 200.1.1.1.

Now, we temporarily disable 200.200.200.201 advertisement on r1b.

We enable algo 129 on r1a:

root@r1a# set protocols isis source-packet-routing flex-algorithm 129

Now:

root@r1a# run show routing transport-class all
Transport Class: junos-tc-555  Configured name: tc555
  Color: 555, References: 4
  Transport Endpoints: IPv4 4  IPv6 0
  Mapping community: color:0:555
  Route Target: transport-target:0:555
  Routing instance: junos-rti-tc-555

Transport Class: junos-tc-666
  Color: 666, References: 2
  Transport Endpoints: IPv4 4  IPv6 0
  Mapping community: color:0:666
  Route Target: transport-target:0:666
  Routing instance: junos-rti-tc-666

[edit]
root@r1a# run show route table junos-rti-tc-666

junos-rti-tc-666.inet.3: 4 destinations, 4 routes (4 active, 0 holddown, 0 hidden)
+ = Active Route, - = Last Active, * = Both

100.3.3.3/32       *[L-ISIS/7] 00:01:51, metric 10
                    >  to 192.168.13.1 via ge-0/0/1.0
100.4.4.4/32       *[L-ISIS/7] 00:01:51, metric 20
                    >  to 192.168.13.1 via ge-0/0/1.0, Push 1124
200.1.1.1/32       *[BGP/167] 00:01:51, MED 20, localpref 100, from 100.4.4.4
                      AS path: 200 200 I, validation-state: unverified
                    >  to 192.168.13.1 via ge-0/0/1.0, Push 38, Push 1124(top)
200.3.3.3/32       *[BGP/167] 00:01:51, MED 10, localpref 100, from 100.4.4.4
                      AS path: 200 200 I, validation-state: unverified
                    >  to 192.168.13.1 via ge-0/0/1.0, Push 39, Push 1124(top)

Transport class was automatically created an already includes BGP-CT routes coming from other AS (where we have some lsps using same color 666).

This tells us one important thing: transport class is dynamically created as soon as we have some active lsps for a given color. It is not required to have a BGP route with a color community to trigger TC creation. This way, as soon as the BGP route arrives, everything is in place to resolve it.

Let’s re-enable 200.200.200.201 advertisement on r1b to verify it gets resolved:

root@r1a# run show route 200.200.200.201

inet.0: 20 destinations, 20 routes (20 active, 0 holddown, 0 hidden)
+ = Active Route, - = Last Active, * = Both

200.200.200.201/32 *[BGP/170] 00:00:23, localpref 100, from 100.0.0.100
                      AS path: 200 I, validation-state: unverified
                    >  to 192.168.13.1 via ge-0/0/1.0, Push 38, Push 1124(top)

[edit]
root@r1a# run show route 200.200.200.201 extensive | match "comm|orig|Protocol nex"
                Protocol next hop: 200.1.1.1
                Communities: color:0:666
                        Protocol next hop: 200.1.1.1 Metric: 20
                                200.1.1.1/32 Originating RIB: junos-rti-tc-666.inet.3
                                Protocol next hop: 100.4.4.4 Metric: 20
                                        100.4.4.4/32 Originating RIB: junos-rti-tc-666.inet.3

That’s it, it works!

There is one thing worth looking at:

root@r1a# run show route table bgp.transport.3 match-prefix *200.1.1.1

bgp.transport.3: 8 destinations, 8 routes (8 active, 0 holddown, 0 hidden)
+ = Active Route, - = Last Active, * = Both

200.4.4.4:8:200.1.1.1/96
                   *[BGP/167] 00:28:46, MED 20, localpref 100, from 100.4.4.4
                      AS path: 200 I, validation-state: unverified
                    >  to 192.168.12.1 via ge-0/0/0.0, Push 33, Push 1114(top)
200.4.4.4:9:200.1.1.1/96
                   *[BGP/167] 00:08:27, MED 20, localpref 100, from 100.4.4.4
                      AS path: 200 200 I, validation-state: unverified
                    >  to 192.168.13.1 via ge-0/0/1.0, Push 38, Push 1124(top)

[edit]
root@r1a# run show route table bgp.transport.3 match-prefix *200.1.1.1 extensive | match comm
                Communities: transport-target:0:555
                Communities: transport-target:0:666

root@r1a# run show route receive-protocol bgp 100.4.4.4 table bgp.transport.3 extensive match-prefix *200.1.1.1

bgp.transport.3: 8 destinations, 8 routes (8 active, 0 holddown, 0 hidden)
* 200.4.4.4:8:200.1.1.1/96 (1 entry, 0 announced)
     Import Accepted
     Route Distinguisher: 200.4.4.4:8
     Route Label: 33
     Nexthop: 100.4.4.4
     MED: 20
     Localpref: 100
     AS path: 200 I
     Communities: transport-target:0:555

* 200.4.4.4:9:200.1.1.1/96 (1 entry, 0 announced)
     Import Accepted
     Route Distinguisher: 200.4.4.4:9
     Route Label: 38
     Nexthop: 100.4.4.4
     MED: 20
     Localpref: 100
     AS path: 200 200 I
     Communities: transport-target:0:666

There are now two CT routes to remote PNH 200.1.1.1. Endpoint router is the same, physically, but logically they are two different entities: one reachable via a tc555 path while the other via a tc666 path. We need two different routes because each route uses a different label and it uses a different label as the border router needs to know to which lsp traffic must be sent.

After all, it is still good old Inter-AS option with BGP-LU. We only have to think about 200.1.1.1-tc-666 and 200.1.1.1-tc-555 as two loopbacks of two different remote PEs. This is not a totally abstract idea as we have to consider that with transport classes and SR we slice networks into multiple layers and each layer can be treated as an independent network.

Ciao
IoSonoUmberto

BGP-CT for Inter-AS TE

We should be confident with transport classes and how they work with BGP colored routes.

Anyhow, we have also seen them in action within a single domain (single AS). We can refer to that as the Intra-AS TC scenario.

What if we wanted to extend TCs to multiple ASs? That is possible and allows us to create an end to end path.

Just few words about terminology. What kind of end to end path is this?
Informally, we might talk about a a TE end to end path.
Anyhow, what we build is not a single TE path. It is more like stitching multiple TE paths together via labels.
For this reason, it might be more useful to talk about an end to end SLA path.
After all what we do is to “connect” different TCs belonging to multiple domains. TCs, as we have seen, are logical representations of SLAs and those SLAs are turned into something real through TE lsps.
Let’s agree on talking about an end to end path…which is a SLA path, not a TE one.

We might define 3 transport classes: gold, silver and bronze. Gold one might include high-bw redundant lsps while silver and bronze might include more expensive lsps. Through color community I can map a route to a given TC, hence, to a given service. It is no surprise that colored route are also known as service routes.

The idea is to give the adequate service to routes. For instance, route towards a critical web server might be assigned to gold tc while the one towards a mail server just a bronze service.

Another option is to sell VPN to customers and, based on the service they purchase, map their traffic to a specific TC.

This is nothing more than good old Traffic Engineering.

We have already said many time that one of SR benefits is to bring TE to another level by allowing us to be more flexible and granular when providing TE paths. This is possible by combining SRTE paths (explicit, dcspf, …), flex algo and, with transport classic, to leverage classic rsvp tunnels as well.

What if we had some traffic spanning over multiple ASs but desiring to have an end to end path providing certain SLAs.?
Let’s consider this simple scenario: AS1 and AS2. Locally, they both have a premium TC. There is some traffic going from AS1 to AS2 and we would like that traffic to follow and end to end premium path. This means “stitching” the two premium paths (internal to a single ASO so to create an end to end one (SLA path composed of two local TE paths, each of them part of the same TC).

This is made possible, guess what, by BGP by introducing new family: inet transport. This BGP flavor is known as BGP-CT: BGP classfull transport. Basically, we want to have the same transport class on both ASs and seamlessly create an end to end path from AS1 to AS2 and viceversa. BGP-CT is the one responsible for sticking premium path in AS1 with premium path in AS2.

We are going to work on this topology:

but first, we will focus on the bgp-ct part to create the end to end tc555 path:

What we want to achieve is to have a path from 100.1.1.1 to 200.1.1.1 (and viceversa) in tc 555 rib.

Both ASs are configured similarly (ISIS L2 as IGP, flex algo, RSVP, …) and we omit this part.

Most importantly,they both have TC 555 configured!

If you remember from a previous post, we saw this:

root@r1a# run show route resolution scheme all
Resolution scheme: junos-resol-schem-tc-555-v4-service
  References: 1
  Mapping community: color:0:555
  Resolution Tree index 1, Nodes: 5
  Policy: [__resol-schem-common-import-policy__]
  Contributing routing tables: junos-rti-tc-555.inet.3 inet.3

Resolution scheme: junos-resol-schem-tc-555-v4-transport
  References: 1
  Mapping community: transport-target:0:555
  Resolution Tree index 2, Nodes: 4
  Policy: [__resol-schem-common-import-policy__]
  Contributing routing tables: junos-rti-tc-555.inet.3

We have a service schema and a transport schema. Up to now, we have dealt with service routes, tagged with color community.

Now, we are going to focus on transport routes tagged with transport-target community.

We have seen that tc rib includes all the routers it can reach with lsps associated to the tc.

root@r1a# run show route table junos-rti-tc-555

junos-rti-tc-555.inet.3: 4 destinations, 6 routes (4 active, 0 holddown, 0 hidden)
+ = Active Route, - = Last Active, * = Both

100.2.2.2/32       *[L-ISIS/7] 02:05:04, metric 10
                    >  to 192.168.12.1 via ge-0/0/0.0
100.4.4.4/32       *[L-ISIS/7] 02:05:04, metric 20
                    >  to 192.168.12.1 via ge-0/0/0.0, Push 1114
                    [SPRING-TE/8] 02:11:49, metric 1, metric2 20
                    >  to 192.168.13.1 via ge-0/0/1.0, Push 1104
                    [RSVP/9/1] 02:14:11, metric 20
                    >  to 192.168.12.1 via ge-0/0/0.0, label-switched-path rsvp-r4a

This tells us tc 555 can reach r2a (100.2.2.2) via a flex algo route and r4a (100.4.4.4) via three routes (flex algo, rsvp, srte).

The active route for each destination is copied into another table called bgp.transport.3:

root@r1a# run show route table bgp.transport.3

bgp.transport.3: 5 destinations, 5 routes (5 active, 0 holddown, 0 hidden)
+ = Active Route, - = Last Active, * = Both

100.1.1.1:8:100.2.2.2/96
                   *[L-ISIS/7] 02:06:44, metric 10
                    >  to 192.168.12.1 via ge-0/0/0.0
100.1.1.1:8:100.4.4.4/96
                   *[L-ISIS/7] 02:06:44, metric 20
                    >  to 192.168.12.1 via ge-0/0/0.0, Push 1114
100.1.1.1:9:100.3.3.3/96
                   *[L-ISIS/7] 02:06:44, metric 10
                    >  to 192.168.13.1 via ge-0/0/1.0

root@r1a# run show route table bgp.transport.3 extensive | match comm
                Communities: transport-target:0:555
                Communities: transport-target:0:555

Those routes, een if they are bgp routes, are tagged with a transport-target community.

We might say that, as a starting point, bgp.transport.3 includes routes towards all the intra-as routers it can reach within that tc.

Moreover, you can see those routes have a MP-BGP-like NLRI structure. They start with a RD. That RD was automatically generated and that why we needed to configured route-distinguisher-id under routing-options when enabling transport classes.

Similarly, we can see this on r4a:

root@r4a> show route table bgp.transport.3

bgp.transport.3: 4 destinations, 4 routes (4 active, 0 holddown, 0 hidden)
+ = Active Route, - = Last Active, * = Both

100.4.4.4:9:100.1.1.1/96
                   *[L-ISIS/14] 18:54:15, metric 20
                    >  to 192.168.24.0 via ge-0/0/0.0, Push 1111
100.4.4.4:9:100.2.2.2/96
                   *[L-ISIS/14] 18:54:15, metric 10
                    >  to 192.168.24.0 via ge-0/0/0.0

R4a is a special router; it is a border router between AS 100 and AS 200. It has the information about the endpoint it can reach within tc 555 stored in bgp.transport.3. That “table name is “bgp.” is there for a reason. The idea is to use BGP to advertise those routes to r4b, the border router of AS 200.

We are going to use eBGP-CT. This is configuration on r4a (r4b is specular):

set protocols bgp group ebgp type external
set protocols bgp group ebgp import imp-ebgp
set protocols bgp group ebgp family inet transport
set protocols bgp group ebgp export exp-ebgp
set protocols bgp group ebgp peer-as 200
set protocols bgp group ebgp neighbor 192.168.45.1

set policy-options policy-statement exp-ebgp term tc from community tc555
set policy-options policy-statement exp-ebgp term tc then accept

set policy-options policy-statement imp-ebgp term ok from protocol bgp
set policy-options policy-statement imp-ebgp term ok then accept

The key here is to enable family inet transport (BGP-CT). Then, we have an export policy to advertise transport routes tagged with transport-target:0:555 (transport rt associated to tc555) and an import policy to accept bgp routes (here we do not filter on any specific transport rt but we accept anything. Filtering can be achieved with usual policy manipulation).

As a result:

root@r4a> show route advertising-protocol bgp 192.168.45.1 table bgp.transport.3 extensive

bgp.transport.3: 4 destinations, 4 routes (4 active, 0 holddown, 0 hidden)
* 100.4.4.4:9:100.1.1.1/96 (1 entry, 1 announced)
 BGP group ebgp type External
     Route Distinguisher: 100.4.4.4:9
     Route Label: 18
     Nexthop: Self
     Flags: Nexthop Change
     MED: 20
     AS path: [100] I
     Communities: transport-target:0:555

* 100.4.4.4:9:100.2.2.2/96 (1 entry, 1 announced)
 BGP group ebgp type External
     Route Distinguisher: 100.4.4.4:9
     Route Label: 19
     Nexthop: Self
     Flags: Nexthop Change
     MED: 10
     AS path: [100] I
     Communities: transport-target:0:555

As you can see, a label is assigned as well.

Next, we move to r4b, the border router in AS 200.

root@r4b> show route table bgp.transport.3

bgp.transport.3: 4 destinations, 4 routes (4 active, 0 holddown, 0 hidden)
+ = Active Route, - = Last Active, * = Both

200.4.4.4:8:200.1.1.1/96
                   *[L-ISIS/14] 19:07:54, metric 20
                    >  to 192.168.124.0 via ge-0/0/0.0, Push 1211
200.4.4.4:8:200.2.2.2/96
                   *[L-ISIS/14] 19:08:04, metric 10
                    >  to 192.168.124.0 via ge-0/0/0.0
100.4.4.4:9:100.1.1.1/96
                   *[BGP/167] 18:42:20, MED 20, localpref 100
                      AS path: 100 I, validation-state: unverified
                    >  to 192.168.45.0 via ge-0/0/2.0, Push 18
100.4.4.4:9:100.2.2.2/96
                   *[BGP/167] 18:42:20, MED 10, localpref 100
                      AS path: 100 I, validation-state: unverified
                    >  to 192.168.45.0 via ge-0/0/2.0, Push 19

Here, table bgp.transport.3 has internal routes plus BGP-CT routes to reach endpoint in AS 100. Notice, to reach remote endpoint the labels r4a advertised are pushed.

Last step, is to advertise those remote endpoint internally, to r1b. This is done via iBGP-CT.

On r4b:

set protocols bgp group ibgp type internal
set protocols bgp group ibgp local-address 200.4.4.4
set protocols bgp group ibgp family inet transport
set protocols bgp group ibgp neighbor 200.1.1.1

On r1b:

set protocols bgp group ibgp type internal
set protocols bgp group ibgp local-address 200.1.1.1
set protocols bgp group ibgp import imp-ct
set protocols bgp group ibgp family inet transport
set protocols bgp group ibgp neighbor 200.4.4.4
set policy-options policy-statement imp-ct term ok from protocol bgp
set policy-options policy-statement imp-ct term ok from community tc555
set policy-options policy-statement imp-ct term ok then accept

This is what we have on r1b bgp.transport.3:

root@r1b> show route receive-protocol bgp 200.4.4.4 table bgp.transport.3 extensive

bgp.transport.3: 4 destinations, 4 routes (4 active, 0 holddown, 0 hidden)
* 100.4.4.4:9:100.1.1.1/96 (1 entry, 0 announced)
     Import Accepted
     Route Distinguisher: 100.4.4.4:9
     Route Label: 30
     Nexthop: 200.4.4.4
     MED: 20
     Localpref: 100
     AS path: 100 I
     Communities: transport-target:0:555

* 100.4.4.4:9:100.2.2.2/96 (1 entry, 0 announced)
     Import Accepted
     Route Distinguisher: 100.4.4.4:9
     Route Label: 29
     Nexthop: 200.4.4.4
     MED: 10
     Localpref: 100
     AS path: 100 I
     Communities: transport-target:0:555

root@r1b> show route table bgp.transport.3

bgp.transport.3: 4 destinations, 4 routes (4 active, 0 holddown, 0 hidden)
+ = Active Route, - = Last Active, * = Both

200.1.1.1:8:200.2.2.2/96
                   *[L-ISIS/14] 19:18:29, metric 10
                    >  to 192.168.112.1 via ge-0/0/0.0
200.1.1.1:8:200.4.4.4/96
                   *[L-ISIS/14] 19:16:26, metric 20
                    >  to 192.168.112.1 via ge-0/0/0.0, Push 1214
100.4.4.4:9:100.1.1.1/96
                   *[BGP/167] 18:50:41, MED 20, localpref 100, from 200.4.4.4
                      AS path: 100 I, validation-state: unverified
                    >  to 192.168.112.1 via ge-0/0/0.0, Push 30, Push 1214(top)
100.4.4.4:9:100.2.2.2/96
                   *[BGP/167] 18:50:41, MED 10, localpref 100, from 200.4.4.4
                      AS path: 100 I, validation-state: unverified
                    >  to 192.168.112.1 via ge-0/0/0.0, Push 29, Push 1214(top)

R1b can reach both endpoint in its AS and remote endpoints (r1a and r2a). Again, r3a is not reachable as tc 555 in AS 100 has no route to that router.

To reach r1a (PNH 100.1.1.1) a double push is performed. Why?

To understand it, we need to recall the resolution schema and what the BGP advertisement contains:

* 100.4.4.4:9:100.1.1.1/96 (1 entry, 0 announced)
     Import Accepted
     Route Distinguisher: 100.4.4.4:9
     Route Label: 30
     Nexthop: 200.4.4.4
     MED: 20
     Localpref: 100
     AS path: 100 I
     Communities: transport-target:0:555

Route has community transport-target:0:555. Resolution scheme for that community says to resolve the route in tc 555 rib. There we should find a route for the BGP protocol next-hop (200.4.4.4):

root@r1b> show route 200.4.4.4 table junos-rti-tc-555

junos-rti-tc-555.inet.3: 4 destinations, 4 routes (4 active, 0 holddown, 0 hidden)
+ = Active Route, - = Last Active, * = Both

200.4.4.4/32       *[L-ISIS/14] 19:21:23, metric 20
                    >  to 192.168.112.1 via ge-0/0/0.0, Push 1214

There it is! Top label is the label to reach PNH (200.4.4.4) while bottom label is CT route label.

Our end to end path is ready!

Let’s follow it from r1b to r1a.

As just seen, first step is to reach r4b (AS 200 border router). We follow the lsp to r4b alg 128 node-sid (1214).

On r4b we have label 30 as top label. This is how it is processed:

root@r4b> show route label 30

mpls.0: 25 destinations, 25 routes (25 active, 0 holddown, 0 hidden)
+ = Active Route, - = Last Active, * = Both

30                 *[VPN/170] 19:01:18
                    >  to 192.168.45.0 via ge-0/0/2.0, Swap 18

A label swap is performed. New label is the route label advertised over eBGP-CT by r4a.

Next, we are in AS 100. R4a processes label 18:

root@r4a> show route label 18

mpls.0: 29 destinations, 29 routes (29 active, 0 holddown, 0 hidden)
+ = Active Route, - = Last Active, * = Both

18                 *[VPN/170] 19:28:27
                    >  to 192.168.24.0 via ge-0/0/0.0, Swap 1111

Label swap to r1a node sid to reach our final destination (100.1.1.1).

Route to 100.1.1.1 on r1b is also available in tc 555 rib:

root@r1b> show route 100.1.1.1

junos-rti-tc-555.inet.3: 4 destinations, 4 routes (4 active, 0 holddown, 0 hidden)
+ = Active Route, - = Last Active, * = Both

100.1.1.1/32       *[BGP/167] 19:04:33, MED 20, localpref 100, from 200.4.4.4
                      AS path: 100 I, validation-state: unverified
                    >  to 192.168.112.1 via ge-0/0/0.0, Push 30, Push 1214(top)

Why that? Suppose a BGP route with PNH 100.1.1.1 with community color:0:555 arrives at r1b. As we have tc 55 configured, that BGP route is resolved into tc 555 rib so we need a route there towards 100.1.1.1.

Now, it should be clear the difference between service routes and transport routes.

Transport routes, tagged with transport-target community, are used to build end to end tunnel across ASs. They allow routers in an AS to reach endpoints in remote ASs.

Service routes, instead, are non infra routes (e.g. customer routes) that rely on transport routes to find a valid inter-as end to end tunnel to reach BGP PNH.

There are some interesting considerations to be done about their inter-dependence and fallback but we will see it later.

You have probably already thought about that but building inter-as end to end labelled paths is not something so new. It looks like an evolved inter-as option C.
Comparison is correct. BGP-CT replaces BGP-LU while intra-AS LDP/RSVP lsps are replaced by a mix of SRTE/RSVP/FlexAlgo lsps to provide better TE capabilites. On top of that, transport classes allow to further slice the network to provide different SLAs.

One difference between this BGP-CT driven model and a BGP-LU one is that remote PNH are not installed into inet.0 but into tc ribs (and only those tc ribs that share the same tc with remote endpoint). This has a consequence: it is less straightforward to build that PE-PE multihop MP-eBGP session used to carry service routes.

Anyhow, this is not a big deal as, commonly, we do not have those PE-PE session but we go through the mediation of route reflectors.

Here we will use this model

  • both ASs have their RR (rr100 and rr200)
  • rr100 peers with r1a
  • rr200 peers with r1b
  • rr100 and rr200 have an eBGP multihop session to exchange routes without modifying the next-hop
  • rrs must have reachability somehow (this is out of the scope of this post)

Here is sample minimal config for rr100 (rr200 is specular0):

set interfaces ge-0/0/0 unit 0 family inet address 192.168.100.1/31
set interfaces ge-0/0/0 unit 0 family iso
set interfaces ge-0/0/0 unit 0 family mpls
set interfaces lo0 unit 0 family inet address 100.0.0.100/32
set interfaces lo0 unit 0 family iso address 49.0001.0000.0000.0100.00
set routing-options rib inet.3 static route 0.0.0.0/0 discard
set routing-options autonomous-system 100
set routing-options static route 200.0.0.0/8 next-hop 192.168.100.0
set protocols bgp family inet unicast
set protocols bgp family inet-vpn unicast
set protocols bgp group rr type internal
set protocols bgp group rr local-address 100.0.0.100
set protocols bgp group rr cluster 0.0.0.100
set protocols bgp group rr neighbor 100.1.1.1
set protocols bgp group rr neighbor 100.4.4.4
set protocols bgp group otherrr type external
set protocols bgp group otherrr local-address 100.0.0.100
set protocols bgp group otherrr peer-as 200
set protocols bgp group otherrr neighbor 200.0.0.200 multihop no-nexthop-change
set protocols isis interface ge-0/0/0.0 level 1 disable
set protocols isis interface ge-0/0/0.0 point-to-point
set protocols isis interface lo0.0
set protocols isis level 2 wide-metrics-only

In previous post we have already set up a bgp session between r1a/r4a and rr100 to exchange inet unicast and inet-vpn routes. Now, we do the same on r1b (towards rr200) and extend the VPN to r1b as well (we omit bgp session to rr and vrf configuration on r1b as identical to what was done in previous post).

As a result, on r1a, we have this for an inet route:

root@r1a# run show route 200.200.200.200

inet.0: 20 destinations, 20 routes (20 active, 0 holddown, 0 hidden)
+ = Active Route, - = Last Active, * = Both

200.200.200.200/32 *[BGP/170] 03:22:10, localpref 100, from 100.0.0.100
                      AS path: 200 I, validation-state: unverified
                    >  to 192.168.12.1 via ge-0/0/0.0, Push 33, Push 1114(top)

[edit]
root@r1a# run show route 200.200.200.200 extensive | match "comm|orig|Protocol nex"
                Protocol next hop: 200.1.1.1
                Communities: color:0:555
                        Protocol next hop: 200.1.1.1 Metric: 20
                                200.1.1.1/32 Originating RIB: junos-rti-tc-555.inet.3
                                Protocol next hop: 100.4.4.4 Metric: 20
                                        100.4.4.4/32 Originating RIB: junos-rti-tc-555.inet.3

and this for a vpn route:

root@r1a# run show route table vpn100.inet.0

vpn100.inet.0: 3 destinations, 3 routes (3 active, 0 holddown, 0 hidden)
+ = Active Route, - = Last Active, * = Both

1.2.3.4/32         *[Direct/0] 08:53:23
                    >  via lo0.100
4.3.2.1/32         *[BGP/170] 02:31:11, localpref 100, from 100.0.0.100
                      AS path: I, validation-state: unverified
                    >  to 192.168.12.1 via ge-0/0/0.0, Push 35, Push 1114(top)
5.6.7.8/32         *[BGP/170] 03:23:56, localpref 100, from 100.0.0.100
                      AS path: 200 I, validation-state: unverified
                    >  to 192.168.12.1 via ge-0/0/0.0, Push 18, Push 33, Push 1114(top)

[edit]
root@r1a# run show route table vpn100.inet.0 5.6.7.8 extensive | match "comm|orig|Protocol nex"
                Protocol next hop: 200.1.1.1
                Communities: target:1:100 color:0:555
                        Protocol next hop: 200.1.1.1 Metric: 20
                                200.1.1.1/32 Originating RIB: junos-rti-tc-555.inet.3
                                Protocol next hop: 100.4.4.4 Metric: 20
                                        100.4.4.4/32 Originating RIB: junos-rti-tc-555.inet.3

As you can see, the VPN route requires 3 labels to be pushed which what we had with classic inter-as option C. Here, we might have more if the intra-AS lsp was, for example, a SRTE path built with Adj SIDs.

As already said, we might see this BGP-CT scenario as the enabler for an evolved Inter-AS option C use-case leveraging the advanced TE capabilities transport classes and SR can provide.

Ciao
IoSonoUmberto

Get to know transport classes

Remember SR color communities? We used them to provide granular and advanced TE to traffic.

Routes tagged with them relied on the inetcolor.0 table and extended color next-hop resolution.

Well, networks are fast and, sometimes, technical enhancements are too.

So guess what? We have a new model to work with color communities. It relies on a new object: transport class.

Let’s start discovering this new model.

We will use a simple topology:

First, we provision our network:

  • ISIS L2 only
  • RSVP
  • Flex algo 128 and 129
  • SR

We omit basic configuration of those protocols and focus on what’s new.

First, we configure a SRTE path from r1a to r4a with color 555:

set protocols source-packet-routing segment-list r4a-via-r3a r3 label 1103
set protocols source-packet-routing segment-list r4a-via-r3a r4 label 1104
set protocols source-packet-routing source-routing-path r4a-colored to 100.4.4.4
set protocols source-packet-routing source-routing-path r4a-colored color 555
set protocols source-packet-routing source-routing-path r4a-colored binding-sid 1000555
set protocols source-packet-routing source-routing-path r4a-colored primary r4a-via-r3a bfd-liveness-detection sbfd remote-discriminator 14
set protocols source-packet-routing source-routing-path r4a-colored primary r4a-via-r3a bfd-liveness-detection minimum-interval 1000
set protocols source-packet-routing source-routing-path r4a-colored primary r4a-via-r3a bfd-liveness-detection multiplier 3

By default, as already seen here , route is placed into inetcolor.0:

root@r1a# run show route table inetcolor.0

inetcolor.0: 1 destinations, 1 routes (1 active, 0 holddown, 0 hidden)
+ = Active Route, - = Last Active, * = Both

100.4.4.4-555<c>/64
                   *[SPRING-TE/8] 00:00:14, metric 1, metric2 10
                    >  to 192.168.13.1 via ge-0/0/1.0, Push 1104

Next, we check algo 128 routes. They end up in inetcolor.0 as well:

root@r1a# run show route table inetcolor.0

inetcolor.0: 3 destinations, 3 routes (3 active, 0 holddown, 0 hidden)
+ = Active Route, - = Last Active, * = Both

100.2.2.2-128<c>/64
                   *[L-ISIS/14] 00:00:06, metric 10
                    >  to 192.168.12.1 via ge-0/0/0.0
100.4.4.4-555<c>/64
                   *[SPRING-TE/8] 00:02:18, metric 1, metric2 20
                    >  to 192.168.13.1 via ge-0/0/1.0, Push 1104
100.4.4.4-128<c>/64
                   *[L-ISIS/14] 00:00:06, metric 20
                    >  to 192.168.12.1 via ge-0/0/0.0, Push 1204

We configure another color for Algo 128:

set routing-options flex-algorithm 128 color 555

As a result:

root@r1a# run show route table inetcolor.0

inetcolor.0: 2 destinations, 3 routes (2 active, 0 holddown, 0 hidden)
+ = Active Route, - = Last Active, * = Both

100.2.2.2-555<c>/64
                   *[L-ISIS/14] 00:00:02, metric 10
                    >  to 192.168.12.1 via ge-0/0/0.0
100.4.4.4-555<c>/64
                   *[SPRING-TE/8] 00:02:46, metric 1, metric2 20
                    >  to 192.168.13.1 via ge-0/0/1.0, Push 1104
                    [L-ISIS/14] 00:00:02, metric 20
                    >  to 192.168.12.1 via ge-0/0/0.0, Push 1204

Algo 128 routes are now “mapped” to color 555.

Now, we define the transport class:

set routing-options route-distinguisher-id 100.1.1.1
set routing-options transport-class name tc555 color 555

It is mandatory to configure route-distinguisher-id in order to autogenerate RDs for transport routes. We will talk about this later when moving to an inter-as scenario.

Transport class gets created:

root@r1a# run show routing transport-class all
Transport Class: junos-tc-555  Configured name: tc555
  Color: 555, References: 3
  Transport Endpoints: IPv4 2  IPv6 0
  Mapping community: color:0:555
  Route Target: transport-target:0:555
  Routing instance: junos-rti-tc-555

As you can see TC is mapped to 2 communities:

  • color:0:555, classic color community that will no longer rely on inetcolor.0
  • transport-target:0:555, route target used for transport routes. This will come into play in a inter-as scenario. We can forget about it right now

TC has its own resolution scheme:

root@r1a# run show route resolution scheme all
Resolution scheme: junos-resol-schem-tc-555-v4-service
  References: 1
  Mapping community: color:0:555
  Resolution Tree index 5, Nodes: 3
  Policy: [__resol-schem-common-import-policy__]
  Contributing routing tables: junos-rti-tc-555.inet.3 inet.3

Resolution scheme: junos-resol-schem-tc-555-v4-transport
  References: 1
  Mapping community: transport-target:0:555
  Resolution Tree index 6, Nodes: 2
  Policy: [__resol-schem-common-import-policy__]
  Contributing routing tables: junos-rti-tc-555.inet.3

Junos creates a new routing table: junos-rti-tc-555.inet.3.

Let’s focus on the “service” scheme, the one mapped to color:0:555.
Please notice this scheme also has inet.3 among its contributing ribs. This is a fallback mechanism: a service route is resolved into junos-rti-tc-55.inet.3 first, if there is no match then it looks for a fallback route into inet.3.
This fallback mechanism is not available for transport routes. Again, we will talk about this later.

To populate the new routing table we have to map lsps to transport classes.
We can map Flex algo lsps, SRTE lsps or RSVP lsps.

On r1a:

set routing-options flex-algorithm 128 use-transport-class
set protocols source-packet-routing use-transport-class
set protocols mpls label-switched-path rsvp-r4a transport-class tc555 primary r1a-r2a-r3a-r4a
set protocols mpls label-switched-path rsvp-r4a to 100.4.4.4
set protocols mpls path r1a-r2a-r3a-r4a 100.2.2.2 strict
set protocols mpls path r1a-r2a-r3a-r4a 100.3.3.3 strict
set protocols mpls path r1a-r2a-r3a-r4a 100.4.4.4 strict

As a result:

root@r1a# run show route table junos-rti-tc-555.inet.3

junos-rti-tc-555.inet.3: 2 destinations, 4 routes (2 active, 0 holddown, 0 hidden)
+ = Active Route, - = Last Active, * = Both

100.2.2.2/32       *[L-ISIS/14] 00:01:51, metric 10
                    >  to 192.168.12.1 via ge-0/0/0.0
100.4.4.4/32       *[RSVP/7/1] 00:05:38, metric 20
                    >  to 192.168.12.1 via ge-0/0/0.0, label-switched-path rsvp-r4a
                    [SPRING-TE/8] 00:04:35, metric 1, metric2 20
                    >  to 192.168.13.1 via ge-0/0/1.0, Push 1104
                    [L-ISIS/14] 00:01:51, metric 20
                    >  to 192.168.12.1 via ge-0/0/0.0, Push 1204

What do we have?

  • one single route to 100.2.2.2 (r2a) coming from flex algo 128 (r2a participates to algo 128)
  • three routes to 100.4.4.4 (r4a): flex algo 128 + rsvp + SRTE colored lsp
  • no route to 100.3.3.3 (r3a) as r3a does not participate to algo 128 and we have no rsvp tunnels or SRTE lsps mapped to transport class 555

Let’s move to r4a where we configured the same transport class (555). Let’s check tc table there:

root@r4a> show route table junos-rti-tc-555

junos-rti-tc-555.inet.3: 4 destinations, 4 routes (4 active, 0 holddown, 0 hidden)
+ = Active Route, - = Last Active, * = Both

100.1.1.1/32       *[L-ISIS/14] 15:42:50, metric 20
                    >  to 192.168.24.0 via ge-0/0/0.0, Push 1111
100.2.2.2/32       *[L-ISIS/14] 15:42:50, metric 10
                    >  to 192.168.24.0 via ge-0/0/0.0

We have routes to r1a and r2a that belong to algo 128.

Both r1a and r4a can reach r2a but not r3a (no algo 128 there and no srte or rsvp lsps). They can also reach each other.

R4a, to reach r1a only has one route, the Algo 128 one. If that route goes down, r4a cannot reach r1a anymore within TC 555.

On the other hand, r1a hs multiple lsp within TC 55 to reach r4a. This means it is more resilient and fault tolerant.

We start understanding some principles and advantages of transport classes:

  • each TC is associated to a color
  • each TC has its own routing table
  • that routing table contains lsps towards other nodes in the network
  • Those lsps are the ones that , somehow, were associated to the TC (SRTE lsp with the right color, rsvp lsp mapped to the TC, flex algo with the right color)

One difference with the inetcolor.0 model is that we can include rsvp lsps as well (with inetcolor.0 only SRTE lsps and flex algo could be given a color). This means higher flexibility and bigger TE potentialities.

Till here, we kinda built the “underlay”, made of lsps, of the transport class. Now, it is time to use it.

We add another router acting as RR, peering with r1a and r4a. Only “family inet-unicast” is enabled.

Let’s check bgp conf on r1a:

set protocols bgp group rr type internal
set protocols bgp group rr local-address 100.1.1.1
set protocols bgp group rr family inet unicast
set protocols bgp group rr export exp-rr
set protocols bgp group rr neighbor 100.0.0.100
set policy-options policy-statement exp-rr term stc-inet from protocol static
set policy-options policy-statement exp-rr term stc-inet from route-filter 100.100.100.100/32 exact
set policy-options policy-statement exp-rr term stc-inet then community add color555
set policy-options policy-statement exp-rr term stc-inet then accept
set policy-options community color555 members color:0:555
set routing-options static route 100.100.100.100/32 discard

It is a very standard bgp conf. Please notice, we do not need the extended color next-hop knob to resolve color communities. This might seem a small difference with the inetcolor.0 model but it has its advantages but this will be clearer later.

How does junos resolve colored nextp-hops without that knob? It uses the resolution scheme we saw before that used color:0:555 as mapping community:

Resolution scheme: junos-resol-schem-tc-555-v4-service
  References: 1
  Mapping community: color:0:555
  Resolution Tree index 1, Nodes: 5
  Policy: [__resol-schem-common-import-policy__]
  Contributing routing tables: junos-rti-tc-555.inet.3 inet.3

Basically, this tell junos to resolve routes tagged with color:0:555 in the TC routing table (or inet.3 as a fallback).

Similarly to r1a, r4a advertises 123.4.4.41 with community color:0:555.

Let’s check it on r1a:

root@r1a# run show route 123.4.4.41

inet.0: 20 destinations, 20 routes (20 active, 0 holddown, 0 hidden)
+ = Active Route, - = Last Active, * = Both

123.4.4.41/32      *[BGP/170] 00:03:17, localpref 100, from 100.0.0.100
                      AS path: I, validation-state: unverified
                    >  to 192.168.13.1 via ge-0/0/1.0, Push 1104

[edit]
root@r1a# run show route 123.4.4.41 extensive | match "comm|Orig|Protocol next"
                Protocol next hop: 100.4.4.4
                AS path: I  (Originator)
                Originator ID: 100.4.4.4
                Communities: color:0:555
                        Protocol next hop: 100.4.4.4 Metric: 20
                                100.4.4.4/32 Originating RIB: junos-rti-tc-555.inet.3
                                Protocol next hop: 1103 Metric: 10
                                        1103 /52 Originating RIB: mpls.0

[edit]
root@r1a# run show route 100.4.4.4 active-path

inet.0: 20 destinations, 20 routes (20 active, 0 holddown, 0 hidden)
+ = Active Route, - = Last Active, * = Both

100.4.4.4/32       *[IS-IS/18] 16:38:59, metric 20
                    >  to 192.168.12.1 via ge-0/0/0.0
                       to 192.168.13.1 via ge-0/0/1.0

inet.3: 3 destinations, 4 routes (3 active, 0 holddown, 0 hidden)
+ = Active Route, - = Last Active, * = Both

100.4.4.4/32       *[L-ISIS/14] 16:38:59, metric 20
                    >  to 192.168.12.1 via ge-0/0/0.0, Push 1104
                       to 192.168.13.1 via ge-0/0/1.0, Push 1104

junos-rti-tc-555.inet.3: 4 destinations, 6 routes (4 active, 0 holddown, 0 hidden)
+ = Active Route, - = Last Active, * = Both

100.4.4.4/32       *[SPRING-TE/8] 00:01:25, metric 1, metric2 20
                    >  to 192.168.13.1 via ge-0/0/1.0, Push 1104

As you can see route is resolved within TC rib by using the SRTE path.

TC rib has 3 routes to PNH (protocol next-hop) 100.4.4.4:

junos-rti-tc-555.inet.3: 4 destinations, 6 routes (4 active, 0 holddown, 0 hidden)
+ = Active Route, - = Last Active, * = Both

100.4.4.4/32       *[SPRING-TE/8] 00:03:38, metric 1, metric2 20
                    >  to 192.168.13.1 via ge-0/0/1.0, Push 1104
                    [RSVP/9/1] 00:06:00, metric 20
                    >  to 192.168.12.1 via ge-0/0/0.0, label-switched-path rsvp-r4a
                    [L-ISIS/14] 00:08:28, metric 20
                    >  to 192.168.12.1 via ge-0/0/0.0, Push 1114

We can influence which lsp is used, for example by playing with preferences:

root@r1a# set protocols isis level 2 flex-algorithm-preference 7

[edit]
root@r1a# commit
commit complete

[edit]
root@r1a# run show route table junos-rti-tc-555 100.4.4.4

junos-rti-tc-555.inet.3: 4 destinations, 6 routes (4 active, 0 holddown, 0 hidden)
+ = Active Route, - = Last Active, * = Both

100.4.4.4/32       *[L-ISIS/7] 00:00:04, metric 20
                    >  to 192.168.12.1 via ge-0/0/0.0, Push 1114
                    [SPRING-TE/8] 00:06:49, metric 1, metric2 20
                    >  to 192.168.13.1 via ge-0/0/1.0, Push 1104
                    [RSVP/9/1] 00:09:11, metric 20
                    >  to 192.168.12.1 via ge-0/0/0.0, label-switched-path rsvp-r4a

[edit]
root@r1a# run show route 123.4.4.41

inet.0: 20 destinations, 20 routes (20 active, 0 holddown, 0 hidden)
+ = Active Route, - = Last Active, * = Both

123.4.4.41/32      *[BGP/170] 00:01:31, localpref 100, from 100.0.0.100
                      AS path: I, validation-state: unverified
                    >  to 192.168.12.1 via ge-0/0/0.0, Push 1114

This is the transport color model replacing the inetcolor one.

Summing up:

  • we slice our network into transport classes
  • each transport class has its own routing table
  • that table contains lsps to other routers in the network
  • those lsps might come from rsvp, flex algo, srte path where appropriate tagging is configured
  • BGP routes still relies on color community
  • BGP colored routes are resolved into TC rib (or inet.3 as fallback) with no need of the extended color nexthop knob

Are we done? Of course, not…but for today it is enough

Ciao
IoSonoUmberto

Taking shortcuts with IGP

IGP protocols, like ISIS and OSPF, are based on the implementation of the shortest path first (SPF) algorithm. This means that a given node locally computes the best path towards all the remote destination.

ISIS and OSPF are link-state protocols, meaning each node has the complete view of the network.

By default, any LSP (e.g. RSVP lsp) is not considered when running SPF algo.

There are two ways to have LSP taken into consideration.

One, is to establish so-called forwarding adjacencies between source and destination of a lsp. This is more intrusive but makes the whole network aware of the lsp. By that, I mean that all the nodes learn about the lsp (and its cost) and can use it when running SPF.

Two, is to use so-called shortcuts. This means leveraging lsps as a way to “move faster” (like a shortcut) through the SPF from a source to a destination. Unlike forwarding adjacencies, shortcuts are a “local measure”.

Based on Juniper official documentation, this is what happens:

  • first, standard SPF is run
  • then, “second computation is performed considering only LSPs as a logical interface. Each LSP’s egress router is considered. The list of destinations whose shortest path traverses the egress router (established during the first computation) is placed in the inet.3 routing table.”

There are two key things we learn from the second bullet:

  • shortcuts are used to create routes within inet.3 table, so available for BGP next-hops
  • a given lsp is used as shortcut for all the destinations traversing the lsp egress router. This means that, for example, to reach node X only a lsp terminating on a node along the shortest path to X can be used

With that in mind, we can start playing a bit in order to understand how these shortcuts work.

I will use this topology and focus on path R1-R5:

By default, R1 has two ECMP paths to R5.

root@r1# run show route 5.5.5.5

inet.0: 20 destinations, 20 routes (20 active, 0 holddown, 0 hidden)
+ = Active Route, - = Last Active, * = Both

5.5.5.5/32         *[IS-IS/18] 00:01:06, metric 30
                    >  to 192.168.12.1 via ge-0/0/1.0
                       to 192.168.13.1 via ge-0/0/2.0

Next, I create a RSVP lsp from R1 to R4 via R3:

set protocols mpls label-switched-path r4viar3 to 4.4.4.4
set protocols mpls label-switched-path r4viar3 primary r3strict
set protocols mpls path r3strict 3.3.3.3 strict

root@r1# run show mpls lsp
Ingress LSP: 1 sessions
To              From            State Rt P     ActivePath       LSPname
4.4.4.4         1.1.1.1         Up     0 *     r3strict         r4viar3
Total 1 displayed, Up 1, Down 0

No change in RIB:

root@r1# run show route 5.5.5.5

inet.0: 20 destinations, 20 routes (20 active, 0 holddown, 0 hidden)
+ = Active Route, - = Last Active, * = Both

5.5.5.5/32         *[IS-IS/18] 00:08:56, metric 30
                    >  to 192.168.12.1 via ge-0/0/1.0
                       to 192.168.13.1 via ge-0/0/2.0

We enable shortcuts on R1:

root@r1# set protocols isis traffic-engineering family inet shortcuts

ISIS now knows about the lsp:

root@r1# run show isis route
 IS-IS routing table             Current version: L1: 2 L2: 22
IPv4/IPv6 Routes
----------------
Prefix             L Version   Metric Type Interface       NH   Via                 Backup Score
2.2.2.2/32         2      22       10 int  ge-0/0/1.0      IPV4 r2
3.3.3.3/32         2      22       10 int  ge-0/0/2.0      IPV4 r3
4.4.4.4/32         2      22       20 int  ge-0/0/1.0      IPV4 r2
                                           ge-0/0/2.0      IPV4 r3
                                           ge-0/0/2.0      LSP  r4viar3
5.5.5.5/32         2      22       30 int  ge-0/0/1.0      IPV4 r2
                                           ge-0/0/2.0      IPV4 r3
                                           ge-0/0/2.0      LSP  r4viar3
192.168.24.0/31    2      22       20 int  ge-0/0/1.0      IPV4 r2
192.168.34.0/31    2      22       20 int  ge-0/0/2.0      IPV4 r3
192.168.45.0/31    2      22       30 int  ge-0/0/1.0      IPV4 r2
                                           ge-0/0/2.0      IPV4 r3
                                           ge-0/0/2.0      LSP  r4viar3

And an inet.3 route appears:

root@r1# run show route 5.5.5.5

inet.0: 20 destinations, 20 routes (20 active, 0 holddown, 0 hidden)
+ = Active Route, - = Last Active, * = Both

5.5.5.5/32         *[IS-IS/18] 00:00:10, metric 30
                    >  to 192.168.12.1 via ge-0/0/1.0
                       to 192.168.13.1 via ge-0/0/2.0

inet.3: 3 destinations, 4 routes (3 active, 0 holddown, 0 hidden)
+ = Active Route, - = Last Active, * = Both

5.5.5.5/32         *[IS-IS/18] 00:00:10, metric 30
                    >  to 192.168.13.1 via ge-0/0/2.0, label-switched-path r4viar3

Notice, only R1 knows about the lsp and uses it when running SPF. As said before, shortcuts are a local mechanism.

Inet.3 route has metric 30. We can change it by setting a custom metric on the lsp:

root@r1# set protocols mpls label-switched-path r4viar3 metric 5

[edit]
root@r1# run show route 5.5.5.5

inet.0: 20 destinations, 20 routes (20 active, 0 holddown, 0 hidden)
+ = Active Route, - = Last Active, * = Both

5.5.5.5/32         *[IS-IS/18] 00:01:46, metric 30
                    >  to 192.168.12.1 via ge-0/0/1.0
                       to 192.168.13.1 via ge-0/0/2.0

inet.3: 3 destinations, 4 routes (3 active, 0 holddown, 0 hidden)
+ = Active Route, - = Last Active, * = Both

5.5.5.5/32         *[IS-IS/18] 00:00:02, metric 15
                    >  to 192.168.13.1 via ge-0/0/2.0, label-switched-path r4viar3

Now, we move to R2 and create a lsp to R4 going through R1 and R3:

set protocols mpls label-switched-path r4viar3 to 4.4.4.4
set protocols mpls label-switched-path r4viar3 primary r3strict
set protocols mpls path r3strict 1.1.1.1 strict
set protocols mpls path r3strict 3.3.3.3 strict
set protocols mpls label-switched-path r4viar3 metric 1

We enable shortcuts and check routes:

root@r2# run show route 5.5.5.5

inet.0: 19 destinations, 19 routes (19 active, 0 holddown, 0 hidden)
+ = Active Route, - = Last Active, * = Both

5.5.5.5/32         *[IS-IS/18] 00:00:07, metric 20
                    >  to 192.168.24.1 via ge-0/0/1.0

inet.3: 6 destinations, 7 routes (6 active, 0 holddown, 0 hidden)
+ = Active Route, - = Last Active, * = Both

5.5.5.5/32         *[IS-IS/18] 00:00:07, metric 11
                    >  to 192.168.12.0 via ge-0/0/0.0, label-switched-path r4viar3

Here we see something “weird”.

SPF from R2 to R5 is R2-R4-R5. The LSP we configured takes a longer path to reach R4 but, as R4 is part of the shortest path from R2 to R5 and the LSP has a very low metric, a longer route is chosen.

The SPF result, considering the LSP, is not the real LSP in terms of hops.

Now, is it really like that? It depends.If we only look at hops number it is a sub-optimal choice. Anyhow, it might be that R2-R4 link is very expensive (low bw or other reasons) so going through that longer path is actually better. If so, that’s why we set the metric to a very low number (1) and that’s why shortcuts exist (one might argue “why not simply increasing link metric to influence standard SPF?” which makes sense but, again, as usual with networks, it depends…and, after all, it is called traffic engineering, it is not an exact science :)).

Let’s make things interesting. This is what we have right now:

  • on R1, we set lsp metric to R4 to 12
  • this means it costs 22 to reach R5
  • on R2, we have a lsp to R4 with cost 1 (11 to reach R5 from R2)
  • ideally, the cheapest path from R1 to R5 is to go to R2 and take the lsp. Total cost is 21

Anyhow:

root@r1# run show route 5.5.5.5 table inet.3

inet.3: 3 destinations, 4 routes (3 active, 0 holddown, 0 hidden)
+ = Active Route, - = Last Active, * = Both

5.5.5.5/32         *[IS-IS/18] 00:01:09, metric 22
                    >  to 192.168.13.1 via ge-0/0/2.0, label-switched-path r4viar3

Why that? As said before, shortcuts are local. LSP info is not advertised all over the network so R1 does not know about lsps on R2. R1 can only make use of its local lsps!

Let’s keep experimenting:

On R1 we define a new lsp:

set protocols mpls label-switched-path r3direct to 3.3.3.3
set protocols mpls label-switched-path r3direct metric 1
set protocols mpls label-switched-path r3direct primary r3long
set protocols mpls path r3long 2.2.2.2 strict
set protocols mpls path r3long 4.4.4.4 strict

Checking route:

root@r1# run show route 5.5.5.5

inet.0: 20 destinations, 20 routes (20 active, 0 holddown, 0 hidden)
+ = Active Route, - = Last Active, * = Both

5.5.5.5/32         *[IS-IS/18] 00:00:22, metric 30
                    >  to 192.168.12.1 via ge-0/0/1.0
                       to 192.168.13.1 via ge-0/0/2.0

inet.3: 5 destinations, 7 routes (5 active, 0 holddown, 0 hidden)
+ = Active Route, - = Last Active, * = Both

5.5.5.5/32         *[IS-IS/18] 00:00:22, metric 21
                    >  to 192.168.13.1 via ge-0/0/2.0, label-switched-path r3long

root@r1# run show mpls lsp ingress
Ingress LSP: 2 sessions
To              From            State Rt P     ActivePath       LSPname
3.3.3.3         1.1.1.1         Up     0 *                      r3long
4.4.4.4         1.1.1.1         Up     0 *     r3strict         r4viar3
Total 2 displayed, Up 2, Down 0

New lsp is used! Again, even if, hop-wise, path is longer, total cost is lower. This works as both R3 and R4 are valid hops over the SPF from R1 to R5 (we saw at the beginning we have 2 ECMP paths).

Last thing we do is to remove on of those ecmp paths:

set protocols isis interface ge-0/0/1 level 2 metric 9 (R1)
set protocols isis interface ge-0/0/0 level 2 metric 9 (R2)

We get this:

root@r1# run show mpls lsp ingress
Ingress LSP: 2 sessions
To              From            State Rt P     ActivePath       LSPname
3.3.3.3         1.1.1.1         Up     0 *     r3long           r3long
4.4.4.4         1.1.1.1         Up     0 *     r3strict         r4viar3
Total 2 displayed, Up 2, Down 0

root@r1# run show route 5.5.5.5

inet.0: 20 destinations, 20 routes (20 active, 0 holddown, 0 hidden)
+ = Active Route, - = Last Active, * = Both

5.5.5.5/32         *[IS-IS/18] 00:01:12, metric 29
                    >  to 192.168.12.1 via ge-0/0/1.0

inet.3: 5 destinations, 7 routes (5 active, 0 holddown, 0 hidden)
+ = Active Route, - = Last Active, * = Both

5.5.5.5/32         *[IS-IS/18] 00:01:12, metric 22
                    >  to 192.168.13.1 via ge-0/0/2.0, label-switched-path r4viar3

The other lsp is used now. What happened?

The inet.0 route helps us here. No more ecmp; the route through R3 is no longer over the shortest path.

As a result, any lsp to R3, regardless the actual path it follows cannot be used as R3 is not along the standard SPF.

This confirms what we said initially: shortcuts must lead to a node which already is along the shortest path. They cannot be used to take totally new paths. This also helps us avoiding loops.

Now it should be clear when I can take a shortcut.

Ciao
IoSonoUmberto

One topology to rule them all: LFA, RLFA and TILFA

Being able to react to failures fast is a key element when designing your network.

Network protocols are able to re-compute paths when failures occur. Anyhow, this takes time, the well-known convergence time.

To overcome this ” delay” and reduce downtime as much as possible, fast recovery techniques have been developed over the years so that routers have a backup path ready for use.

This normally means pre-computing a backup path and installing it into the forwarding table (with a higher weight) so that, as soon as the primary path goes down, the backup path is already there.

This backup path is not optimal and might not be the post convergence path. Anyhow, this is not important. What matters here is to have an alternative way to route packets if an unexpected failure takes place.

Over the years, multiple fast reroute techniques have been invented. This was required because one solution was not able to cover all the possible failures in a network. At the same time, some techniques were not feasible in a destination-based routing framework while became an option in a source-based network.
To add more complexity, one technique might be enough for certain topologies while might now cover all the possible failures in other ones.

Here, I’m going to use this simple topology:

The goal is to provide a backup path at R1 to reach R5. The protected resource, here, is the R1-R4 link.

Network uses ISIS L2 as IGP + LDP.

What i’m going to show is how playing with link metrics affects the backup solution we need.
We will start with a scenario where pure LFA is enough. Then, by changing a link metric, we will have to move to R-LFA and, finally, another metric change will force us to look at TI-LFA (and SR).

Let’s start simple:

All the link have default metric 10, except link R1-R2 that has metric 20.

If R1-R4 link goes down, R2 is a valid backup next-hop as R2 does not use the failed link to reach R5. R2 shortest path to R5 is R2-R3-R4-R5 (cost 30). This is enough to compute and have a LFA path.

Let’s check computed backup path for R5:

r5.00
  Primary next-hop: ge-0/0/1.0, IPV4, r4, SNPA:  56:3:1:0:b6:77
    Root: r4, Root Metric: 10, Metric: 10, Root Preference: 0x0
      Not eligible, IPV4, Reason: Primary next-hop link fate sharing
    Root: r2, Root Metric: 20, Metric: 30, Root Preference: 0x0
      track-item: r4.00-00
      Eligible, Backup next-hop: ge-0/0/0.0, IPV4, r2, SNPA:  56:3:1:0:a2:8f,Prefixes: 1

root@r1_re# run show route 5.5.5.5

inet.0: 19 destinations, 19 routes (19 active, 0 holddown, 0 hidden)
+ = Active Route, - = Last Active, * = Both

5.5.5.5/32         *[IS-IS/18] 00:09:42, metric 20
                    >  to 192.168.14.1 via ge-0/0/1.0
                       to 192.168.12.1 via ge-0/0/0.0

inet.3: 4 destinations, 4 routes (4 active, 0 holddown, 0 hidden)
+ = Active Route, - = Last Active, * = Both

5.5.5.5/32         *[LDP/9] 00:09:42, metric 1
                    >  to 192.168.14.1 via ge-0/0/1.0, Push 299824
                       to 192.168.12.1 via ge-0/0/0.0, Push 299824

Now, let’s set link R1-R2 metric to 10. On both routers:

deactivate protocols isis interface ge-0/0/0.0 level 2 metric

Do we have a backup path?

 IS-IS level 2 SPF results:
r5_re.00
  Primary next-hop: ge-0/0/1.0, IPV4, r4_re, SNPA:  56:3:1:0:b6:77
    Root: r4_re, Root Metric: 10, Metric: 10, Root Preference: 0x0
      Not eligible, IPV4, Reason: Primary next-hop link fate sharing
    Root: r2_re, Root Metric: 10, Metric: 30, Root Preference: 0x0
      track-item: r4_re.00-00
      track-item: r1_re.00-00
      Not eligible, IPV4, Reason: Path loops

No, there is no LFA path to R5. R2 is the only eligible next-hop but, now, with R1-R2 metric equal to 10, R2 uses R1 as next-hop to reach R5. Actually, R2 has 2 ECMP routes to R5: one via R1 and one via R3. As one of them goes through the protected link, LFA is not possible due to potential loop.

LFA is not able to provide coverage for this failure with this topology.

What can we do? Here comes R-LFA into play!

To understand how R-LFA works, we need some theory.

First, let’s define some “names”:

  • S: source (here, R1)
  • PLR: point of local repair, where we have the failure (here, R1)
  • PR: protected resource (here link R1-R4)
  • N: S neighbors, excluding the one reachable via PR (here, R2)
  • D: destination (here, R5)

Now, we need to define some so-called “spaces”:

  • P space: nodes reachable from S without going through the PR. Here, we have R2 only. R3 is not part of R1 P-space has R1 has 2 ECMP routes to R3 and one of them goes through the protected link. Similar considerations for R4 and R5 (even simpler as there is no ECMP here, shortest path is just via protected link)
  • Extended P space: nodes reachable from S neighbors without going through the PR. Here, R3 is ok. R4 is not ok as R2 has 2 ECMP paths to R4, one of them through the protected link. Similar considerations for R5
  • Q space: nodes that can reach D without going through the PR. Here, it includes R3 and R4 that reach R5 without going through the protected link.

Summing up:

  • P space: R2 (+ R1)
  • extended P space: (R1), R2 and R3
  • Q space: R3 and R4

Now, we look at the intersection between extended P space and Q space. Here, the result only includes R3. R3 is called PQ node.

A PQ node is the key to have R-LFA working. Why is a PQ node so important?

If we manage to reach a PQ node then, from there, it is trivial to reach D without going through the protected link (PQ is part of Q space). In other words, a path through a PQ node is the backup path we are looking for!.

How do we instruct R1 to reach R3 as they are not directly connected? We use targeted LDP sessions.

Time to look at configs.

We add these lines on all our routers:

set protocols isis backup-spf-options remote-backup-calculation
set protocols ldp auto-targeted-session
set protocols ldp auto-targeted-session teardown-delay 60
set protocols ldp auto-targeted-session maximum-sessions 20

Result?

r5_re.00
  Primary next-hop: ge-0/0/1.0, IPV4, r4_re, SNPA:  56:3:1:0:b6:77
    Root: r4_re, Root Metric: 10, Metric: 10, Root Preference: 0x0
      Not eligible, IPV4, Reason: Primary next-hop link fate sharing
    Root: r3_re, Root Metric: 20, Metric: 20, Root Preference: 0x0
      track-item: r4_re.00-00
      track-item: r3_re.00-00
      Eligible, Backup next-hop: ge-0/0/0.0, LSP, LDP->r3_re(3.3.3.3), Prefixes: 1
    Root: r2_re, Root Metric: 10, Metric: 30, Root Preference: 0x0
      track-item: r4_re.00-00
      track-item: r1_re.00-00
      Not eligible, IPV4, Reason: Path loops

Junos looks for alternate paths to reach R5 from R1:

  • R4 is not feasible as it is the primary (protected) next-hop
  • R2 is not ok due to potential loop (reason why LFA is not an option)
  • R3 is our backup next-hop and we have to reach it via a LSP

We now have a backup route:

root@r1_re# run show isis route 5.5.5.5
 IS-IS routing table             Current version: L1: 0 L2: 446
IPv4/IPv6 Routes
----------------
Prefix             L Version   Metric Type Interface       NH   Via                 Backup Score
5.5.5.5/32         2     446       20 int  ge-0/0/1.0      IPV4 r4_re
                                           ge-0/0/0.0      LSP  LDP->r3_re(3.3.3.3)

A targeted LDP session is created:

root@r1_re# run show ldp session auto-targeted
  Address                           State       Connection  Hold time  Adv. Mode
3.3.3.3                             Operational Open          23         DU

Why do we need that session? It is instrumental to allow R1 to learn a label to reach R3.

Having a look at LDP database is helpful:

root@r1_re# run show ldp database session 3.3.3.3 | match 5.5.5.5
 299824      5.5.5.5/32
 299824      5.5.5.5/32

root@r1_re# run show ldp database session 2.2.2.2 | match 3.3.3.3
 299776      3.3.3.3/32
 299792      3.3.3.3/32

R3 (PQ) advertises label 299824 to reach 5.5.5.5 (D).

R2, which is between R1 and R3, advertises label 299776 to reach R3 (PQ).

Let’s check route to 5.5.5.5:

root@r1_re# run show route 5.5.5.5

inet.0: 19 destinations, 19 routes (19 active, 0 holddown, 0 hidden)
+ = Active Route, - = Last Active, * = Both

5.5.5.5/32         *[IS-IS/18] 00:05:11, metric 20
                    >  to 192.168.14.1 via ge-0/0/1.0
                       to 192.168.12.1 via ge-0/0/0.0, Push 299776

inet.3: 4 destinations, 4 routes (4 active, 0 holddown, 0 hidden)
+ = Active Route, - = Last Active, * = Both

5.5.5.5/32         *[LDP/9] 00:05:10, metric 1
                    >  to 192.168.14.1 via ge-0/0/1.0, Push 299824
                       to 192.168.12.1 via ge-0/0/0.0, Push 299824, Push 299776(top)

We have 2 labels in the stack:

  • top label, 299776 to reach R3 from R2 (R2 reachable via ge-0/0/0)
  • bottom label, 299824 to reach R5 from R3

Having 2 labels has to do with P and extended-P spaces.

Top label is the LDP label advertised by R2 to reach R3. LDP relies on IGP so that label means go to R3 via R2 via IGP path. R2 is part of the P space so we know we will not go through the protected link.
Once at R2, label 299776 is swapped with label 3 (PHP) to reach R3. As R3 is part of the extended P space, we know R2 reaches R3 without going through the protected link.
Last, R3 processes label 299824 and knows it has to send the packet to R5. As R3 is a PQ node, we know it can reach R5 without going through the protected link.

Now, it should be clear how all the pieces come together. A PQ node is fundamental as we can leverage extended P space to safely go from S to PQ; at the same time, we can leverage Q space once at the PQ node to reach D.
The only additional piece we need is the targeted LDP session so that S learns the label used by PQ node to reach D.

Once again, we change metrics so that link R2-R3 costs 1000.

As you can see that super expensive link causes (extended)-P space and Q space not to overlap. As a consequence, R-LFA cannot help us. This shows us that even R-LFA cannot provide 100% coverage any time!

The solution here is to use TI-LFA. TI-LFA does one super simple thing: compute the post convergence path and use it as backup path. Simple right? Yes but…to use it we need to migrate to a segment routing network as we need our network to be source based in order to build the adequate label stack.

Check here and here how to enable SR and perform basic config. TI-LFA was also covered here.

Once SR is ready, add these configs on all the routers:

deactivate protocols isis interface ge-0/0/1.0 link-protection
deactivate protocols isis backup-spf-options remote-backup-calculation
set protocols isis backup-spf-options use-post-convergence-lfa
set protocols isis backup-spf-options use-source-packet-routing

On R1 add this to enable TI-LFA:

set protocols isis interface ge-0/0/1.0 level 2 post-convergence-lfa

You may have noticed we had to deactivate link protection (LFA) and remote backup calculation (RLFA). If we use TI-lFA we have to give up on them.

As a result, we now have a backup path:

root@r1_re# run show route 5.5.5.5

inet.0: 19 destinations, 19 routes (19 active, 0 holddown, 0 hidden)
+ = Active Route, - = Last Active, * = Both

5.5.5.5/32         *[IS-IS/18] 00:08:31, metric 20
                    >  to 192.168.14.1 via ge-0/0/1.0
                       to 192.168.12.1 via ge-0/0/0.0, Push 20

inet.3: 4 destinations, 8 routes (4 active, 0 holddown, 0 hidden)
+ = Active Route, - = Last Active, * = Both

5.5.5.5/32         *[LDP/9] 00:08:31, metric 1
                    >  to 192.168.14.1 via ge-0/0/1.0, Push 19
                    [L-ISIS/14] 00:08:31, metric 20
                    >  to 192.168.14.1 via ge-0/0/1.0, Push 1105
                       to 192.168.12.1 via ge-0/0/0.0, Push 1105, Push 20(top)

Stack includes label 20 (Adj SID to reach R3 from R2) and label 1105 (to reach R5 once at R3).

You can easily see that L-ISIS (SR) route has a backup path while LDP does not. LDP route is not protected. Is this a problem? It depends. It is true not having protection is, generally speaking, an issue. At the same time, one might argue that if you enable SR, why should you keep LDP on? It is a complex topic that deserves more space…but not today. I think we had enough for today 🙂

What do we take home?
LFA is not perfect. It depends on the topology but might not cover all failures.
R-LFA can help where LFA fails but R-LFA is not perfect as well. It also depends on the topology and cannot cover all failures.
TI-LFA provides 100% coverage but it requires our network to run SR.

What do we do?
As always, it depends… Study your topology, test LFA and R-LFA to see if they provide enough coverage for the failures you would like to protect (it might happen that R-LFA cannot cover a specific failure but you are not interested in protecting that failure. If so, R-LFA might not provide 100% coverage on that topo but it provides 100% coverage for what you need).
If some failures are not covered, think of TI-LFA, knowing SR will come into play.

Or…just move to SR to embrace the future 🙂

Ciao
IoSonoUmberto

Colored LSPs and fallback routes

We have seen it is possible to create colored LSPs from one router to another.

A colored route can come from an explicitly configured LSP or from Flex Algos, to make some examples.

Colored routes end up in a special table called inetcolor.0.

Then, we can use well known BGP to attach a color community to an advertised prefix and have our router to resolve the next-hop taking into consideration the color as well.

To refresh these concepts. Think of our topology:

  • there is a colored LSP from R1 to R8 (color red)
  • R8 advertises prefix 1.2.3.4/32 to R1 with community color red
  • R1, instead of resolving the BGP route with an inet.3 route (non colored), is configured so to consider the color as well
  • R1 will use the red colored route found in inetcolor.0 to resolve 1.2.3.4/32

To resolve BGP routes using color community we need this:

set protocols bgp group ibgp family inet unicast extended-nexthop-color

Once enabled, if a colored route comes, Junos tries to resolve it using colored routes only.

Now, consider this scenario:

  • R8 advertises route 5.6.7.8/32 with color purple
  • R1 has 3 routes within inetcolor.0, each with a different color: red, blue, yellow
  • as there is no purple lsp, 5.6.7.8/32 will remain hidden

Once we enable color resolution it is a bit like “all or nothing”.

Is there anything we can do?

Most likely, we have alternative routes into inet.3: uncolored lsp or classic ldp lsps.

What we can do is to leak those routes into inetcolor.0 so to provide fallback routes.

Let’s see how.

In this example R8 is advertising 8.3.2.1/32 with a color community that does not correspond to any inetcolor.0 route at R1:

root@r1_re> show route receive-protocol bgp 8.8.8.8 hidden table inet.0 extensive

inet.0: 29 destinations, 29 routes (28 active, 0 holddown, 1 hidden)
  8.3.2.1/32 (1 entry, 0 announced)
     Accepted
     Nexthop: 8.8.8.8
     Localpref: 100
     AS path: I
     Communities: color:0:4321

root@r1_re> show route table inetcolor.0 match-prefix *-4321*

inetcolor.0: 15 destinations, 15 routes (15 active, 0 holddown, 0 hidden)

root@r1_re>

As you can see, there is no route for color 4321.

The solution is pretty easy and relies on rib-groups:

set routing-options rib-groups ldp-to-inetcolor import-rib inet.3
set routing-options rib-groups ldp-to-inetcolor import-rib inetcolor.0
set routing-options rib-groups ldp-to-inetcolor import-policy ldp-to-inetcolor

set policy-options policy-statement ldp-to-inetcolor term ldp from protocol ldp
set policy-options policy-statement ldp-to-inetcolor term ldp then accept
set policy-options policy-statement ldp-to-inetcolor then reject

set protocols ldp rib-group ldp-to-inetcolor

Basically, we tell to copy routes from inet.3 to inetcolor.0.
Via policy, we tell junos to only accept ldp routes.
Last, we apply the rib group to ldp (which makes “from protocol ldp” within the policy redundant).

As a result:

root@r1_re> show route table inetcolor.0 match-prefix *-4321*

inetcolor.0: 22 destinations, 22 routes (22 active, 0 holddown, 0 hidden)

root@r1_re> show route 8.3.2.1 extensive | match "Protocol "
                Protocol next hop: 8.8.8.8-4321<c>
                        Protocol next hop: 8.8.8.8-4321<c> Metric: 1

                Composite next hops: 1
                        Protocol next hop: 8.8.8.8-4321<c> Metric: 1
                        Composite next hop: 0x6ade6b0 571 INH Session ID: 0x0
                        Indirect next hop: 0x7411884 1048582 INH Session ID: 0x0
                        Indirect path forwarding next hops: 1
                                Next hop type: Router
                                Next hop: 192.168.14.1 via ge-0/0/2.0 weight 0x1
                                Session Id: 0x0
                                8.8.8.8-0<c>/32 Originating RIB: inetcolor.0
                                  Metric: 1 Node path count: 1
                                  Forwarding nexthops: 1
                                        Next hop type: Router
                                        Next hop: 192.168.14.1 via ge-0/0/2.0 weight 0x1
                                        Session Id: 0x0

Still no routes for color 4321.

Anyhow, Junos tells us the route was resolved using 8.8.8.8-4321<c>….which does not exist.

What’s the trick?
Behind 8.8.8.8-4321<c> there is 8.8.8.8-0<c>. What’s that route?

root@r1_re> show route table inetcolor.0 protocol ldp

inetcolor.0: 22 destinations, 22 routes (22 active, 0 holddown, 0 hidden)
+ = Active Route, - = Last Active, * = Both

2.2.2.2-0<c>/32
                   *[LDP/9] 01:17:27, metric 1
                    >  to 192.168.14.1 via ge-0/0/2.0, Push 24
3.3.3.3-0<c>/32
                   *[LDP/9] 01:17:27, metric 1
                    >  to 192.168.14.1 via ge-0/0/2.0, Push 23
4.4.4.4-0<c>/32
                   *[LDP/9] 01:17:27, metric 1
                    >  to 192.168.14.1 via ge-0/0/2.0
5.5.5.5-0<c>/32
                   *[LDP/9] 01:17:27, metric 1
                    >  to 192.168.14.1 via ge-0/0/2.0, Push 25
6.6.6.6-0<c>/32
                   *[LDP/9] 01:17:27, metric 1
                    >  to 192.168.14.1 via ge-0/0/2.0, Push 28
7.7.7.7-0<c>/32
                   *[LDP/9] 01:17:27, metric 1
                    >  to 192.168.14.1 via ge-0/0/2.0, Push 27
8.8.8.8-0<c>/32
                   *[LDP/9] 01:17:27, metric 1
                    >  to 192.168.14.1 via ge-0/0/2.0, Push 26

It’s the ldp route copied from inet.3.

As, natively, it does not have a color associated, it uses color 0 which acts as a wildcard color. This allows those routes to work as a fallback option.

Junos will always try to perform longest prefix match into inetcolor.0 for a colored route. This means finding the best match for the ip:color pair.
If it cannot find it, then it will do a “softer” longest prefix match using “wildcard” routes that are ip:0 (color 0 as we have just seen).

And here we are! Fallback routes and no more hidden prefixes!

Ciao
IoSonoUmberto

Making IGP flexible with Flex Algo

IGP is a key element when designing a network. By default, all the routers within a domain are part of the same IGP topology (let’s not consider things like stub areas or similar).

Let’s recall our lab topology:

We chose ISIS as our IGP. As a consequence, all the 8 routers belong to the same IGP instance. It means that all the routers and all the links build the network ISIS database used to compute shortest paths.

With segment routing, we have learned that one keyword is “flexibility”. This flexibility does not mean only TE paths combined with colors. The wave of fresh air brought by SR involved IGP as well. Not surprisingly, this new feature is called “Flex Algo”.

As said before, by default, all our routers belong to the same IGP topology; our network has one IGP topology.

With flex algo we can slice out network into multiple topologies, each of them running SPF separately,

Let’s make an example to better understand this. We slice our lab network like this:

We define two “sub-topologies”: red and blue. Those new topologies do not replace the default one (the one all the 8 routers belong to; that one still exists), they are additional ones.

As a result, it is like having 3 logical topologies into one physical network:

  • default one
  • red (flex algo)
  • blue (flex algo)

Those two topologies are associated to a flex algo. Flex algo si represented by a number, starting from 128.

Here, we are going to configure flex algo 128 (red) and 129 (blue).

When configuring flex algos, we need to split nodes belonging to the same algo into two groups:

  • participating node: every node belonging to a flex algo is a participating node
  • FAD node: nodes telling other nodes how this algo works (we will see in a bit what this means). Only some nodes belonging to an algo are configured this way. Normally, two FAD nodes might be a good choice so to have redundancy in case of failures (we use priority values to elect primary and backup FAD nodes)

Let’s start the configuration.

We make R1, R2, R7 and R8 part of both algo 128 and algo 129:

set protocols isis source-packet-routing flex-algorithm 128
set protocols isis source-packet-routing flex-algorithm 129

Similarly, we configure algo 128 only on R3 and R3 and algo 129 only on R4 and R6.

R3 and R5 are FAD nodes for algo 128. This is R3 config (R5 config is identical but with a higher priority, 200):

set routing-options flex-algorithm 128 definition priority 100
set routing-options flex-algorithm 128 color 128
set protocols isis source-packet-routing flex-algorithm 128

Similarly, we configure R4 and R6 as FAD nodes for algo 129.

What does FAD configuration bring in?

First, we define the identifier of the algo: here, 128 or 129. Those numbers are mapped to color communities, yes the ones we have already seen when looking at colored sr lsps.

Additionally, we might set:

  • admin groups constraints so to tell a given algo to only consider certain links
  • spf method: normal or strict
  • spf metric: by default the IGP metric is used (making it like standard IGP but on subset of nodes, the ones participating to the algo) but, alternatively, we might have the flex algo to compute shortest paths based on te-metric or delay (relying on twamp)

Here, we use standard IGP metric. This means same spf as standard IGP but not all the nodes are available to a given flex algo.

To better visualize it, this is the topology flex algo 128 works on:

while this is the one for flex algo 129:

Next, our nodes are currently advertising their node SIDs into ISIS “full” topology. We are going to call that topology Algo 0.

With flex algo, we can have our nodes to advertise a per-flex-algo node SID. This is achieved by updating our ISIS export policy used to inject node SID. This is for R1 (part of both algos 128 and 129) but config for other nodes can be derived easily:

set policy-options policy-statement exp-anycast-sr term nodesid from interface lo0.0
set policy-options policy-statement exp-anycast-sr term nodesid from route-filter 1.1.1.1/32 exact
set policy-options policy-statement exp-anycast-sr term nodesid then prefix-segment algorithm 128 index 181
set policy-options policy-statement exp-anycast-sr term nodesid then prefix-segment algorithm 128 node-segment
set policy-options policy-statement exp-anycast-sr term nodesid then prefix-segment algorithm 129 index 191
set policy-options policy-statement exp-anycast-sr term nodesid then prefix-segment algorithm 129 node-segment
set policy-options policy-statement exp-anycast-sr term nodesid then prefix-segment index 101
set policy-options policy-statement exp-anycast-sr term nodesid then prefix-segment node-segment
set policy-options policy-statement exp-anycast-sr term nodesid then accept

Time to verify flex algo is operational.

First, we check router capabilities by looking at the ISIS database:

    Router Capability:  Router ID 1.1.1.1, Flags: 0x00
      SPRING Capability - Flags: 0xc0(I:1,V:1), Range: 1000, SID-Label: 1000
      SPRING Algorithm - Algo: 0
      SPRING Algorithm - Algo: 1
      SPRING Algorithm - Algo: 128
      SPRING Algorithm - Algo: 129

    Router Capability:  Router ID 3.3.3.3, Flags: 0x00
      SPRING Capability - Flags: 0xc0(I:1,V:1), Range: 1000, SID-Label: 1000
      SPRING Algorithm - Algo: 0
      SPRING Algorithm - Algo: 1
      SPRING Algorithm - Algo: 128
      Flex Algo: 128, Len: 4, Metric: 0, Calc: 0, Prio: 100

    Router Capability:  Router ID 4.4.4.4, Flags: 0x00
      SPRING Capability - Flags: 0xc0(I:1,V:1), Range: 1000, SID-Label: 1000
      SPRING Algorithm - Algo: 0
      SPRING Algorithm - Algo: 1
      SPRING Algorithm - Algo: 129
      Flex Algo: 129, Len: 4, Metric: 0, Calc: 0, Prio: 100

    Router Capability:  Router ID 5.5.5.5, Flags: 0x00
      SPRING Capability - Flags: 0xc0(I:1,V:1), Range: 1000, SID-Label: 1000
      SPRING Algorithm - Algo: 0
      SPRING Algorithm - Algo: 1
      SPRING Algorithm - Algo: 128
      Flex Algo: 128, Len: 4, Metric: 0, Calc: 0, Prio: 200

    Router Capability:  Router ID 6.6.6.6, Flags: 0x00
      SPRING Capability - Flags: 0xc0(I:1,V:1), Range: 1000, SID-Label: 1000
      SPRING Algorithm - Algo: 0
      SPRING Algorithm - Algo: 1
      SPRING Algorithm - Algo: 129
      Flex Algo: 129, Len: 4, Metric: 0, Calc: 0, Prio: 200

As you can see all the nodes belong to Algo 0 (default topology). We also have algo 1 which we ignore for now. Next, based on what we configured, nodes advertise their membership to algo 128 and/or algo 129.

Last, nodes R3 and R5 have FAD for algo 128 (R5 with higher priority).
Nodes R4 and R6 do the same for algo 129.

We can see some details for each algo:

root@r1_re> show isis spring flex-algorithm flex-algorithm-id 128
Flex Algo: 128
  Level: 1, Color: 128, Not Participating, No Definitions
    ...
  Level: 2, Color: 128, Participating, FAD supported
      Winner: r5_re, Metric: 0, Calc: 0, Prio: 200, FAD supported
    Spf Version: 31
    ...
      r3_re, Metric: 0, Calc: 0, Prio: 100, FAD supported
    Full SPFs: 31, Partial SPFs: 0

root@r1_re> show isis spring flex-algorithm flex-algorithm-id 129
Flex Algo: 129
  Level: 1, Color: 129, Not Participating, No Definitions
    ...
  Level: 2, Color: 129, Participating, FAD supported
      Winner: r6_re, Metric: 0, Calc: 0, Prio: 200, FAD supported
    Spf Version: 30
    ...
      r4_re, Metric: 0, Calc: 0, Prio: 100, FAD supported
    Full SPFs: 30, Partial SPFs: 0

Here we can see if the node is participating to an algo and at which level (our network is a L2 only ISIS network so L1 shows “not participating”). Fad nodes are displayed.

We said a different node SID is advertised for each flex algo. R7 shows:

root@r1_re# run show isis database r7 extensive | match node
  Node Segment Blocks Advertised:
      Node SID, Flags: 0x40(R:0,N:1,P:0,E:0,V:0,L:0), Algo: SPF(0), Value: 107
      Node SID, Flags: 0x40(R:0,N:1,P:0,E:0,V:0,L:0), Algo: Flex-Algo(129), Value: 197
      Node SID, Flags: 0x40(R:0,N:1,P:0,E:0,V:0,L:0), Algo: Flex-Algo(128), Value: 187

R7 advertises 3 Node SIDs (labels are obtained using the same SRGB block so think of this to avoid conflicts):

  • SID 107, algo 0, label 1107
  • SID 187, algo 128, label 1187
  • SID 197, algo 129, label 1197

All those labels are available into mpls.0:

root@r1_re# run show route table mpls.0 label 1107

mpls.0: 46 destinations, 46 routes (46 active, 0 holddown, 0 hidden)
+ = Active Route, - = Last Active, * = Both

1107               *[L-ISIS/14] 02:28:45, metric 30
                    >  to 192.168.14.1 via ge-0/0/2.0, Swap 1107

[edit]
root@r1_re# run show route table mpls.0 label 1187

mpls.0: 46 destinations, 46 routes (46 active, 0 holddown, 0 hidden)
+ = Active Route, - = Last Active, * = Both

1187               *[L-ISIS/18] 02:28:49, metric 80
                    >  to 192.168.12.1 via ge-0/0/0.0, Swap 1187

[edit]
root@r1_re# run show route table mpls.0 label 1197

mpls.0: 46 destinations, 46 routes (46 active, 0 holddown, 0 hidden)
+ = Active Route, - = Last Active, * = Both

1197               *[L-ISIS/18] 03:21:07, metric 30
                    >  to 192.168.14.1 via ge-0/0/2.0, Swap 1197

This means flex algo provides control plane separation but the data plane is still one. Both labels 1107 and 1187 will reach R7 but label 1107 will follow a path over the full topology while label 1187 will follow a path computed over flex algo 128 topology (red one). We will see later how to map those to routes and prefixes.

You may have notices flex algo routes have higher preference: 18 instead of 14. This happens if we add:

set protocols isis level 2 flex-algorithm-preference 18

Now, it might be interesting to see how R1 can reach R7 (7.7.7.7):

root@r1_re# run show route match-prefix 7.7.7.*

inet.0: 26 destinations, 26 routes (26 active, 0 holddown, 0 hidden)
+ = Active Route, - = Last Active, * = Both

7.7.7.7/32         *[IS-IS/18] 00:06:16, metric 30
                    >  to 192.168.13.1 via ge-0/0/1.0
                       to 192.168.14.1 via ge-0/0/2.0

inet.3: 8 destinations, 15 routes (8 active, 0 holddown, 0 hidden)
+ = Active Route, - = Last Active, * = Both

7.7.7.7/32         *[LDP/9] 00:06:16, metric 1
                    >  to 192.168.13.1 via ge-0/0/1.0, Push 26
                       to 192.168.14.1 via ge-0/0/2.0, Push 27
                    [L-ISIS/14] 00:06:16, metric 30
                    >  to 192.168.13.1 via ge-0/0/1.0, Push 1107
                       to 192.168.14.1 via ge-0/0/2.0, Push 1107

inetcolor.0: 15 destinations, 15 routes (15 active, 0 holddown, 0 hidden)
+ = Active Route, - = Last Active, * = Both

7.7.7.7-128<c>/64
                   *[L-ISIS/18] 00:06:16, metric 30
                    >  to 192.168.13.1 via ge-0/0/1.0, Push 1187
7.7.7.7-129<c>/64
                   *[L-ISIS/18] 00:23:43, metric 30
                    >  to 192.168.14.1 via ge-0/0/2.0, Push 1197

Let’s go table by table:

  • inet.0: this is “plain” ISIS route (2 ECMP next-hops)
  • inet3: here we have both the LDP route (LDP is active into our network) and the default SR (algo 0) route (label is 1107)
  • inetcolor.0: here we have 2 routes one using label 1187 (algo 128) and one using 1197 (algo 129).

We have already seen table inetcolor.0. We will come back to this later.

Now, let’s check all the routes we have for algo 128 (prefix *-128*) and algo 129 (prefix *-129*):

root@r1_re# run show route table inetcolor.0 match-prefix *-128*

inetcolor.0: 15 destinations, 15 routes (15 active, 0 holddown, 0 hidden)
+ = Active Route, - = Last Active, * = Both

2.2.2.2-128<c>/64
                   *[L-ISIS/14] 03:32:05, metric 10
                    >  to 192.168.12.1 via ge-0/0/0.0
3.3.3.3-128<c>/64
                   *[L-ISIS/14] 00:01:32, metric 10
                    >  to 192.168.13.1 via ge-0/0/1.0
5.5.5.5-128<c>/64
                   *[L-ISIS/14] 00:01:32, metric 20
                    >  to 192.168.13.1 via ge-0/0/1.0, Push 1185
7.7.7.7-128<c>/64
                   *[L-ISIS/14] 00:01:32, metric 30
                    >  to 192.168.13.1 via ge-0/0/1.0, Push 1187
8.8.8.8-128<c>/64
                   *[L-ISIS/14] 00:01:32, metric 30
                    >  to 192.168.13.1 via ge-0/0/1.0, Push 1188

[edit]
root@r1_re# run show route table inetcolor.0 match-prefix *-129*

inetcolor.0: 15 destinations, 15 routes (15 active, 0 holddown, 0 hidden)
+ = Active Route, - = Last Active, * = Both

2.2.2.2-129<c>/64
                   *[L-ISIS/14] 03:32:10, metric 10
                    >  to 192.168.12.1 via ge-0/0/0.0
4.4.4.4-129<c>/64
                   *[L-ISIS/14] 03:32:10, metric 10
                    >  to 192.168.14.1 via ge-0/0/2.0
6.6.6.6-129<c>/64
                   *[L-ISIS/14] 03:32:10, metric 20
                    >  to 192.168.14.1 via ge-0/0/2.0, Push 1196
7.7.7.7-129<c>/64
                   *[L-ISIS/14] 03:32:10, metric 30
                    >  to 192.168.14.1 via ge-0/0/2.0, Push 1197
8.8.8.8-129<c>/64
                   *[L-ISIS/14] 03:32:10, metric 30
                    >  to 192.168.14.1 via ge-0/0/2.0, Push 1198

As you can see both algos have routes to 1.1.1.1, 2.2.2.2, 7.7.7.7 and 8.8.8.8.
Anyhow, routes to 3.3.3.3 and 5.5.5.5 is available for algo 128 only while routes to 4.4.4.4 and 6.6.6.6 are available for algo 129 only. This is correct as R3 and R5 only belong to algo 128 while R4 and R6 only to algo 129. This might be seen as a proof that flex algos are working properly. From algo 128 perspective, R4 and R6 do not exist.

Actually, it might be that, for example, R4 is not advertising any node SID for algo 128.

To really be sure R4 is not considered by algo 128 we can check ISIS route:

root@r1_re# run show isis route flex-algorithm-id 128
 IS-IS routing table             Current version: L1: 4837 L2: 5937
IPv4/IPv6 Routes
----------------
Prefix             L Version   Metric Type Interface       NH   Via                 Backup Score
1.1.1.1/32         2      33        0 int
2.2.2.2/32         2      33       10 int  ge-0/0/0.0      IPV4 r2_re
3.3.3.3/32         2      33       10 int  ge-0/0/1.0      IPV4 r3_re
4.4.4.4/32         2      33 4261412865 int
5.5.5.5/32         2      33       20 int  ge-0/0/1.0      IPV4 r3_re
6.6.6.6/32         2      33 4261412865 int
7.7.7.7/32         2      33       30 int  ge-0/0/1.0      IPV4 r3_re
8.8.8.8/32         2      33       30 int  ge-0/0/1.0      IPV4 r3_re

R4 and R6 metric is 4261412865 that, in SR world, means unreachable.

We can do the same with algo 129 to see R3 and R5 are unreachable:

root@r1_re# run show isis route flex-algorithm-id 129
 IS-IS routing table             Current version: L1: 4837 L2: 5937
IPv4/IPv6 Routes
----------------
Prefix             L Version   Metric Type Interface       NH   Via                 Backup Score
1.1.1.1/32         2      31        0 int
2.2.2.2/32         2      31       10 int  ge-0/0/0.0      IPV4 r2_re
3.3.3.3/32         2      31 4261412865 int
4.4.4.4/32         2      31       10 int  ge-0/0/2.0      IPV4 r4_re
5.5.5.5/32         2      31 4261412865 int
6.6.6.6/32         2      31       20 int  ge-0/0/2.0      IPV4 r4_re
7.7.7.7/32         2      31       30 int  ge-0/0/2.0      IPV4 r4_re
8.8.8.8/32         2      31       30 int  ge-0/0/2.0      IPV4 r4_re

What if we have a failure? TI-LFA is still the way to go! TI-lFA is flex algo aware so it will compute backup paths considering the specific topology a destination belongs to.
What does this mean?
Consider R1. It has 3 labels to reach R7 (algo 0, algo 128 and algo 129). Now, assume we configure TI-LFA to protect R1-R3 link.

That link is not part of algo 129 so we do not expect any algo 129 path to go through that link. As a result, we can ignore algo 129 in case of R1-R3 link failure (to algo 129 that link simply does not exist)

Algo 128 must be considered. Let’s see what happens. We configure TI-LFA:

set protocols isis interface ge-0/0/1.0 level 2 post-convergence-lfa

And check routes:

root@r1_re# run show route table inetcolor.0 match-prefix *-128*

inetcolor.0: 15 destinations, 15 routes (15 active, 0 holddown, 0 hidden)
+ = Active Route, - = Last Active, * = Both

2.2.2.2-128<c>/64
                   *[L-ISIS/18] 00:02:28, metric 10
                    >  to 192.168.12.1 via ge-0/0/0.0
3.3.3.3-128<c>/64
                   *[L-ISIS/18] 00:00:24, metric 10
                    >  to 192.168.13.1 via ge-0/0/1.0
                       to 192.168.12.1 via ge-0/0/0.0, Push 1183
5.5.5.5-128<c>/64
                   *[L-ISIS/18] 00:00:24, metric 20
                    >  to 192.168.13.1 via ge-0/0/1.0, Push 1185
                       to 192.168.12.1 via ge-0/0/0.0, Push 1185
7.7.7.7-128<c>/64
                   *[L-ISIS/18] 00:00:24, metric 30
                    >  to 192.168.13.1 via ge-0/0/1.0, Push 1187
                       to 192.168.12.1 via ge-0/0/0.0, Push 1187
8.8.8.8-128<c>/64
                   *[L-ISIS/18] 00:00:24, metric 30
                    >  to 192.168.13.1 via ge-0/0/1.0, Push 1188
                       to 192.168.12.1 via ge-0/0/0.0, Push 1188

They all have a backup path through R2 (R1 only has R2 as alternative next-hop; R4 does not participate to algo 128).

As expected, algo 129 is unaffected:

root@r1_re# run show route table inetcolor.0 match-prefix *-129*

inetcolor.0: 15 destinations, 15 routes (15 active, 0 holddown, 0 hidden)
+ = Active Route, - = Last Active, * = Both

2.2.2.2-129<c>/64
                   *[L-ISIS/18] 00:02:46, metric 10
                    >  to 192.168.12.1 via ge-0/0/0.0
4.4.4.4-129<c>/64
                   *[L-ISIS/18] 00:02:46, metric 10
                    >  to 192.168.14.1 via ge-0/0/2.0
6.6.6.6-129<c>/64
                   *[L-ISIS/18] 00:02:46, metric 20
                    >  to 192.168.14.1 via ge-0/0/2.0, Push 1196
7.7.7.7-129<c>/64
                   *[L-ISIS/18] 00:02:46, metric 30
                    >  to 192.168.14.1 via ge-0/0/2.0, Push 1197
8.8.8.8-129<c>/64
                   *[L-ISIS/18] 00:02:46, metric 30
                    >  to 192.168.14.1 via ge-0/0/2.0, Push 1198

No algo 129 routes go through the protected link so there is no need to compute any backup path.

What about the default topology (algo 0). Playing with TI-LFA is a nice way to better understand and see how different control planes (different algos) are independent.

Let’s increase the metric of link R1-R2:

root@r1_re# set protocols isis interface ge-0/0/0 level 2 metric 50
root@r2_re# set protocols isis interface ge-0/0/0 level 2 metric 50

Now, we are going to see how R1 reaches R7.

Default topology has 2 ECMP next-hops via R3 and R4:

root@r1_re# run show isis route 7.7.7.7
 IS-IS routing table             Current version: L1: 4844 L2: 5954
IPv4/IPv6 Routes
----------------
Prefix             L Version   Metric Type Interface       NH   Via                 Backup Score
7.7.7.7/32         2    5954       30 int  ge-0/0/1.0      IPV4 r3_re
                                           ge-0/0/2.0      IPV4 r4_re

Instead, algo 128 has a primary path via R3 and a backup path via R2:

root@r1_re# run show isis route 7.7.7.7 flex-algorithm-id 128
 IS-IS routing table             Current version: L1: 4844 L2: 5954
IPv4/IPv6 Routes
----------------
Prefix             L Version   Metric Type Interface       NH   Via                 Backup Score
7.7.7.7/32         2      41       30 int  ge-0/0/1.0      IPV4 r3_re
                                           ge-0/0/0.0      MPLS Direct->r2_re(2.2.2.2)

Algo 129 only has its route via R4:


root@r1_re# run show isis route 7.7.7.7 flex-algorithm-id 129
 IS-IS routing table             Current version: L1: 4851 L2: 5970
IPv4/IPv6 Routes
----------------
Prefix             L Version   Metric Type Interface       NH   Via                 Backup Score
7.7.7.7/32         2      44       30 int  ge-0/0/2.0      IPV4 r4_re

Even just by looking at those outputs we realize algos are independent. Algo 128, to compute the backup path, takes a longer path: R1->R2->R3 costs 60 while R1->R4->R3 costs 20…but algo 128 cannot use R1->R4 as R4 does not exist for him.

Now, we emulate link R1-R3 failure:

root@r3_re# set interfaces ge-0/0/0 disable

And we check again.

We omit algo 129 as it is unaffected by this failure.

Default topology is left with only one out of those 2 ecmp paths:

root@r1_re# run show isis route 7.7.7.7
 IS-IS routing table             Current version: L1: 4845 L2: 5958
IPv4/IPv6 Routes
----------------
Prefix             L Version   Metric Type Interface       NH   Via                 Backup Score
7.7.7.7/32         2    5958       30 int  ge-0/0/2.0      IPV4 r4_re

Algo 128 post-conv path is via R2, using the expensive R1-R2 link (but it is its only possibility!):

root@r1_re# run show isis route 7.7.7.7 flex-algorithm-id 128
 IS-IS routing table             Current version: L1: 4845 L2: 5958
IPv4/IPv6 Routes
----------------
Prefix             L Version   Metric Type Interface       NH   Via                 Backup Score
7.7.7.7/32         2      43       80 int  ge-0/0/0.0      IPV4 r2_re

Are we done? Not yet, this is a long one.

We still miss one thing.

If we think about it, why do we associate flex algo with SR. At the end of the day, flex algo is just a way to slice the IGP into multiple sub-topologies. Thinking of a python script, it is like using multiprocess to run the same script many times in parallel, each of them using different output values (in our case the different output is the algo specific topology along with FAD parameters). Flex algo is just running IGP on multiple logical instances, right?

So why do we “link” it to SR?
Remember when we configured FAD? There, we mapped a given flex algo to a given color. That color is nothing more than a color community. Flex algo routes are nothing more than colored lsps. No surprise we found those routes into inetcolor.0.

Now, it should be clear. Flex algo is another way to create TE lsps that can be used by end to end routes.

Colored LSPs allowed us to build a lsp based on a segment list (or DCSPF).

Flex algo allows us to build lsp by partitioning our network into smaller portions and have IGP to run on them. It is TE because building flex algo topologies is doing TE! By saying “this router belongs to algo 128 while this other one to algo 129” we are doing traffic engineering but, unlike colored lsps, after this initial partition, we leave the responsibility to build lsps to the IGP. Moreover, flex algo are flexible enough to let us specify additional constraints like admin-groups or have the IGP to run SPF based on delay or te-metrics.

At this point, it should be clear how to map prefixes to flex algo routes. It is no different than what we have done here and with VPNs.

To make an example. R7 has this config:

set policy-options policy-statement exp-bgp term algo128 from protocol static
set policy-options policy-statement exp-bgp term algo128 from route-filter 7.0.0.128/32 exact
set policy-options policy-statement exp-bgp term algo128 then community add algo128
set policy-options policy-statement exp-bgp term algo128 then accept
set policy-options policy-statement exp-bgp term algo129 from protocol static
set policy-options policy-statement exp-bgp term algo129 from route-filter 7.0.0.129/32 exact
set policy-options policy-statement exp-bgp term algo129 then community add algo129
set policy-options policy-statement exp-bgp term algo129 then accept
set policy-options community algo128 members color:0:128
set policy-options community algo129 members color:0:129
set routing-options static route 7.0.0.128/32 discard
set routing-options static route 7.0.0.129/32 discard

As a result, on R1:

root@r1_re# run show route match-prefix 7.0.* active-path

inet.0: 28 destinations, 28 routes (28 active, 0 holddown, 0 hidden)
+ = Active Route, - = Last Active, * = Both

7.0.0.128/32       *[BGP/170] 00:00:28, localpref 100, from 7.7.7.7
                      AS path: I, validation-state: unverified
                    >  to 192.168.13.1 via ge-0/0/1.0, Push 1187
7.0.0.129/32       *[BGP/170] 00:00:28, localpref 100, from 7.7.7.7
                      AS path: I, validation-state: unverified
                    >  to 192.168.14.1 via ge-0/0/2.0, Push 1197

Each route uses the lsp from the right flex algo.

Again, we see how flex algo is control plane. Forwarding plane is still unique: we have both routes using algo 128 next-hops and routes using algo 129 next-hops.

Pretty flexible…

Ciao
IoSonoUmberto

Building dynamic SR-TE tunnels

Up to now, we have seen ho to manually provision a SR-TE lsp. This required the definition of one or more segment lists, referenced by a source routing path.

All those lsps were bound to a specific egress IP and associated with a given color.

Alternatively, we can have Junos to dynamically create SR-TE paths based on received bgp routes.

The principle is no different from what Junos already implemented for dynamic UDP or GRE tunnels.

Let’s see how it works. The core configuration is the dynamic tunnels themselves:

set routing-options dynamic-tunnels dyn-seven spring-te source-routing-path-template color7-template color 777
set routing-options dynamic-tunnels dyn-seven spring-te destination-networks 7.0.0.0/8

It is pretty simple. BGP routes coming from peers belonging to 7.0.0.0/8 will be taken into consideration.
When a peer within that range sends a BGP route, the route is analyzed. If it is a colored route and color is 777, then Junos dynamically builds a SR-TE path based on the content of the referenced template.

Let’ have a look at the template:

set protocols source-packet-routing source-routing-path-template color7-template primary dyn7sl

set protocols source-packet-routing segment-list dyn7sl inherit-label-nexthops
set protocols source-packet-routing segment-list dyn7sl dynamic
set protocols source-packet-routing segment-list dyn7sl hop1 ip-address 5.5.5.5
set protocols source-packet-routing segment-list dyn7sl hop1 label-type node

Template is similar to a SR-TE path definition. This is really simple as it only has a primary path (no secondary path, no bfd and co…). Path is built based on a segment list. Segment list simply tells to go through R5 (IP 5.5.5.5 is R5 lo0 that will be translated to its SID as we specified label-type node).

A remote node sends us this route:

root@r1_re# run show route receive-protocol bgp 7.7.7.7 extensive

inet.0: 26 destinations, 26 routes (26 active, 0 holddown, 0 hidden)
* 7.1.2.3/32 (1 entry, 1 announced)
     Accepted
     Nexthop: 7.7.7.7
     Localpref: 100
     AS path: I
     Communities: color:0:777

Remote node IP is within 7/8 and advertised route uses color 777.

Junos builds the dynamic tunnel:

root@r1_re# run show dynamic-tunnels database
*- Signal Tunnels #- PFE-down
Table: inetcolor.0

Destination-network: 7.0.0.0-0<c>/8

Tunnel to: 7.7.7.7-777<c>/64
  Reference count: 1
  Next-hop type: spring-te
      7.7.7.7:309:dt-srte-dyn-seven
      State: Established

root@r1_re# run show spring-traffic-engineering lsp name 7.7.7.7:309:dt-srte-dyn-seven detail
Name: 7.7.7.7:309:dt-srte-dyn-seven
  Tunnel-source: Dynamic Tunnel Module(DTM)
  Tunnel-template: color7-template
  To: 7.7.7.7-777<c>
  State: Up
    Path: dyn7sl
    Path Status: NA
    Outgoing interface: NA
    Auto-translate status: Enabled Auto-translate result: Success
    Compute Status:Disabled , Compute Result:N/A , Compute-Profile Name:N/A
    BFD status: N/A BFD name: N/A
    ERO Valid: true
      SR-ERO hop count: 2
        Hop 1 (Loose):
          NAI: IPv4 Node ID, Node address: 5.5.5.5
          SID type: 20-bit label, Value: 1105
        Hop 2 (Loose):
          NAI: IPv4 Node ID, Node address: 7.7.7.7
          SID type: 20-bit label, Value: 1107

There it is, a SR-TE path to R7 via R5.

We can do more! Instead of using segment lists, we can leverage compute profiles and DCSPF.

Config changes as follows:

set protocols source-packet-routing compute-profile dyn-7-ent admin-group include-any enterprise
set protocols source-packet-routing compute-profile dyn-7-ent maximum-computed-segment-lists 1

set protocols source-packet-routing source-routing-path-template color7-template primary dyn7sl
deactivate protocols source-packet-routing source-routing-path-template color7-template primary dyn7sl
set protocols source-packet-routing source-routing-path-template color7-template primary compute compute dyn-7-ent

Compute profile adds a new constraint: only use links belonging to enterprise admin-group.

Here are those links in our network:

As you can see, we are forced to go through R4.

root@r1_re# run show spring-traffic-engineering lsp name 7.7.7.7:309:dt-srte-dyn-seven detail
Name: 7.7.7.7:309:dt-srte-dyn-seven
  Tunnel-source: Dynamic Tunnel Module(DTM)
  Tunnel-template: color7-template
  To: 7.7.7.7-777<c>
  State: Up
    Path: compute
    Path Status: NA
    Outgoing interface: NA
    Auto-translate status: Disabled Auto-translate result: N/A
    Compute Status:Enabled , Compute Result:success , Compute-Profile Name:dyn-7-ent
    Total number of computed paths: 1
    Computed-path-index: 1
      BFD status: N/A BFD name: N/A
      TE metric: 30, IGP metric: 30; Metric optimized by type: TE
      computed segments count: 3
        computed segment : 1 (computed-node-segment):
          node segment label: 1104
          router-id: 4.4.4.4
        computed segment : 2 (computed-node-segment):
          node segment label: 1105
          router-id: 5.5.5.5
        computed segment : 3 (computed-node-segment):
          node segment label: 1107
          router-id: 7.7.7.7

DCSPF found the feasible path adding a new label that forces traffic to go through R4.

Finally, let’s check the route:

root@r1_re# run show route 7.1.2.3

inet.0: 26 destinations, 26 routes (26 active, 0 holddown, 0 hidden)
+ = Active Route, - = Last Active, * = Both

7.1.2.3/32         *[BGP/170] 00:34:24, localpref 100, from 7.7.7.7
                      AS path: I, validation-state: unverified
                    >  to 192.168.14.1 via ge-0/0/2.0, Push 1107, Push 1105(top)

root@r1_re# run show route 7.1.2.3 extensive | match 777
                Protocol next hop: 7.7.7.7-777<c>
                Communities: color:0:777
                        Protocol next hop: 7.7.7.7-777<c> Metric: 30
                                7.7.7.7-777<c>/64 Originating RIB: inetcolor.0

All good!

Ciao
IoSonoUmberto