Troubleshooting CGNAT from inside the MS-MPC

Recently, I bumped into a CGNAT issue. The issue was pretty self-explanatory: all the traffic hitting the service card got dropped.

We were talking about few users, around 5k, and few total sessions, around 60k.

We were using PBA, meaning that each private address was assigned a port block. Port block size was set to 1008. This means that every private IP could open up to 1008 sessions simultaneously. If it tries to open more, session is dropped as there are no more available ports for that IP.

If a user attempted a DNS query, traffic was blocked by the MSMPC:

X0178879@cgnatrm2> show services sessions service-set NAT-SET-1 destination-prefix 8.8.8.8/32
Mar 17 17:59:23
ms-8/0/0
Service Set: NAT-SET-1, Session: 1073806714, ALG: none, Flags: 0x100, IP Action: no, Offload: no, Asymmetric: no
UDP          100.65.0.1:58090  ->         8.8.8.8:53     Drop     I               1

In order to understand more, we monitored NAT statistic for the NPU taking care of traffic:

X0178879@cgnatrm2> show services nat statistics interface ms-8/0/0 | match "NAT allocation F"
Mar 17 18:03:02
    NAT allocation Failures                                 :23471950

{master}
X0178879@cgnatrm2> show services sessions service-set NAT-SET-1 destination-prefix 8.8.8.8/32
Mar 17 18:03:11

{master}
X0178879@cgnatrm2> show services sessions service-set NAT-SET-1 destination-prefix 8.8.8.8/32
Mar 17 18:03:16
ms-8/0/0
Service Set: NAT-SET-1, Session: 67131703, ALG: none, Flags: 0x100, IP Action: no, Offload: no, Asymmetric: no
UDP          100.65.0.1:56445  ->         8.8.8.8:53     Drop     I               1

{master}
X0178879@cgnatrm2> show services nat statistics interface ms-8/0/0 | match "NAT allocation F"
Mar 17 18:03:18
    NAT allocation Failures                                 :23471952

There was a counter increasing: NAT allocation failures.

This counter typically means that MSMPC was not able to find an available port to create the session. This counter normally comes into play when the port block assigned to a user is full, In our case, this means when that IP 100.65.0.1 has opened more than 1008 sessions at the same time.

This felt suspicious. As I said, we had about 5k users and 60k sessions meaning that, on average, each subscriber was opening about 12 sessions…way far from 1008. Ok, one of those 5k users could have hit that limit but we were dropping 100% of the sessions.

At that point, we had one hint, an increasing counter, but it was not helping at all…the opposite! It was reporting something that was not happening for real.

In order to understand more, we had to look for information somewhere else. Junos CLI was not enough, we entered the MSMPC NPU directly.

In order to do that:

CGNAT > start shell
CGNAT # telnet fpc8.pic0

Access takes place via shell using telnet. In the example above, we log into MSMPC in slot 8, PIC 0 (corresponding interface is ms-8/0/0).

When prompted for a username, simply type “root”; no password is required.

Once there, we access the NPU CLI:

mspdbg-cli -ps

There, we have several commands we can use to understand better what’s going on.

Here is a list of some useful ones:

show msp plugins
show msp plugins pkt-cntrs
show msp service-sets
show msp service-set service-set-id <N> svc-id 2
show msp tcplib statistics
show msp tcp-stack-plugin statistics
show msp stats ctrl
show msp services-options
show msp ip-stats
show msp shm mum
show msp jbuf pools

show svcs-xlp intf counters
show svcs-xlp fifo
show svcs-xlp cpu stats
show svcs-xlp poe stats
show svcs-xlp sae stats
show svcs-xlp jbuf-stats
show msp jbuf pools
show svcs-xlp exceptions

plugin nat show nat statistics terse
plugin nat show natlib statistics
plugin nat show nat rules
plugin nat show nat pool details

First, we verified NAT plugin was there and active:

MSPMAND-CLI> show msp plugins
Plugins registered: 20
Next PID          : 20
Plugin Mask       : 0x000fffff
Name                    ID      Data handler    Control handler
...
junos-nat               1       0x1084520f0     0x108454868
  Class : 1 Provider ID : 0x00000000 Gencfg APP ID: 0
  Flags : 0x0000160a TCP Flags : 0x00000001
  Event class base : 36
  Event class names :
...

Next, we checked plugin counters and we noticed junos-nat was reporting discards, many discards:

MSPMAND-CLI> show msp plugins pkt-cntrs
...

Plugin ID: 1
Plugin Name: junos-nat
         Packets Received                       : 0
         Packets Forwarded                      : 0
         Packets Dropped                        : 0
         Session Discarded                      : 77212009
...

At that point, we wanted to understand more about those discards so we looked at NAT statistics on the NPU:

MSPMAND-CLI> plugin nat show nat statistics terse
Session statistics
        Total Session Interest events            :12050331
        Total Session Destroy events             :12050329
        Total Session Discards                   :12050331
        Pkt Dst in NAT Route                     :314064
        NAT rule lookup failed                   :314064

NAT Allocation/free statistics
        NAT Allocation Failures                  :23471938
...

There, we see that all the sessions were discarded (Interest events = Discards).

It also reports NAT rule lookup and Packet destination in NAT route failures and, more importantly, NAT allocation failures.

THose numbers confirm what we had already seen from Junos CLI but do not add much.

Another command helps us:

MSPMAND-CLI> plugin nat show natlib statistics

jsf-natlib-counter (total 502), lsys_id: 0.
        -- source nat control ----------          0
        ...
        -- source nat rt ---------------          0
        SRC_ALLOC                          58489214
        SRC_ALLOC_SUCCESS                         0
        SRC_ALLOC_FAIL                     58489214
        ...
        SRC_ALLOC_DETERM                   58489214

There seem to be some issues with the source.

Yet, this does not tell us too much as “NAT allocation failures” have to do with source, ok, but we said that for sure we were not exceeding the PBA (for sure not for all the 5k users).

We check NAT pools and rules in order to verify source data is correct:

MSPMAND-CLI> plugin nat show nat rules
...
    -> src addr
          mask 255.255.240.0           addr 100.65.0.0 to 100.65.15.255
...
    -> src addr
          mask 255.255.240.0           addr 100.65.0.0 to 100.65.15.255
...
    -> src addr
          mask 255.255.240.0           addr 100.65.0.0 to 100.65.15.255
...
    -> src addr
          mask 255.255.240.0           addr 100.65.0.0 to 100.65.15.255
...

MSPMAND-CLI> plugin nat show nat pool details
...
 Source Address(es):
          mask 255.255.240.0           addr 100.65.0.0 to 100.65.15.255
 Failures: 58489214
    Out of addresses : 0
    Out of ports : 0
    APP Out of ports : 0

Source is correct anywhere as our private pool is 100.65-something.

The second command shows us something interesting: no out of addresses/ports errors…so it is something source related but we are not exceeding the PBA block.

Anyhow, source address seems correct.

At this point, we decided to enable tracing.

First, we identify IDs for two plugins:

MSPMAND-CLI> show msp trace-handles
MSP_TRACE target: 0
Handle ID Level Map Plugin
junos-nat 11 0 0x00000000(0) -1
jsf nat lib 29 0 0x00000000(1) 0

Then, we enabled logging for them:

set msp trace handle 11 level 8
set msp trace handle 29 level 8
set msp trace handle 11 mask 0xffffffff
set msp trace handle 29 mask 0xffffffff

We have some traffic reaching the NPU and being dropped. NExt, we check the trace.

Trace can be found on Junos filesystem at /var/log. Name is trace-ms<SLOT><NPU#>.txt. In this case, trace-ms80.txt.

Scrolling through the file we find this:

Mar 17 20:18:33.594123 : [20] Event ID 21 passed to NAT plugin
Mar 17 20:18:33.594157 : [20] msvcs_nat_process_pre_session_alloc_validate, NAT Policy is obtained for service set 2
Mar 17 20:18:33.594206 : [20] Event ID 3 passed to NAT plugin
Mar 17 20:18:33.594231 : [20] nat_data_evt_handler: Session interest from parent session 100679671
Mar 17 20:18:33.594283 : [20] msvcs_nat_process_session_interest: Setting ALG id to 0 mapping_refresh 0
Mar 17 20:18:33.594320 : [20] jsf_nat_subs_ext_set_state: State change TIMER -> VALID Success
Mar 17 20:18:33.594348 : [20] ctx[0x0:0x2], conn 100.65.0.1/50939 -> 8.8.8.8/53 17
Mar 17 20:18:33.594382 : [20] op 1, src-pool-id 4/0x4, tunnel-id 0, sw_src_addr 0.0.0.0, ifl-idx 53, port-num 1, desired 0.0.0.0/0, alloc-flag 0x80, dst-pool-id 0/0x0, ip-base 0.0.0.0, dst-ip-wing0 0.0.0.0,protocol 17
Mar 17 20:18:33.594427 : [20] nat_src_xlate_alloc_: pool addr assign 0x20
Mar 17 20:18:33.594456 : [20] src_determ_host_ip2host_offset: get host offset from range min 10.65.0.1 max 10.65.15.254 for src_ip 100.65.0.1
Mar 17 20:18:33.594488 : [20] src_determ_host_ip2host_offset: get current summary offset 4094.
Mar 17 20:18:33.594513 : [20] src_determ_alloc_process: get host offset failed by host ip 100.65.0.1.
Mar 17 20:18:33.594545 : [20] src_pool_generate_log: src nat pool log send
Mar 17 20:18:33.594570 : [20] nat_src_xlate_alloc: fail to alloc nat resource, pool-id 4/0x4, xlated-src 0.0.0.0/0, ri-id 0, flag 0x1
Mar 17 20:18:33.594607 : [20] msvcs_nat_xlate_alloc: Failed1 to alloc IP from JSF NATLIB for svc_set_id 2 from natpool Pu-NAT-POOL-1.
Mar 17 20:18:33.594636 : [20] msvcs_nat_process_session_interest: Could not allocate NAT mappings for svc_set_id 2. Session will be discarded.

There is one line which is very useful:

Mar 17 20:18:33.594456 : [20] src_determ_host_ip2host_offset: get host offset from range min 10.65.0.1 max 10.65.15.254 for src_ip 100.65.0.1

Even if we verified private pool information is correct, somehow, somewhere else in NPU data structure there must be an incorrect data as it expects source IPs to be in the 10.65-something range, not 100.65-something one.

That’s the problem! That’s why it complains about the source.

Where did this issue come from?

NAT configuration learns about private pool address from the NAT rule configuration where we reference a prefix list.

Checking commit history, we found this:

set policy-options Pr-POOL-1 10.65.0.0/20
delete policy-options Pr-POOL-1
set policy-options Pr-POOL-1 100.65.0.0/20
commit

Basically, private pool was initially incorrect and we updated it.

Internally, the MSMPC NPU holds private pool information in two different data structures.

Somehow, when Junos committed the updated configuration, the MSMPC was not able to update private pool data in both data structures leading to an internal misalignment that caused those source allocation errors.

In order to avoid it, it was required to deactivate and re-activate the service set so that the NPU could re-program itself correctly and update all the required data structures with the correct data.

This is because some operations are considered critical and, to be safe, should always be followed by either a service set de-activate/re-activate or a NPU restart.

Among these operations, we have configuration changes involving:

  • private pool
  • public pool addresses
  • port block size

At the end, solving the issue was very easy and highlighted a new best practice to keep in mind for the future.

Anyhow, this gave us the opportunity to have a look inside the NPU and understand how to look for information there and go beyond classic CLI commands.

Ciao
IoSonoUmberto

Leave a comment