Resource Monitoring (resmon) Overview

Monitoring the system resources is very crucial to analyze the health of devices. RBFS has a dedicated daemon called resmond to discover and monitor the device resources. Resmond polls the system resources to gather the status of the resource and store this data in the resource-specific BDS table.

The Resource Monitoring (resmon) functionality of RBFS provides support for monitoring the following components:

  • CPU

  • Memory

  • Processes

  • Disks

  • Sensor

  • Optics

CPU

Resmond collects CPU hardware information from the global.chassis_0.resource.cpu table. In addition, Resmond calculates CPU usage dynamically and stores this information in the global.chassis_0.resource.cpu_usage table.

Memory

Resmond collects RAM hardware information from the global.chassis_0.resource.mem table. In addition, Resmond gathers memory usage information in the global.chassis_0.resource.mem_usage table.

Processes

Resmond collects process usage information of Brick Daemon(BD) that runs in the RBFS and stores the information in the global.chassis_0.resource.proc_usage table. It dynamically gathers the process information and calculates the CPU and the memory usage of the individual Brick Daemons.

Disks

Resmond collects the disk information from the global.chassis_0.resource.disk table. In addition, Resmond collects disk usage information in the global.chassis_0.resource.disk_usage table.

Sensor

Resmond collects the reading data of the hardware sensor such as temperature, fan, power-supply, and system LED. The data collected from the sensor are stored in the global.chassis_0.resource.sensor table.

The RBFS implementation supports pluggable optics modules on white box switches only.

System Clock

Resmond provides support for monitoring the system clock so that the system clock is always in sync with the NTP server clock. This ensures that the deviation from the NTP server clock always remains within acceptable limits. Resmond collects the system clock information from the global.os.timex table.

The RBFS implementation supports pluggable optics modules on white box switches only.

Optical Modules

Resmond monitors optical transceivers plugged onto the chassis. It reads transceivers EEPROM (Electrically Erasable Programmable Read-Only Memory) data and translates the data to respective fields in the BDS tables.

Resmond provides the following functionalities for monitoring optical transceivers:

  • Provides a mechanism to discover and monitor optics modules. Supported optics modules include SFP, SFP+, QSFP, QSFP+, and QSFP28 (DAC is not supported).

  • Provides CLIs to write to optics modules

  • Provides show commands to see optics inventory and status of each module

  • Logs the status of the optics module

The RBFS implementation supports monitoring of pluggable optics modules on white box switches only.

The following are some of the important tasks (but not limited to) that the Resmond application performs:

  • Optics inventory: Identifying the following brief information of a discovered optics module and stores in table global.chassis_0.resource.optics.inventory.

    • Port

    • Type

    • Vendor

    • Serial Number

    • Part Number

  • Read the following optics data from a module and stores in the table: global.chassis_0.resource.optics.module.

    • RX/TX alarming (loss of light and loss of signal)

  • RX/TX power status

    • Voltage and BIAS status

    • Temperature

  • Write the optics data to an optics module

    • Enabling high power class on QSFP28

    • Shutdown lasers (QSFP28, SFP+ and SFP)

Optics Logging

The Resmond can log the following Optics module events:

  • Temperature high alarm

  • Temperature high warning

  • Temperature low alarm

  • Temperature low warning

  • Voltage high alarm

  • Voltage high warning

  • Voltage low alarm

  • Voltage low warning

  • Lane power high alarm

  • Lane power high warning

  • Lane power low alarm

  • Lane power low warning

  • Lane bias high alarm

  • Lane bias high warning

  • Lane bias low alarm

  • Lane bias low warning

Q2C Resource Monitoring

Q2C platform resource-specific usage metrics are stored in the BDS table: local.bcm.q2c.resource.monitor. Resource usage information enables you to understand the scale of services that the device performs and how it optimally leverages the resource usage.

The following table provides the list of supported resource types for monitoring in RBFS for Q2C platforms.

Resource Type Description

EEDB_L2TP

EEDB is an Egress Encapsulation Data Base. This resource is consumed when L2TP subscribers are created in hardware.

EEDB_MPLS_TUNNEL

EEDB is an Egress Encapsulation Data Base. This resource is consumed when MPLS tunnels are created in the chip.

EEDB_PPPOE

EEDB_PPPOE is used for PPPoE encapsulation. This resource is consumed when PPPoE subscribers are created in hardware.

EEDB_PWE

This resource is consumed when L2X or cross-connection sessions are created in hardware at egress.

IN_LIF_FORMAT_PWE

This resource is consumed when a pseudowire is created in the hardware at ingress.

IN_AC_C_C_VLAN_DB

This resource is consumed when an ingress logical interface for double-tagged VLAN is created.

IN_AC_C_VLAN_DB

This resource is consumed when an ingress logical interface for single tag VLAN is created.

IN_AC_UNTAGGED_DB

This resource is consumed when an ingress logical interface for untagged IFLs is created.

IPV4_MULTICAST_PRIVATE_LPM_FORWARD

LPM stands for Longest Prefix Match. This resource is consumed for multicast (source, group) entries.

IPV4_UNICAST_PRIVATE_LPM_FORWARD

This resource is consumed for non-default VRF instance IPv4 prefixes.

IPV4_UNICAST_PRIVATE_LPM_FORWARD_2

This resource is consumed for default VRF instance IPv4 prefixes.

IPV6_UNICAST_PRIVATE_LPM_FORWARD

This resource is consumed for non-default VRF instance IPv6 prefixes.

IPV6_UNICAST_PRIVATE_LPM_FORWARD_2

This resource is consumed for default VRF instance IPv6 prefixes.

L3_RIF

This resource is consumed for the L3 interfaces.

L2TPV2_DATA_MESSAGE_TT

This resource is consumed when the L2TP subscribers are created in hardware.

MPLS_FWD

This resource is consumed for MPLS entries for which forwarding actions are involved.

MPLS_TERMINATION_SINGLE_LABEL_DB

This resource is consumed for MPLS entries for which label termination is required.

MULTICAST_MCDB

This resource is consumed for multicast groups created in hardware.

PPPOE_O_ETH_TUNNEL_FULL_SA

This resource is consumed for PPPoE subscribers in hardware.

Example: Logical table information for the resource type EEDB_L2TP

supervisor@rtbrick: dbg> bcm "dbal table info table=EEDB_L2TP"

Logical table info  EEDB_L2TP
=============================

        Access method: MDB
        Table type: DIRECT
        Touched status: Initialized
        Entries Status: Max Capacity: HW dependent (see mapping), Committed 0
        Bulk mode range NOT supported
        Maturity_level: HIGH
        Table Labels: L2, L3, MPLS, EEDB
        Core mode: SBC
        Max key value: 1048575
        Max payload size in bits: 101
<...>

Example: Logical table information for the resource type EEDB_MPLS_TUNNELEEDB_MPLS_TUNNEL

supervisor@rtbrick: dbg> bcm "dbal table info table=EEDB_MPLS_TUNNEL"

Logical table info  EEDB_MPLS_TUNNEL
====================================

        Access method: MDB
        Table type: DIRECT
        Touched status: Initialized
        Entries Status: Max Capacity: HW dependent (see mapping), Committed 18
        Bulk mode range NOT supported
        Maturity_level: HIGH
        Table Labels: L2, L3, MPLS, EEDB
        Core mode: SBC
        Max key value: 1048575
        Max payload size in bits: 147
<...>

QAX Resource Monitoring

QAX platform resource-specific usage metrics are stored in the BDS table: local.bcm.qax.resource.monitor. Resource usage information enables you to understand the scale of services that the device performs and how it optimally leverages the resource usage.

The following table provides the list of supported resource types for monitoring in QAX platforms.

Resource Type Description

ECMP ID

It represents an index in the table that maintains the paths for a particular ECMP route.

Egress Failover ID

It identifies failover paths for traffic leaving the device.

FEC Failover ID

This is used to manage failover paths for FECs.

FECs for Global Use

It represents the number of FECs that are available for general use across the device.

Field Direct Extraction Entry ID

It represents the entries in the FDE table, which are used for direct extraction of fields.

Field Entry ID

It identifies a specific entry used for field-based operations.

Ingress Failover ID

It identifies backup paths for traffic entering the device.

Local Common InLif

It represents the number of common InLifs (Ingress Logical Interface) allocated for incoming traffic.

Local OutLif

It represents the number of logical interfaces allocated for outgoing traffic.

Local Wide InLif

Wide InLifs are used for additional metadata or special handling is required for incoming traffic. It represents the number of wide InLifs allocated for such purposes.

Number of Meters in Processor A

Meters are used for rate limiting or policing traffic, ensuring that traffic adheres to predefined bandwidth limits. This represents the number of meters available and used in processor A.

Profiles for PON Use

Profiles related to Passive Optical Networks are used in the management and configuration of PON interfaces. The number of profiles allocated for PON operations.

QOS EGRESS DSCP/EXP MARKING PROFILE IDs

These IDs represent the QoS marking profiles used to mark outgoing traffic based on DSCP or EXP bits, ensuring the prioritization of outgoing traffic.

QOS EGRESS L2 I TAG PROFILE IDs

These IDs are used for profiles that manage Layer 2 (L2) tagging in egress traffic. The number of available profiles for L2 I Tagging.

QOS EGRESS MPLS PHP QOS IDs

The IDs identify QoS profiles used for MPLS PHP (Penultimate Hop Popping) operations.

QOS EGRESS REMARK QOS IDs

This is used to re-mark packets with a new QoS value as they exit the device. IDs represent the QoS profiles that apply new markings to egress traffic.

QOS INGRESS COS OPCODE IDs

COS (Class of Service) opcodes are used to classify incoming traffic based on criteria such as type of service. The IDs represent different COS opcode profiles for classifying ingress traffic.

QOS INGRESS LABEL MAP ID

Maps are used to associate incoming packets with certain labels for QoS processing. The number of available Label Maps for ingress traffic.

QOS INGRESS LIF/COS IDs

This ID is used for associating incoming traffic with specific LIF (Logical Interface) and COS profiles for handling. The number of IDs available for mapping ingress traffic to LIF/COS profiles.

QOS INGRESS PCP PROFILE IDs

Priority Code Point (PCP) profiles are used to prioritize packets at the ingress based on their PCP value. The IDs represent available PCP profiles for prioritizing ingress traffic.

SW Handles of Policer

It represents the number of software handles available for configuring policers.

Trill Virtual Nickname

It represents the number of virtual nicknames allocated for TRILL protocol operations for creating loop-free multi-path Ethernet network.

VLAN Translation Egress Usage

The number of entries used for VLAN translation operations on out-going traffic.

VLAN Translation Ingress Usage

The number of entries used for VLAN translation on incoming traffic.

VSIs for MSTP

The number of VSIs (Virtual Switch Instances) allocated for MSTP operations.

VSIs for TB VLANS

The number of VSIs allocated for handling TB VLANs.

Supported Platforms

Not all features are necessarily supported on each hardware platform. Refer to the Platform Guide for the features and the sub-features that are or are not supported by each platform.