Use Cases

Monitoring Network Devices Using SNMP

Overview

Unmonitored network devices can lead to unexpected failures, performance bottlenecks, and costly downtime. From Ethernet switches to network interfaces, these components must function optimally to ensure seamless communication.

MetricsHub® provides comprehensive SNMP-based monitoring, allowing you to track key performance metrics and detect issues before they impact your network.

Why Network Device Monitoring Matters

Monitoring network devices helps:

  • Detect performance issues by identifying slowdowns caused by congestion or high bandwidth usage
  • Identify failures and connectivity issues by detecting unplugged cables, failed ports, or degraded network links
  • Optimize resource utilization by monitoring bandwidth consumption to balance traffic loads
  • Troubleshoot network problems faster with real-time alerts, allowing you to diagnose and resolve issues before they cause downtime.

Common Network Device Issues Detected by MetricsHub

With MetricsHub, you can monitor and address common network device issues that may lead to downtime, performance degradation, increased network latency, or bottlenecks, directly impacting data flow and overall efficiency.

The table below outlines frequent issues and the corresponding metrics that help diagnose and resolve them:

Issue Description Metrics
Hardware Failures Degraded fans, power supplies, overheating, or faulty modules. hw.status{hw.type="<component>", state="degraded|failed|ok"}, hw.fan.speed, hw.fan.speed.limit{limit_type="low.degraded"}, hw.temperature.limit{limit_type="high.critical"}
Network Congestion High bandwidth usage leading to slowdowns. hw.network.bandwidth.limit, hw.network.io{direction="receive"}, hw.network.io{direction="transmit"}
Packet Loss High error rates affecting network communication. hw.network.packets{direction="receive"}, hw.network.packets{direction="transmit"}
Transmission Errors Packet corruption due to faulty cables, failing ports, or electrical interference. hw.errors{hw.type="network"}
Interface Down Network interface failure due to unplugged cables, administrative shutdown, or hardware faults. hw.network.up, hw.status{hw.type="network", state="degraded|failed|ok"}
Cooling Inefficiency Fans running too slow, reducing cooling effectiveness. hw.fan.speed, hw.fan.speed.limit{limit_type="low.degraded"}
Outdated Firmware Device running on old firmware, causing security risks. bios_version

Configuring MetricsHub for Network Interfaces Monitoring

In the example below, we configured MetricsHub to monitor a switch using SNMP v2c.

To monitor our switch:

  1. In the config/metricshub.yaml file, we create the resource alcatel-switch with the following attributes:

    • hostname: alcatel-switch-01
    • host type: network
      resources:
        alcatel-switch:
          attributes: 
            host.name: alcatel-switch-01
            host.type: network
    
  2. We configure MetricsHub to connect to the network switch using the SNMP v2c protocol and the public community

      protocols:
        snmp:
          version: v2c
          community: public
    

Here is the complete YAML configuration to be added to config/metricshub.yaml to monitor the switch:

  resources:
    alcatel-switch:
      attributes: 
        host.name: alcatel-switch-01
        host.type: network
      protocols:
        snmp:
          version: v2c 
          community: public