UFM 平台可助力科研和行业数据中心操作人员对 InfiniBand 数据中心网络进行高效调配、监控、管理、预防性故障排除及维护。UFM 平台包含多个不同级别的解决方案和全面的功能集,可满足广泛的现代横向扩展数据中心需求。借助 UFM,您可以实现更高的网络资源利用率、获得竞争优势,并减少运营支出。
版本:UFM Enterprise Appliance Gen 3.0 (NDR) HW UM
描述:
NVIDIA UFM Appliance 3.0 for UFM Telemetry or UFM Enterprise
1U server with 2x ConnectX-7 NDR Single-port
400Gb/s InfiniBand adapter
Secured- boot, UFM software package sold separately
NVIDIA UNIFIED FABRIC
MANAGER (UFM) PORTFOLIO
AI-Powered Cyber Intelligence and Analytics Platforms
Data centers host many users and applications and have become the competitive advantage for research organizations and manufacturing companies. Keeping the data center intact and healthy is critical—a data center shutdown can mean the loss of millions of dollars. What’s more, malicious users often exploit data center access to misuse compute resources such as by running prohibited applications, resulting in higher operating costs.
NVIDIA® UFM® platforms revolutionize InfiniBand network management. By combining enhanced and real-time network telemetry with AI-powered cyber intelligence and analytics, the UFM platforms empower you to discover operation anomalies and predict network failures for preventive maintenance. UFM platforms comprise multiple levels of solutions and capabilities to suit yourdata center’s needs and requirements. At the basic level, the UFM Telemetry platform provides network validation tools, and monitors the network performance and conditions. It captures, for example, rich real-time network telemetry information, and workload usage data and system configuration, and streams it to a defined on-premises or cloud-based database for further analysis.
The mid-tier UFM Enterprise platform adds enhanced network monitoring, management, workload optimizations and periodic configuration checks. In addition to including all of the UFM Telemetry services, it provides network setup, connectivity validation, and secure cable management, automated network discovery and network provisioning, traffic monitoring, and congestion discovery. UFM Enterprise also enables job scheduler provisioning and integration with Slurm and Platform LSF, in addition to network provisioning and integration with OpenStack, Azure Cloud and VMware.
The enhanced UFM Cyber-AI platform includes all of the UFM Telemetry and UFM Enterprise services. The unique advantages of the Cyber-AI platform are based on capturing rich InfiniBand telemetry information over time and utilizing deep learning algorithms. The platform learns the data center’s “heartbeat,” operation mode, conditions, usage, and workload network signatures. It builds an enhanced database of telemetry information and discovers correlations between events. It detects performance degradations, usage and profile changes over time, and alerts to abnormal system and application behavior, and potential system failures. The Cyber-AI platform can also perform corrective actions.
In addition to detecting past and current events, the Cyber-AI platform can indicate future performance degradations or abnormal usage of the data center computing resources, by translating and correlating changes in the data center heartbeat. Such changes and correlations trigger the performing of predictive analytics, and initiate alerts that indicate abnormal system and application behavior, as well as potential system failures. System administrators can quickly detect and respond to such potential security threats, and address upcoming failures in an efficient manner, saving OPEX and maintaining end-user SLAs. Predictability is optimized over time with the collection of additional system data.
UFM ENTERPRISE 可视化面板
UFM各版本特征