Subscribe to the latest remote jobs:

Industrial AI Cloud - Network Engineer (REF5506H)

🇭🇺 Hungary

Management

Machine Learning

Security Engineer

Industrial AI Cloud - Network Engineer (REF5506H)

from 🇭🇺 Hungary

As Hungary’s most attractive employer in 2025 (according to Randstad’s representative survey),Deutsche Telekom IT Solutions is a subsidiary of the Deutsche Telekom Group. The company provides a wide portfolio of IT and telecommunications services with more than 5300 employees. We have hundreds of large customers, corporations in Germany and in other European countries.

DT-ITS recieved the Best in Educational Cooperation award from HIPA in 2019, acknowledged as the the Most Ethical Multinational Company in 2019. The company continuously develops its four sites in Budapest, Debrecen, Pécs and Szeged and is looking for skilled IT professionals to join its team.

NVIDIA and Deutsche Telekom are jointly developing the world’s first industrial AI cloud for European manufacturers. This AI factory in Germany will host 10,000 GPUs across NVIDIA DGX B200 systems and RTX Pro Servers. Deutsche Telekom provides secure, sovereign and fast infrastructure, including data centers, operations, security, and AI solutions.

Role Overview

We are seeking anNetwork Engineerto new networking team for automation and operation related network components such as Switches, Firewalls, Routers, Border Gateways as part of core environment of the Industrial AI Cloud. In this role you will provision and manage above mentioned stack, implement and fine-tune monitoring, and deploy additional components if necessary.  You’ll be working and coordinating between multiple teams (such as Infrastructure, Platform) to deliver and continuously improve infrastructure services following ITIL processes.

Detailed scope of the operations:

98x MQM9700-NS2F (InfiniBand 400G)

138x SN2201 (1G Spectrum based Ethernet switch, Cumulus OS)

8x SN5400 (100G Spectrum-3 based Ethernet switch, Cumulus OS)

101x SN5610 (800G Spectrum-4 based Ethernet switch, Cumulus OS)

4x FortiGate FG-201G

2x FortiGate 4801F-EU

2x Border Gateways - Cisco CR-8608 or Juniper PTX 10004

2x NVIDIA UFM appliance

 

Proprietary technologies used for managing above scope: InfiniBand, Cumullus OS, RoCE, UFM,  FortiGate friewalls, Cisco Border gateways.

 

Key Responsibilities

  • Coordinate Operations together with Data Center, IaaS & PaaS layer: Coordinate and support network lifecycle activities (installs, upgrades, changes, firmware updates) and manage /network interconnections and related documentation
  • Switch & Firewall Management: Provision and maintain InfiniBand switches according to ITIL Standards
  • Automation: Develop and maintain automation scripts to orchestrate overall scope. Fine tuning, configuration changes through whole project lifetime
  • OS & Firmware Management: Maintain network-based environments, apply patches, and manage firmware upgrades at scale.
  • Monitoring & Observability:   
  • ITIL Processes: Follow and improve incident, problem, and change management workflows; document runbooks and standard operating procedures. Adhere to ZERO Outage guidelines.
  • Cross-Team Collaboration: Work closely with Platform Engineers and AI solution teams to ensure smooth deployments and operations.
  • Manage High-Speed Fabric: A unified network fabric utilizing both InfiniBand and Ethernet / RoCE technologies.
  • Management Network: A separate 1 Gbps Ethernet  and serial console for out-of-band (OOB) network management.
  • PE/CE datacenter connectivity: CE routers, firewalls

What We Offer

  • Work on Europe’s first industrial AI cloud with cutting-edge technologies.
  • Direct collaboration with NVIDIA and Deutsche Telekom experts.
  • Hybrid working model, training opportunities, and career progression.

Required Skills and Qualifications

  • Experience in network installation, maintenance, and operations.
  • Deep understanding ofInfiniBand architecture,RDMA over Converged Ethernet (RoCE),andlow-latency high-throughput networkingfor AI/HPC workloads.
  • Experience withNVIDIA/Mellanox switch configurationandUFM (Unified Fabric Manager)management.
  • Data Center Routing & Border Gateway Protocols: understanding ofCisco orJuniper routers (e.g., CR-8608, PTX 10004) andBGP/OSPF routing. Knowledge ofASNs,IP Transit,peering, andfailover connectivity
  • Linux Networking (Cumulus OS / Ubuntu / Debian). Command-line networking skills onLinux-based systems,especiallyCumulus Linux. Experience configuringbridges, bonds, VLANs, and routing tables.
  • Experience using tools such asiperf,ETHTool,nvidia-smi for network devices,perfquery, andMellanox/NVIDIA diagnostics.
  • Skilled incontinuous monitoring,incident detection, androot-cause analysis for large-scale data center networks.
  • Familiar withNOC/SOC operational proceduresandon-call rotation models.
  • Firewall & Security Management: Proficient inFortiGate firewall administration — policies, NAT, VPNs, IDS/IPS, and HA configuration. Understanding ofsecurity segmentation,DDoS mitigation,andzero-trust networking.
  • Configuration & Lifecycle Management: Hands-on experienceinswitch provisioning,firmware/OS upgrades,patch management,andconfiguration backups
  • Working knowledge of ITIL processes (incident, problem, change).

 

    You will be working in the European Union to meet our customers' data security and privacy requirements.

    * Please be informed that our remote working possibility is only available within Hungary due to European taxation regulation.

    by @maxrusakovic