Job description
At Exness, we are not just a leading trading broker—we’ve reimagined what it takes to be a leader. With 40M+ trades a day and 2,000+ people across 13 countries, we combine scale, care, and real tech to make trading better for 1M+ clients worldwide. Recognised globally as a Best Place to Work, we’re a people-first company where long-term wins always matter more. As part of our team, you will shape the future of fintech with real technology, care, and purpose.
Responsibilities
- Ensure stable, high-availability operation of compute, storage, and virtualization platforms across multiple data centers.
- Plan and execute lifecycle operations for hardware, virtualization clusters, and storage systems, ensuring predictable and low-risk maintenance windows.
- Diagnose and resolve infrastructure issues across hardware, virtualization, and storage layers, collaborating with vendors and internal teams when required.
- Continuously monitor platform performance and capacity usage; identify bottlenecks, forecast resource growth, and provide recommendations for scaling infrastructure.
- Participate in evaluating new technologies, preparing technical designs, and documenting infrastructure standards, troubleshooting guides, operational procedures, and recovery playbooks to ensure operational consistency.
- Develop and maintain automation tools to reduce operational effort and increase platform consistency.
- Support backup and recovery systems, ensuring resilience, data integrity, and the ability to restore services in multi-site environments.
- Act as a subject-matter expert in servers, virtualization, and infrastructure software; assist other teams in troubleshooting and optimizing their systems.
Requirements
- Bachelor’s in Computer Science/IT or equivalent;
- 3+ years in IT systems/infrastructure;
- VMware/storage/virtualization certifications (e.g., VCP) a plus.
- Expertise in VMware ESXi, vCenter, clusters, HA/DRS, patching, upgrades, and virtualization lifecycle operations in high-load, distributed environments.
- Strong technical knowledge across compute, storage, virtualization, networking, and automation; server hardware lifecycle management and performance troubleshooting.
- Linux administration (RHEL, CentOS, Ubuntu), alternative virtualization (Hyper-V, OpenStack), and cloud platforms (AWS, GCP) experience is advantageous.
- Solid understanding of enterprise storage, capacity planning, monitoring, LUNs, snapshots, replication, backup, disaster recovery, and storage systems (PureStorage, Dell/EMC, Lenovo).
- Proficient in L2/L3 networking, Fibre Channel fabrics, and network monitoring for virtualization, storage, and multi-site environments.
- Skilled in automation and scripting (Ansible, Shell, Python, PowerShell, Go) and monitoring/observability tools (Zabbix, Grafana, Prometheus).
- Strong operational engineering: root-cause analysis, documentation, infrastructure security, change management, physical data center operations, and multi-DC HA environments.
- Analytical, detail-oriented, proactive, calm under pressure, and focused on continuous improvement, reliability, and resilience.
- Collaborative team player with ownership mindset, effective cross-team/vendor communication, and ability to adapt.