All vacancies

Senior SRE (blockchain networks)

P2P · remote · senior · full-time
cryptodevweb3 KubernetesTerraformLinuxGCPAWSAzureOCIPrometheusGrafanaLokiGoPython
6.4
AI Score
The vacancy is strong in task clarity and requirements but lacks compensation details and company information.
no salary info
Job description
We are looking for a Senior Site Reliability Engineer (SRE 3) to join the Launch team at P2P.org. This team is responsible for bringing new blockchain networks into production—from initial design and deployment to ensuring they are stable, observable, and production-ready.
Responsibilities
### Network Launch & Operations - Lead the end-to-end launch of new blockchain networks—from testnet to mainnet - Design and implement deployment architectures for validators, full nodes, RPCs, and supporting services - Ensure all new networks meet production readiness standards—monitoring, alerting, backups, failover, and security - Collaborate with protocol teams to understand network-specific requirements, risks, and failure modes - Create repeatable launch patterns and runbooks to reduce time-to-market for new networks ### Infrastructure & Reliability Engineering - Build and operate infrastructure across cloud and bare-metal environments - Improve automation and standardisation of deployments using Terraform, Helm, and internal tooling - Contribute to the internal platform by aligning launches with existing Kubernetes, observability, and delivery standards - Implement high-availability and fault-tolerant setups for validator infrastructure - Continuously improve SLOs, SLIs, and alerting for newly launched networks ### Observability & Incident Response - Ensure all services are fully observable—metrics, logs, and traces - Define and implement alerts that are actionable and low-noise - Participate in on-call rotations and incident response - Lead or contribute to post-incident reviews, focusing on systemic improvements - Proactively identify and fix reliability risks before they impact production ### Security & Best Practices - Apply security best practices to all deployments—secrets management, access control, and network isolation - Ensure compliance with internal standards and contribute to SOC 2-aligned practices - Support secure key management practices for validator infrastructure ### Collaboration & Ownership - Work closely with Infrastructure, Core Networks, and Security teams - Take ownership of deliverables - from design to production - Contribute to documentation, runbooks, and knowledge sharing - Support and mentor more junior engineers when needed
Requirements
### Required - 5+ years of experience in SRE, DevOps, or infrastructure engineering - Strong experience operating production systems at scale - Hands-on experience with: - Kubernetes (deployment, troubleshooting, operations) - Terraform (infrastructure as code) - Linux systems and networking fundamentals - Experience with at least one cloud provider (GCP preferred, AWS, Azure, OCI) - Experience with observability tooling (Prometheus, Grafana, Loki, or similar) - Familiarity with CI/CD systems and GitOps workflows (e.g., ArgoCD) - Solid scripting or programming skills (Go, Python, or similar) - Experience working in distributed systems or high-availability environments - Strong debugging and problem-solving skills under pressure - Good communication skills and ability to work across teams (English B2 minimum) ### Nice to have - Experience with blockchain infrastructure (validators, RPC nodes, staking systems) - Experience with bare-metal environments - Experience with distributed tracing or advanced observability setups - Exposure to security and compliance frameworks (SOC 2, ISO 27001)
About P2P
P2P GROUP PLC is a dormant public limited company recently incorporated in the UK with no active trading activities. Its nature of business is classified as dormant under SIC code 99999.
AI · Dover, United Kingdom · Founded 2024
Apply to this role