The vacancy is well-structured, providing clear expectations and compensation details, making it attractive for applicants.
Job description
Alpaca is a US-headquartered self-clearing broker-dealer and brokerage infrastructure for stocks, ETFs, options, crypto, fixed income, 24/5 trading, and more. Our recent Series D funding round brought our total investment to over $320 million, fueling our ambitious vision.
Amongst our subsidiaries, Alpaca is a licensed financial services company, serving hundreds of financial institutions across 40 countries with our institutional-grade APIs. This includes broker-dealers, investment advisors, wealth managers, hedge funds, and crypto exchanges, totaling over 9 million brokerage accounts.
Our global team is a diverse group of experienced engineers, traders, and brokerage professionals who are working to achieve our mission of opening financial services to everyone on the planet. We're deeply committed to open-source contributions and fostering a vibrant community, continuously enhancing our award-winning, developer-friendly API and the robust infrastructure behind it.
Responsibilities
### Your Role:
As a Site Reliability Engineer at Alpaca, you'll help keep our brokerage platform reliable, observable, and operable as we grow - working across our cloud infrastructure, Kubernetes platform, observability stack, messaging layer, and data layer.
### Things You Get To Do:
- Operate production day-to-day - oncall, incident response, postmortems, and the follow-ups that actually close the loop.
- Own reliability practice - define and refine SLIs/SLOs and error budgets, and help product teams live within them.
- Strengthen our observability across metrics, logs, traces, and alerting.
- Ship infrastructure through code in a GitOps workflow - cloud resources and Kubernetes workloads alike.
- Look after PostgreSQL: performance tuning, schema and migration review, online migrations on large tables, HA/DR, and CDC pipelines.
- Mentor engineers on reliability and database fundamentals through code review, design review, and pairing.
Requirements
### Who You Are (must-haves):
- 4+ years in SRE, DevOps, Platform/Infrastructure, or backend engineering with significant production operations ownership.
- Hands-on experience operating production services on Kubernetes, and shipping infrastructure as code in a GitOps workflow.
- Solid working knowledge of PostgreSQL in production — query plans, pg_stat_*, indexing and schema trade-offs, and what a safe online migration looks like on a non-trivial table.
- Cloud networking fundamentals (VPCs, routing, L4/L7 load balancing, DNS, TLS) and comfort debugging cross-service connectivity.
- Comfortable with a modern observability stack and proficient with Linux at the operator level.
- Practiced in incident response - calm under pressure, structured debugging, postmortems that drive change.
- At least working proficiency in Go or Python, plus strong written and verbal communication.
- Genuine interest in databases and in growing your PostgreSQL/DBA expertise.
Conditions
### How We Take Care of You:
- Competitive Salary & Stock Options
- Health Benefits
- New Hire Home-Office Setup: One-time USD $500
- Monthly Stipend: USD $150 per month via a Brex Card.
Alpaca is proud to be an equal opportunity workplace dedicated to pursuing and hiring a diverse workforce.
About Alpaca
Alpaca is a global leader in brokerage infrastructure APIs, providing access to stocks, ETFs, options, fixed income, and crypto, along with embeddable finance solutions like tokenization and securities lending. It serves retail traders, institutional investors, app developers, and fintech companies worldwide through its API-first platform. The company, originally started in 2015 as a database and machine learning firm, is headquartered in San Mateo, California.