Sr DevOps Engineer
Botco.ai
We're looking for a senior DevOps engineer to be our sole infrastructure owner — the person who keeps our platform running, scales our delivery pipeline, and responds when things go wrong in production. This isn't a role where you hand off tickets to a platform team. You'll own the infrastructure end-to-end, work closely with engineers on deployment and reliability, and be the on-call anchor for critical incidents. You're someone who combines deep technical expertise with a collaborative, process-minded approach — you keep the team in the loop, follow change control procedures, and understand that good infrastructure work is as much about communication as it is about execution.
What you'll do
- Own and evolve our cloud infrastructure on AWS (+ some Azure) — architecture, provisioning, cost, and security
- Manage and improve our Kubernetes clusters and containerized workloads (Docker)
- Build, maintain, and optimize CI/CD pipelines using GitHub Actions
- Drive infrastructure-as-code practices — everything in version control, nothing snowflaked
- Maintain and improve observability: logging (Grafana + Loki), error tracking (Sentry), alerting, and dashboards
- Serve as the primary on-call engineer for critical production incidents
- Partner with the engineering team on deployment strategy, environment management, and reliability improvements
- Identify gaps proactively and bring well-reasoned recommendations to the team before acting
What we're looking for
- 5+ years of DevOps or infrastructure engineering experience
- Deep hands-on expertise with AWS
- Strong Kubernetes experience: cluster management, networking, resource configuration, troubleshooting
- Docker proficiency — building images, optimizing layers, managing registries
- Solid GitHub Actions experience
- Infrastructure-as-code fluency
- Comfortable to be the sole DevOps presence on a team
- On-call ready: you can respond to a production incident at off-hours if needed, stay calm under pressure, and lead the resolution
- Process-minded: you respect change control, loop in the right people before making impactful changes, and document your work as a matter of habit
- Clear, concise communicator — you can explain an outage, a trade-off, or a migration plan to the technical VP of Engineering
- Startup experience — you know how to prioritize ruthlessly and deliver with limited resources
- Comfortable using AI coding assistants (e.g. Claude Code, Copilot) to accelerate infrastructure work
Nice to have
- Experience with HIPAA and/or SOC2 compliance
- Familiarity with Grafana, Loki, or Sentry in production
- Background supporting Java microservices deployments
- Experience with database operations — RDS, backups, failover, migrations
- Scripting proficiency in Python or Bash for automation
How we work
- You'll be the only DevOps engineer — high autonomy, high ownership, direct impact on the platform
- We follow change control procedures for production — changes are coordinated, communicated, and documented as a team practice
- Code review is a first-class practice here — every PR gets a thorough review, and we expect engineers to engage seriously with feedback and explain their reasoning
- Remote-first, async-friendly — team spans the Americas
- Linear for project management, GitHub for version control
- Small team = high ownership, direct impact, no bureaucracy