Site Reliability Engineer (USA Only - 100% Remote)
Close
ABOUT US
Close https://close.com/ is a bootstrapped, profitable, 100% remote, ~100 person team of thoughtful individuals who prioritize taking ownership and making a meaningful impact. Weβre eager to make a product our customers fall in love with over and over again.
We π small scaling businesses. Since 2013, weβve been building a CRM that focuses on better communication, without the hassle of manual data entry or a complex UI. We are out to supercharge sales productivity with the most modern, thoughtfully designed, all-in-one, communication-focused CRM.
Our backend tech stack https://stackshare.io/close-crm/close consists primarily of Python Flask web apps with our TaskTiger https://github.com/closeio/tasktiger scheduler handling many of the backend asynchronous task processing chores. Our data stores include MongoDB, PostgreSQL, Elasticsearch, and Redis. The underlying infrastructure runs on AWS using a combination of managed services like EKS, MSK, RDS and ElasticCache and non-managed services running on EC2 instances. We have CI/CD pipelines that build Docker images, run automated tests and deploy to Kubernetes clusters. We also use these images in our local development environment allowing coding locally against all of our services. We have a well-documented public API https://developer.close.com/ that is consumed by our front-end JavaScript app as well as numerous integrations. Our infrastructure is heavily automated using Terraform, Ansible and other AWS tools.
We love open sourcing our code and ideas on our GitHub https://github.com/closeio and on The Making of Close https://making.close.com/, our behind-the-scenes Product & Engineering blog. Check out our open source projects like close-mongo-ops-manager https://github.com/closeio/close-mongo-ops-manager, SocketShark https://github.com/closeio/socketshark, TaskTiger https://github.com/closeio/tasktiger, LimitLion https://github.com/closeio/limitlion and ciso8601 https://github.com/closeio/ciso8601.
ABOUT THE ROLE
You will be joining the Infrastructure Team at Close. This team builds and maintains the platform that runs all Close systems (and do we have a lot of those). Work with us and youβll be working with:
- Multi-terrabyte MongoDB https://www.mongodb.com/, PostgreSQL https://www.postgresql.org/, and Elasticsearch https://www.elastic.co/elasticsearch clusters
- Telemetry systems built on Grafanaβs LGTM https://grafana.com/go/webinar/getting-started-with-grafana-lgtm-stack/ stack and ClickHouse https://clickhouse.com/ processing over 130 TB per month
- Multiple Kubernetes https://kubernetes.io/ clusters running tens of thousands of pods
- Github Actions https://docs.github.com/en/actions/hosting-your-own-runners/managing-self-hosted-runners/about-self-hosted-runners & ArgoCD https://argo-cd.readthedocs.io/en/stable/ powered CI/CD that can go from merged, to production, to rolled back in 10 minutes
- A system that is stable, up to date, and hasnβt needed scheduled downt