B

Senior Cloud Resilience Architect

Blinkhealth
10 days ago
Full-time
Remote
Worldwide
Remote Other

Company Overview:

Blink Health is the fastest growing healthcare technology company that builds products to make prescriptions accessible and affordable to everybody.  Our two primary products – BlinkRx and Quick Save – remove traditional roadblocks within the current prescription supply chain, resulting in better access to critical medications and improved health outcomes for patients. 

BlinkRx is the world’s first pharma-to-patient cloud that offers a digital concierge service for patients who are prescribed branded medications. Patients benefit from transparent low prices, free home delivery, and world-class support on this first-of-its-kind centralized platform. With BlinkRx, never again will a patient show up at the pharmacy only to discover that they can’t afford their medication, their doctor needs to fill out a form for them, or the pharmacy doesn’t have the medication in stock. 

We are a highly collaborative team of builders and operators who invent new ways of working in an industry that historically has resisted innovation. Join us!

Responsibilities

  • Evaluate and mature the organization’s disaster recovery posture, including recovery objectives (RTO/RPO), dependency mapping, and failure domain analysis across applications, data, and infrastructure.

  • Define, document, and establish disaster recovery standards and best practices across cloud infrastructure, platforms, and application architectures.

  • Partner with SRE, platform, security, and product engineering teams to design and implement resilient, fault-tolerant systems, progressing from backup-based recovery to multi-region and active-active architectures.

  • Lead the disaster recovery roadmap, balancing technical feasibility, cost, risk, and business priorities.

  • Design and recommend reference architectures for disaster recovery patterns, including pilot-light, warm standby, hot standby, and active-active.

  • Drive adoption of active-active disaster recovery for critical systems, including traffic management, data replication, consistency models, and automated failover.

  • Define and operationalize testing strategies for DR, including game