Site Reliability Engineer

About Us

We are a passionate team of open source developers with a desire to build a successful and sustainable business that can impact the world at large. Our mission is to create open source, enterprise-grade products that help individuals and organizations unlock their potential and become top performers in their respective domains. To achieve this, we are building a suite of tools that span the entire web development lifecycle ranging from a best in class local development experience all the way through multi-cloud, high-availability hosting (PaaS or self-hosted). To learn more, please visit https://www.ddev.com/, our GitHub (https://github.com/drud/), and governance (https://github.com/drud/community) pages.

Roles and Responsibilities

  • Be professional, courteous, kind and responsive to others you engage with.
  • Integrate with a fast-paced engineering team to design, develop and deliver our local development and hosting products.
  • Help maintain 24×7 uptime on public cloud-based infrastructure.
  • Be a first responder during outages for clients with managed hosting and self-hosting with a support package.
  • Help design, build, and maintain solutions around logging, networking, monitoring, security, disaster recovery, etc.

Requirements

An overall team-centric philosophy and strong emotional intelligence is absolutely a must.  We have a strong affinity for cloud-native technologies and so should you. You must love highly distributed mission-critical computing using modern technologies and languages.

Qualifications

  • Experience managing production Kubernetes clusters.
  • Experience running applications in productions on Kubernetes.
  • Must be fluent in at least one programming language such as Python, GoLang or Ruby.
  • 3+ years in a combination of DevOps, SRE, Software Development or Systems Operations roles.
  • 1+ years experience managing production workloads on a major cloud provider such as AWS, GCP, Azure or DigitalOcean.
  • Demonstrated understanding of containers and container orchestration.
  • Troubleshooting skills that span systems, network (TCP/IP), and code.
  • Must have experience building or managing large-scale systems and application architectures.
  • Solid understanding of system performance and monitoring.
  • Working knowledge of cloud computing including virtualization, hosted services, multi-tenant cloud infrastructures, distributed storage systems and content delivery networks.
  • Experience working with source control management tools, GitHub is a huge plus.
  • Excellent verbal and written communication skills.

Nice to Haves

  • Experience building extensions to the Kubernetes API such as Custom Resource Definitions using tools such as Kubebuilder, Operator SDK or Aggregated API Server.
  • A demonstrated history of working on and contributing to open source projects.
  • Experience work on remotely distributed
  • Experience with load balancers such as Elastic Load Balancer, NGINX, Envoy, HAProxy or Google Cloud Load Balancer
  • Experience with Cloud Native ecosystem projects such as Cluster Autoscaler, CoreDNS, Pod Autoscaler, etc
  • Experience with infrastructure configuration and automation processes and tools: Ansible, Fabric, Terraform, Puppet, Chef.
  • Experience with hosting Content Management applications such as Gatsby, Drupal, Typo3, etc.
  • Experience with monitoring solutions: Prometheus, ELK, Splunk, SUMO, Nagios or fluentd
  • Experience with various data technologies including relational and nonrelational databases and message queues.
  • Experience with distributed storage systems: S3, Ceph, GlusterFS, EFS, EBS or Rook

Benefits

  • Flexible vacation/time-off.
  • Competitive salaries and performance-based raises.
  • Health, vision and dental insurance.
  • Professional development opportunities.
  • An amazing team of like-minded individuals to create with.

Applications (including a resume, a cover letter, and any additional information that would be relevant to the position) can be sent to careers@drud.com.

Please follow and like us: