At Canonical it is our mission to make open source software available to people everywhere. We believe the best way to fuel innovation is to give the innovators the technology they need. As a Systems Reliability Engineer (SRE) for the Information Services (IS) team you'll play a key role in driving this mission and helping to define the future of free software.
SREs work closely with development teams to build and maintain the extraordinary infrastructure required to run all of Canonical and Ubuntu’s systems and services. The scope of our responsibility combined with the overall size of our environment means that our SREs face new challenges every day. From developing automated processes for faster, more reliable deployments to building large and scalable cloud environments, every day at Canonical is an opportunity to learn something new and collaborate with some of the most talented technical minds in the industry.
IS supports and maintains all of Canonical’s production services and IS team members use real-life operational experiences to contribute to product improvements. As an SRE you’ll be in a unique position that will allow you to provide critical feedback to developers by writing code, submitting bugs, and working with others within the company to ensure that Canonical products are as good as they can be. You will also be able to develop and submit fixes and enhancements directly.
KEY RESPONSIBILITIES & ACCOUNTABILITIES
SREs rotate through three roles:
1. Maintaining all core services, networks, and infrastructure (including public and private clouds). The ability to work under pressure and demonstrate sound problem solving skills in a fast-paced and complex environment are key here.
2. Working directly with a variety of development teams within Canonical in a devops role to test, deploy, monitor and maintain services running on our production clouds. This will require an overlap of development and administration skills, as you help write and review code you will then use to deploy and maintain services using Canonical's cloud products.
3. Larger project work, currently focused on large scale cloud deployments and overall process improvements. This role gives SREs the ability to utilize development and architecting skills in a focused manner that is unique to Canonical.
REQUIRED SKILLS & EXPERIENCE
- Prior experience working in a large highly available environment
- Willing to be flexible and adaptable with the ability to learn new things quickly.
- Strong development skills (Python, Go, Ruby, etc.) with experience writing code.
- Heavily focused on automation preferably with experience in building and maintaining self-service tools.
- Authoritative understanding and experience with the administration of infrastructure services such as DNS, DHCP, SSH, Apache/Nginx, HAProxy, Squid/Varnish, PostgreSQL/MySQL etc.
- Practical knowledge of IP networking and routing
- Strong security focus including knowledge of network, operating system and application level practices
- Familiarity with software development and code review practices, including use of DVCS (e.g. git or bzr)
- Experience deploying, administering and maintaining services in a cloud computing environment
- Able to communicate clearly in English, especially using email and IRC
- College degree in a relevant technical field or equivalent experience.
- Self-driven and able to troubleshoot, ask others when appropriate and find answers
- Motivated, organised, and willing and able to work well remotely within a distributed team
- Able to participate in our weekend on call rotation approximately 1 weekend every 18 weeks
DESIRED SKILLS & EXPERIENCE
- Prior experience administering OpenStack
- Familiarity with Juju and MAAS
- Familiarity with Ubuntu or Debian
- Prior experience with configuration management tools (Puppet, Chef, CFEngine, etc.)
- Prior experience maintaining and configuring routers and firewalls (Cisco, iptables)
Canonical is an equal opportunity employer.