Site Reliability Engineer - Games

Redmond, WA
Job Terms:
Start date:
Posted By:
Stephen Charles-Kendall

Job Description:

Site Reliability Engineer

Job Description
343 Industries, the studio developing titles in the Halo universe, is looking for a talented, passionate, and effective Site Reliability Engineer to join the Infrastructure team working on Halo Infinite! ​​

At 343 Industries, our mission is to inspire heroes and deliver wonder. As a developer on the Infrastructure team, you will be part of the backbone of the studio, making sure all teams are able to install game builds and tools and access internal services wherever they are in the world. The ideal candidate will have experience with a broad range of disciplines including modern build pipelines, building scalable services, writing reliable automation, and designing workflows to make other developers’ lives better. We want to empower everyone in the studio to work as efficiently as possible so we can create this amazing Halo universe together.​



Participate in support rotation with other members of the Infrastructure team.
Handle day to day support operations on large scale computer farm.
Partner with 343 IT Team to improve farm stability, performance, and maintainability.
Plan, migrate and support computer farm transition to Microsoft Azure where applicable.
Support 343 Infrastructure and partner teams on development initiatives.
Investigate, report, and resolve farm / infrastructure team issues.
Create, Adjust and Monitor Infrastructure team SLAs, SLIs and SLOs and work toward resolving any failing indicators.
Create telemetry and dashboards to visualize farm health.

Network Requirements
​During WFH or if remote, the following is required for this type of position:

​Internet speed of at least 500 Mbps or 1 Gbps, wired connection preferred, to ensure work is not impacted by connectivity/latency issues
Make sure to run and repeat speed tests throughout the day, as it can fluctuate during peak hours
Candidates must provide a screenshot of their speed test result
Agency must include speed test screenshot as part of the submission process for consideration
If candidate does not currently meet the network requirement:
​Candidate is committing to meeting the network requirement by their start date if an offer is received and mutually accepted
Internet upgrade reimbursement eligibility criteria and guidelines are available on the 343 Agency Hub​​ for reference​

Minimum Qual​ifications and Skills

3+ years' experience in software development, OR a Bachelor's Degree in Computer Science, OR comparable
Extensive experience debugging, troubleshooting, and fixing problems in a Windows environment
Proficiency in creating tools in PowerShell, Python or C#
Knowledge of Source control Systems
Familiarity with Cloud and cloud provisioning tools
Experience with configuration management tools​

Experience with Git and Perforce
Experience with Azure DevOps, Azure Monitor Workbooks, Kusto, and App Insights
Experience with configuration management tools: Puppet, Chef, Ansible, Terraform, Packer
Experience with Kubernetes, Docker, and other container technologies
Experience with SQL / NoSQL databases