Site Reliability Engineer

Permanent

Interactive Business Systems

Site Reliability Engineer Kansas City MO

Planet Technology is teaming up with a Kansas City, MO area client to locate and bring on a Site Reliability Engineer for a Direct Hire Opportunity!

What Youll Do:

  • Leading software lifecycle activities and reviewing changes to determine how they may affect site reliability
  • Proactively anticipate production issues, such as outages, slowness, processing delays, errors, failures, etc., and taking corrective action to prevent them
  • Leading and improving the incident response process which includes investigating, appropriately communicating key findings, and resolving issues; collaborating with subject matter experts, as needed, throughout the process
  • Designing, implementing, and improving company monitoring and alerting solutions, such as Splunk, Datadog, and CouldWatch
  • Identifying and tracking Service Level Indicators (SLIs) and Service Level Objectives (SLOs)
  • Maintaining a record of system incidents and reliability metrics
  • Creating and improving data-driven analysis to identify patterns and offer recommendations for preventative measures to avoid future incidents
  • Lead performance tests; identify bottlenecks, opportunities for optimization, and capacity demands.
  • Design, develop, update, and support software solutions and automation using Python
  • Designing, building, managing, and supporting resources in Amazon Web Services
  • Creating and updating company processes and procedures, documentation, knowledge based articles, and other resources related to Site Reliability Engineering and the incident management process
  • Actively researching new technology and share knowledge with team members
  • Participate in on-call rotations and triage or resolve critical production issues

What Youll Bring to the Role:

  • Bachelors degree in Computer Science or Engineering or other relevant degrees, or relevant work or military experience
  • 6-9 years experience in IT in a Site Reliability role or position with similar responsibilities such as incident management, IT support, or infrastructure and application hosting
  • Passionate about building and managing highly available software and systems
  • Experience working with cutting edge AWS cloud technology in a fast-paced environment
  • Must possess proficiency in software development, Python language
  • Must have exposure and proficiency with data analytics
  • Must be AWS Certified

Interested candidates should send a resume and brief cover note to the attention of Chris Douglas at Planet Technology (cdouglas@planet-technology.com)

About IBS: IBS, Interactive Business Systems, Inc., is an IT solutions and staffing company known for achieving business objectives and bottom-line results through the smart architecting, implementation and management of technology. In three decades of developing the technology applications, tools, environments and teams that foster top business performance, we have become an industry-leading IT services provider. 

Tagged as: Site Reliability Engineer