Site Reliability Engineer

Orlando, Florida

Site Reliability Engineer Job Opening in Orlando, Florida - Job Posting ID: CWD14552

Role: Site Reliability Engineer

Location: US-Orlando, FL

Duration: 9 mo.

Skype Interview

Photo ID

Description:

Site Reliability Engineering (SRE) is an engineering discipline that combines software engineering and systems engineering to build and run large-scale, massively distributed, fault-tolerant systems. An SRE within the Engineering Excellence team will focus on increasing our tooling and automation and improving our systems availability.

Responsibilities

? Build tools to quickly triage issues and discover failures across hardware, software, applications and network

? In-depth analysis of service trends and implements adjustments to mitigate risk and prevent issue recurrence

? Maintain production systems by measuring and monitoring availability, latency and overall system health

? Provide guidance to software engineers related to design patterns that are resistant to failure

? Support 24x7 on-call response to critical operational issues Basic Qualifications

? Strong technical knowledge of digital environment full stack including Mobile, Web, APIs, Messaging, Databases, Networks and their interactions

? Knowledge and understanding of the SDLC principals and key controls

? Experience working with and contributing to open source code or frameworks using Git version control

? Strong knowledge of AWS Cloud solutions and product offerings

? Experience with container technologies (i.e. Docker, Kubernetes)

? Strong understanding of monitoring methodologies and proactive monitoring using APM (i.e. AppDynamics, New Relic) solutions or other monitoring and instrumentation technologies

? Required knowledge and understanding of technical architecture, application systems design and integration in a large heterogeneous enterprise environment with hands-on experience in SOA, Angular/Node, Java/J2EE, Oracle or MySQL/MariaDB programming methodologies

? Experience working in an Agile environment (i.e. Scrum, Kanban) Preferred Qualifications ? 3+ years programming in one or more of: Java, Node, Python, Perl or C

? 2+ years UNIX systems knowledge and/or systems administration background

? Interest in designing, analyzing and troubleshooting large-scale distributed systems

? Systematic problem-solving approach, coupled with strong communication skills and a sense of ownership and drive ? Experience debugging, optimizing code and automating routine tasks

1 Click Easy Apply

Site Reliability Engineer lang: en_US

Orlando, Florida

Site Reliability Engineer