ID: RF01308 (96090518)
Lead Operational Readiness Engineer/System admin with web development/support, tomcat, WebLogic/WebSphere, AEM, Unix/Linux, Java/Python/Shell Scripting and DevOps experience
Duration: 12 months
• 5+ years industry experience supporting high traffic, scalable, consumer facing web applications
• 10+ years industry experience building, developing and supporting three tier enterprise applications.
• Strong understanding of Operational Readiness Requirements in the context of large internet based web applications.
• Strong understanding of monitoring systems and operational needs
• Experience administering and working with web technologies: Tomcat, Apache, WebSphere, WebLogic, AEM (web content systems), Unix/Linux etc
• Experience in Software(/IT) Process(/Project) Management directly involved with the implementation of process and systems in a large IT environment
• Proficiency in coding in one or more of the following: Java (ideal), Python, advanced Shell Script
• Understanding of DevOps principles; having the experience to easily participate in 24/7 on-call rotation, when required
• Strong analytical and problem-solving skills and ability to prioritize tasks and work independently.
• Strong interpersonal communication skills and ability to work well in diverse and cross functional teams of other operational engineers, developers, product managers, etc.
• Preferred B.S. or M.S. in Computer Science or related technical discipline
Primary Responsibilities –
• Understand KP.ORG features, the underlying platform, and support requirements
• Oversee and execute the ORR process across multiple groups. This includes integration with testing/release teams, tracking releases and identifying operational risks, and providing feedback for go/no-go decisions.
• Facilitate the update of support documentation
• Update process and procedures to improve operational team awareness and capabilities for new release.
• Maintain services once they are in production by measuring and monitoring availability, latency and overall system health.
• Have the ability to troubleshoot issues across the entire application stack
• Scale systems sustainably through mechanisms like automation, and evolve systems by pushing for changes that improve reliability and create necessary automation for management and visibility of KP.ORG services.
• Drive standardization/documentation efforts across all KP.ORG wide services in conjunction with other operation support teams across the organization
• Facilitate the knowledge transfer and training of Level 1.5 & 2 support resources on new features/changes