ID: RF01309 (96090518)
Application Operations Engineering Analyst with BlueMix, Cloud Native, AWS, Azure, Google Cloud, Splunk, Foglight, Dynatrace, HP BSM, Java/Python/Shell and Tomcat/WebSphere experience
Pleasanton, CA (ka.org)
Duration: 12 months
• 5+ years industry experience engineering and supporting high traffic, scalable, consumer facing web applications, PaaS, or cloud technologies.
• 10+ years industry experience building, developing and supporting 3 tier internet enterprise applications.
• Experience in application operations in the cloud: BlueMix or Cloud Native, AWS, Microsoft Azure, Google Cloud, etc; strong experience designing and managing fault tolerant web platforms with 99.9+ availability.
• Strong practical usage of statistics and creation statistics-based models in previous related IT engagements.
• Very proficient in manipulating and analyzing logs and monitoring data from system utilities such as Splunk, Foglight, Dynatrace, HP BSM (or similar) and any similar tools with that are helpful.
• Good interpersonal communication skills and ability to work well in diverse and cross-functional teams of other SREs, infrastructure engineers, developers, IT product managers, etc.
• Understands and can factor in various implemented caching technologies when creating predictive modeling.
• Strong understanding of capacity management principles in static and dynamic models
• Strong proficiency coding in one or more of the following: Java (ideal), Python, advanced Shell Script.
• Experience configuring and performance tuning one or more of the following web platforms: Tomcat, Apache, WebSphere, etc.
• Preferred B.S. or M.S. in Computer Science or related technical discipline.
Primary Responsibilities –
• Creates an application system usage projection model, based on collecting and analyzing the log and monitoring data. Tests and adjusts the model, as needed. Models include predictive models of system transactions and user level transactions; predictive models that would help identify potential tipping points or thresholds of the application.
• Measure and predict traffic and its impact on infrastructure components so that we are not surprised by demand. This is for allowing for mitigation or response, as it related to single site usage or other reasons.
• Maintains an overall understanding of current systems operating behavior and trends.
• Evaluates performance results on new and modified IT services for general understanding and to give feedback. These results are also factors used to model.
• Proactively works with other Blue Mix (Cloud Native) application and support engineers to advise on dynamic application capacity and failover techniques/provisioning.
• Monitor growth in the current and future cloud platforms and look for performance improvement generally, to include looking at resource requirements that increase suddenly or seem disproportionate to the services being performed.
• Analyzes and correlates incidents and problems with models created; partially to head off potential conflicts that could be impactful to the application.
• Assists in assessing the capacity impact of new architecture, environment or application major or potentially impactful changes.
• Proactively utilizing application capacity-related data, comparing actual levels against targets. addressing shortfalls or overcapacity. This can aid in the continuous reduction of resources no longer needed (which is moving to other platforms).
• Able to gather data, analyze and create models for both the portal-based legacy and the newer platform based IBM Blue Mix PaaS environments.