Group Technology and Operations (T&O) enables and empowers the bank with an efficient, nimble and resilient infrastructure through a strategic focus on productivity, quality & control, technology, people capability and innovation. In Group T&O, we managethe majority ofthe Bank's operational processes and inspire to delight our business partners through our multiple banking delivery channels. Responsibilities
- Drive the Site Reliability Engineering agenda forward at an Enterprise Level to improve availability, reliability, and performance of services.
- Drive cross-team efforts in resiliency assessment exercises and reporting
- Draft and/or contribute to internal SRE training materials
- Support services before they go live through activities such as Chaos testing (failure injection), system design inputs, developing software platforms and frameworks, capacity planning and launch reviews.
- Engage with product engineering teams to test against relevant Chaos Engineering tool kit.
- Good understanding of CI/CD pipelines and SDLC (application delivery)
- Assist application teams in setting up SLI, SLO and Error budget for the system/s
- Participate in Blameless Incident Retrospectives and follow up on action items
- Work with application teams for Observability, automating monitoring and auto-remediation of known issues.
- Programming and scripting to automate failure scenarios, integration with pipelines and developing self-service portals.
- Work with teams located across locations in Asia Pacific
Good to have
- Experience in SRE transformation and adoption for large scale environments
- Experience in one or more of the following: Java Script, Java and Python.
- Very good analytical and problem-solving skills with good understanding of technical risks emerging out of architecture decisions.
- Understands key SRE concepts such as Error Budgets, MTTD, MTTR and Launch Control
- Development skills with experience in real time, distributed and highly secured environments.
- Experience with developing test cases and ensuring appropriate test coverage through unit and automated testing.
- Experience with one of more of ELK, Grafana, Prometheus, Dynatrace and AppDynamics.
- Experience with Proxies and Load Balancers like HAProxy and Nginix.
- Experience with CI/CD pipelines and release strategies
- Systematic problem-solving approach coupled with effective communication skills and a sense of ownership and drive.
- Bachelor's or Master's degree in Computer Science, a related technical field that involves programming, or equivalent practical experience.
- Minimum of 10 years technology experience (preferably in the financial industry).
- Highly motivated, pro-active and capable of working under pressure without compromising development processes and productivity.
- Strong, committed and reliable team player, able to take direction but also willing to contribute to discussions on design and strategy.
- Possess strong interpersonal and communication skills to be able to deal with and form good relationships with the business and other technology groups through day to day support and project work
- Experience with developing applications and setting up automations in a Linux environment, with sound knowledge of algorithms, data structures, complexity analysis and software design.
- Understands complex architectures and well versed with design patterns.
- Ability to debug and optimize code and to automate routine tasks.
- Interest in financial technologies, new technology tools and the ability to learn.
We offer a competitive salary and benefits package and the professional advantages of a dynamic environment that supports your development and recognises your achievements.