- Career Center Home
- Search Jobs
- Vice President, Site Reliability Engineer Lead, Application Production Services & Engineering
Results
Job Details
Explore Location
Bank of America
Singapore, Singapore
(on-site)
Posted
9 hours ago
Bank of America
Singapore, Singapore
(on-site)
Job Function
Financial Services
Vice President, Site Reliability Engineer Lead, Application Production Services & Engineering
The insights provided are generated by AI and may contain inaccuracies. Please independently verify any critical information before relying on it.
Vice President, Site Reliability Engineer Lead, Application Production Services & Engineering
The insights provided are generated by AI and may contain inaccuracies. Please independently verify any critical information before relying on it.
Description
Job Description:At Bank of America, we are guided by a common purpose to help make financial lives better through the power of every connection. We do this by driving Responsible Growth and delivering for our clients, teammates, communities and shareholders every day.
Being a Great Place to Work and providing a culture of caring is core to how we drive Responsible Growth. We are intentional about fostering an inclusive workplace where every teammate has the opportunity to succeed, build a career and contribute to our shared success. This includes attracting and developing exceptional talent, recognizing and rewarding performance, and supporting our teammates' physical, emotional, and financial wellness through affordable, competitive and flexible benefits.
We value the unique perspectives individuals bring from all backgrounds and career paths - whether shaped by military service, community college education, or a wide range of work and life experiences. These journeys foster resilience, leadership and innovation, strengthening our workforce and positively impact the communities we serve.
Bank of America is committed to an in-office culture that supports collaboration, engagement, and career development. Our approach includes clear in-office expectations, while providing an appropriate level of flexibility based on role-specific responsibilities and business needs.
At Bank of America, you can build a successful career with opportunities to learn, grow, and make an impact. Join us!
Job Description:
This job is responsible for partnering with engineering and technology teams to implement measures prescribed by the Site Reliability Engineer teams it leads. Key responsibilities include ensuring appropriate instrumentation, tooling, ticketing, alerting and on call routines are in place for key services, demonstrating technical expertise within domains, and decomposing objectives into work units. Job expectations include advancing efficient solution delivery practices and promoting exceptional design, engineering, and organizational practices.
Responsibilities:
- Collaborate with Development and Infrastructure teams to understand technical solutions and implement monitoring capabilities outlined in the application and system monitoring designs put forward by the Senior Site Reliability Engineer (SRE)
- Develops and maintains reliability scripts, tools and libraries and leverages them for common instrumentation, automation, and operational needs, and when mentoring SRE resources on reliability practices and established tools/capabilities
- Partner implement code changes to make use of common reliability libraries and tools and helps Application Production Services and Application Development teammates understand how to use them
- Participates regularly in architecture community of practice meetings and communication via other channels
- Identifies vulnerabilities and opportunities for reliability improvement, such as investigating low level error rates and 'noise' in monitoring, and defines solutions to reduce manual support effort and/or improve system reliability
- Engages as a subject matter expert in major incident triage efforts and failure scenario modelling and diagnosis with Problem Manager root causes for major incident/problem management investigations
- Define and maintain a multi-year stability roadmap aligned with business objectives and technology strategy
- Identify critical dependencies, risks, and mitigation strategies across infrastructure, applications, and services
- Work with the architects to develop and adhere to the enterprise architectural patterns and frameworks that enhance system reliability and fault tolerance
- Ensure designs adhere to best practices for high availability, disaster recovery, and performance optimization
- Establish stability metrics, KPIs, and compliance standards for technology teams
- Drive adoption of reliability engineering principles across development and operations
- Partner with engineering, operations, and product teams to embed stability into the software development lifecycle
- Act as a trusted advisor to senior leadership on stability-related initiatives and investments
- Monitor emerging technologies and industry trends to enhance stability strategies
- Lead post-incident reviews and ensure lessons learned are incorporated into future designs
- Collaborate with Development and Infrastructure teams to understand technical solutions and to implement the monitoring capabilities outlined in the application and system monitoring designs put forward by the Senior SRE
- Develop and maintain a catalog of extensible reliability scripts, tools, and libraries that can be leveraged for common instrumentation, automation and operational needs
- Partner to implement code changes to make use of common reliability libraries and tools and help the Application Production Services (APS) and Application Development teammates understand how to use them
- Partner with infrastructure engineers and application teams to implement the necessary code changes to make use of common reliability libraries and tools and help the APS and Application Development of teammates to understand how to use them
- Engage as a subject matter expert (SME) in major incident triage efforts, failure scenario modelling and work with the Problem Manager to diagnose root causes for major incident / problem management investigations
- Identify vulnerabilities and opportunities for reliability improvement, such as investigating low level error rates and 'noise' in monitoring, and to help define solutions to reduce manual support effort and/or improve system reliability
Required Skills:
- 8+ years in technology architecture, reliability engineering, or infrastructure strategy roles
- Proven track record of delivering stability-focused initiatives in large-scale environments
- Strong knowledge of distributed systems, cloud architecture (AWS, Azure, GCP), and microservices
- Experience with reliability engineering, chaos testing, and observability tools
- Ability to influence cross-functional teams and communicate complex concepts to non-technical stakeholders
Desired Skills:
- SRE Certification
- Automation
- Collaboration
- Influence
- Production Support
- Result Orientation
- Analytical Thinking
- Application Development
- Architecture
- Solution Design
- Stakeholder Management
- Adaptability
- DevOps Practices
- Project Management
- Risk Management
- Solution Delivery Process
Job ID: 84371082
Investing in our teammates’ wellness and long-term career growth.
Bank of America has always been the bank of opportunity for our shareholders, our clients and customers, our communities and our teammates.
We’re committed to connecting our nation’s military to the training, education and resources that put them on the path to financial stability. We employ thousands of veterans and military spouses. Building on that, since 2014 we have hired more than 10,000 service members. We also finance and partner with organizatio...
View Full Profile
More Jobs from Bank of America
Financial Solutions Advisor Registration Candidate - North OC Market
Anaheim, California, United States
9 hours ago
Business Banking Relationship Manager
Roseville, California, United States
9 hours ago
Credit Review Senior Portfolio Specialist
Chicago, Illinois, United States
9 hours ago
Jobs You May Like
Median Salary
Net Salary per month
$4,254
Cost of Living Index
88/100
88
Median Apartment Rent in City Center
(1-3 Bedroom)
$2,659
-
$5,505
$4,082
Safety Index
78/100
78
Utilities
Basic
(Electricity, heating, cooling, water, garbage for 915 sq ft apartment)
$92
-
$235
$158
High-Speed Internet
$23
-
$39
$28
Transportation
Gasoline
(1 gallon)
$8.52
Taxi Ride
(1 mile)
$1.26
Data is collected and updated regularly using reputable sources, including corporate websites and governmental reporting institutions.
Loading...
