Systems Development Engineer - Data Center Operations
DESCRIPTION Amazon is building some of the largest distributed systems in the world, and we are looking for talents to support and engineer the next generation of compute and storage platforms. Amazon's Data Center Operations Strategic Engineering (DCOSE) group provides data center support worldwide with focus on continuous improvement. We have high standards for our infrastructure as well as our team, and our systems are highly reliable, available, and turn scale into an advantage for our business and an asset to our customers. Our teams work collectively, driven to serve customers, and fun to work with.
Data Center Operations DevOps Engineering (DCODE) is a team of crafty engineers within DCOSE that focuses on continuous improvement through infrastructure automation and tool development. As an engineer in the team, you will be focused on developing internal systems written in Python, Java, C or a similar language. Also, you should be sensitive to the clients' needs, and can develop effective client relationships - both AWS customers and the other engineers at AWS who will benefit from the automation you create. This position also requires diving deep in data, analyzing data for trends and systemic issues, then follow our Software Development Life Cycle (SDLC) to develop solutions or effect changes to eliminate problems from our environment. You will be also developing front and back-end applications and dashboards that would enable the business to make informed decisions. Your creativity and understanding of the business needs will drive agile development to keep up with ever-changing customer demand. You will also support the underlying infrastructure that hosts our applications through Availability, Performance and Capacity Management. You will work directly with the various service owners and hardware design teams to collaborate on hardware issues within the fleet. You think proactively and work to prevent support issues before they are realized. At the same time, you will be working with other Amazon leaders to share ideas and improve support within the company.
• Develop new or existing applications, system management tools, and processes that reduce manual efforts and increase overall efficiency.
• Adapt and improve operations management systems and processes to accommodate rapid and increasing growth in systems and traffic.
• Monitor the health of the fleet, automating system health, maintenance tasks, and reporting systems as needed.
• Perform various system maintenance tasks, including deployments and improving availability and performance of tools.
• Assist in developing methods for incident reduction.
• Monitor various data sources for unidentified fleet issues.
• Manage directly assigned tasks and on-call duties gracefully.
• Collaborate with outside teams to resolve customers issues.
• Bachelors degree in Computer Science, Computer Engineering, Electrical Engineering, MIS, or 5+ years equivalent technology experience.
• 2+ years of experience operating in a Linux environment, including configuration of networking and security.
• 3+ years experience with Software Development Life Cycle processes.
• 3+ years of experience with software architecture and implementing solutions on Amazon Web Services.
• Experience with some aspect(s) of computer security: network security, application security, security protocols.
• Experience deploying or managing servers in large-scale, geographically diverse environments.
• Deep understanding of enterprise level server and storage hardware components.
• Excellent troubleshooting and documentation skills.
• Able to show good judgment and instincts in decision making.
• Able to prioritize and perform in complex, fast-paced situations
• A drive to take ownership of problems and solve them
Amazon is committed to a diverse and inclusive workplace. Amazon is an equal opportunity employer, and does not discriminate on the basis of race, national origin, gender, gender identity, sexual orientation, disability, age, or other legally protected status.