Senior Site Reliability Engineer, Trello
By Atlassian At , San Francisco
3+ years of hands-on experience with public cloud offerings such as AWS,GCP or Azure
Familiarity with Incident management, post-incident analysis and participation in on-call rotation
3+ years experience operating high-availability, fault-tolerant, scalable, distributed software in production: building monitoring, tweaking dashboards, defining alerts, writing runbooks, etc.
Engineering microservices and tools across one or more programming languages (e.g. Go, Python,Bash)
Automation and Infrastructure-as-Code projects and tooling (e.g. Ansible, Puppet, Terraform)
Build and maintain a continuous integration and delivery pipeline (e.g. Bamboo, Bitbucket Pipelines, Github Actions)
Site Reliability Engineer Jobs
By Lawrence Berkeley National Laboratory At , San Francisco Bay Area, Ca $9,739 - $11,905 a month
Minimum of three years of experience in UNIX or Linux, Networking, IT infrastructure environment and management experience in a distributed-computing environment.
Knowledge of the processes for standard operating procedures, and best practices for implementation and change management.
Past experience with Incident Management and a good understanding of IT service management.
Experience with network security: configuring/maintaining ACLs, knowledge of firewalls
Bachelor’s Degree in a Computer Science or similar discipline or equivalent years of experience.
Strong hands-on knowledge of the Linux shell and working in a command-line (e.g. SSH) environment.
Staff Site Reliability Engineer
By Netskope At , Santa Clara, Ca
You will be part of a high caliber engineering team in the exciting space of cloud tools and infrastructure management.
Drive efficiencies in systems and processes: capacity planning, configuration management, performance tuning, monitoring and root cause analysis.
You will solve complex, exciting challenges and improve the depth and breadth of your technical and analytical skills
Partner closely with our development teams and product managers to architect and build features that are highly available, performant and secure
Gain deep knowledge of our application stack
Experience improving the performance of micro-services and solve scaling/performance issues
Senior System Reliability Engineer
By NVIDIA At , Santa Clara, Ca $132,000 - $212,750 a year
Good project management skills and ability to balance multiple simultaneous projects during development and production stages.
BS (or equivalent experience) in Engineering, Material Science, Physics, or a related field. MS or PHD preferred.
Deep understanding and hands-on experience in theoretical and practical Reliability concepts as it relates to high-tech electronic enterprise and consumer products.
Hands-on experience with Reliability demonstration & testing along with accelerated life methods for components, subassemblies, and complete products.
Good verbal and writing skills as well as the ability to communicate at a high level.
ASQ certification is desired but not a must.
Principal Site Reliability Engineer
By Oracle At , Redwood City, 94065, Ca
Develop and implement various database life-cycle management flows.
Certification of Database products for cloud integration
Participate in Product Feature Review, Certification experiments and User Document reviews.
Research and acquire skills on new technologies as needed from time to time
6-14 years of Oracle database administration experience on large production environments
Database hands on skills especially around database and system troubleshooting and administration
Senior Site Reliability Engineer (Sre)
By Apple At , San Diego, Ca
Experience in a DevOPS or SRE role
Experience with modern web-scale services including servers, VIPs, load balancers, proxies
Highly experienced with one of these: Puppet, Chef, Saltstack, Ansible
Bonus: Native Kubernetes implementation including CNI, Kafka, etcd experience
Bonus: Experience with Cisco, Juniper, or Arista routing and switching hardware (+OS), including wireless
Able to write software needed to build and operate a large scale platform 24x7 including the development and staging platforms.
Sr. Site Reliability Engineer
By rockset At , San Mateo, Ca $140,000 - $185,000 a year
Experience with Terraform, Salt, Chef, Packer, or similar configuration management tools
Willing to learn new skills and technologies
Bachelor's or Master's degree in Computer Science or a related field, or relevant work experience
Experience as an SRE for 3+ years
Experience building and operating public-facing 24x7 web applications at scale
Experience working with cloud infrastructure and patterns (AWS preferred)
Staff Site Reliability Engineer
By Collective Health At , San Mateo, 94401, Ca $140,000 - $210,000 a year
Expertise in management and use of relational databases including.
10+ years of work experience in DevOps, Site Reliability Engineering, or Software Engineering.
Experience creating and monitoring SLIs and SLOs in order to set and remain within error budgets.
Experience in supporting customer-facing production systems and responding to incidents as part of an oncall rotation.
Knowledge of data structures, algorithms, distributed systems, and information retrieval.
Experience in solving diagnosing and resolving incidents that involve application, OS, network, infrastructure, partners, people, and process.
Manager, Site Reliability Engineer - Remote
By KPMG-UnitedStates At , San Diego, Ca
Manager, Site Reliability Engineer - Remote
Experience in supporting various enterprise class solutions and services including Windows server administration and security issue remediation
Be the Technical Lead representing the SRE\Tier 3 team for operational initiatives or project support
Improve reliability, quality, and time-to-market of our suite of software solutions
Create sustainable systems and services through automation and uplifts
Bachelor's degree from an accredited college or university is preferred
Senior Engineer Ii - Digital Site Reliability
By Lululemon At , Seattle $132,300 - $173,500 a year
Contribute to engineering automation, management or development of pre-prod and production systems
Mentor and guide junior team members, sharing knowledge and expertise to foster a culture of learning and continuous improvement.
Eight+ years of engineering experience
Five+ years experience with CI/CD tools, GitLab preferred
Proficiency in at least one programming language (e.g., Python, Go, Java) and experience with scripting and automation.
Acknowledge the presence of choice in every moment and take personal responsibility for your life.
Reliability Cae Senior Engineer I
By Honda Dev. and Mfg of Am.,LLC At , Raymond
Experience in data analysis and communication of complex information to engineering management is desired.
Experience with following software or similar is desired
Ability to communicate concerns and ideas through remote work environment
Education reimbursement for continued learning
BS in Mechanical / Automotive Engineering
Proficient in Microsoft Excel, Word, and PowerPoint
Senior Reliability Engineer Jobs
By Digital Diagnostics, Inc. At , Remote
Location – Chicago, IL | Coralville, IA | or Remote-US
What We Have to Offer
Lead or participate in deploying updates or improvements as needed.
Lead or participate in support activities.
Identify performance and scalability bottlenecks in Digital Diagnostics’ global technical infrastructure.
Identify and work to eliminate waste in cloud infrastructure costs.
Senior Site Reliability Engineer/Devops Engineer
By Zillow At , Remote
Knowledge and experience working with microservices
Leverage your knowledge to build technical consensus around architecture and technology choices
Build and manage StreetEasy's cloud infrastructure, contributing to our commitment to reliability and efficiency
A Bachelor's degree in Computer Science or a related technical field, or equivalent practical experience
1-3 years of experience in site reliability engineering, DevOps, or a related field
Experience with cloud service providers, preferably AWS
Senior Site Reliability Engineer
By Adyen At , Chicago
Have a good understanding of Infrastructure as Code and experience with configuration management and automation tools such as Puppet and Ansible;
Strong familiarity with SRE practices and methodologies such as defining SLOs, change management processes and incident response;
Together with the team lead the way in continuously improving our incident management and on-call processes
Have experience with building, operating and troubleshooting large-scale distributed systems spanning multiple data centers across the globe;
Skilled in one or more programming or scripting languages such as Python, Java or bash;
We use SLOs to drive platform stability and innovation
Cloud Senior Site Reliability Engineer
By Bank of America At , New York, Ny
Perform deep dives into systemic and latent reliability issues, incident management, problem management
Understanding of cost management, inventory management, FinOps model
Identifying, analyzing, and resolving infrastructure vulnerabilities and application deployment issues.
Evaluating and automating the scaling and capacity requirements within Azure environments
BS /MS degree in Computer Science or related technical field involving systems or equivalent practical experience.
Minimum 8+ years of hands-on experience maintaining cloud platforms on a major cloud service provider.
Site Reliability Engineer (Sre) - Mid/Senior
By Vanilla Technologies Inc. At , Remote
Project management tools such as Jira, Git, and Confluence
Accounting for and addressing software vulnerabilities
Securing infrastructure, applications, and code
Ensuring high SLA for uptime & security
Quick, continuous automation and deployment of updates
Preserving infrastructure and stability of code
Senior Site Reliability Engineer
By NVIDIA At California, United States
BS degree in Computer Science or a related technical field involving coding (e.g., physics or mathematics), or equivalent experience
Technical leadership beyond development that includes scoping, requirements capturing, leading and influencing multiple teams of engineers on broad development initiatives.
Experience with the ELK and Prometheus stacks as a power user and administrator.
Prior experience driving production issues and helping with on-call support.
Experience with Cuda, PyTorch, TensorRT, TensorFlow, and/or Triton.
Experience with StackStorm and similar automation platforms is a bonus.
Senior Site Reliability Engineer
By Business Wire At United States
Strong experience with AWS cloud infrastructure and container orchestration (Kubernetes, Docker)
Strong experience with monitoring and alerting systems such as Prometheus, Grafana, Nagios, etc.
Strong experience with at least one programming language. Java is highly preferred but other languages such as Python will be considered
Advanced experience with Linux system administration, Java based applications, and network architecture
Ability to work remotely 100%
Excellent health benefits that begin on your first day of employment
Senior Site Reliability Engineer (Remote)
By The Hartford At , Hartford, Ct
Progressively implement preventative controls and drive increased automation and self-healing capabilities. Continue to improve cost efficiency baselines
Hands on experience with Performance and Observability tools such as DynaTrace, Splunk, TrueSight, CloudWatch, CloudTrail, and related tools.
Experience with continuous integration and DevOps methodologies, preferred tools such as GitHub, Jenkins, Nexus, Rally, SonarQube etc.
Knowledge of complex traditional and modern enterprise architectures and systems (understand more than the component itself).
Strong hybrid cloud experience (private and public) across various service delivery models – IaaS, PaaS, SaaS.
Strong communication (verbally and written) / collaboration / negotiation skill, working in a diverse team cross business units
Senior Site Reliability Engineer
By Humana At , Phoenix, 85050, Az
Detail oriented with excellent organizational and project management skills
Bachelor's degree or equivalent experience
Experienced in Java, Python, or similar coding experience
3+ years of experience working with voice technologies / IVR
2+ years of project leadership experience
Project-based experience driving changes and improvements to IVR solutions.
Senior Site Manager Jobs
By RSPB At Boardman, OH, United States
Head up & develop the site's management team to effectively deliver agreed objectives.
Be responsible for ensuring the safe management of all site operations.
The ability & experience to be a natural leader with excellent communication skills, who can successfully engage & influence.
Have applied or can apply a long-term vision approach to ensure best use of resources, balancing conservation management with visitor operations.
Able to understand and manipulate data to inform operational decision-making and ensure sustainable business management.
Have led and monitored compliance in land management obligations and health and safety.
Senior Site Reliability Engineer
By Dremio At , Seattle, Wa $166,304 - $225,000 a year
Have moderate-advanced experience in Python/Go, and at least reading knowledge of Java.
10+ years of relevant experience in the following areas: SRE, DevOps, Distributed Systems, Cloud Operations, Software Engineering.
Have a systematic problem-solving approach, coupled with strong communication skills and a sense of ownership and drive.
Hands-on experience with large-scale production Kubernetes clusters (<=1000 nodes).
Hands-on experience using Honeycomb for OpenTelemetry trace analysis.
Drive continuous improvements to our usage of Kubernetes, our Operators, and the GitOps deployment paradigm.
Senior Site Reliability Engineer
By Akamai At , $113,430 - $170,043 a year
Acting as an escalation point for operational, network and managerial teams to ensure network/customer issues are resolved
Have 4 years of relevant experience and a Bachelor's degree in Engineering, Computer Science, or related discipline
Experience in SQL, working in UNIX/Linux environments along with managing and running RDBMS (MySQL, PostgreSQL, etc.) clusters
Possess good experience with Internet protocols (DNS, HTTP, TLS, TCP/IP, SSH, etc.)
Experience with Web Services & Cloud API, cloud computing, hosting, and networking
Have experience in Python and/or Bash scripting
Senior Site Reliability Engineer
By Abbott Laboratories At , Abbott Park, Il
Experience with Microsoft Azure DevOps, Release Management Tools
Experience with Windows Server Configuration Management
Experience in IIS Configuration Management
EDUCATION AND EXPERIENCE YOU’LL BRING
Produces and Manages Infrastructure as code
Manages Development, QA, and Production environment configuration
Senior Site Reliability Engineer - Apple Maps
By Apple At , Cupertino, Ca
Incident management experience is a plus.
Cloud Native SRE experience ( Ideally 5-10 years).
Experience setting up and managing services running on Kubernetes.
Multi Cloud environment experience such as AWS and Google Cloud is preferred but not required.
Ability to learn and adapt. Experience matters but curiosity and adaptability are even more important.
Linux System and Network Administration.
Senior Service Reliability Engineer
By Amadeus At , Portsmouth, 03801, Nh
Change and Release Management: Manage and execute change, release and test processes and drive automation of these processes
Develop standardized automation to control, build artifact and deploy managed services
Leverage, improve, design, and implement services that automate application provisioning and manage the underlying infrastructure as a service
Manage application operations for Amadeus Core Services end to end.
Manage the full application stack (OS, Data Bases and Data Stores,
Play a key role in accelerating the organization's ability to deliver changes reliably and consistently to Amadeus Hospitality customers
Senior Site Reliability Engineer
By Microsoft At , Redmond, 98052, Wa $112,000 - $218,400 a year
6+ years of experience in Site reliability engineering role experience with large-scale, distributed infrastructures
5+ years’ experience with scripting languages such as PowerShell, Python etc.
6+ years’ experience troubleshooting, investigating, and fixing production issues in large scale cloud and/or hosted environments
4+ years experience with building infrastructure using Microsoft Azure technology
Technical Knowledge and Domain-Specific Expertise
This position will be required to pass the Microsoft Cloud background check upon hire/transfer and every two years thereafter.

Are you an experienced Senior Site Reliability Engineer looking for a new challenge? We are looking for a motivated individual to join our team and help us ensure our systems are running smoothly and efficiently. You will be responsible for developing and maintaining our infrastructure, monitoring system performance, and troubleshooting any issues that arise. If you are passionate about technology and have a keen eye for detail, this could be the perfect opportunity for you!

What is Senior Site Reliability Engineer Skills Required?

•Strong knowledge of Linux/Unix administration
•Experience with scripting languages such as Bash, Python, Ruby, etc.
•Experience with automation/configuration management using tools such as Chef, Puppet, Ansible, etc.
•Experience with cloud technologies such as AWS, Azure, Google Cloud Platform, etc.
•Experience with container technologies such as Docker, Kubernetes, etc.
•Experience with monitoring tools such as Nagios, Zabbix, etc.
•Experience with version control systems such as Git, SVN, etc.
•Strong troubleshooting and problem-solving skills
•Excellent written and verbal communication skills

What is Senior Site Reliability Engineer Qualifications?

•Bachelor’s degree in Computer Science, Information Technology, or related field
•5+ years of experience in a Site Reliability Engineer role
•Experience with DevOps practices and tools
•Experience with database technologies such as MySQL, PostgreSQL, etc.

What is Senior Site Reliability Engineer Knowledge?

•Knowledge of ITIL best practices
•Knowledge of network protocols and technologies
•Knowledge of security best practices
•Knowledge of software development lifecycle

What is Senior Site Reliability Engineer Experience?

•Experience with large-scale distributed systems