Site Reliability Engineer Jobs in California , Employment

Saas Site Reliability Engineer And Automation Developer

By Siemens Digital Industries Software At , Costa Mesa, 92627 $116,900 - $210,400 a year

Develop and maintain automation tools, scripts, and frameworks to streamline deployment, configuration management, and monitoring processes.

Design and implement infrastructure solutions using configuration management tools, such as Ansible, Puppet, or Chef.

Proficiency in automation and configuration management tools (e.g., Ansible, Puppet, Chef).

In-depth knowledge and hands-on experience with cloud platforms such as AWS, Azure, or Google Cloud and their scalability features.

Bachelor's degree in Computer Science, Engineering, or a related field (or equivalent work experience).

Strong programming skills in languages such as Python, Go, or Ruby.

Senior Site Reliability Engineer, Trello

By Atlassian At , San Francisco

3+ years of hands-on experience with public cloud offerings such as AWS,GCP or Azure

Familiarity with Incident management, post-incident analysis and participation in on-call rotation

3+ years experience operating high-availability, fault-tolerant, scalable, distributed software in production: building monitoring, tweaking dashboards, defining alerts, writing runbooks, etc.

Engineering microservices and tools across one or more programming languages (e.g. Go, Python,Bash)

Automation and Infrastructure-as-Code projects and tooling (e.g. Ansible, Puppet, Terraform)

Build and maintain a continuous integration and delivery pipeline (e.g. Bamboo, Bitbucket Pipelines, Github Actions)

Site Reliability Engineer, Product - Usds

By TikTok At , Los Angeles $119,000 - $289,000 a year

Gain a solid understanding of the various components and services that power the TikTok experience

Maintain services to meet service-level-agreements (SLAs) and service-level-objectives (SLOs) by measuring and monitoring availability, performance, and overall system health

Scale systems sustainability through mechanisms such as automation; evolve systems reliability, efficiency, and velocity by pushing for changes

Provide user support, incident responses and postmortems

In this role, you will:

Our time off and leave plans are:

Site Reliability Engineer, Systems

By Anthropic At , San Francisco, Ca

Automate operations and infrastructure management

Have significant experience with Kubernetes and cloud-native infrastructure

Have strong communication skills to work with a range of technical and non-technical colleagues

Python and Linux SysAdmin skills

Significant experience with Kubernetes architecture and administration

Strong Linux skills and cloud infrastructure expertise

Site Reliability Engineer (L4/5) - Core

By Netflix At , Los Gatos, Ca

Experience in risk management and/or analysis

Improve our incident management lifecycle to identify, mitigate, and learn from reliability risks

Read signals and metrics to develop deeper insights into our customers’ quality of experience to help inform business decisions

Strong writing and presentation skills

Development experience with Java, JavaScript/Node.js, Python, Go

Knowledge of cloud platforms (i.e. AWS, GCP, etc.) and microservices architecture

Software Engineer Iii, Site Reliability Engineering, Google Cloud

By Google At Sunnyvale, CA, United States

Bachelor’s degree in Computer Science, a related field, or equivalent practical experience.

2 years of experience with data structures/algorithms and software development in one or more programming languages.

2 years of experience designing, analyzing, and troubleshooting large-scale distributed systems.

Contribute to existing documentation or educational content and adapt content based on product/program updates and user feedback.

Master's degree in Computer Science or Engineering.

Write product or system development code.

Site Reliability Engineer Jobs

By Sohum Inc At San Francisco Bay Area, United States

Full time opportunity that offers excellent benefits.

• Configuration Management and IAC - Salt, Pulumi (Terraform will work)

• Bachelor’s degree in CS / other highly technical discipline, or equivalent experience

• 5+ years of experience and 3+ years experience as Site Reliability Engg

• Strong networking and firewall knowledge

• Exceptional problem solving and troubleshooting skills

Site Reliability Engineer Jobs

By WalkWater Technologies At Cupertino, CA, United States

Experience with SSL/mTLS and certificate management

Hands-on experience with cloud orchestration platforms such as Kubernetes or Nomad

Setting up CD/CD pipelines using GitHub hooks, TeamCity, Docker, and Artifactory

Familiarity with load balancers, traffic-envoys, and proxies

Familiarity with Java runtime / JVM

Familiarity with observability systems such as Prometheus or Open Metrics

Staff Site Reliability Engineer

By Netskope At , Santa Clara, Ca

You will be part of a high caliber engineering team in the exciting space of cloud tools and infrastructure management.

Drive efficiencies in systems and processes: capacity planning, configuration management, performance tuning, monitoring and root cause analysis.

You will solve complex, exciting challenges and improve the depth and breadth of your technical and analytical skills

Partner closely with our development teams and product managers to architect and build features that are highly available, performant and secure

Gain deep knowledge of our application stack

Experience improving the performance of micro-services and solve scaling/performance issues

Site Reliability Engineer Jobs

By Lawrence Berkeley National Laboratory At , San Francisco Bay Area, Ca $9,739 - $11,905 a month

Minimum of three years of experience in UNIX or Linux, Networking, IT infrastructure environment and management experience in a distributed-computing environment.

Knowledge of the processes for standard operating procedures, and best practices for implementation and change management.

Past experience with Incident Management and a good understanding of IT service management.

Experience with network security: configuring/maintaining ACLs, knowledge of firewalls

Bachelor’s Degree in a Computer Science or similar discipline or equivalent years of experience.

Strong hands-on knowledge of the Linux shell and working in a command-line (e.g. SSH) environment.

Principal Site Reliability Engineer

By Oracle At , Redwood City, 94065, Ca

Develop and implement various database life-cycle management flows.

Certification of Database products for cloud integration

Participate in Product Feature Review, Certification experiments and User Document reviews.

Research and acquire skills on new technologies as needed from time to time

6-14 years of Oracle database administration experience on large production environments

Database hands on skills especially around database and system troubleshooting and administration

Senior Site Reliability Engineer (Sre)

By Apple At , San Diego, Ca

Experience in a DevOPS or SRE role

Experience with modern web-scale services including servers, VIPs, load balancers, proxies

Highly experienced with one of these: Puppet, Chef, Saltstack, Ansible

Bonus: Native Kubernetes implementation including CNI, Kafka, etcd experience

Bonus: Experience with Cisco, Juniper, or Arista routing and switching hardware (+OS), including wireless

Able to write software needed to build and operate a large scale platform 24x7 including the development and staging platforms.

Sr. Site Reliability Engineer

By rockset At , San Mateo, Ca $140,000 - $185,000 a year

Experience with Terraform, Salt, Chef, Packer, or similar configuration management tools

Willing to learn new skills and technologies

Bachelor's or Master's degree in Computer Science or a related field, or relevant work experience

Experience as an SRE for 3+ years

Experience building and operating public-facing 24x7 web applications at scale

Experience working with cloud infrastructure and patterns (AWS preferred)

Staff Site Reliability Engineer

By Collective Health At , San Mateo, 94401, Ca $140,000 - $210,000 a year

Expertise in management and use of relational databases including.

10+ years of work experience in DevOps, Site Reliability Engineering, or Software Engineering.

Experience creating and monitoring SLIs and SLOs in order to set and remain within error budgets.

Experience in supporting customer-facing production systems and responding to incidents as part of an oncall rotation.

Knowledge of data structures, algorithms, distributed systems, and information retrieval.

Experience in solving diagnosing and resolving incidents that involve application, OS, network, infrastructure, partners, people, and process.

Manager, Site Reliability Engineer - Remote

By KPMG-UnitedStates At , San Diego, Ca

Manager, Site Reliability Engineer - Remote

Experience in supporting various enterprise class solutions and services including Windows server administration and security issue remediation

Be the Technical Lead representing the SRE\Tier 3 team for operational initiatives or project support

Improve reliability, quality, and time-to-market of our suite of software solutions

Create sustainable systems and services through automation and uplifts

Bachelor's degree from an accredited college or university is preferred

Are you looking for an opportunity to join a fast-paced and innovative team? We are looking for a Site Reliability Engineer to join our team and help us ensure our systems are running smoothly and efficiently. You will be responsible for monitoring, troubleshooting, and resolving any issues that arise with our systems. You will also be responsible for developing and implementing strategies to improve system reliability and performance. If you are a self-starter with a passion for problem-solving and a knack for automation, then this is the job for you!

A Site Reliability Engineer (SRE) is responsible for ensuring the reliability, performance, and availability of a company’s websites, applications, and services. They are responsible for developing and maintaining automation tools, monitoring systems, and other processes to ensure the reliability of the company’s services.

What is Site Reliability Engineer Skills Required?

• Knowledge of Linux/Unix systems

• Knowledge of scripting languages such as Bash, Python, and Ruby

• Knowledge of distributed systems and cloud computing

• Knowledge of monitoring and logging tools such as Nagios, Splunk, and ELK

• Knowledge of configuration management tools such as Chef, Puppet, and Ansible

• Knowledge of container technologies such as Docker and Kubernetes

• Knowledge of version control systems such as Git

• Ability to troubleshoot and debug complex systems

• Ability to work in a fast-paced environment

What is Site Reliability Engineer Qualifications?

• Bachelor’s degree in Computer Science, Information Technology, or related field

• 5+ years of experience in a DevOps or SRE role

• Experience with automation and configuration management tools

• Experience with monitoring and logging tools

• Experience with container technologies

• Experience with version control systems

What is Site Reliability Engineer Knowledge?

• Knowledge

Latest vacancies

Systems Analyst - Excel, Xml, Sql, Scripting
By CyberCoders At Salt Lake City, UT, United States 7 months ago
(Senior) Finance & Shared Services Manager
By Catholics For Choice At Washington, DC, United States 7 months ago
Paralegal - Probate Administration
By CyberCoders At Miami, FL, United States 7 months ago
Account Executive - Automotive Software
By ECW Search At United States 7 months ago
Construction Project Coordinator Jobs
By CyberCoders At River Falls, WI, United States 7 months ago