Unfortunately, this job posting is expired.
Don't worry, we can still help! Below, please find related information to help you with your job search.
Some similar recruitments
Sr. Site Reliability Engineer - Remote Us
Recruited by SitusAMC 8 months ago Address , Remote $100,000 - $125,000 a year
Staff Engineer, Electrical Jobs
Recruited by Abbott Laboratories 8 months ago Address , Saint Paul $90,700 - $181,300 a year
Project Engineer Assistant Jobs
Recruited by EXB Solutions 8 months ago Address , Remote
Senior Reliability Engineer Jobs
Recruited by Digital Diagnostics, Inc. 8 months ago Address , Remote
Site Reliability Engineer (Sre) - Evening Shift
Recruited by Brightspot 8 months ago Address , Chicago $100,000 - $115,000 a year
Reliability And Rotating Equipment Engineer
Recruited by Avantor 8 months ago Address , Carpinteria $107,100 - $178,500 a year
Sr Electrical Engineer Jobs
Recruited by Parsons 8 months ago Address , Remote $44.33 - $77.60 an hour
Site Operations Manager Jobs
Recruited by Hewlett Packard Enterprise 8 months ago Address , Dallas, 75202 $95,100 - $218,700 a year
Senior Site Reliability Engineer
Recruited by Adyen 8 months ago Address , Chicago
System Reliability Test Engineer
Recruited by Symbotic 8 months ago Address , Remote
Site Reliability Engineer Ii - Remote
Recruited by Akamai 8 months ago Address , Remote $93,656 - $140,803 a year
Lead Sre (Site Reliability Engineer)
Recruited by Concentrix 8 months ago Address , Remote
Cdn Site Reliability Engineer (L5) - Open Connect
Recruited by Netflix 8 months ago Address , Remote
Sr. Site Reliability Engineer
Recruited by eHealth 8 months ago Address , Remote $113,500 - $141,900 a year
Reliability & Maintainability Engineer (R&M) (Remote) - Huntsvlle, Al
Recruited by Davidson Technologies, Inc. 8 months ago Address , Remote
Site Reliability Engineer (Sre) - Mid/Senior
Recruited by Vanilla Technologies Inc. 8 months ago Address , Remote
System Engineer Jobs
Recruited by Rinf 8 months ago Address , Remote
Site Reliability Engineer, Netflix Technology
Recruited by Netflix 8 months ago Address , Remote
Electrical Engineer Jobs
Recruited by Unilever 10 months ago Address , Remote $110,700 - $166,000 a year
Lead Site Reliability Engineer (Remote)
Recruited by IQVIA 11 months ago Address , Remote
Electrical Engineer - Clinton
Recruited by Taylor Power Systems, Inc 11 months ago Address , Clinton, Ms
Site Reliability Engineer (Sre)
Recruited by Luxoft 11 months ago Address , Remote
Backend Engineer (Site-Reliability) Jobs
Recruited by Terraform Labs 11 months ago Address , Remote
Electrical Engineer - Project Management/Project Engineer
Recruited by GE Renewable Energy 11 months ago Address , Remote
Site Reliability Engineer Ii
Recruited by Exact Sciences Corporation 11 months ago Address , Remote $82,000 - $130,000 a year
Reliability Engineer (Open To Us Remote)
Recruited by Cargill 11 months ago Address , Swedesboro, 08085, Nj $103,000 - $119,000 a year
Site Reliability Engineer - Kubernetes
Recruited by Avantage Entertainment 11 months ago Address , Remote $115,000 - $130,000 a year
Maintenance Reliability Engineer Ii
Recruited by Specialty Granules 11 months ago Address , Ione, Ca $80,000 - $110,000 a year
Project Engineer - Actuation
Recruited by Moog Inc. 11 months ago Address , Elma, Ny
Project Engineer Jobs
Recruited by Highline Warren 11 months ago Address Council Bluffs, IA, United States
Project Engineer Jobs
Recruited by JWF Industries 11 months ago Address , Johnstown, 15906, Pa
Electrical Engineer I Jobs
Recruited by Worley 1 year ago Address , Monrovia, Ca $84,448 - $99,351 a year
System Operations Specialist Jobs
Recruited by Litera 1 year ago Address , Remote
Firewall Engineer - Remote
Recruited by Lumen 1 year ago Address , Remote $57,600 - $128,400 a year
Reliability Engineer Jobs
Recruited by The Hershey Company 1 year ago Address , Hazleton, 18202, Pa
Operations Engineer Jobs
Recruited by Amazon.com Services LLC 1 year ago Address , Remote Up to $93,500 a year

System Reliability Operations Engineer

Company

Disney

Address , Lake Buena Vista
Employment type
Salary
Expires 2023-10-09
Posted at 8 months ago
Job Description

System Reliability Operations Engineer

Job ID
10050587
Location
Lake Buena Vista, Florida, United States
Business
The Walt Disney Company (Corporate)
Date posted
Jul. 31, 2023

Job Summary:

Within Disney Enterprise Technology, the Disney Technology Operations Command Center (DTOC) is a 24x7x365 critical services operation center responsible for service availability, with main focus to rapidly respond to, correlate for, and reduce impact of outages. We are accountable for identifying and facilitating the resolution of service impacting events, and collaborating with other technology teams to prevent future impact through proactive event management, incident and problem analysis. DTOC drives the execution of the major incident process including communication to executives and key partners, including owning and implementing Crisis Management plans and processes. DTOC also provides ongoing first and second-level technical support of requests, performs validation procedures for routine system/service checks, and fulfills proactive monitoring of significant business events.

System Reliability Operations (SRO) Engineers ensure all processes and functions within our environment operate correctly and efficiently – monitoring, identifying, and coordinating with other technologists across segments to fine-tune system operations and resolve service interruptions. This role is responsible for the end-to-end reliability and operations of IT services and performing consultations and training to other clients and segments across Disney. SROs consistently and reliably triage reported or automated incidents, apply recovery procedures, and engage domain experts to restore steady-state operations. Additionally, this position will drive service improvement initiatives through proactive monitoring and enhancement actions from gaps identified through analytics and problem management.

Responsibilities:

  • Proactively identify, diagnose, fix, and resolve infrastructure, application, and IT operations issues in collaboration with other IT support teams
  • Implement and maintain technology observability and alerting solutions to provide real-time insights into system health, performance, and compliance
  • Effectively apply Problem & Incident Analysis techniques during an incident and post-incident
  • Ensure that all DTOC services are designed to deliver the levels of availability required by the business
  • Develop, implement, and maintain automation tools and scripts to improve the efficiency and reliability of IT operations and infrastructure
  • Perform DR/BCP activities for critical events and emergency onsite response
  • Identify and drive service availability improvement opportunities by driving leading practices
  • Supervise the performance and availability of enterprise applications, systems, and infrastructure, ensuring they meet or exceed established service level objectives (SLOs)
  • Identify service improvement opportunities through trend analysis, proactive techniques, and after-action reviews
  • Address outages in a timely fashion, ensuring work streams towards resolution following department procedures while presenting business impacts
  • Analyze and publish operational utilization and service performance metrics

Required

  • 2+ years incident recovery with demonstrated experience with Service and Event Management tools
  • Demonstrated experience in systems integration, application infrastructure support, and middleware operations.
  • Experience in enterprise IT operations including system administration, application platforms, infrastructure, networking fundamentals, and IT service management
  • Experience working in a 24x7 IT operations environment
  • Experience with hands-on support of cloud operations (AWS, Google Cloud, Azure)
  • BA/BS in Computer Science, Engineering or related field; or equivalent work experience
  • Solid understanding of observability, monitoring, and alerting tools (ex. Splunk, New Relic, Grafana, ELK Stack, Datadog)
  • 2+ years experience supporting converged infrastructure stacks including application, compute, storage, and networking
  • Experience within network technologies (WAN/LAN, wireless infrastructure, DNS/DHCP, Load-Balancers, Accelerators)
  • Proficiency in one or more scripting/automation languages (ex. Python, PowerShell, Bash, Ruby)
  • Experience with x86 hardware technology, Windows, Linux, RISC operating systems, P-Series hardware, SAN, NAS, and data protection technologies
  • Strong technology problem-solving and analytical skills, with the ability to quickly diagnose and resolve technical issues.

Preferred

  • Master’s degree in a technical field
  • Certification/s within Kepner-Tregoe, ITIL Foundations (V3), operating systems, visualization, and/or hardware platforms