We use cookies. Find out more about it here. By continuing to browse this site you are agreeing to our use of cookies.
#alert
Back to search results

Critical Environment Technician Manager

Microsoft
United States, Virginia, Boydton
Nov 16, 2024
OverviewIn alignment with our Microsoft values, we are committed to cultivating an inclusive work environment for all employees to positively impact our culture every day and we need you as a Data Center Critical Environment Technician Manager. Microsoft's Cloud Operations & Innovation (CO+I) is the engine that powers our cloud services. As a CO+I Critical Environment Technician Manager, you will perform a key role in delivering the core infrastructure and foundational technologies for Microsoft's online services including Bing, Office 365, Xbox, OneDrive, and the Microsoft Azure platform. As a group, CO+I is focused on the personal and professional development for all employees and offers trainings and growth opportunities including Career Rotation Programs, Diversity & Inclusion trainings and events, and professional certifications. Our infrastructure is comprised of a large global portfolio of more than 200 datacenters in 32 countries and millions of servers. Our foundation is built upon and managed by a team of subject matter experts working to support services for more than 1 billion customers and 20 million businesses in over 90 countries worldwide. With environmental sustainability and optimization at the forefront of our datacenter design and operations, we continue to grow and evolve as we meet the ever-changing business demands that hold Microsoft as a world-class cloud provider. Microsoft's mission is to empower every person and every organization on the planet to achieve more. As employees we come together with a growth mindset, innovate to empower others, and collaborate to realize our shared goals. Each day we build on our values of respect, integrity, and accountability to create a culture of inclusion where everyone can thrive at work and beyond.
ResponsibilitiesPeople ManagementManagers deliver success through empowerment and accountability by modeling, coaching, and caring.Model - Live our culture; Embody our values; Practice our leadership principles.Coach - Define team objectives and outcomes; Enable success across boundaries; Help the team adapt and learn.Care - Attract and retain great people; Know each individual's capabilities and aspirations; Invest in the growth of others.Equipment and Systems OperationsWorks on complex, advanced tasks (e.g., stabilization, resolution, recovery) independently. Serves as an operations specialist in a major area of operations (e.g., electrical, mechanical, controls, generators) and provides oversight and training/mentorship to team members on tasks regarding these subsystems. Demonstrates an understanding of and operates equipment and systems across all disciplines (e.g., electrical, mechanical, controls) with knowledge of the interactions between them and overall operation of a data center. Operates all systems and equipment in a safe and professional manner.Takes part in, and oversees team in the inspection and supervision of critical environment-related facility equipment (e.g., controls, heating, ventilation, and air conditioning [HVAC], mechanical systems), building, and grounds regularly for unsafe or abnormal conditions to develop and analyze trends. Understands critical system alarms for multiple discipline(s) of equipment, their meanings, and engages with appropriate escalation processes or procedures. Recognizes circumstances where execution would be considered safe to proceed. Performs various inspections and validations of equipment performance. Monitors the performance from central monitoring locations (i.e., Facility Operations Centers) of maintenance and operations of equipment (e.g., electrical, mechanical, fire/life safety) and understands risks or impacts to other subsystems across the data center. Escalates per applicable policies and standards. Utilizes telemetry, control systems, and other platforms to monitor site status, analyze past and current events, as well as other processes, and can identify all alarms. Uses technical expertise, prior experience, and device analytics to recognize trends with equipment behavior and checks potential issues as they arise. Advises less experienced colleagues on issues found while monitoring applicable CE systems. Ensures all monitoring equipment repair, replacement, and maintenance work meets or exceeds Microsoft Service Level Agreement (SLA) requirements. Uses data trends to develop or produce predictive analyses of equipment performance.Utilizes internal computerized maintenance management system (CMMS) to track all equipment assets and to complete work order requests for maintenance work. Tracks utilization and time tracking results for team members, within applicable task management systems. Reviews and validates added required data, documents, changes logged, and procedures upkeep details (related to building management systems and reports) conducted by team members. Guides and coaches team in CMMS usage best practices. Audits and reviews team's data to ensure performance adherence to assigned work tasks. Generates reporting from CMMS for outstanding and ongoing work orders. Generates ad hoc or custom CMMS reports as needed for management. Reviews spare equipment and parts, updates stock, and signal stock replenishment with suppliers and/or coordination with Materials Handling partners.Safely and quickly responds to and leads an onsite incident response team for all abnormal conditions that impact operations, and coordinates with other critical facilities professionals to perform corrective repairs, without supervision. Gathers necessary information and creates incident timelines/data, root-cause analyses, and/or action items following an abnormal condition as required. Identifies and contacts/engages appropriate parties to mitigate incidents as they occur, and collects or provides appropriate escalation details or context to support incident management requirements. Develops new or follows preexisting emergency operating procedures (EOPs), methods of procedure (MOPs), standard operating procedures (SOPs), and digital methods of operating procedures (DMOPs) in relation to incidents. Directly provides and leads and coordinates emergency monitoring response plans for irregular or malfunctioning conditions. Oversees team during emergency incident management, assigning tasks for corrective repair as appropriate. Communicates directly to leadership, advises the business on the current conditions, safety concerns, and next steps for incident mitigation and/or resolution. Oversees incident detection, analysis, and containment efforts. Serves as technical expert in ensuring emergency operating procedures (EOPs) are consistent with proper incident response.Equipment and Systems MaintenanceGuides, oversees, and performs various types of maintenance (e.g., planned, predictive, corrective) and repairs for multiple disciplines and multiple equipment types of increasing complexity with no supervision, while serving as a subject matter expert for one discipline - in consideration of Task Hazard Analysis (THA), Method Statement of Work (MSOW), or varying permit requirements. Communicates and/or escalates maintenance activities per established process and procedure. Prioritizes maintenance activities as required and/or appropriate. Documents tasks or issues during maintenance activities within appropriate systems per process and procedure as needed. Provides consultation to colleagues on maintenance and repairs through deep understanding of equipment, systems and their interrelations. Follows recommended maintenance schedules. Oversees everyday, complex, large-scale tasks for a single discipline or equipment across disciplines. Ensures follow up action items are addressed in a timely manner. Masters the maintenance of all systems and equipment in a safe and professional manner and understands levels of risk (LORs) associated with varying types of maintenance across all disciplines and supports team members with subject matter expertise. Plans, coordinates, and presents maintenance items for review and approval in their area of responsibility.Acts as a subject matter expert, performing troubleshooting independently for multiple equipment, systems, subsystems, and component types. Documents issues found in troubleshooting process within appropriate systems per process and procedure as needed. Ensures equipment and system settings are consistent with established parameters and designs. Determines when troubleshooting efforts are deemed adequate and communicates or escalates to suppliers, engineers, or more experienced colleagues as needed. Has a hands-on understanding of how equipment in all disciplines work and how to troubleshoot to subsystem level. Provides consultation to less experienced colleagues with troubleshooting systems and problems. Ensures troubleshooting completion paths are escalated appropriately and successfully by their team members. Oversees and advises their team members on troubleshooting systems and/or investigates root causes.Provides and/or assigns team members to provide necessary escort to third-party contractors, sub-contractors, vendors, and service providers on site for all procedure levels of risk (LOR). May take part in getting third-party work underway (e.g., making sure systems are properly energized/deenergized), ensuring the work is started and completed in a safe manner in accordance with standard practices, procedures, and Authority Having Jurisdiction (AHJ) regulations. Ensures work performed by suppliers/vendors is performed to scope, all documentation is performed correctly, and escalates as appropriate. Engages to address and resolve circumstances when supplier/vendor work has been stopped to address potential and/or identified concerns. Coordinates and/or assigns team members to coordinate across all LOR applicable to preventative and/or corrective maintenance. Identifies and recommends procedure corrections if/when errors are detected or when appropriate. Coordinates and schedules supplier/vendor on-site activities. Coordinates with vendor to schedule maintenance and determines availability of equipment/parts, as directed. Resolves or escalates observed vendor quality issues. Reviews and approves vendor supplier field service reports, invoices, and work orders.Prepares and submits highly complex reports as assigned following preexisting scripts and templates, or using ad hoc methods required to support trending and analysis (e.g., Root Cause Analysis [RCA] reports) and may review prior reports delivered by less experienced team members. Develops methods of operating procedure (MOPs), standard operating procedures (SOPs), and/or digital methods of operating procedures (DMOPs) for highly complex and/or interdependent equipment and disciplines to ensure safe and reliable execution. Reviews completed work using approved tools and procedural templates from less experienced technicians for accuracy and completeness. Provides assignment and completion oversight of team members for mandatory, technical, and procedural training material. Assigns and oversees procedure development and reporting to team members. Analyzes findings from reports and documents observations.Processes method statement of work (MSOW) documents. Leads and provides guidance on coordinating activities and establishing associated schedules with contractors. Performs inspections of equipment in a facility. Participates in testing and commissioning activities. Advises engineer partners or project management colleagues on project scope process or execution methodology. Presents for review and approval MSOW in their area of responsibility.Critical Environment CultureUnderstands, follows, and ensures safety and security requirements (e.g., job hazard assessments [JHAs], toolbox talks), and business processes and procedures are met, to properly perform work in a safe, quality, and reliable manner in accordance to applicable Authority Having Jurisdiction (AHJ) regulations, and Microsoft requirements. Recognizes safe versus unsafe working conditions and responds accordingly (e.g., stop/pause tasks, stand down vendors where necessary). Escalates immediately when unsafe working conditions are observed and promotes a safe working culture to empower less experienced team members. Participates in required meetings, trainings, and necessary handoffs. Assesses and identifies appropriate resources and equipment necessary to fully support environmental health and safety (EHS) objectives. Actively maintains safe working conditions at all times. Proactively ensures safety and security requirements are followed and met for the work of themselves and others.Physical Requirements (Applies to but is not limited to US-based Data Center roles)Occasional climbing of ladders.Frequent climbing of stairs and/or ramps.Prolonged standing.Occasional lifting 50lbs. / 22.5kg.Occasional push or pull 50-75lbs. / 22.5-34kg. with assistive device.*Normal visual acuity (near, far and peripheral with correction).*Normal color vision for electrical work.*Normal is defined via standard medical terms and applicable criteria.OtherEmbody our culture and values
Applied = 0

(web-5584d87848-99x5x)