We use cookies. Find out more about it here. By continuing to browse this site you are agreeing to our use of cookies.
#alert
Back to search results
New

Research Software Engineer

Oak Ridge National Laboratory
United States, Tennessee, Oak Ridge
1 Bethel Valley Road (Show on map)
Apr 21, 2026

Requisition Id16231

Overview:

The National Center for Computational Sciences (NCCS) at Oak Ridge National Lab (ORNL), which hosts several of the world's most powerful computer systems, is seeking highly qualified individuals to play a key role in developing, and deploying data management tools and persistent services that support scientific and AI/ML campaigns that run onNCCS computing infrastructure, including the world's first exaflop system, Frontier.

The Team:

As a Research Software Engineer (RSE) in the Data and Platform Services (DAPS) group, you will work within the HPC Operations Section and closely collaborate with other Lead Software Engineers on various software engineering projects. The DAPS group designs and operates data management platforms, tools, and services for the end-to-end data lifecycle from ingestion to publication and supports several large initiatives and facilities at ORNL. Our primary development and deployment platform is the Oak Ridge Leadership Computing Facility (OLCF) Slate Service, built on Kubernetes and Rancher, which provides a container orchestration service for running critical operation applications and user-managed persistent applications that run alongside our OLCF supercomputer systems and other OLCF managed HPC clusters.

The Role:

As a Research Software Engineer, you will implement, operate, and maintain federated data platforms, data management portals, data processing pipelines, API gateways, and persistent services for the entire data lifecycle on our on-premises Kubernetes clusters, with a strong focus on scalability, reliability, and maintainability. You will also assist with AI initiatives at OLCF, integrate key data engineering and MLOps technologies, be an individual contributor for small-medium sized projects, and collaborate with Platform engineers in delivering a robust set of production services for OLCF users. This role requires: significant experience with full stack application and API development, working knowledge of Kubernetes and containerization. Knowledge of current AI/ML tools and workflows is preferred but not required.

Major Duties/Responsibilities:

Application Development and Deployment

  • Gain working expertise with federated data management (e.g., Pelican, Rucio), data catalog solutions (e.g. CKAN, DKAN, Schema.org), streaming data (e.g., Kafka), and data movement (XRootD, Globus, S3) tools and technologies.
  • Design and implement web portals and API services for data management using a combination of modern web technologies.
  • Develop, implement, and maintain Kubernetes deployment recipes for data portals, catalogs, API gateways, and other ancillary services like key-value stores and databases.
  • Implement solutions for MLOps including model lifecycle management and storage, as well as integration with existing platforms like MLFlow.

Collaboration

  • Partner closely with internal platforms, cybersecurity, and account management teams to ensure the platform meets security, compliance, role-based access controls, and usability expectations.
  • Participate in cross-functional projects related to platform enhancements and cluster lifecycle automation.
  • Be able to represent the DAPS team with internal collaborators and partners across the lab.

Basic Qualifications:

  • BSdegree and 3+ years of relevant experience or equivalent experience.
  • At least two years of experience with data management platform and tools development.
  • At least two years of experience with full stack application and API development.
  • Experience with CI/CD tooling, GitOps, and Kubernetes.
  • Experience with code review and familiarity with tools like git, GitHub and GitLab.

Preferred Qualifications:

  • Excellent interpersonal/communications skills, and the ability to work as part of a team.
  • Experience implementing and maintaining highly available systems/services.
  • Experience with PHP, Python, modern Javascript frameworks (React, AngularJS, NodeJS).
  • 5+ years of experience in addition to the degree.
  • Experience with modern software practices such as test-driven development, Agile software development practices and a firm, proven knowledge of software development lifecycles.
  • Demonstrated activity within the broader open-source software community.

This position will remain open for a minimum of 5 days after which it will close when a qualified candidate is identified and/or hired.

We accept Word (.doc, .docx), Adobe (unsecured .pdf), Rich Text Format (.rtf), and HTML (.htm, .html) up to 5MB in size. Resumes from third party vendors will not be accepted; these resumes will be deleted and the candidates submitted will not be considered for employment.

If you have trouble applying for a position, please email ORNLRecruiting@ornl.gov.

ORNL is an equal opportunity employer. All qualified applicants, including individuals with disabilities and protected veterans, are encouraged to apply. UT-Battelle is an E-Verify employer.

Applied = 0

(web-bd9584865-7m7w4)