Assignment Description

Our client in the automotive industry is currently seeking a Storage and Compute Engineer.

You will be at the forefront of High-Performance Computing (HPC) infrastructure in the Autonomous Drive sector. Your primary responsibility will be ensuring the seamless daily operations of their HPC environments and providing support to users within the business to maximize the value and performance of the solution.

The team is currently managing an existing cluster and is embarking on a project for the next-generation High Throughput Cluster (HTC), featuring over 100 PiB storage and CPU/GPU nodes.

Responsibilities:

  • Collaborate with technical engineers in various business areas to enhance methods, workflows, application optimization, automation, etc., thereby improving overall user experience and efficiency.
  • Serve as a local Subject Matter Expert (SME) for all matters related to HPC/HTC, including Linux, configuration management, schedulers, hardware, and applications.
  • Assist in developing standards and procedures and work with external providers to implement them.
  • Participate as an SME in both internal IT and business projects.
  • Work with external service providers to assist in advanced troubleshooting, review and approve root cause analysis reports, etc.
  • Participate as an SME and team member in the development of the next-generation HPC/HTC solution.
  • Collaborate closely with Autonomous Drive & Advanced Driver Assistance Systems (AD and ADAS) developers and engineers.
  • Optimize code and strategies for HPC execution as a HPC Subject Matter Expert (SME).
  • Recommend best practices and conduct workshops and training sessions.
  • Gather and document requirements, identify solutions, and set up proof of concepts.
  • Assist in workload management and related tasks.
  • Contribute to the continuous improvement and expansion of Storage and Compute (S&C) solutions.
  • Engage in ongoing operations with S&C suppliers.
  • Take on a high level of trust and empowerment to address issues and ensure the business receives necessary HPC services.

Requirements:

To succeed in this role, you should be proficient in Linux and fluent in Python and/or other scripting languages. You should have experience working with HPC technologies and be familiar with tools like Slurm/PBS/LSF or similar, Docker/Singularity, as well as Terraform/Ansible or other infrastructure automation languages.

Preferred Qualifications:

  • Relevant Linux experience.
  • Experience in HPC administration in an enterprise environment, including network, server, storage, and client administration.
  • Degree in IT, Computer Science, or equivalent experience.
  • Experience with storage concepts such as scale-out NAS (NFS), SMB, parallel filesystems (Lustre), POSIX, and object storage.
  • Experience with parallel computing (HPC/HTC), performance troubleshooting & optimization, and operating environments at large scale.
  • Professional proficiency in English.

Desired Attributes:

  • Excellent communication skills and a positive attitude.
  • Organizational skills with a focus on results and quality.
  • Ability to coordinate and manage multiple tasks simultaneously.
  • Curiosity and willingness to learn and improve.
  • Experience with Agile methodologies.
Detaljer

Referens:40229

Ort: Göteborg

Omfattning:100%

Startdatum:Omgående

Slutdatum:2024-12-31

Konsultförmedlare

Det går inte längre att söka den här tjänsten.