Responsible for administering, designing, and implementing Linux High Performance Computing (HPC) clusters. Performs system administration duties on a Linux high performance computing (HPC) cluster including cluster management, virtualization, cluster usage monitoring, health monitoring, job scheduling, application integration/installation (open source as well as vendor supported), and application performance. Improve cluster performance through kernel changes, firmware updates, library stack changes, and application container management such as singularity or docker. The professional also leads, oversees and maintains, multiuser computing environment as per the requirements of the organization. The individual in this position must have strong technical knowledge of virtual machines and IaaS, PaaS and SaaS models. Individuals having experience with Bright-Computing would be preferred.
- Performs system administration duties on a Linux HPC Cluster, cluster management, virtualization, cluster usage monitoring, health monitoring, job scheduling, and application integration/installation.
- Manages hardware and software applications in the production environment provided to HPC users.
- Install software and updates. Facilitates the acquisition of hardware and software products and services for the HPC Cluster.
- Knowledge of SLURM and other open-source job schedulers.
- Compile, configure, and integrate open-source applications into HPC environment.
- Build / customize and deploy cluster for customer end to end (a turnkey solution).
- As a cloud administrator you would assist in setting up public and private (Hybrid) cloud system. Manage operating systems, specialized software like MATLAB, TensorFlow, Ansys Fluid etc.
Qualification and skill requirements
- Proven experience as HPC Administrator.
- Degree in computer science / electronics engineering from reputable institute and good grades.
- Experience with OS (RedHat Linux, CENTOS, SUSE, Windows)
- Must have strong command on SLURM, Bright-Computing, Spectrum LSF, Spectrum Scale, Lustre, TrinityX, OpenPBS and BeeGFS.
- Must have command on GPU slicing, PCI Passthrough, GPU libraries and drivers.
- Should be conversant with configuring VMs and ESXI.
- BASH, Perl, PowerShell, Python.
- Experience with Docker, Kubernetes and other container runtimes.
- Experience with High Performance Data Analytics HPDA technologies.
- Strong organizational and project management skills, preferably conversant with Jira, Remine etc.
- Excellent verbal communication skills.
- Good problem-solving skills.
· Bright-computing, RHCSA, Oracle HPC Associate
- 04 to 05 years in the field of system administration with minimum 3 years provable experience as HPC administration.
- Foreign qualified, experienced professionals would be preferred.
- The hired individual will undergo a 2-3 months training period during which the engineer will be on probation. The final appointment will be confirmed after completion of the training period.
- Pakistani Diaspora is highly encouraged to apply.
- All jobs are Contractual (with Career Track option)