Principal Software Engineer - Azure Managed Lustre Filesystem

Microsoft Pittsburgh, PA

Company

Microsoft

Location

Pittsburgh, PA

Type

Full Time

Job Description

The Azure Managed Lustre Filesystem (AMLFS) team leads development, deployment, and monitoring of the most popular High-Performance Computing (HPC) parallel filesystem in the world: Lustre. This is a Principal Software Engineer position for the Filesystem subteam of AMLFS whose mission includes but is not limited to: cluster architecture and design of Lustre in the Azure ecosystem, development of novel Hierarchical Storage Management (HSM) technology between Lustre and other Azure storage platforms, performance analysis and optimization of AMLFS, and customer support for the most challenging parallel filesystem bugs or performance anomalies that arise within our product.

As a Principal Software Engineer in the AMLFS Filesystem team you will lead design and development of key features for releases of our product aligned with the mission statement for the subteam above, primarily working within the Lustre filesystem itself, Lustre userspace libraries, and a host of first and third party software that surround and support the filesystem especially as it relates to HSM. You will also perform kernel and userspace debugging for problems arising in the above either internally or from customer escalations, and drive these problems to root cause and solution in concert with other engineers. You will also engage with the open-source Lustre community members to raise problems, file bugs, perform reviews, and upstream changes our team makes to Lustre itself. This opportunity will allow you to further develop expertise in HPC filesystem design, implementation, and debugging, work with a wide audience to help decide direction and prioritization of key product features, and hone leadership qualities as you work with other engineers to implement such features. The AMLFS team works in and is located in Pittsburgh, Pennsylvania, however we remain open to highly aligned and experienced candidates working fully remotely.

Microsoft's mission is to empower every person and every organization on the planet to achieve more. As employees we come together with a growth mindset, innovate to empower others and collaborate to realize our shared goals. Each day we build on our values of respect, integrity, and accountability to create a culture of inclusion where everyone can thrive at work and beyond.

Responsibilities:

Responsibilities:

  • Partners with appropriate stakeholders to determine user requirements for a set of scenarios.
  • Leads identification of dependencies and the development of design documents for a product, application, service, or platform.
  • Leads by example and mentors others to produce extensible and maintainable code used across products.
  • Leverages subject-matter expertise of cross-product features with appropriate stakeholders (e.g., project managers) to drive multiple group's project plans, release plans, and work items.
  • Holds accountability as a Designated Responsible Individual (DRI), mentoring engineers across products/solutions, working on-call to monitor system/product/service for degradation, downtime, or interruptions.
  • Proactively seeks new knowledge and adapts to new trends, technical solutions, and patterns that will improve the availability, reliability, efficiency, observability, and performance of products while also driving consistency in monitoring and operations at scale and shares knowledge with other engineers.

Qualifications:

Required/Minimum Qualifications:

  • Bachelor's Degree in Computer Science, or related technical discipline AND 6+ years technical engineering experience with coding in languages including, but not limited to, C, C++, or Python
    • OR equivalent experience.
  • 4+ years of software engineering experience with the Lustre parallel file system OR an equivalent parallel file system.
  • 1+ years of experience debugging Linux kernel software, including but not limited to: use of GNU Debugger (gdb) to analyze a kernel memory/crash dump and general knowledge of the tools available during kernel debugging (e.g., Diagnostic Messages (dmesg), System Control (sysctl), kernel boot parameters, module loading, and kernel tracing facilities).

Other Requirements:

Ability to meet Microsoft, customer and/or government security screening requirements are required for this role. These requirements include, but are not limited to the following specialized security screenings: Microsoft Cloud Background Check: This position will be required to pass the Microsoft Cloud background check upon hire/transfer and every two years thereafter.

Preferred/Additional Qualifications:

  • Bachelor's Degree in Computer Science or related technical field AND 10+ years technical engineering experience with coding in languages including, but not limited to, C, C++, or Python OR Master's Degree in Computer Science or related technical field AND 8+ years technical engineering experience with coding in languages including, but not limited to, C, C++, or Python
    • OR equivalent experience.
  • Experience working, developing, and debugging within a Linux operating system environment and at least broad understanding of Linux kernel fundamentals.
  • Experience with filesystem design, development, and debugging.
  • General exposure to high-performance computing OR distributed systems in an industry or academic setting.
  • Deep experience debugging kernel software, including but not limited to: use of kgdb/gdb to analyze a kernel memory/crash dump, use of ftrace, kprobes, uprobes, or ebpf to perform kernel tracing/logging, and general knowledge of the tools available to oneself during kernel debugging (e.g., dmesg, sysctl, kernel boot params, module loading, etc).
  • Experience performing performance analysis and root cause of a distributed or complex system.

Software Engineering IC5 - The typical base pay range for this role across the U.S. is USD $133,600 - $256,800 per year. There is a different range applicable to specific work locations, within the San Francisco Bay area and New York City metropolitan area, and the base pay range for this role in those locations is USD $173,200 - $282,200 per year.

Certain roles may be eligible for benefits and other compensation. Find additional benefits and pay information here: https://careers.microsoft.com/us/en/us-corporate-pay

Microsoft is an equal opportunity employer. All qualified applicants will receive consideration for employment without regard to age, ancestry, color, family or medical care leave, gender identity or expression, genetic information, marital status, medical condition, national origin, physical or mental disability, political affiliation, protected veteran status, race, religion, sex (including pregnancy), sexual orientation, or any other characteristic protected by applicable laws, regulations and ordinances. We also consider qualified applicants regardless of criminal histories, consistent with legal requirements. If you need assistance and/or a reasonable accommodation due to a disability during the application or the recruiting process, please send a request via the Accommodation request form.

Benefits/perks listed below may vary depending on the nature of your employment with Microsoft and the country where you work.

#azurecorejobs

Date Posted

06/06/2023

Views

7

Back to Job Listings ❤️Add To Job List Company Info View Company Reviews
Positive
Subjectivity Score: 0.8

Similar Jobs

Software Engineer - JPMorganChase

Views in the last 30 days - 0

The job description outlines a role that involves designing developing and implementing software solutions to solve business problems The role encompa...

View Details

AI & GenAI Data Scientist - Manager - PwC

Views in the last 30 days - 0

The job description at PwC involves leveraging data and advanced analytics techniques to drive business decisions and optimize operations The role req...

View Details

Enterprise Engineer Sr - Akami Security Suite - The PNC Financial Services Group

Views in the last 30 days - 0

PNC is seeking an Enterprise Engineer Sr with expertise in Akamai Security Suite to manage configure and optimize security solutions The role involves...

View Details

Data Engineer Senior - Data and Automation (Hadoop, Google Cloud, Pyspark, Python, SQL) - The PNC Financial Services Group

Views in the last 30 days - 0

PNC is seeking a Data Engineer Senior to join their Data and Automation organization The role involves architecting developing testing and optimizing ...

View Details

Senior Software Engineer-Java/React/SQ - The PNC Financial Services Group

Views in the last 30 days - 0

PNC is seeking a Senior Software Engineer with 3 years of experience in full stack engineering The role involves detailed technical design and develop...

View Details

AI & GenAI Data Scientist-Senior Associate - PwC

Views in the last 30 days - 0

At PwC data and analytics professionals leverage advanced analytics techniques to drive insights and informed business decisions They focus on data ma...

View Details