Applied Scientist - LLMs

Remote

Full Time

Experienced

Semantic Health is on a mission to improve care delivery and operational inefficiencies by transforming the use of unstructured data in healthcare's revenue cycle. Our machine learning powered medical coding and auditing platform uses cutting edge deep learning to streamline manual and error-prone medical coding and auditing processes in health organizations. We help health organizations improve data quality, optimize reimbursements, and enable real-time access to actionable data for use across the health system.

At Semantic Health, we combine the clinical and business expertise of doctors and successful entrepreneurs, with the technical skillset of top ML researchers. We are backed by leading institutional investors who have driven companies our size to multi- billion dollar valuations.

We’re seeking a motivated and driven scientist to work on our foundation language models. We value scrappy and product-focused thinkers that thrive by solving complex and challenging problems.

As an Applied Scientist - LLM, you will:

Implement, train, fine-tune, and optimize LLMs for specific applications in healthcare, ensuring their accuracy and efficiency.
Utilize Kubernetes and containerization for the deployment and scaling of LLMs across multiple GPUs and servers.
Oversee the process of data preprocessing, ensuring the quality and relevancy of data fed into the LLMs.
Apply methods such as Retrieval-Augmented Generation (RAG) and Direct Preference Optimization (DPO) for enhancing model outputs.
Stay current with the latest advancements in open-source LLM technologies.
Work closely with ML engineers, software engineers, and healthcare professionals to integrate LLM solutions into practical applications.

You will have experience in:

Applying and refining large language models, preferably with 2+ years in the field.
Kubernetes, container technologies, and multi-GPU deployment strategies.
Python, with experience in frameworks like PyTorch or TensorFlow. Familiarity with CUDA for GPU acceleration.
Data cleaning, preprocessing, and manipulation, with at least 2 years of work in this area.
Fine-tuning LLMs for specific tasks or industries, particularly in a Q&A context.
LLM frameworks and models like LangChain, OpenLLM, vLLM, and Llama2.

Bonus points if you have experience with:

Prior work with healthcare datasets, understanding of medical terminologies and compliance (e.g., HIPAA).

Compensation will be competitive to the market and will consist of cash and stock
options, as well as benefits. We are currently a remote team and plan to continue
working remotely.

We are an Equal Opportunity Employer. This company does not and will not discriminate in employment and personnel practices on the basis of race, sex, age, disability, religion, national origin, or any other basis prohibited by applicable law. Hiring, transferring and promotion practices are performed without regard to the above-listed items.

Apply for this position

Required*

Apply with Indeed

First Name*

Last Name*

Email Address*

Phone*

Address*

Resume*

We've received your resume. Click here to update it.

Attach resume or Paste resume

Attach resume as .pdf, .doc, .docx, .odt, .txt, or .rtf (limit 5MB) or Paste resume

Paste your resume here or Attach resume file

Who referred you to this position? Enter their first and last name here.

LinkedIn Profile URL:

Desired salary*

In 150 characters or fewer, tell us what makes you unique. Try to be creative and say something that will catch our eye!*

150

Do you currently reside in the United States or Canada?*

Have you built and deployed LLM based systems in production settings in an organization?*

Have you deployed RAG solutions alongside LLMs in production settings in an organization?*

Have you deployed LLM improvement mechanisms (PPO, DPO, etc.) in the past?*

Are you trained in computer science and/or AI at the graduate level (i.e. at least a Master’s level degree)?*

Are you legally authorized to work in the United States or Canada?*

Will you need sponsorship to work in the US or Canada?*

Do you have experience building and deploying LLMs on clinical data?*

Do you have experience building and deploying LLMs outside of the OpenAI ecosystem?*

The following questions are entirely optional.

To comply with government Equal Employment Opportunity and/or Affirmative Action reporting regulations, we are requesting (but NOT requiring) that you enter this personal data. This information will not be used in connection with any employment decisions, and will be used solely as permitted by state and federal law. Your voluntary cooperation would be appreciated. Learn more.

Gender

Race/Ethnicity

Human Check*

Submit Application