Merck is a leading company in the pharmaceutical industry, and they are seeking interns to join their Computational Toxicology group. The successful candidates will work on projects applying machine learning and natural language processing to enhance drug safety assessments and reduce reliance on animal testing.
Build and evaluate predictive models for toxicity risk and mechanistic hypotheses generation using multimodal pharmaceutical data.
Develop NLP/LLM pipelines (prompting, fine-tuning, RAG) to mine unstructured reports and literature.
Prepare dashboards (e.g., Streamlit) and present results to scientists and leadership.
Qualification
Required
Pursuing BS/MS/PhD in Computational Science, Data Science, Bioinformatics, Computational Biology/Chemistry, Computational Linguistics, or related field.
Must be proficient in Python; solid with statistics/ML and data wrangling (pandas/SQL).
Must have hands-on experience with scikit-learn and at least one deep learning framework (PyTorch or TensorFlow).
Must be comfortable with Git/GitHub and reproducible workflows.
Must have experience with NLP and LLM tools (e.g., Hugging Face, spaCy, NLTK) and techniques such as prompt design, fine-tuning, and retrieval-augmented generation (RAG).
Must be available for a period of 12 weeks, beginning June 2026.
Preferred
Benefits
Merck is a biopharmaceutical company that offers medicines and vaccines for various diseases.