RELX is a provider of information-based analytics for professional and business customs. As a Data Scientist Health Intern, you will engage in the full project cycle, collaborating with data scientists to analyze large datasets, build models, and create strategies while being mentored through two projects.
Analyzing large datasets including cleaning, processing, and analyzing data to extract meaningful insights.
Building predictive models and testing their accuracy and effectiveness.
Assisting with the creation of new strategies to address business challenges or opportunities.
Writing code to implement models and strategies, debug technical problems, and document improvements.
Making an impact by delivering features, refactoring existing code, and actively maintaining and extending our current systems.
Working closely with other developers and stakeholders to learn and understand our goals and workflows, your tasks, and our agile-based development approach.
Qualification
Required
Be a current student pursuing a bachelor's or masters in a related field (i.e., computer science, data science, computational biology, bioinformatics, computational linguistics, physics, mathematics, statistics, etc.) and graduating May 2026 or later.
Proven hands-on development experience in relevant implementation platforms for ML/NLP tasks.
Demonstrate experience or keen interest in working with 'big data' and applying advanced algorithms specifically within the Health Sciences domain.
Be familiar with Unix systems, open-source software, Jupyter Notebook hubs, libraries, and cloud computing environments.
Be proficient in supervised and unsupervised learning, with expertise in model building, validation, and testing using state-of-the-art ML algorithms (i.e., Random Forest, SVM, Logistic Regression, Bayesian modeling).
Possess the ability to collaborate effectively with various stakeholders and convey complex technical concepts to non-technical professionals through solid communication and documentation skills.
Adopt pragmatic approaches when choosing and implementing the right technologies to solve problems and develop with success metrics.
Preferred
Strong proficiency in Python (preferred), SQL, and R.
Experience in building and deploying deep learning models, neural networks, and advanced transformer language models, with particular emphasis on generative AI techniques such as GPT and Claude.
Benefits
RELX is a provider of information-based analytics for professional and business customs.