Senior Data Scientist, Brain Tumor Insitute
![]() | |
![]() United States, D.C., Washington | |
![]() | |
Description
The Brain Tumor Institute (BTI) Bioinformatics Core at Children's National Hospital is seeking a highly skilled Senior Bioinformatics Scientist/Engineer to join our team. This position will play a critical role in advancing research of multiple PIs focused on uncovering oncogenic mechanisms in pediatric brain tumors and identifying novel therapeutic targets. The Senior Bioinformatics Scientist will engage in basic and translational research projects and contribute to tool development, such as interactive applications for visualizing complex genomic data. The role involves close collaboration with researchers and clinicians within both Children's National as well as external partners. The successful candidate will report to the Director of the BTI Bioinformatics Core and lead workflow creation and implementation using CWL and/or NextFlow, benchmark new core pipelines, contribute bioinformatics analyses to focused projects based on PI needs, participate in collaborative activities in the BTI such as code review and/or workshop training, and contribute to grant applications and scientific manuscripts. In addition, this candidate will support core engineering needs such as database/API/UI development and automation. Key Responsibilities: * Collaborate with bioinformatics scientists and PIs to benchmark and optimize new production-scale analysis pipelines and workflows to generate high quality and high data integrity outputs. * Support project-specific engineering needs, such as database/API/UI development. * Collaborate with IT to ensure AWS IAM and bucket security and optimize resource use. * Create and maintain clear documentation for data engineering workflows, including codebases, data pipelines, validation, testing, and CI/CD processes. * Perform high-quality bioinformatics analyses on pediatric oncology datasets, including genomic, transcriptomic, and epigenomic data. * Design and implement downstream analytical workflows for high-throughput data using GitHub, Docker, and AWS infrastructure, focusing on reproducibility, code efficiency, and scalability. * Utilize cloud-computing environments (e.g., AWS EC2) and/or high-performance computing (HPC) to support large-scale or memory-intensive analyses. * Actively and positively participate in sprints and code reviews, ensuring high standards for reproducibility and documentation. * Engage with multidisciplinary teams, providing bioinformatics expertise to support collaborative research initiatives. Application Process: This position will be remote. Candidates should be prepared to share their GitHub handle and present a recent project as part of the interview process. ------------------------------------------------------------------------------------------------------------------------------------- Build scalable, production ready machine learning and statistical models to improve healthcare data latency through automation. This role will focus on advanced statistical and machine learning solutions collecting, cleansing, interpreting large volumes of data from varying sources, designing and delivering production ready models, monitoring and maintaining models' health in production, all while communicating key findings with stakeholders. Qualifications
Preferred Skills: * Ph.D. in Bioinformatics, Computational Biology, or a related field, or equivalent industry experience. * At least ten years of experience in bioinformatics including cancer, with expertise in Bash, R or Python, RShiny and or Python GUI applications. * Proficiency with cloud-based or high-performance computing environments for bioinformatics workflows. * Strong experience with tools and best practices for reproducibility, including Git and Docker. * Proven experience with genomic data types such as single nucleotide variants (SNVs), copy number variants, fusions, RNA expression, methylation, proteomics, splicing, and single cell datasets. * Commitment to open science practices, including sharing and collaborating on code, data, and documentation. * Extensive experience with current standard parallel computing and data processing workflows (eg: Snakemake, NextFlow, CWL, WDL). * Experience diagnosing and troubleshooting pipeline errors and unexpected behaviors. This includes taking initiative whether it be debugging, online searches, contacting authors of software for assistance and generally seeking assistance as needed. * Experience with reproducible pipeline development including software version control, use and creation of docker and/or singularity images, collaborative code review. * Demonstrated ability to develop and implement best practices for bioinformatics systems integration, testing, and deployment is required. * Interest in learning AWS cloud architecture, design, and automation. * Strong organizational and project management skills, with the ability to work on multiple projects and teams. * Excellent communication skills, with the ability to work in cross-disciplinary teams. ----------------------------------------------------------------------------------------------------------------------------------------------------------------- Minimum Education
Organizational Accountabilities
Teamwork/Communication
Performance Improvement/Problem-solving
Cost Management/Financial Responsibility
Safety
Primary Location
:
District of Columbia-Washington
Work Locations
:
Research & Innovation Campus
7144 13th Place NW
Washington
20012 Job
:
Information Technology
Organization
:
Ctr Cancer & Immunology Rsrch
Position Status
:
R (Regular)
-
FT - Full-Time
Shift
:
Day
Work Schedule
:
9:00-5:30 PM
Job Posting
:
Mar 27, 2025, 6:13:13 PM
Full-Time Salary Range
:
109116.8
-
181854.4 |