General Dynamics Information Technology Bioinformatics Data Scientist - Cancer Data Indexing (remote) in Wichita, Kansas
Type of Requisition: Regular
Clearance Level Must Be Able to Obtain: None
Public Trust/Other Required: NACI (T1)
Job Family: Bioinformatics
We are seeking a Bioinformatics Data Scientist to support the National Cancer Institute’s (NCI’s) Cancer Data Aggregator (CDA) activities. As a Bioinformatics Scientist you will be performing highly impactful dataset identification, manipulation, indexing, loading (i.e. Extract, Transform and Load – ETL) to allow cancer researchers to effectively find an amazing breadth of data types (genomic, proteomic, clinical, imaging, …) on the Google and Amazon Clouds for their cancer research.
You will be an integral component of a prestigious multi-institute team of researchers that are driving how the NCI and the NIH-funded researchers worldwide find and utilize cancer data on commercial clouds. You will also be supporting NCI's mission to lead, conduct, and support cancer research across the nation and world to advance scientific knowledge and help all people live longer, healthier lives.
You will work with a small, Agile team to productize a search interface that brings together data from multiple NCI sources. Your scientific acumen and bioinformatics data manipulation and programming skills will provide the unique insights that positively impact the continued success of the project.
IN THIS ROLE, A TYPICAL DAY WILL INCLUDE:
Programmatically explore datasets and annotations available at NCI based data stores
Interacting with data repositories to understand nuances of their data indexing and improving how data is received and utilized
Establishing professional relationships with a broad spectrum of respected scientific teams
Analyzing, organizing, and programmatically transforming indexed dataset descriptions for loading into a production cloud-based indexing system
Leading the development and maintenance of productized processes and workflows on the Google Cloud that allow newly released data to be indexed quickly and robustly
Collaborating with the CDA Development and NCI Data Standards Service teams to build a shared model for search across multiple NCI based data stores
Maintaining and updating CDA ETL code and GitHub repository
Documenting ETL processes
REQUIRED QUALIFICATIONS AND SKILLS:
BS or equivalent and five years of related experience
Experience in programmatically utilizing and manipulating multiple types of datasets
Proficiency in Python, UNIX/LINUX, and SQL
Experience programmatically manipulating data and making it usable for querying (ETL)
Ability to obtain a NIH Public Trust
Masters or Ph.D. in bioinformatics, computational biology, computer science, or related field
Experience programming in commercial cloud environments – preferably Google
Productizing research software into a robust maintainable process.
Fluency with cancer research data
COVID-19 Vaccination: GDIT does not have a vaccination mandate applicable to all employees. To protect the health and safety of its employees and to comply with customer requirements, however, GDIT may require employees in certain positions to be fully vaccinated against COVID-19. Vaccination requirements will depend on the status of the federal contractor mandate and customer site requirements.
We are GDIT. The people supporting some of the most complex government, defense, and intelligence projects across the country. We deliver. Bringing the expertise needed to understand and advance critical missions. We transform. Shifting the ways clients invest in, integrate, and innovate technology solutions. We ensure today is safe and tomorrow is smarter. We are there. On the ground, beside our clients, in the lab, and everywhere in between. Offering the technology transformations, strategy, and mission services needed to get the job done.
GDIT is an Equal Opportunity/Affirmative Action employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability, or veteran status, or any other protected class.