Data Scientist, Senior - TS/SCI
Job Description
Overview
BigBear.ai is seeking a Data Scientist to support data collection, ingestion, validation, and loading of optimized data in the appropriate data stores. The candidate identifies and implements solutions for the data requirements, including building pipelines to collect data from disparate, external sources, implementing rules to validate that expected data is received, cleansed, transformed, massaged and in an optimized output format for the data store.
This position will allow some remote work but will report on location at one of the customer's locations in D.C, MD, or VA area and requires a TS/SCI with the ability to obtain a CI Poly.
What you will do
- Involvement in the analyses of unstructured and semi-structured data, including latent semantic indexing (LSI), entity identification and tagging, complex event processing (CEP), natural language processing (NLP), and the application of analysis algorithms on distributed, clustered, and cloud-based high-performance infrastructures.
- Exercises creativity in applying non-traditional approaches to large-scale analysis of unstructured data in support of high-value use cases visualized through multi-dimensional interfaces.
- Analyze and investigate data sets and summarize their main characteristics, often employing data visualization methods. Exploratory Data Analysis (EDA)
- Handle processing and index requests against high-volume collections of data and high-velocity data streams.
- Research and test novel machine learning approaches for analyzing large-scale distributed computing applications.
- Suggest innovative and creative concepts and ideas that would improve the overall platform.
- Strong understanding of the most recent trends around telecom industry and technology adoption.
- Strong technical and computational skills - engineering, physics, mathematics, coupled with the ability to code design, develop, and deploy sophisticated applications and prototypes using advanced unstructured and semi-structured data analysis techniques and utilizing high-performance computing environments.
- Utilize advanced tools and computational skills to interpret, connect, predict, and make discoveries in complex data and deliver recommendations for business and analytic decisions.
What you need to have
- Bachelor's Degree and 5 to 8 years of relevant experience.
- Clearance: Active TS/SCI with ability to obtain a CI Poly.
- Full time work in a SCIF is required, but hours are flexible.
- Experience with data transport and transformation APIs and technologies such as JSON, XML, XSLT, JDBC, and REST.
- Experience with Cloud-based data analysis tools including Hadoop and Mahout, Acumulo, Hive, Impala, Pig, and similar.
- Experience with visual analytic tools like Microsoft Pivot, Palantir, or Visual Analytics.
- Experience with open-source textual processing such as Lucene, Sphinx, Nutch or Solr
- Experience with entity extraction and conceptual search technologies such as LSI, LDA, etc.
- Experience with machine learning, algorithm analysis, and data clustering.
- Experience with Python scripting, Pandas, Excel.
- Experience with Jupyter Notebooks.
- Data Analysis, Engineering and Enrichment.
- Strong testing and verification skills.
- Ability to work in a fast-paced, high visibility environment.
What we'd like you to have
- Security+ or other IAT II/III level certification that is currently active
- ArcGIS automation and modeling
- Experience with SAFEHOUSE web GIS platform
About BigBear.ai
BigBear.ai delivers AI-powered analytics and cyber engineering solutions to support mission-critical operations and decision-making in complex, real-world environments. BigBear.ai's customers, which include the US Intelligence Community, Department of Defense, the US Federal Government, as well as customers in manufacturing, healthcare, commercial space, and other sectors, rely on BigBear.ai's solutions to see and shape their world through reliable, predictive insights and goal-oriented advice. Headquartered in Columbia, Maryland, BigBear.ai is a global, public company traded on the NYSE under the symbol BBAI. For more information, please visit: http://bigbear.ai/ and follow BigBear.ai on Twitter: @BigBearai.
What you will do
- Involvement in the analyses of unstructured and semi-structured data, including latent semantic indexing (LSI), entity identification and tagging, complex event processing (CEP), natural language processing (NLP), and the application of analysis algorithms on distributed, clustered, and cloud-based high-performance infrastructures.
- Exercises creativity in applying non-traditional approaches to large-scale analysis of unstructured data in support of high-value use cases visualized through multi-dimensional interfaces.
- Analyze and investigate data sets and summarize their main characteristics, often employing data visualization methods. Exploratory Data Analysis (EDA)
- Handle processing and index requests against high-volume collections of data and high-velocity data streams.
- Research and test novel machine learning approaches for analyzing large-scale distributed computing applications.
- Suggest innovative and creative concepts and ideas that would improve the overall platform.
- Strong understanding of the most recent trends around telecom industry and technology adoption.
- Strong technical and computational skills - engineering, physics, mathematics, coupled with the ability to code design, develop, and deploy sophisticated applications and prototypes using advanced unstructured and semi-structured data analysis techniques and utilizing high-performance computing environments.
- Utilize advanced tools and computational skills to interpret, connect, predict, and make discoveries in complex data and deliver recommendations for business and analytic decisions.
What you need to have
- Bachelor's Degree and 5 to 8 years of relevant experience.
- Clearance: Active TS/SCI with ability to obtain a CI Poly.
- Full time work in a SCIF is required, but hours are flexible.
- Experience with data transport and transformation APIs and technologies such as JSON, XML, XSLT, JDBC, and REST.
- Experience with Cloud-based data analysis tools including Hadoop and Mahout, Acumulo, Hive, Impala, Pig, and similar.
- Experience with visual analytic tools like Microsoft Pivot, Palantir, or Visual Analytics.
- Experience with open-source textual processing such as Lucene, Sphinx, Nutch or Solr
- Experience with entity extraction and conceptual search technologies such as LSI, LDA, etc.
- Experience with machine learning, algorithm analysis, and data clustering.
- Experience with Python scripting, Pandas, Excel.
- Experience with Jupyter Notebooks.
- Data Analysis, Engineering and Enrichment.
- Strong testing and verification skills.
- Ability to work in a fast-paced, high visibility environment.
Explore More
Date Posted
12/12/2023
Views
12
Similar Jobs
Teachers at MedStar Good Samaritan Child Development Center - KinderCare Learning Companies
Views in the last 30 days - 0
View DetailsRelationship Banker - Mondawmin Financial Center - Bank of America
Views in the last 30 days - 0
View DetailsRisk Control Consultant, Property - Liberty Mutual Insurance
Views in the last 30 days - 0
View DetailsPrincipal RF/Digital Test Development Engineer - Swing Shift - Northrop Grumman
Views in the last 30 days - 0
View Details