Data/AI (DA)
DCWF 423
Data Scientist
Uncovers and explains actionable insights from data by combining scientific method, math and statistics, specialized programming, advanced analytics, AI, and storytelling.
Tasks
The concrete work activities defined for this role in the DCWF v5.1 spreadsheet. Core tasks are required for the role; additional tasks are associated but not mandatory.
- T5030 additional Analyze data sources to provide actionable recommendations.
- T506 additional Design, develop, and modify software systems, using scientific analysis and mathematical models to predict and measure outcome and consequences of design.
- T5120 additional Conduct hypothesis testing using statistical processes.
- T5550 additional Program custom algorithms.
- T5640 additional Utilize technical documentation or resources to implement a new mathematical, data science, or computer science method.
- T5853 additional Build predictive, prescriptive, or descriptive models in collaboration with stakeholders.
- T5854 additional Collaborate with appropriate personnel to address Personal Health Information (PHI), Personally Identifiable Information (PII), and other data privacy and data reusability concerns for AI solutions.
- T5884 additional Evaluate energy implications (graphical processing unit, tensor processing unit, etc.) when designing AI solutions.
- T5896 additional Maintain current knowledge of advancements in DoD AI Ethical Principles and Responsible AI.
- T5906 additional Plan and conduct complex analytical, mathematical, and statistical research that informs operational requirements.
- T5907 additional Plan, coordinate, and execute complex studies using advanced data modeling techniques and procedures, data trend analysis, and data algorithms.
- T5924 additional Train and evaluate machine learning models.
- T5927 additional Write and document reproducible code.
- T6790A additional Utilize open source languages, as appropriate, and apply quantitative techniques (e.g., descriptive and inferential statistics, sampling, experimental design, parametric and non-parametric tests of difference, ordinary least squares regression, general line).
- T7002 additional Assist integrated project teams identify, curate, and manage test data.
Knowledge, Skills, and Abilities
KSA statements define what a person filling this role knows or can do. "Knowledge" is what they must know, "Skill" is what they can perform, and "Ability" is a durable capacity they bring to the work.
- A1120 ability core Ability to interpret and incorporate data from multiple tool sources.
- A3080 ability core Ability to use and understand complex mathematical concepts (e.g., discrete math).
- A6050 ability core Ability to build complex data structures and high-level programming languages.
- A6060 ability core Ability to collect, verify, and validate test data.
- A6120 ability core Ability to dissect a problem and examine the interrelationships between data that may appear unrelated.
- K0102 knowledge core Knowledge of programming language structures and logic.
- K021A knowledge core Knowledge of statistical/machine learning algorithms.
- K075A knowledge core Knowledge of mathematics, including logarithms, trigonometry, linear algebra, calculus, statistics, and operational analysis.
- K7029 knowledge core Knowledge of how to collect, store, and monitor data.
- S166 skill core Skill in conducting queries and developing algorithms to analyze data structures.
- S172 skill core Skill in creating and utilizing mathematical or statistical models.
- S3756 skill core Skill in developing or recommending analytic approaches or solutions to problems and situations for which information is incomplete or for which no precedent exists.
- S6490 skill core Skill in assessing the predictive power and subsequent generalizability of a model.
- S6570 skill core Skill in identifying hidden patterns or relationships.
- S6651 skill core Skill in Regression Analysis (e.g., Hierarchical Stepwise, Generalized Linear Model, Ordinary Least Squares, Tree-Based Methods, Logistic).
- S6750 skill core Skill in using outlier identification and removal techniques.
- S6760 skill core Skill in writing scripts using R, Python, PIG, HIVE, SQL, etc.
- S7071 skill core Skill in labeling data to make it more discoverable and understandable.
- K0035 knowledge additional Knowledge of digital rights management.
- K0942 knowledge additional Knowledge of the organization's core business/mission processes.
- K1034A knowledge additional Knowledge of Personally Identifiable Information (PII) data security standards.
- K1034C knowledge additional Knowledge of Personal Health Information (PHI) data security standards.
- K6290 knowledge additional Knowledge of how to leverage government research and development centers, think tanks, academic research, and industry systems.
- K7020 knowledge additional Knowledge of DoD AI Ethical Principles (e.g., responsible, equitable, traceable, reliable, and governable).
- K7036 knowledge additional Knowledge of laws, regulations, and policies related to AI, data security/privacy, and use of publicly procured data for government.
- S7078 skill additional Skill in using deep learning approaches to build machine learning models.