Data/AI (DA)
DCWF 672
AI Test & Evaluation Specialist
Performs testing, evaluation, verification, and validation on AI solutions to ensure they are developed to be and remain robust, resilient, responsible, secure, and trustworthy; and communicates results and concerns to leadership.
Tasks
The concrete work activities defined for this role in the DCWF v5.1 spreadsheet. Core tasks are required for the role; additional tasks are associated but not mandatory.
- T508 additional Determine level of assurance of developed capabilities based on test results.
- T5120 additional Conduct hypothesis testing using statistical processes.
- T550 additional Develop test plans to address specifications and requirements.
- T5848 additional Assess technical risks and limitations of planned tests on AI systems.
- T5850 additional Assist integrated project teams to identify, curate, and manage data.
- T5851 additional Build assurance cases for AI systems that support the needs of different stakeholders (e.g., acquisition community, commanders, and operators).
- T5858 additional Conduct AI risk assessments to ensure models and/or other solutions are performing as designed.
- T5866 additional Create or customize existing Test and Evaluation Master Plans (TEMPs) for AI systems.
- T5873 additional Determine methods and metrics for quantitative and qualitative measurement of AI risks so that sensitivity, specificity, likelihood, confidence levels, and other metrics are identified, documented, and applied.
- T5876 additional Develop machine learning code testing and validation procedures.
- T5877 additional Develop possible solutions for technical risks and limitations of planned tests on AI solutions.
- T5889 additional Identify and submit exemplary AI use cases, best practices, failure modes, and risk mitigation strategies, including after-action reports.
- T5896 additional Maintain current knowledge of advancements in DoD AI Ethical Principles and Responsible AI.
- T5901 additional Measure the effectiveness, security, robustness, and trustworthiness of AI tools.
- T5910 additional Provide quality assurance of AI products throughout their lifecycle.
- T5914 additional Report test and evaluation deficiencies and possible solutions to appropriate personnel.
- T5916 additional Select and use the appropriate models and prediction methods for evaluating AI performance.
- T5919 additional Test AI tools against adversarial attacks in operationally realistic environments.
- T5920 additional Test components to ensure they work as intended in a variety of scenarios for all aspects of the AI application.
- T5921 additional Test how users interact with AI solutions.
- T5922 additional Test the reliability, functionality, security, and compatibility of AI tools within systems.
- T5923 additional Test the trustworthiness of AI solutions.
- T5926 additional Use models and other methods for evaluating AI performance.
- T694 additional Make recommendations based on test results.
- T765B additional Perform AI architecture security reviews, identify gaps, and develop a risk management plan to address issues.
- T858A additional Test, evaluate, and verify hardware and/or software to determine compliance with defined specifications and requirements.
- T858B additional Record and manage test data.
Knowledge, Skills, and Abilities
KSA statements define what a person filling this role knows or can do. "Knowledge" is what they must know, "Skill" is what they can perform, and "Ability" is a durable capacity they bring to the work.
- A6060 ability core Ability to collect, verify, and validate test data.
- A6170 ability core Ability to translate data and test results into evaluative conclusions.
- K6311 knowledge core Knowledge of machine learning theory and principles.
- K7003 knowledge core Knowledge of AI security risks, threats, and vulnerabilities and potential risk mitigation solutions.
- K7004 knowledge core Knowledge of AI Test & Evaluation frameworks.
- K7006 knowledge core Knowledge of best practices from industry and academia in test design activities for verification and validation of AI and machine learning systems.
- K7009 knowledge core Knowledge of coding and scripting in languages that support AI development and use.
- K7020 knowledge core Knowledge of DoD AI Ethical Principles (e.g., responsible, equitable, traceable, reliable, and governable).
- K7024 knowledge core Knowledge of how AI is developed and operated.
- K7025 knowledge core Knowledge of how AI solutions integrate with cloud or other IT infrastructure.
- K7028 knowledge core Knowledge of how to automate development, testing, security, and deployment of AI/machine learning-enabled software.
- K7029 knowledge core Knowledge of how to collect, store, and monitor data.
- K7030 knowledge core Knowledge of how to deploy test infrastructures with AI systems.
- K7034 knowledge core Knowledge of interactions and integration of DataOps, MLOps, and DevSecOps in AI.
- K7036 knowledge core Knowledge of laws, regulations, and policies related to AI, data security/privacy, and use of publicly procured data for government.
- K7037 knowledge core Knowledge of machine learning operations (MLOps) processes and best practices.
- K7038 knowledge core Knowledge of metrics to evaluate the effectiveness of machine learning models.
- K7041 knowledge core Knowledge of remedies against unintended bias in AI solutions.
- K7044 knowledge core Knowledge of testing, evaluation, validation, and verification (T&E V&V) tools and procedures to ensure systems are working as intended.
- K7045 knowledge core Knowledge of the AI lifecycle.
- K7048 knowledge core Knowledge of the benefits and limitations of AI capabilities.
- K7051 knowledge core Knowledge of the possible impacts of machine learning blind spots and edge cases.
- K7053 knowledge core Knowledge of the user experience (e.g., decisionmaking, user design, and human-computer interaction) as it relates to AI systems.
- K7054 knowledge core Knowledge of tools for testing the robustness and resilience of AI products and solutions.
- S182 skill core Skill in determining an appropriate level of test rigor for a given system.
- S6490 skill core Skill in assessing the predictive power and subsequent generalizability of a model.
- S6630 skill core Skill in preparing Test & Evaluation reports.
- S6641 skill core Skill in providing Test & Evaluation resource estimate.
- S7065 skill core Skill in explaining AI concepts and terminology.
- S7067 skill core Skill in identifying low-probability, high-impact risks in machine learning training data sets.
- S7069 skill core Skill in identifying risk over the lifespan of an AI solution.
- S7070 skill core Skill in integrating AI Test & Evaluation frameworks into test strategies for specific projects.
- S7075 skill core Skill in testing and evaluating machine learning algorithms and/or AI solutions.
- S7076 skill core Skill in testing for bias in data sets and AI system outputs as well as determining historically or often underrepresented and marginalized groups are properly represented in the training, testing, and validation data sets and AI system outputs.
- S7077 skill core Skill in translating operation requirements for AI systems into testing requirements.
- K0040 knowledge additional Knowledge of organization's evaluation and validation requirements.
- K0942 knowledge additional Knowledge of the organization's core business/mission processes.
- K1133 knowledge additional Knowledge of service management concepts for networks and related standards (e.g., Information Technology Infrastructure Library, current version [ITIL]).
- K7012 knowledge additional Knowledge of current test standards and safety standards that are applicable to AI (e.g. MIL-STD 882E, DO-178C, ISO26262).
- K7040 knowledge additional Knowledge of Personal Health Information (PHI), Personally Identifiable Information (PII), and other data privacy and data reusability considerations for AI solutions.