Contents
Title
- X-code: the next mutant generation of first-class CS education
- X-code: the bugs of future past
- Mutation testing in computer science education
- Mutation testing automatically generates programming assignments
- Applying mutation testing to assignment generation
- Intelligent mutation generates educational programming assignments
- Generating mutant assignments
- Professor-X: creating mutants for CS education
- Creating mutant bugs for CS education
Abstract
- Mutation testing is a method traditionally used in software engineering.
- In computer science education, programming assignment generation is a laborious and time consuming part of instruction.
- Debugging is a frequent activity in industry
- Debugging is a skill that continues to improve beyond the bachelor’s level.
- Academic integrity (cheating, copying) is a significant impediment to learning from the programming assignment experience.
- Here, we define a novel application of selective intelligent mutation toward the goal of cheating-resistant, streamlined programming assignment creation.
Introduction
We bring together two largely exclusive fields: CS education and mutation testing in software engineering.
Evolutionary Computation
- Broad discussion of EC methods for mutating source code.
Mutation Testing
- Mutation testing is a second-order testing strategy that injects logical errors into a program to validate test adequacy.
- For example, the less-than operator inside an if-else statement is changed to a greater-than-or-equal operator, then the mutated program is run against a unit test.
- An adequate unit test must fail.
- An inadequate test would not detect a logical mutation in the tested source code.
- Thus, mutations represent errors that any competent programmer may make
- Errors that change control flow have the greatest chance of propagating
- Weak vs Strong Mutations
- What is mutated? Why?:
- Offut [1] states that the following are effective operators:
- Arithmetic Operators (AOR): 1+2 -> 1-2
- Relation Operators (ROR) 1 < 2 -> 1 == 2
- Logical Connect (LCR): 1 AND 0 -> 1 OR 0
- Unary Operator Insertion (UOI) 1 -> NOT 1
- Absolute Value Insertion (ABS) -1 -> abs(-1)
- Offut [1] differentiates between semantic and syntatic size:
- syntactic size is the fewest number of changes needed to revert to original
- semantic size depends on how the change affects the input domain
- A mutant with semantic size 0 (does nothing to the size of the input domain) is "Equivalent"
- A mutant with a large semantic size (killed by most, if not all inputs) is "Trivial"
- Harder assignments involve small semantic size, easy assignments, large semantic size
CS Education
Programming
- While research in this field is primarily concerned with the detection of equivalent mutants, we introduce a new application for mutants:
- automatic programming assignment generation in computer science education: Starting with a correct program, random mutations are generated, and mutants are distributed to students.
- It is then the job of the students to find and fix the error, and potentially to write tests to "kill" the mutants. The goal is to emulate a common scenario in software development, fixing bugs and writing unit tests.
- This paper describes the methodology behind generating source mutations for the purpose of randomized debugging assignment creation in a language-independent way.
=== Debugging ===
- Debugging is an important skill.
Academic Honesty and Cheating
- CS education has a massive problem with academic honesty.
- Disciplines like mathematics can parameterize assignments, for example by changing the numbers in a calculus problem.
- On the other hand, CS implements a calculus solver, and thus has difficulty making new assignments.
- To make matters worse, each new assignment has large overhead for creation.
- Further, it is easier to detect cheating in CS than other disciplines, due to the unique fingerprint produced.
- In addition, many solutions are readily available on the internet.
- One goal of this project is to streamline the assignment creation process, while increasing difficulty of cheating.
Methods
- A translation unit is the ultimate input to a C or C++ compiler from which an object file is generated.
- It is generated from the source file after running through a preprocessor.
- In interpreted languages like Python or Java, the translation unit is the source file.
- A private function is a function that cannot be accessed outside the translation unit, as opposed to a public function.
- Public functions may be explicitly tested.
- A testable unit consists of code in the function, code of private functions that is called, excluding the code of called public functions
- The behavior of private functions propagates to its public caller.
- Thus, all private functions called by a public function are subject to mutation.
User Workflow
- Write assignment
- Write unit test for assignment
- Ensure 100% line coverage
- Generate and test mutants
- Select killed mutants and distribute to students
Program Flow
- Given a translation unit...
- Feed through parser, traverse syntax tree, and locate mutation candidates. Store in file. User can edit this file before step 3, which allows fine tuning of mutation parameters. The file is formatted as a CSV:
- Table 1
9 |
0 |
is_valid_sides |
BinaryOp |
> |
0 |
0 |
triangle_tritype |
SYMLINK |
is_valid_sides |
- The SYMLINK operator type indicates that
triangle_tritype
, a public function, calls is_valid_sides
, a private function
- Feed the candidate file and the source file through a mutator program. The mutator program reads the candidate file and performs a mutation. The mutation is recorded and the output file is recorded.:
- Table 2
9 |
13 |
is_valid_sides |
BinaryOp |
!= |
<= |
0-0-triangle.so |
9 |
13 |
is_valid_sides |
BinaryOp |
!= |
> |
0-1-triangle.so |
9 |
13 |
is_valid_sides |
BinaryOp |
!= |
< |
0-2-triangle.so |
9 |
13 |
triangle_tritype |
BinaryOp |
!= |
<= |
0-0-triangle.so |
9 |
13 |
triangle_tritype |
BinaryOp |
!= |
> |
0-1-triangle.so |
- Two new columns are added: NEW OP and LOCATION. The NEW OP column indicates how the symbol was mutated. Location indicates the file where the mutated symbol can be found. In our implementation for C, the new symbol is found in a shared object file.
- The last two row indicates that the symbolic link from table 1 has been resolved
- The DB file is read by the unit test executor. The data is stored in an associative array, keyed by the symbol. The value is a list of all mutations for that given symbol. Represented in JSON, the data would look like this:
{
“Is_valid_side”: [“0-0-triangle.so”, “0-1-triangle,so”, …]
“triangle_tritype”: [“0-0-triangle.so”, “0-1-triangle.so”, …]
}
Note that in these examples, is_valid_side is a private function. However, we expect references to private references to fail at link-time.
- Run the mutation test, and record which mutants were successfully killed.
Results
- When we applied these methods to varying types of small programs, we noticed several interesting results.
- Diversity of mutants
- Feasibility as a real assignment in a large class
- Vary number of mutations, type of mutation
- Potential for cheating
- P2P attacks on this system, cite literature, like sybil attack, etc.
- Make a plot: for a given number of mutations per code file, how many students does it take to cheat?
- Use this as a group assignment, to turn it around?
Discussion
- Applications:
- Create assignments to teach unit testing
- Future Directions:
- Vim plugin to select mutation, see mutation result
- Implementation for ducktyped languages like python
References
[1] Offut, et al. "An Experimental Determination of Sufficient Mutation Operators"
TODO
- Create graphs of workflow
- Restrict operators for pointers