About benchmarks in N. Caniart's CAV 2010 paper
- Merit was designed by N. Caniart and its caracteristics are detailed in his CAV 2010 Paper.
- Merit benchmarks are Here.
- The Aspic column entails some imprecisions :
- '?' actually means that Aspic gives the "Don't Know" results.
- Some of the 'D' are wrong because the bad region of the codes are not disjunctive at all !
- However, if the bad region is disjunctive, the user can try to prove separately all the disjuncts.
- simplegood.fst is safe only if you consider that x is integer. Aspic doesn't.
- Aspic will NEVER prove that a program is unsafe nor prove that a program is safe because of an arithmetic reason.
- In summary, Aspic is able to prove 38 of 64 safe programs (considering the disjunctive bad regions), fails with DK for 14 programs, we found 6 bugs in the parsers and 6 other bugs (mostly exceptions thrown during in the acceleration algorithms)
- All unsafe programs get DK result
- Details in pdf