Generated: November 24, 2025
Total Images Evaluated: 246 (Originals + Edge Cases)
| Category | Images | Accuracy |
|---|---|---|
| All Images (Originals + Edge Cases) | 246 | 80.9% |
| Edge Cases Only (All Manipulations) | 190 | 82.6% |
| Edge Cases Excluding Photo-of-Photo (AI edits, Photoshop, screenshots, real re-captures) |
137 | 98.5% |
| Photo-of-Photo Only (extreme synthetic re-capture with moiré) |
53 | 41.5% |
| Total | Real | Altered | Accuracy | |
|---|---|---|---|---|
| folder | ||||
| generated_edge_cases | 174.000 | 0.000 | 174.000 | 0.822 |
| test best images | 56.000 | 56.000 | 0.000 | 0.750 |
| test edge cases | 16.000 | 0.000 | 16.000 | 0.875 |
| overall_score.grade | A | B | C | D | F |
|---|---|---|---|---|---|
| folder | |||||
| generated_edge_cases | 30 | 1 | 41 | 94 | 8 |
| test best images | 28 | 14 | 12 | 1 | 1 |
| test edge cases | 1 | 1 | 5 | 6 | 3 |
The following benchmark was performed on an independently curated dataset with a collection of images designed to challenge the Proof system. Outdoor nature photos make up the majority of images, each randomly transformed through a pipeline to represent AI edits, Photoshop edits, and photos of photographs. Additional real-world data including screenshots and pictures of screens was included for variety.
The transformations were deliberately designed to be as challenging as possible to capture edge cases, as the system already performs extremely well on typical real-world manipulations.
The full dataset used for this analysis can be found here:
https://drive.google.com/drive/folders/1KsRjrDaIsCsRwbZ13qhGZonz6PGT0KoX
Conclusion:
Proof is able to accurately identify the majority of altered or non-original photos with a high level of accuracy, including many of the most difficult cases. Overall, the system performs at a high level and accomplishes its intended purpose.
— Dallin Munger, MS
Independent Security Researcher