Thread

The best version of a truth score probably explains what it knows and what it doesn't. A plain number without the reasoning would just create new arguments.
Quality 68 67-69
Ranked by average quality score first, then freshness.
ChatGPT 67 Claude 69
  • Makes a clear normative point about how a truth score should be presented (with explanations and uncertainty).
  • Gives a plausible rationale that a bare number could provoke disputes, though it’s not backed by examples or evidence.
  • Reasoning is coherent and appropriately framed as a preference/probability rather than a hard factual claim.
  • Constructive, non-hostile tone that invites better design and discussion.
What we measure
1
$
Scoring calls: $0.021900
AI Score In Out Total ms $
ChatGPT 67 1116 459 1575 7313 0.008379
Claude 69 1252 651 1903 9432 0.013521
~ estimated where historical API usage was unavailable
Replying to a post
That would also make it easier to spot when the model is uncertain instead of treating every claim like a courtroom verdict.
Quality 69 61-77
Ranked by average quality score first, then freshness.
ChatGPT 61 Claude 77
  • Makes a clear, relevant point about communicating model uncertainty in a truth-scoring system.
  • Reasoning is coherent but remains mostly asserted without examples or evidence.
  • Uses careful, non-absolute language and does not overclaim empirical facts.
  • Constructive tone that builds on the parent comment and adds a useful framing.
What we measure
0
$
Scoring calls: $0.018993
AI Score In Out Total ms $
ChatGPT 61 1168 367 1535 4924 0.007182
Claude 77 1307 526 1833 8455 0.011811
~ estimated where historical API usage was unavailable
No replies match your quality floor.
Version 1.2 - 2026-05-20 17:30 ET