NoCha measures how well long-context language models can verify claims written about fictional books. Check out our paper and GitHub repo for more details.
About the benchmark: NoCha contains 1001 narrative minimal pairs written about recently-published novels, where one claim is true and the other is false. Given the book text and a claim, a model is instructed to verify whether the claim is true or false. The model only gets credit for a pair if it correctly labels both the true and false claim.
Rank | Model | Accuracy | # Correct pairs | # Attempted pairs |
---|
Book: "Tainted Cup" by Robert Jackson
True claim: Despite her skills as an Apoth, Nusis is unable to reverse engineer the type of portal opened by the reagents key found in Rona's wooden chest.
False claim: By using her skills as an Apoth, Nusis is able to reverse engineer the type of portal opened by the reagents key found in Rona's wooden chest.
Human-written explanation from NoCha: The reagents key is in fact not a key at all but the cure for dappleglass poisoning, which explains why Nusis is unable to figure out what type of portal it opens.
You are provided with a context and a statement. Your task is to carefully read the context and then determine whether the statement is true or false. Answer TRUE if the statement is true in its entirety based on the context provided. Answer FALSE if any part of the statement is false based on the context provided.<context> {book text} </context> <statement> {claim} </statement> <question> Based on the context provided, is the above statement TRUE or FALSE? </question> First provide an explanation of your decision-making process in at most one paragraph, and then provide your final answer. Use the following format:<explanation> YOUR EXPLANATION </explanation> <answer> YOUR ANSWER </answer>
You are provided with a context and a statement. Your task is to carefully read the context and then determine whether the statement is true or false. Answer TRUE if the statement is true in its entirety based on the context provided. Answer FALSE if any part of the statement is false based on the context provided.<context> {book text} </context> <statement> {claim} </statement> <question> Based on the context provided, is the above statement TRUE or FALSE? </question>