ChatGPT’s citation reading makes for unremarkable reading for publishers

As more publishers cut content licensing deals with ChatGPT maker OpenAI, a study conducted this week by the Tow Center for Digital Journalism — which looks at how an AI chatbot generates citations (ie sources) for publishers’ content — makes an interesting, or, well, about, reading.
In summary, the findings suggest that publishers are always at the mercy of a productive AI tool to fabricate or misrepresent information, whether or not they allow OpenAI to censor their content.
The study, conducted at the Columbia Journalism School, examined the citations generated by ChatGPT after it was asked to identify the source of a sample of citations taken from a pool of publishers — some of whom had written agreements with OpenAI and others who did not.
The agency took citations from 10 articles each produced by a total of randomly selected publishers (so 200 different citations in total) – including content from the New York Times (currently suing OpenAI in a copyright lawsuit); The Washington Post (not affiliated with the maker of ChatGPT); The Financial Times (licensed); and others.
“We selected citations that, when pasted into Google or Bing, would return the source article among the top three results and tested whether the new OpenAI search tool would correctly identify the source article for each citation,” Tow researchers wrote -Klaudia Jaźwińska and Aisvarya Chandrasekar in a blog post explaining their methodology and summarizing their findings.
“What we found was not promising news publishers,” they continued. “Although OpenAI emphasizes its ability to provide users with ‘timely answers through links to relevant web sources,’ the company expressly makes no commitment to verify the accuracy of those citations. This is a significant omission for publishers who expect their content to be referenced and represented honestly.”
“Our test found that no publisher – regardless of the level of cooperation with OpenAI – was spared the misrepresentation of its content in ChatGPT,” they added.
Finding unreliable sources
The researchers say they found “numerous” cases where publishers’ content was incorrectly cited by ChatGPT – and they found what they called “a large number of inaccuracies in the responses”. So while they got “some” citations that were completely correct (ie ChatGPT correctly returned the publisher, date, and URL of the block citation it was shared with), there were “many” citations that were completely incorrect; and “others” that fell somewhere in between.
In short, ChatGPT quotes seem to be an unreliable mixed bag. The researchers also found several cases where the chatbot did not express complete confidence in its (wrong) answers.
Some quotes are taken from publishers who blocked OpenAI search queries. In these cases, the researchers said they expected there would be problems generating relevant citations. But they found that this situation raised another problem – as the “unusual” bot admitted to not being able to give an answer. Instead, it went back to the synthesis to produce a specific finding (albeit, a wrong finding).
“In total, ChatGPT returned partially incorrect or incorrect answers 153 times, although it only admitted to not being able to correctly answer the question seven times,” the researchers said. “Only in those seven results did the chatbot use relevant words and phrases such as ‘appears,’ ‘might,’ or ‘likely,’ or statements such as ‘I couldn’t find the exact article’.”
They compare this unfortunate situation to a typical internet search where a search engine such as Google or Bing can usually find an exact quote, then point the user to the website/websites where they found it, or say they didn’t find exactly the same results. .
“ChatGPT’s lack of transparency about its confidence in feedback can make it difficult for users to assess the veracity of a claim and understand which parts of feedback they can or can’t trust,” they argue.
For publishers, there may also be reputational risks from incorrect citations, upvotes, and the commercial risk of readers being referred elsewhere.
Decontextualized data
The research also reveals another issue. It suggests that ChatGPT can be a profitable cheat. The researchers recounted an incident in which ChatGPT mistakenly cited a website that had plagiarized a piece of “deeply reported” New York Times journalism, that is, by copying the text without attribution, as the source of the NYT story – guesswork, in that case. , the bot may have produced this false response to fill the information gap caused by its lack of clarity on the NYT website.
“This raises serious questions about OpenAI’s ability to filter and ensure the quality and authenticity of its data sources, especially when dealing with unauthorized or pirated content,” they suggested.
In another finding that may be relevant to publishers who subscribe to OpenAI, the study found that ChatGPT’s quotes were not always reliable in their contexts – so letting its hackers in doesn’t seem to guarantee accuracy, either.
The researchers argue that the key issue is that OpenAI technology treats journalism as “non-textual content”, regardless of the conditions of its original production.
Another issue research flags is the diversity of ChatGPT responses. The researchers tested asking the bot the same question multiple times and found that it “usually returned a different answer each time”. Although that is the general condition of GenAI tools, in general, in the case of such citations the consistency is too small if it is the precision you want.
Although Tow’s research is small scale — the researchers admit that “rigorous” testing is needed — it’s noteworthy given the high-profile deals major publishers are busy cutting with OpenAI.
If media businesses were hoping that these systems would lead to exclusive management of their content against competitors, at least in terms of producing an accurate source, this study suggests that OpenAI has not provided any such consistency.
Although publishers don’t have licensing deals either I didn’t OpenAI crawlers have blocked them entirely – perhaps in the hope that they will at least pick up traffic if ChatGPT returns content about their stories – the study makes for depressing reading as well, as citations may be inaccurate in their contexts as well.
In other words, there is no guaranteed “visibility” for publishers in the OpenAI search engine even when they allow its searches to enter.
And completely banning searches does not mean that publishers can save themselves from reputational damage by avoiding any mention of their stories on ChatGPT. Research has found that the bot still misquotes New York Times articles despite ongoing litigation, for example.
‘A small minded agency’
The researchers concluded that as it stands, publishers have “little reasonable agency” in what happens to their content when ChatGPT gets its hands on it (directly or, well, indirectly).
The blog post includes a response from OpenAI to the findings of the study – which it accuses the researchers of conducting “unusual testing of our product”.
“We support publishers and creators by helping ChatGPT’s 250 million monthly users find quality content through summaries, citations, clear links, and attribution,” OpenAI also told them, adding: “We’ve worked with partners to improve citation accuracy in line and respect publisher preferences, including enabling how they appear in search by running OAI-SearchBot on their robots.txt. We will continue to improve search results.”