Illustration: Eniola Odetunde/Axios
The Washington Post on Sunday published its first-ever story built on the work of a new AI tool called Haystacker that allows journalists to sift through large data sets — video, photo or text — to find newsworthy trends or patterns. Why it matters: In an interview, the Post's chief technology officer Vineet Khosla said the company is committed to building many AI tools in-house because they can address the specific needs of trained journalists.
- "It's a far superior product than just the general purpose stuff you get from Big Tech," he said.
Zoom out: That ethos is reminiscent of the Post's efforts to build an in-house content management system nearly a decade ago called ArcXP that serves the special needs of news publishers.
- Asked whether the Post would ever license Haystacker to other newsrooms, Khosla said that's not the company's focus right now.
- But, he added, "I'm pretty sure this, or some variation of it, is going to make it back to the industry at large. ... There is no intention of keeping it just for us."
State of play: Haystacker was built by the Post's engineering team in conjunction with its newsroom.
- The tool, which took more than a year to build, is used primarily by the Post's visual forensics and data journalism teams.
In the story published Sunday, the Post analyzed over 700 presidential and down-ballot campaign ads that mention immigration from the first half of the year.
- Using Haystacker, journalists found that nearly 20% of the ads use footage and photos "that are outdated, lack context, or are paired with voice-overs and text that do not accurately depict what is shown on the screen."
- Some ads use footage from buildings being blown up in Gaza.
Of note: Haystacker can be used across any large data set that's available to the Post through a public API, or backend interface, or through a data partnership in which the data is given to Post journalists or is licensed by the newsroom.
- While Haystacker aims to help journalists examine big data sets, it can also be used to help summarize video footage very quickly to save journalists time. For example, Haystacker can be used to summarize an hours-long City Council meeting quickly.
Zoom out: Video makes up a huge portion of internet traffic, but without AI, it's nearly impossible for journalists to study large volumes of video data.
The big picture: This is the third major AI tool the Post has debuted in the past few months.
- Last month, it launched an AI-driven chatbot on its site that responds to user queries about climate with answers pulled from Post articles.
- It also debuted a new article summary product that summarizes a given article using generative AI. Khosla said the company will ramp up its investment in the summary product as the election draws nearer.
What to watch: The Post has yet to strike a deal to license its content to an AI firm, although it works with major large language models to build its internal tools.
- Khosla told Axios last month that the Post "will talk to any company that helps us expand our journalism, but we also want it to be fair."