At O’Reilly, we don’t just create educational materials about AI. We’re also using it to build new kinds of learning experiences. One of the ways we leverage AI is through updates to Answers. Answers is a generative AI-based feature that aims to answer questions in your learning flow. It’s included in every book, on-demand course, and video, and will eventually be available across our entire learning platform. To view them, click the “Answer” icon (the last item in the list on the right side of the screen).
Learn faster. Take a deeper dive. Look further.
Answers enables active learning. Rather than simply collecting streams from books or videos, you interact with the content by asking questions and getting answers. When solving business problems, learning is part of the workflow. It’s natural to have questions while working on something. Those who remember hard copy books remember stacks of books laid out upside down on a desk to save pages as you delve deeper into a problem. Something similar happens online. While searching for answers, you open so many tabs that you can’t remember which is which. Why can’t I ask a question and get an answer? Now you can.
Here are some insights into the decisions we made while building Answers. Of course, everything is subject to change. This is the first thing you need to realize before starting an AI project. This is unknown territory. Everything is an experiment. You don’t know how people will use your application until you build and deploy it. There are many questions to answer and we are still waiting for them. While it’s important to be cautious when deploying AI applications, it’s also important to recognize that all AI is experimental.
The core of Answers is built through collaboration with partners who provide AI expertise. This is an important principle, especially for small companies. Don’t build it alone when you can collaborate with others. Developing the expertise to build and train models would have been very difficult, and it would have been much more effective to work with a company that already had that expertise. There will be many decisions and problems for your employees to make and solve. At least for your first few products, leave the heavy AI work to someone else. Focus on understanding the problem you are trying to solve. What is your specific use case? What kind of answers do users expect? What answer do you want to give? Think about how the answers to these questions affect your business model.
If you’re building a service like chat, you need to think seriously about how that service will be used: what kinds of prompts to expect and what kinds of answers to return. Answers places few limits on the questions you can ask. Most people think of O’Reilly as a resource for software developers and IT departments, but our platform includes many other types of information. Answers can answer questions about any topic on our platform, including chemistry, biology, climate change, and more. However, it differs from chat applications like ChatGPT in several ways. First, it is limited to questions and answers. Suggests follow-up questions, but is not conversational. Each new question starts a new context. We believe that many companies experimenting with AI want to talk for the sake of talking, not as a means to an end. Perhaps the goal is to monopolize the user’s attention. We want our users to learn. We want our users to continue to resolve technical issues. The conversation itself is not suitable for this use case. We want our interactions to be short, direct, and to the point.
Limiting your answers to Q&A also minimizes abuse. It’s harder to get an AI system to “stick” if you’re limited to Q&A. (Honeycomb, one of the first companies to integrate ChatGPT into its software products, made a similar decision.)
Unlike many AI-based products, Answers actually tells you when there is no answer. For example, if you ask “Who won the World Series?” the answer is “There is not enough information to answer this question.” If you ask a question that we cannot answer, but may have relevant information on our platform, we will provide you with that information. This design decision was simple but incredibly important. Few AI systems tell us that they cannot answer a question, and that inability is a significant cause of hallucinations, errors, and other misinformation. Most AI engines say, “I’m sorry. You can’t say, “I don’t know.” We can and we will.
Answers are always attributed to specific content, allowing us to reward our talent and partner publishers. Designing a compensation plan was an important part of the project. We are committed to treating authors fairly. It’s not just about generating answers from their content. When a user asks a question, Answers generates a short response and provides a link to the resource from which the information came. This data is fed into a compensation model that is designed to be revenue-neutral. It doesn’t penalize our talents when we generate answers from their materials.
Answers’ design is more complex than you might expect. Therefore, it is important for organizations embarking on AI projects to understand that “the simplest thing that will work” may not work. From the beginning, we knew we simply couldn’t use models like GPT or Gemini. Not only is it error-prone, but there is no mechanism to provide data about how the answer was constructed, the data needed as input to the reward model. This immediately led us to use the Search Augmentation Generative Pattern (RAG), which provided a solution. With RAG, the program generates a prompt that contains both a question and the data needed to answer the question. The augmented prompt is sent to a language model that provides the answer. We can reward our talents because we know what data was used to create the answer.
Using RAGs raises the following questions: Where do documents come from? Another AI model that can access the platform’s content database to generate “candidate” documents. Another model ranks candidates and selects those that seem most useful. The third model re-evaluates each candidate to see if it is actually relevant and useful. Finally, truncate the selected document to minimize content that is not relevant to the question. This process has two purposes: This is the data sent to the model that minimizes hallucinations and answers questions. Minimize the context needed. The more context you need, the longer it takes to get an answer and the more expensive it is to run your model. Most of the models we use are small, open source models. It’s fast, effective and cheap.
In addition to minimizing clutter and enabling attribution of content to its creators (and assigning royalties from there), this design makes it easy to add new content. We are constantly adding new content to our platform – thousands of items every year. With models like GPT, adding content requires a time-consuming and expensive learning process. RAG makes adding content simple. When anything is added to the platform, relevant content is added to the database from which it is selected. This process is not computationally intensive and can be performed almost instantaneously, so to speak, in real time. The answers never lag behind the rest of the platform. Users will not see the message ‘This model was trained only on data up to July 2023.’
Answers is a product, but it’s just one part of the ecosystem of tools we’re building. All of these tools are designed to provide a learning experience. That means helping our users and enterprise customers develop the skills they need to stay relevant in a changing world. This is the goal, and it is also the key to building successful applications with generative AI. What is your goal? What is your real goal? It’s not about impressing customers with your AI expertise. To solve a problem. For us, the challenge is helping students acquire new skills more efficiently. Focus on that goal, not the AI. AI will become an important tool, perhaps the most important tool. But it is not an end in itself.