At the recent CloudWorld event, Oracle introduced Oracle Database 23c, its next-generation database, which incorporates AI capabilities through the addition of AI vector search to its converged database. This vector search feature allows businesses to run multimodal queries that integrate various data types, enhancing the usefulness of GenAI in business applications. With Oracle Database 23c, there's no need for a separate database to store and query AI-driven data. By supporting vector storage alongside relational tables, graphs, and other data types, Oracle 23c becomes a powerful tool for developers building business applications, especially for semantic search needs.
In this two-part blog series, we'll explore the basics of vectors and embeddings, explain how the Oracle vector database works, and develop a Retrieval-Augmented Generation (RAG) application to enhance a local LLM.
In this post, we will cover the following steps:
Before we dive into the installation and configuration process, let's clarify a few concepts, such as embeddings and vectors in Generative AI. If you are already familiar with these concepts, feel free to skip this section.
Let's start with the term vector. In mathematics, a vector is an object that represents both the value and direction of a quantity in any dimension.
In the context of Generative AI, vectors are used to represent text or data in numerical format, allowing the model to understand and process it. This is necessary because machines only understand numbers, so text and images must be converted into vectors for the Large Language Models (LLM) to comprehend.
The following is a pseudocode that converts a motivational text into tokens using the Phi-2 model. We use the class from Hugging Face to encode the text into vectors and decode it back into text.
Vectors alone are not sufficient for LLMs because they only capture basic numerical features of a token, without encoding its rich semantic meaning. Vectors are simply a mathematical representation that can be fed into the model. To capture the semantic relationships between tokens, we need something more: embeddings.
An embedding is a more sophisticated version of a vector, usually generated through training on large datasets. Unlike raw vectors, embeddings capture semantic relationships between tokens. This means that tokens with similar meanings will have similar embeddings, even if they appear in different contexts.
Embeddings are what enable Large LLMs to grasp the subtleties of language, including context, nuance, and the meanings of words and phrases. They arise from the model's learning process, as it absorbs vast amounts of text data and encodes not just the identity of individual tokens but also their relationships with other tokens.
Typically, embeddings are generated through techniques such as Word2Vec, GloVe, or using sentence-transformers. Here's an example of how OpenAI Embeddings can be used to generate embeddings from input texts: Lion, Tiger, and iPhone.
Also known as a similarity search engine, a vector database is a specialized database designed to store and efficiently retrieve vectors. These databases are optimized for performing nearest-neighbor searches (i.e., finding the most similar item based on their embeddings) in high-dimensional vector spaces. Unlike traditional relational databases, vector databases can compare vectors directly without needing explicit queries about attributes.
Let's look at an example:
If you query the database with the embedding of Movie A, it will also return Movie B because their embeddings are close in the vector space, indicating they have similar content.
Vector databases can be used in various scenarios:
Now that we have sufficient theories, let's explore using Oracle Database to store and query embeddings.
I will use a local mini server powered by Proxmox to install and configure the Oracle Autonomous Database Free (ADB-Free) in a Docker container. My setup is outlined below:
The Oracle Database Actions application-like web SQL-developer will be accessible through the url https://IP_ADDRESS:8443/ords/sql-developer.
You can use this web application to manipulate the database, such as creating schemas and users or querying the database.
Therefore, Oracle ADB-Free is not accessible directly: we need a wallet to communicate with the database. Create a new directory to your host machine named "/scratch/tls_wallet" and copy the wallet to the Docker host machine by the following command:
In my case, since I plan to connect remotely, I need to replace "localhost" in with the remote host's FQDN.
Log in to the database as the administrator user (= ):
Verify you can connect with the new user. Use the following command from any terminal:
I will use Jupyter Notebook with Miniconda to run the Python application; however, you can use your preferred IDE, such as Visual Studio Code, to execute it.
Create a new Python Jupyter Notebook and add the following statement.
The command will install all the necessary packages we need.
Download the file named from the link and save it in the same directory as your Jupyter Notebook file.
Note that a fragment of the Python code was taken from the blog post, "Setting Up Vector Embeddings and Oracle Generative AI with Oracle Database 23ai," and remains unchanged.
Tokenize and embed the contents of the file as shown below.
Here, you can print any row from the notebook or connect to the database to explore the rows of the table.
Let's try a semantic search over the table data, and add the following SQL query:
Now, we can ask any question related to our question-and-answer file about the Generated AI and get the semantic result.
The code above should return a result that closely resembles the one shown below:
At this point, I encourage you to experiment with different metrics, such as , and use various files to work with vectors and semantic search.
In the follow-up blog post, we will move forward and add functionality to communicate with a local LLM for developing the RAG application.