# Langchain: An Introduction

![](../images/langchain_stack.svg)
*This image is from [Langchain official documentation](https://python.langchain.com/docs/get_started/introduction).*

:::{contents}

:::

## What is Langchain?

Langchain is an open-source framework designed for developers working with AI. It facilitates the integration of large language models (LLMs) like GPT-4 with external sources of computation and data. Here's a breakdown of Langchain's key components and functionalities:

- Integration of Large Language Models (LLMs):
  - Langchain allows developers to seamlessly connect LLMs such as GPT-4 to external data sources and computation platforms.
  - This integration enables developers to leverage the vast knowledge and capabilities of LLMs in combination with their own data and applications.

- Addressing Specific Information Needs:
  - While LLMs like GPT-4 possess extensive general knowledge, Langchain addresses the need for specific information from proprietary or domain-specific data sources.
  - Developers can utilize Langchain to connect LLMs to their own datasets, including documents, PDF files, or proprietary databases.

- Dynamic Data Referencing:
  - Unlike traditional methods that involve pasting snippets of text into chat prompts, Langchain provides comprehensive ways to design prompt templates that interact with the LLMs.
  - Developers can segment their data into smaller chunks and store them in a vector database as embeddings, enabling efficient referencing and retrieval.

- Key Components of Langchain:
  - **LLM Wrappers:** Facilitate connection to LLMs like GPT-4.
  - **Prompt Templates:** Dynamically generate prompts for LLMs based on user input.
  - **Indexes:** Extract relevant information from datasets for LLM processing.
  - **Chains:** Combine multiple components to build LLM applications following a specific task.
  - **Agents:** Enable LLMs to interact with external APIs for additional functionality.

- Pipeline for Language Model Applications:
  - User input: Initial questions or queries from users.
  - Language model interaction: Sending user input to the LLM for processing.
  - Similarity search: Matching user queries with relevant data chunks in the vector database.
  - Action or response: Providing answers or taking actions based on the combined information from the LLM and vector database.



## Install

In [None]:
# !pip install langchain
# !pip install langchain-community
# !pip install langchain-core
# !pip install -U langchain-openai
# !pip install langchain openai weaviate-client



## API Setup

:::{note}
Please check the official documentation -- [Quick Start](https://platform.openai.com/docs/quickstart) -- about how to get an OPEN AI API.
:::

To save environment variables in a `.env` file and use the `dotenv` library in Python to load them, follow these steps:

### Saving Environment Variables in a `.env` File:
1. Create a new file in your project directory and name it `.env`. This file will store your environment variables.
2. Add your environment variables to the `.env` file in the format `VARIABLE_NAME=variable_value`. For example:
   ```
   OPENAI_API_KEY=your_api_key
   DATABASE_URL=your_database_url
   ```

### Using `dotenv` in Python to Load Environment Variables:
3. Install the `dotenv` library if you haven't already installed it. You can install it using pip:
   ```
   pip install python-dotenv
   ```

4. In your Python script, import the `dotenv` module:
   ```python
   from dotenv import load_dotenv
   ```

5. Load the environment variables from the `.env` file using the `load_dotenv()` function. Place this line at the beginning of your script:
   ```python
   load_dotenv()
   ```

6. Access the environment variables in your Python script using the `os.environ` dictionary. For example:
   ```python
   import os

   api_key = os.environ.get('API_KEY')
   database_url = os.environ.get('DATABASE_URL')

   print("API Key:", api_key)
   print("Database URL:", database_url)
   ```


:::{note}

- Make sure to add the `.env` file to your project's `.gitignore` file to prevent sensitive information from being exposed.
- You can also specify the path to the `.env` file if it's located in a different directory:
  ```python
  load_dotenv('/path/to/your/env/file/.env')
  ```

By following these steps, you can save environment variables in a `.env` file and use the `dotenv` library in Python to load them into your script. This approach helps keep sensitive information separate from your codebase and makes it easier to manage environment variables in your projects.
:::

In [3]:
# Load environment variables

from dotenv import load_dotenv,find_dotenv
load_dotenv(find_dotenv())
load_dotenv('/Users/alvinchen/.env')

True

In [2]:
find_dotenv()

'/Users/alvinchen/.env'

## Basic Query

![](https://python.langchain.com/assets/images/model_io-e6fc0045b7eae0377a4ddeb90dc8cdb8.jpg)
*This image is from [Langchain official documentation](https://python.langchain.com/docs/modules/model_io/).*

In [6]:
## initialize Chat model
import langchain
from langchain_openai import ChatOpenAI
chat = ChatOpenAI(model_name="gpt-4",temperature=0.3)



In [7]:
print(langchain.__version__)

0.1.11


:::{warning}
There are some major changes in `Langchain 0.2`. Please check the compatibility issues on your own if you are using the most recent version.
:::

In [4]:
## Interact with the Chat model immediately
response = chat.invoke("explain large language models in one sentence")
print(response.content,end='\n')


Large language models are machine learning algorithms trained on vast amounts of text data to understand and generate human-like text.


In [11]:
## Testing Hugginface Model
from dotenv import load_dotenv, find_dotenv
from langchain_community.llms import HuggingFaceEndpoint
import os 

## loading environment variables
load_dotenv()
HUGGINGFACEHUB_API_TOKEN = os.getenv('HUGGINGFACEHUB_API_TOKEN')


## Hugginface Model ID
repo_id = "mistralai/Mistral-7B-Instruct-v0.2"

## Initalize Hugginface Model
llm = HuggingFaceEndpoint(
    repo_id=repo_id,  temperature=0.5)

# llm = HuggingFaceHub(repo_id='tiiuae/falcon-7b-instruct', huggingfacehub_api_token=huggingfacehub_api_token)
# llm = HuggingFaceEndpoint(
#     repo_id="meta-llama/Meta-Llama-3-70B-Instruct",
#     task="text-generation",
#     max_new_tokens=512,
#     do_sample=False,
#     repetition_penalty=1.03,
# )

The token has not been saved to the git credentials helper. Pass `add_to_git_credential=True` in this function directly or `--add-to-git-credential` if using via `huggingface-cli` if you want to set the git credential as well.
Token is valid (permission: read).
Your token has been saved to /Users/alvinchen/.cache/huggingface/token
Login successful


In [13]:
llm.invoke('explain large language models in one sentence')

'\n\nA large language model is a type of artificial intelligence that uses deep learning techniques to analyze vast amounts of text data and generate human-like responses to textual inputs.'

## Messages

In LangChain, `SystemMessage`, `HumanMessage`, and `AIMessage` are classes used to represent different types of messages that can be exchanged during interactions with a language model. These distinctions help structure and contextualize conversations or workflows involving LLMs. 

- `SystemMessage`: This is used to provide context or instructions to the language model. These messages typically set the stage for how the model should behave or what role it should assume.
- `HumanMessage`: It represents input or queries from a user. These messages simulate the interaction coming from a human participant in the conversation.
- `AIMessage`: This represents the responses generated by the language model. These messages encapsulate the output provided by the AI in response to human or system messages.

In [5]:
# import schema for chat messages and ChatOpenAI in order to query chatmodels GPT-3.5-turbo or GPT-4
from langchain_core.messages import HumanMessage, SystemMessage, AIMessage

In [7]:
## LLM
chat = ChatOpenAI(model_name="gpt-3.5-turbo",temperature=0.3)

## Messages
messages = [
    SystemMessage(content="You are an expert data scientist"),
    HumanMessage(content="Write a Python script that trains a neural network on simulated data ")
]

## Response
response=chat.invoke(messages)

## Print out
print(response.content,end='\n')

Sure, here is an example Python script using the popular deep learning library TensorFlow to train a simple neural network on simulated data:

```python
import numpy as np
import tensorflow as tf

# Generate simulated data
np.random.seed(0)
X = np.random.rand(100, 2)
y = np.random.randint(0, 2, 100)

# Define the neural network architecture
model = tf.keras.models.Sequential([
    tf.keras.layers.Dense(10, activation='relu', input_shape=(2,)),
    tf.keras.layers.Dense(1, activation='sigmoid')
])

# Compile the model
model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])

# Train the model
model.fit(X, y, epochs=10, batch_size=32)

# Evaluate the model
loss, accuracy = model.evaluate(X, y)
print(f'Loss: {loss}, Accuracy: {accuracy}')
```

In this script, we first generate simulated data with 2 features and binary labels. We then define a simple neural network with one hidden layer of 10 neurons and an output layer with a sigmoid activation function. We compil

In [8]:
## check `reponse` class type
print(type(response))

<class 'langchain_core.messages.ai.AIMessage'>


## Prompt Template

In LangChain, a PromptTemplate is a structured way to create and manage prompts that will be sent to a language model. It allows you to define a template with placeholders and then fill in these placeholders with actual values when generating a prompt. This approach is useful for ensuring consistency and reusability in the prompts you use for different tasks or interactions with the language model.

:::{important}
- There's no need to emphasize more that the quality of the prompts is crucial to the performance of the LLM. A good place to start with would be some [prompt examples provided by OpenAI](https://platform.openai.com/docs/examples).
- Please check [Prompt Engineering Guide](https://www.promptingguide.ai/) for more advanced prompting techniques ()
:::

In [9]:
# Import prompt and define PromptTemplate

from langchain_core.prompts import PromptTemplate



In [10]:
## Create raw template
template = """
You are a college professor with an expertise in building deep learning models. 
Please provide the answer of {question} like I am five.
"""

## Initialize Prompt Tempalte
prompt = PromptTemplate.from_template(
    template=template,
)

In [11]:
# Run LLM with PromptTemplate
response = chat.invoke(prompt.format(question="What is backpropogation?"))

In [12]:
print(response.content,end='\n')


Backpropagation is like a teacher helping a student correct their mistakes. Imagine you're doing a math problem. First, you try to solve it yourself. Then, your teacher checks your work. If you made a mistake, your teacher doesn't just tell you the answer, but shows you where you went wrong and how to correct it. You then use this feedback to fix your mistake and get the right answer. 

In the world of computers and AI, backpropagation is a similar process. It's a way for the computer to learn from its mistakes. The computer makes a guess, checks if it's right or wrong, and then adjusts its guess based on the feedback. This process is repeated many times until the computer gets better at making the right guess.


:::{note}
**Few-shot learning** is a machine learning approach that enables models to learn and generalize from a small number of training examples. Unlike traditional methods that require large amounts of labeled data, few-shot learning aims to perform tasks and make predictions accurately with only a few examples per class. This is particularly valuable in scenarios where data collection is expensive, time-consuming, or impractical.

In the context of LLMs, providing the LLM with a few examples of inputs and outputs is called few-shotting, and is a simple yet powerful way to guide generation and in some cases drastically improve model performance. With the convience of prompt templates, we can inject seed examples to the LLM in a systematic way.

Please see [How to use few-shot examples in chat models](https://python.langchain.com/v0.2/docs/how_to/few_shot_examples_chat/) and [How to use few shot examples](https://python.langchain.com/v0.2/docs/how_to/few_shot_examples/) for more detail.

:::

:::{note}
Instead of directly prompting the LLM for an answer, the **Chain-of-Thought** technique encourages the model to think through a series of logical steps or intermediate thoughts. This helps in tackling complex problems that require reasoning and multi-step solutions.
:::

## Chain

In LangChain, a **Chain** is a structured sequence of operations or steps that are executed in order to achieve a specific task. Chains can be thought of as workflows that combine various components, such as prompts, models, and data processing steps, to automate complex tasks or interactions with language models.

Key Points About Chains:

- **Sequential Steps**: Chains consist of a series of steps that are executed one after another. Each step can involve different operations like generating text, processing data, or interacting with other services.

- **Modularity**: Chains allow you to break down complex tasks into smaller, manageable components. Each step in a chain can be a reusable module, making it easier to build and maintain sophisticated workflows.

- **Flexibility**: Chains can be customized to fit specific use cases. You can define the sequence of operations and specify the inputs and outputs for each step.


In [13]:
## create chain of prompt and chat
chain = prompt | chat

In [14]:
## interact with chain
response = chain.invoke({"question": "What is gradient descent?"})


In [15]:
print(response.content,end='\n')

Imagine you're playing a game where you're blindfolded and you're on top of a hill. Your goal is to reach the bottom of the hill. Now, you don't know where to go, but you can feel if you're moving up or down by taking small steps. You decide the best strategy is to always take a step in the direction where the hill is steepest downwards. This is exactly what gradient descent does. It's a method to find the lowest point in a valley, starting from a random point, by always taking a step in the steepest downhill direction. In the context of machine learning, the 'hill' is the error of the model, and 'reaching the bottom' means finding the best parameters for the model that make the error as small as possible.


In [27]:
from langchain_core.output_parsers import StrOutputParser

## create another chain
chain2 = prompt | chat | StrOutputParser()

In [15]:
## invoke the chain with a dictionary input
chain2.invoke({"question": "What is gradient descent?"})


'Imagine you are trying to find the bottom of a big slide in a playground. Gradient descent is like taking small steps down the slide until you reach the bottom. It helps us find the best way to adjust our deep learning model to make it work better.'

## Chaining A Series of Prompts

In [33]:
# Import LLMChain and define chain with language model and prompt as arguments.

from langchain.chains import LLMChain
chain = LLMChain(llm=chat, prompt=prompt)

# Run the chain only specifying the input variable.
print(chain.invoke({"question": "what is derivative?"}))

{'question': 'what is derivative?', 'text': "A derivative is like a speedometer in a car. When you're driving, the speedometer tells you how fast you're going at any given moment. Similarly, in math, a derivative tells you how fast something is changing at any given point. For example, if you're looking at a graph of a hill, the derivative would tell you how steep the hill is at any point."}


In [34]:
# Define a second prompt 

second_prompt = PromptTemplate(
    input_variables=["prev_ans"],
    template="Translate the answer description of {prev_ans} in traditional Chinese",
)

## Chain
chain_two = LLMChain(llm=chat, 
                     prompt=second_prompt)

In [38]:
# Define a sequential chain using the two chains above: the second chain takes the output of the first chain as input

from langchain.chains import SimpleSequentialChain

overall_chain = SimpleSequentialChain(chains=[chain, chain_two], verbose=True)

# Run the chain specifying only the input variable for the first chain.
explanation = overall_chain.invoke({"input":"what is derivative?"})
print(explanation)



[1m> Entering new SimpleSequentialChain chain...[0m
[36;1m[1;3mA derivative is like a speedometer in a car. It tells you how fast you're going and in what direction. If you're driving straight, the derivative is just your speed. But if you're going up a hill or down a hill, the derivative also tells you how steep the hill is. So, in a way, it's like a super speedometer that tells you more than just your speed, but also how your speed is changing.[0m
[33;1m[1;3m導數就像汽車上的速度計。它告訴你你正在以多快的速度前進以及前進的方向。如果你正在直行，導數就是你的速度。但如果你正在上坡或下坡，導數還會告訴你山坡有多陡。所以，從某種程度上說，它就像一個超級速度計，不僅告訴你你的速度，還告訴你你的速度是如何變化的。[0m

[1m> Finished chain.[0m
{'input': 'what is derivative?', 'output': '導數就像汽車上的速度計。它告訴你你正在以多快的速度前進以及前進的方向。如果你正在直行，導數就是你的速度。但如果你正在上坡或下坡，導數還會告訴你山坡有多陡。所以，從某種程度上說，它就像一個超級速度計，不僅告訴你你的速度，還告訴你你的速度是如何變化的。'}


In [42]:
print(explanation['output'])

導數就像汽車上的速度計。它告訴你你正在以多快的速度前進以及前進的方向。如果你正在直行，導數就是你的速度。但如果你正在上坡或下坡，導數還會告訴你山坡有多陡。所以，從某種程度上說，它就像一個超級速度計，不僅告訴你你的速度，還告訴你你的速度是如何變化的。


## Text Splitting

In [51]:
# Import utility for splitting up texts and split up the explanation given above into document chunks

from langchain.text_splitter import RecursiveCharacterTextSplitter

text_splitter = RecursiveCharacterTextSplitter(
    chunk_size = 50,
    chunk_overlap  = 20,
)

texts = text_splitter.create_documents([explanation['output']])


In [52]:
# Individual text chunks can be accessed with "page_content"

print(texts[0].page_content)
print(texts[1].page_content)
print(texts[2].page_content)

導數就像汽車上的速度計。它告訴你你正在以多快的速度前進以及前進的方向。如果你正在直行，導數就是你的速
進的方向。如果你正在直行，導數就是你的速度。但如果你正在上坡或下坡，導數還會告訴你山坡有多陡。所以，
或下坡，導數還會告訴你山坡有多陡。所以，從某種程度上說，它就像一個超級速度計，不僅告訴你你的速度，還


## Retrieval-Augmented Generation

![](../images/rag_indexing.png)
![](../images/rag_retrieval_generation.png)
*This image is from [Langchain official documentation](https://python.langchain.com/).*


**Retrieval-Augmented Generation (RAG)** is a technique that combines information retrieval with text generation to enhance the capabilities of language models, especially for tasks requiring up-to-date or domain-specific knowledge.

RAG was first proposed in 2020 by [Lewis et al. 2020](https://arxiv.org/abs/2005.11401) as an end-to-end approach that combined a pre-trained retriever and a pre-trained generator. At that time, its main goal was to improve performance through model fine-tuning.

- **Retrieval**: The system first retrieves relevant documents or information from a large database or corpus based on a user's query or input. To gather contextually relevant data that the generation model can use to produce more accurate and informed responses.

- **Augmentation**: The retrieved information is then used to augment the input to the generation model. This ensures that the model has access to external knowledge, which can be particularly useful for questions that require specific or factual information not present in the model's training data.

- **Generation**: The language model generates a response or output based on the augmented input, which includes the original query and the retrieved documents. The result is a more accurate, contextually relevant, and informative response.

The goal of RAG is to enable LLMs to generate more accurate and contextual responses while minimizing factual inaccuracies, known as **hallucinations**, especially when prompted with queries that require knowledge beyond their pre-trained data. 

Unlike traditional **fine-tuning** methods, which are computationally expensive and less adaptable to evolving information, RAG combines a generative model with a retriever module. 

This approach allows the model to access non-parametric knowledge stored in an external knowledge source, such as a **vector database**, which can be updated more easily. 

In essence, RAG operates similarly to an **open-book exam for humans**, where reference materials can be used to supplement reasoning skills.

- Five important components for RAG:
    - Load: DocumentLoaders
    - Split: TextSplitters 
    - Store: VectorStore and Embeddings model.
    - Retrieve: Retriever
    - Generate: A ChatModel/LLM produces an answer using a prompt that includes the question and the retrieved data

:::{note}
Vector stores are specialized databases designed to store and retrieve high-dimensional vectors, which are commonly used in machine learning and data science for tasks like nearest neighbor search, similarity search, and clustering. Different vector stores offer varying features, optimizations, and capabilities. 
:::

In [56]:
# ## Load documents from PDF
# from langchain_community.document_loaders import PyPDFLoader

# loader = PyPDFLoader("../../../../ENC2045_demo_data/ENC2045Syllabus.pdf")
# pages = loader.load()
# pages


In [76]:
## Load documents from webpages 
import bs4
from langchain_community.document_loaders import WebBaseLoader

loader = WebBaseLoader("https://alvinntnu.github.io/NTNU_ENC2045_LECTURES/intro.html")
pages = loader.load()


In [77]:
pages

[Document(page_content='\n\n\n\n\n\nENC2045 Computational Linguistics — ENC2045 Computational Linguistics\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\nSkip to main content\n\n\n\n\n\n\n\n\n\n\nCtrl+K\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\nINTRODUCTION\n\nNatural Language Processing: A Primer\nNLP Pipeline\n\nPreprocessing\n\nText Preprocessing\nText Normalization\nText Tokenization\nText Enrichment\nChinese Word Segmentation\nGoogle Colab\n\n\n\nText Vectorization\n\nText Vectorization Using Traditional Methods\n\nMachine Learning Basics\n\nMachine Learning: Overview\nMachine Learning: A Simple Example\nClassification Models\n\nMachine-Learning NLP\n\nCommon NLP Tasks\nSentiment Analysis Using Bag-of-Words\nEmsemble Learning\nTopic Modeling: A Naive Example\n\nDeep Learning NLP\n\nNeural Network From Scratch\nDeep Learning: A Simple Example\nDeep Learning: Sentiment Analysis\n\nNeural Language Model and Embeddings\n\nSequence Models Intuition\nNe

In [91]:
## Split and Store

from langchain_community.vectorstores import FAISS
from langchain_openai import OpenAIEmbeddings
from langchain_text_splitters import RecursiveCharacterTextSplitter

## Initialize splitter
text_splitter = RecursiveCharacterTextSplitter(chunk_size = 1000, 
                                               chunk_overlap = 50)

## Initialize embeddings
embeddings = OpenAIEmbeddings(model="text-embedding-3-large")

## Split documents
documents = text_splitter.split_documents(pages)

## Vectorize documents into vector store
vector = FAISS.from_documents(documents,  ## documents
                              embeddings) ## embedding model


In [92]:
documents

[Document(page_content='ENC2045 Computational Linguistics — ENC2045 Computational Linguistics\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\nSkip to main content\n\n\n\n\n\n\n\n\n\n\nCtrl+K\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\nINTRODUCTION\n\nNatural Language Processing: A Primer\nNLP Pipeline\n\nPreprocessing\n\nText Preprocessing\nText Normalization\nText Tokenization\nText Enrichment\nChinese Word Segmentation\nGoogle Colab\n\n\n\nText Vectorization\n\nText Vectorization Using Traditional Methods\n\nMachine Learning Basics\n\nMachine Learning: Overview\nMachine Learning: A Simple Example\nClassification Models\n\nMachine-Learning NLP\n\nCommon NLP Tasks\nSentiment Analysis Using Bag-of-Words\nEmsemble Learning\nTopic Modeling: A Naive Example\n\nDeep Learning NLP\n\nNeural Network From Scratch\nDeep Learning: A Simple Example\nDeep Learning: Sentiment Analysis\n\nNeural Language Model and Embeddings\n\nSequence Models Intuition\nNeural Languag

In [93]:
## Given a query, find relevant documents from vector store
docs = vector.similarity_search("What are the important reference books?", k=1)


In [94]:
## print out
for doc in docs:
    print(str(doc.metadata["title"]) + ":", doc.page_content[:500])

ENC2045 Computational Linguistics — ENC2045 Computational Linguistics: 5
Steven Bird, Ewan Klein, and Edward Loper. Natural language processing with Python: analyzing text with the natural language toolkit (See https://www.nltk.org/book/). " O'Reilly Media, Inc.", 2009.

6
Sowmya Vajjala, Bodhisattwa Majumder, Anuj Gupta, and Harshit Surana. Practical Natural Language Processing: A Comprehensive Guide to Building Real-World NLP Systems. O'Reilly Media, 2020.

7
Jacob Perkins. Python 3 text processing with NLTK 3 cookbook. Packt Publishing Ltd, 2014.

8
Bhargav Srin


### RAG Using Chain

- `create_stuff_documents_chain()`: This chain takes a list of documents and formats them all into a prompt, then passes that prompt to an LLM. It passes ALL documents, so you should make sure it fits within the context window the LLM you are using.
- `create_retrieval_chain()`: This chain takes in a user inquiry, which is then passed to the retriever to fetch relevant documents. Those documents (and original inputs) (done by the `create_stuff_documents_chain()`) are then passed to an LLM to generate a response

In [95]:
documents[:7]

[Document(page_content='ENC2045 Computational Linguistics — ENC2045 Computational Linguistics\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\nSkip to main content\n\n\n\n\n\n\n\n\n\n\nCtrl+K\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\nINTRODUCTION\n\nNatural Language Processing: A Primer\nNLP Pipeline\n\nPreprocessing\n\nText Preprocessing\nText Normalization\nText Tokenization\nText Enrichment\nChinese Word Segmentation\nGoogle Colab\n\n\n\nText Vectorization\n\nText Vectorization Using Traditional Methods\n\nMachine Learning Basics\n\nMachine Learning: Overview\nMachine Learning: A Simple Example\nClassification Models\n\nMachine-Learning NLP\n\nCommon NLP Tasks\nSentiment Analysis Using Bag-of-Words\nEmsemble Learning\nTopic Modeling: A Naive Example\n\nDeep Learning NLP\n\nNeural Network From Scratch\nDeep Learning: A Simple Example\nDeep Learning: Sentiment Analysis\n\nNeural Language Model and Embeddings\n\nSequence Models Intuition\nNeural Languag

In [98]:
from langchain.chains.combine_documents import create_stuff_documents_chain
from langchain_core.documents import Document

## define prompt template
prompt = PromptTemplate.from_template("""Answer the following question based only on the provided context:

<context>
{context}
</context>

Question: {input}""")


## create chain
document_chain = create_stuff_documents_chain(chat, prompt)

## When invoking the caht, define `context`` documents
document_chain.invoke({
    "input": "What can students learn from the course, ENC2045?",
    "context": documents
})

'The ENC2045 course, Computational Linguistics, focuses on computational text analytics, leveraging computational tools, techniques, and algorithms to process and understand natural language data. The course covers a range of topics including Natural Language Processing, Text Normalization, Text Tokenization, Parsing and Chunking, Chinese Processing, Machine Learning Basics, Feature Engineering and Text Vectorization, Traditional Machine Learning, Classification Models, Sentiment Analysis, Text Clustering and Topic Modeling, Ensemble Learning, Deep Learning NLP, Neural Language Model, Sequence Models, Sequence-to-sequence Model, Attention-based Models, Explainable Artificial Intelligence and Computational Linguistics, Large Language Models, Generative AI, Applications of LLM, Transfer Learning and Fine-Tuning, Prompt Engineering, Retrieval-Augmented Generation, and Multimodal Data Processing. The main coding language used in this course is Python.'

In [100]:
from langchain.chains import create_retrieval_chain

retriever = vector.as_retriever()

## Specific setting for retriever
# retriever = vector.as_retriever(
#     search_type="similarity_score_threshold", 
#     search_kwargs={"score_threshold": 0.3, "k":3})

retrieval_chain = create_retrieval_chain(retriever, document_chain)

In [102]:
response = retrieval_chain.invoke({"input": "What are the topics for the course ENC2045?"})
print(response["answer"])

The topics for the course ENC2045 include Natural Language Processing, Text Preprocessing, Text Normalization, Text Tokenization, Text Enrichment, Chinese Word Segmentation, Text Vectorization, Machine Learning Basics, Machine-Learning NLP, Deep Learning NLP, Neural Language Model and Embeddings, and Sequence Models, Attention, Transformers. The course also includes a section on Python Setup.


## Chat History Management

- In addition to retrieving external documents as context information, LLM also needs to consider the conversation history for more precise answers.
- `create_history_aware_retriever()`: This chain takes in conversation history and then uses that to generate a search query which is passed to the underlying retriever.

In [103]:
from langchain.chains import create_history_aware_retriever
from langchain_core.prompts import MessagesPlaceholder, ChatPromptTemplate

# First we need a prompt that we can pass into an LLM to generate this search query

prompt = ChatPromptTemplate.from_messages([
    MessagesPlaceholder(variable_name="chat_history"),
    ("user", "{input}"),
    ("user", "Given the above conversation, generate a search query to look up in order to get information relevant to the conversation")
])


history_chain = create_history_aware_retriever(chat, retriever, prompt)

In [104]:
chat_history = [HumanMessage(content="How many assignments do students need to do?"), 
                AIMessage(content="Four.")]

history_chain.invoke({
    "chat_history": chat_history,
    "input": "What are the four assignments?"
})

[Document(page_content='Coding Assignments Reminders#\n\nYou have to submit your assignments via Moodle.\nPlease name your files in the following format: Assignment-X-NAME.ipynb and Assignment-X-NAME.html.\nPlease always submit both the Jupyter notebook file and its HTML version.\nOn the assignment due date, students will present their solutions to the exercises, with each presentation limited to 20-30 minutes.\nThe class will be divided into small groups for these exercise/assignment presentations, with the group size determined by the final enrollment. (Each group is expected to make two presentations throughout the semester.)\n\n\nAttention\nUnless otherwise specified in class, all assignments will be due on the date/time given on Moodle. The due is usually a week after the topic is finished. Late work within 7 calendar days of the original due date will be accepted by the instructor at the instructor’s discretion. After that, no late work will be accepted.\n\n\n\nAnnotated Bibliogr

In [105]:
prompt = ChatPromptTemplate.from_messages([
    ("system", "Answer the user's questions based on the below context:\n\n{context}"),
    MessagesPlaceholder(variable_name="chat_history"),
    ("user", "{input}"),
])

## user query
document_chain = create_stuff_documents_chain(chat, prompt)

## combine user query, history chain
retrieval_chain = create_retrieval_chain(history_chain, document_chain)

In [106]:
retrieval_chain.invoke({
    "chat_history": chat_history,
    "input": "Are you saying four?"
})

{'chat_history': [HumanMessage(content='How many assignments do students need to do?'),
  AIMessage(content='Four.')],
 'input': 'Are you saying four?',
 'context': [Document(page_content='Coding Assignments Reminders#\n\nYou have to submit your assignments via Moodle.\nPlease name your files in the following format: Assignment-X-NAME.ipynb and Assignment-X-NAME.html.\nPlease always submit both the Jupyter notebook file and its HTML version.\nOn the assignment due date, students will present their solutions to the exercises, with each presentation limited to 20-30 minutes.\nThe class will be divided into small groups for these exercise/assignment presentations, with the group size determined by the final enrollment. (Each group is expected to make two presentations throughout the semester.)\n\n\nAttention\nUnless otherwise specified in class, all assignments will be due on the date/time given on Moodle. The due is usually a week after the topic is finished. Late work within 7 calendar 

In [107]:
chat_history = [HumanMessage(content="How many assignments do students need to do?"), 
                AIMessage(content="Four."),
                HumanMessage(content='I am telling you that these four assignments include coding, reviewing, testing, and presentation.'),
                AIMessage(content='Thank you for the information.')]

In [108]:
retrieval_chain.invoke({
    "chat_history": chat_history,
    "input": "Can you repeat the four assignments?"
})

{'chat_history': [HumanMessage(content='How many assignments do students need to do?'),
  AIMessage(content='Four.'),
  HumanMessage(content='I am telling you that these four assignments include coding, reviewing, testing, and presentation.'),
  AIMessage(content='Thank you for the information.')],
 'input': 'Can you repeat the four assignments?',
 'context': [Document(page_content='Coding Assignments Reminders#\n\nYou have to submit your assignments via Moodle.\nPlease name your files in the following format: Assignment-X-NAME.ipynb and Assignment-X-NAME.html.\nPlease always submit both the Jupyter notebook file and its HTML version.\nOn the assignment due date, students will present their solutions to the exercises, with each presentation limited to 20-30 minutes.\nThe class will be divided into small groups for these exercise/assignment presentations, with the group size determined by the final enrollment. (Each group is expected to make two presentations throughout the semester.)\n

## Memory

- Right now, the module `Memory` is still under active development.
- To work with Memory, we will use the legacy chain, `langchain.chains.LLMChain()`, which is still under development of its compatibility with the [LCEL](https://python.langchain.com/docs/expression_language) framework.

In [109]:
from langchain_openai import OpenAI
from langchain.prompts import PromptTemplate
from langchain.chains import LLMChain
from langchain.memory import ConversationBufferMemory


# Notice that "chat_history" is present in the prompt template
template = """You are a nice college professor having a conversation with a student.

Previous conversation:
{chat_history}

New student's question: {question}
Response:"""

prompt = PromptTemplate.from_template(template)
# Notice that we need to align the `memory_key`
memory = ConversationBufferMemory(memory_key="chat_history", k = 1)
conversation = LLMChain(
    llm=chat,
    prompt=prompt,
    verbose=True, ## see the original prompts
    memory=memory
)



In [110]:
conversation.invoke("what is your name?")



[1m> Entering new LLMChain chain...[0m
Prompt after formatting:
[32;1m[1;3mYou are a nice college professor having a conversation with a student.

Previous conversation:


New student's question: what is your name?
Response:[0m

[1m> Finished chain.[0m


{'question': 'what is your name?',
 'chat_history': '',
 'text': 'My name is Professor Johnson. How can I assist you today?'}

In [111]:
memory.chat_memory.add_user_message("I think your name is Alvin Chen, right?")
memory.chat_memory.add_ai_message("Yes. My name is Alvin Chen.")

In [112]:
conversation.invoke("So what is your name really?")



[1m> Entering new LLMChain chain...[0m
Prompt after formatting:
[32;1m[1;3mYou are a nice college professor having a conversation with a student.

Previous conversation:
Human: what is your name?
AI: My name is Professor Johnson. How can I assist you today?
Human: I think your name is Alvin Chen, right?
AI: Yes. My name is Alvin Chen.

New student's question: So what is your name really?
Response:[0m

[1m> Finished chain.[0m


{'question': 'So what is your name really?',
 'chat_history': 'Human: what is your name?\nAI: My name is Professor Johnson. How can I assist you today?\nHuman: I think your name is Alvin Chen, right?\nAI: Yes. My name is Alvin Chen.',
 'text': 'My real name is Alvin Chen.'}

In [113]:
## somehow the k window size is not working?
memory.load_memory_variables({})

{'chat_history': 'Human: what is your name?\nAI: My name is Professor Johnson. How can I assist you today?\nHuman: I think your name is Alvin Chen, right?\nAI: Yes. My name is Alvin Chen.\nHuman: So what is your name really?\nAI: My real name is Alvin Chen.'}

## References

- [Langchain Crash Course for Beginners](https://youtu.be/lG7Uxts9SXs?si=07gr6zeB9tDkHjGm)
- [Langchain Documentation](https://python.langchain.com/docs/get_started/introduction)
- [Langchain Quickstart](https://python.langchain.com/docs/get_started/quickstart)