Generative AI - My Findings

References

Playlist - Foundational Generative AI
What is Retrieval-Augmented Generation (RAG)?
What are Vector Embeddings?
LangChain Tutorials (Below is overview videos you shouldn't miss)
- The LangChain Cookbook - Beginner Guide To 7 Essential Concepts
- The LangChain Cookbook Part 2 - Beginner Guide To 9 Use Cases

Langchain useful links

TIP

Watch this video on LCEL before starting with Langchain

📚 Cheatsheet

Glossary

General

AI > Machine Learning > Deep Learning > Generative AI

Prompting

Zero Shot Prompting: No previous data or guidelines given before the prompt.
One Shot Prompting: One piece of data or guideline is given before the prompt.
Few Shot Prompting: Multiple pieces of data or guidelines are given before the prompt.

Basics

# Imports
from langchain_community.chat_models.ollama import ChatOllama
from langchain_core.output_parsers import StrOutputParser
from langchain_core.prompts import ChatPromptTemplate

llm = ChatOllama(model="llama2") # Create language model instance
prompt = ChatPromptTemplate.from_template("Tell me nice story on {topic}") # Create prompt template

chain = prompt | llm | StrOutputParser() # Create chain

response = chain.invoke({"topic": "Space travel"}) # Invoke chain
print(response) # Print response

Prompt Templates

INFO

You can also use pipeline feature but ATM I'm not aware of its use cases.

from langchain.prompts import PromptTemplate, ChatPromptTemplate, ChatMessagePromptTemplate
from langchain_core.messages import SystemMessage

# `PromptTemplate`
prompt = PromptTemplate(
  input_variables=["country"],
  template="What is the capital of {country}."
)

# `PromptTemplate.from_template`
prompt = PromptTemplate.from_template(
    "What is the capital of {country}."
)

# `ChatPromptTemplate`
# 1. 2-tuple representation of (type, content)
prompt = ChatPromptTemplate.from_messages([
    ("system", "You are world class technical documentation writer."),
    ("user", "{input}")
])
# 2. Using instance of `MessagePromptTemplate` or `BaseMessage`.
prompt = ChatPromptTemplate.from_messages([
    SystemMessage(content="You are world class technical documentation writer."),
    HumanMessagePromptTemplate.from_template("{input}"),
])


# You can invoke the prompt with the following
chain = prompt | llm
chain.invoke({"input": "how can langsmith help with testing?"})

# ---

# `ChatMessagePromptTemplate` - Create `ChatMessage` from template
prompt = "May the {subject} be with you"
chat_message_prompt = ChatMessagePromptTemplate.from_template(
    role="Jedi", template=prompt
)
chat_message_prompt.format(subject="force") # instance of `ChatMessage`

Partial Prompt Templates

# 1. With strings
prompt = PromptTemplate.from_template("You are expert in {lang}. Help me with {task}.")
prompt_python = prompt.partial(lang="Python")
print(partial_prompt.format(task="Writing retry decorator")) # "You are expert in Python. Help me with Writing retry decorator."
print(partial_prompt.format(task="Debug below query...")) # "You are expert in Python. Help me with Debug below query..."

# 2. With strings +  `partial_variables` parameter
prompt = PromptTemplate(
    template="You are expert in {lang}. Help me with {task}.",
    input_variables=["task"],
    partial_variables={"lang": "python"}
)
print(prompt.format(task="Writing retry decorator")) # "You are expert in Python. Help me with Writing retry decorator."

# 3. With Functions
from datetime import datetime


def _get_datetime():
    now = datetime.now()
    return now.strftime("%m/%d/%Y, %H:%M:%S")

prompt = PromptTemplate(
    template="Tell me a {adjective} joke about the day {date}",
    input_variables=["adjective", "date"],
)
partial_prompt = prompt.partial(date=_get_datetime)
print(partial_prompt.format(adjective="funny")) # "Tell me a funny joke about the day 10/12/2021, 14:30:00"

# 4. With Functions + `partial_variables` parameter
partial_prompt = PromptTemplate(
    template="Tell me a {adjective} joke about the day {date}",
    input_variables=["adjective", "date"],
    partial_variables={"date": _get_datetime}
)
print(partial_prompt.format(adjective="funny")) # "Tell me a funny joke about the day 10/12/2021, 14:30:00"

Chain

Invocation

# 1. Invoke chain to get full response
response = chain.invoke({"topic": "Space travel"})
print(response)

# 2. Stream response in chunks
for chunks in chain.stream({"topic": "Space travel"}):
    print(chunks, end="", flush=True)

# 3. Async stream response in chunks
async for chunks in chain.astream({"topic": "Space travel"}):
    print(chunks, end="", flush=True)

Caching

from langchain.cache import InMemoryCache, SQLiteCache

# 1. In-memory cache
set_llm_cache(InMemoryCache())

# 2. SQLite cache
set_llm_cache(SQLiteCache(database_path=".langchain.db"))

llm.predict("Tell me a joke") # First time so it can take time
llm.predict("Tell me a joke") # Cached response, so it will be faster

Memory

Conversation Buffer

This will store whole the conversation history in memory.

WARNING

It's not recommended for long conversations because as we'll send whole conversation it'll consume more tokens and can be expensive.

from langchain.memory import ConversationBufferMemory


memory = ConversationBufferMemory()
chain = (
    RunnablePassthrough.assign(
        history=RunnableLambda(memory.load_memory_variables) | itemgetter("history")
    )
    | prompt
    | model
)

Conversation Buffer Window

Only preserves the last n messages in memory.

from langchain.memory import ConversationBufferWindowMemory


# Only last 5 messages will be stored in memory
memory = ConversationBufferWindowMemory(k=5)
chain = (
    RunnablePassthrough.assign(
        history=RunnableLambda(memory.load_memory_variables) | itemgetter("history")
    )
    | prompt
    | model
)

Generative AI - My Findings ​

📚 Cheatsheet ​

Glossary ​

General ​

Prompting ​

Basics ​

Prompt Templates ​

Partial Prompt Templates ​

Chain ​

Invocation ​

Caching ​

Memory ​

Conversation Buffer ​

Conversation Buffer Window ​