Introducing Gradio Clients
WatchIntroducing Gradio Clients
WatchIn the previous guides, we built chatbots where the conversation history was stored in a list of tuple pairs. It is also possible to use the more flexible Messages API, which is fully compatible with LLM API providers such as Hugging Face Text Generation Inference, Llama.cpp server, and OpenAI's chat completions API.
To use this format, set the type parameter of gr.Chatbot or gr.ChatInterface to 'messages'. This expects a list of dictionaries with content and role keys.
The role key should be 'assistant' for the bot/llm and user for the human.
The content key can be one of three things:
gr.FileData) to display a file. At minimum this dictionary should contain a path key corresponding to the path to the file. Full documenation of this dictionary is in the appendix of this guide.gr.Plot, gr.Image, gr.Gallery, gr.Video, gr.Audio are supported. For better type hinting and auto-completion in your IDE, you can use the gr.ChatMessage dataclass:
from gradio import ChatMessage
def chat_function(message, history):
history.append(ChatMessage(role="user", content=message))
history.append(ChatMessage(role="assistant", content="Hello, how can I help you?"))
return historyThe following chatbot will always greet the user with "Hello"
import gradio as gr
def chat_greeter(msg, history):
history.append({"role": "assistant", "content": "Hello!"})
return history
with gr.Blocks() as demo:
chatbot = gr.Chatbot(type="messages")
msg = gr.Textbox()
clear = gr.ClearButton([msg, chatbot])
msg.submit(chat_greeter, [msg, chatbot], [chatbot])
demo.launch()The messages format lets us seemlessly stream from the Hugging Face Inference API -
import gradio as gr
from huggingface_hub import InferenceClient
client = InferenceClient("HuggingFaceH4/zephyr-7b-beta")
def respond(message, history: list[dict]):
messages = history + [{"role": "user", "content": message}]
print(messages)
response = {"role": "assistant", "content": ""}
for message in client.chat_completion(
messages,
max_tokens=512,
stream=True,
temperature=0.7,
top_p=0.95,
):
token = message.choices[0].delta.content
response['content'] += token
yield response
demo = gr.ChatInterface(respond, type="messages")
if __name__ == "__main__":
demo.launch()The full contents of the dictionary format for files is documented here
class FileDataDict(TypedDict):
path: str # server filepath
url: NotRequired[Optional[str]] # normalised server url
size: NotRequired[Optional[int]] # size in bytes
orig_name: NotRequired[Optional[str]] # original filename
mime_type: NotRequired[Optional[str]]
is_stream: NotRequired[bool]
meta: dict[Literal["_type"], Literal["gradio.FileData"]]