The state-of-the-art models like GPT-4 and PaLM 2 have demonstrated the ability to perform complex tasks requiring reasoning and decision-making, pushing the boundaries of automated processes. Adding to this advancement, OpenAI’s recent API update empowers developers to define functions and parameters when prompting ‘gpt-4’ and ‘gpt-3.5’ models , making the automation of tasks more practical.
This development paves the way for creating active bots that interact with users and execute actions based on the conversations, automating manual processes without the need for complex dependencies. In this article, we will explore the function capabilities with a practical example of automating AWS S3 tasks in a conversation-based implementation.
Fundamentals
Before diving into the implementation details, it is essential to understand the underlying mechanisms for the LLMs. PaLM-2 and GPT-4 are large language models built upon the transformer architecture, a type of neural network designed for natural language processing tasks and handling long contextual inputs. Transformers can effectively capture relationships between words in natural language text and manage long-range dependencies by utilizing self-attention mechanisms that weigh the importance of different words in a sequence.
Few-shot learning is an approach employed with large language models, where these models learn new tasks from just a few examples. Unlike traditional machine learning methods, LLMs do not require massive datasets of examples to learn effectively. Instead, they are initially trained on extensive corpuses of text and code, enabling them to comprehend the statistical relationships between words and generalize to new tasks with only a few examples provided. This training method allows the model to associate user input with the most suitable function by utilizing the given descriptions without additional re-training.
The Function Calling Value
OpenAI released a function calling capability to the GPT-based models allowing them to connect with external tools and APIs. By describing the functions and their arguments, developers can leverage the ability to detect when a function needs to be called based on user input. This opens up numerous possibilities for automation, such as:
1. Virtual assistant with External API Integration: developers can create chatbots that go beyond answering questions. By integrating external APIs and performing actions such as scheduling appointments. For example, a chatbot integrated with a CRM can send personalized emails using a send_email function.
2. Natural Language Processing (NLP) to Fetch Data Sources: it becomes easier to access data sources as part of the chatbot conversation. For example, users can ask questions like “Who are my top customers this month?” and have the system transform the query into a data source through a defined function, like get_customers_by_revenue(start_date, end_date, limit) to receive the desired information in real-time.
3. Automatic Task Delegation: by integrating GPT models with project management tools, users can delegate tasks to team members through natural language commands. For instance, the user say, “Assign Adam to finish the presentation draft by Friday.” The model will identify the related functions to execute the task and return them for execution.
These are a few automation scenarios among countless examples. While it was possible to achieve automation through multiple dependencies and action chaining, the latest GPT update has streamlined the process as part of the ongoing conversation.
Practical Example: AWS S3 Integration
To demonstrate the capabilities of OpenAI’s function calling, let’s walk through a practical example involving AWS S3 — a widely used cloud storage service. We will use Python code to interact with S3 and perform various operations through a conversation-driven chatbot. The user input will go through three stages: First the chatbot with the functions context to select the most relevant one. Second, we will extract the specific function and its arguments from the model output in order to execute the intended operation. Finally, the function output will go to a summary model to formalize the responses.
Step 1: Importing Libraries and Setting Up Credentials
Start by importing relevant libraries and setting up the credentials needed for AWS S3 and Openai access.
functions_init.py
import boto3
import os
from dotenv import load_dotenv
load_dotenv()
# s3 object
s3_client = boto3.client('s3')
# openai object
openai.api_key = os.environ.get("OPENAI_API_KEY")
Step 2: Defining Functions for S3 Operations
Next, create functions that encapsulate different S3 operations such as listing buckets, listing objects within a bucket, downloading files, uploading files, and search. These functions interact directly with the S3 API and consolidate required parameters.
def list_buckets():
response = s3_client.list_buckets()
return json.dumps(response['Buckets'], default=datetime_converter)
def list_objects(bucket, prefix=''):
response = s3_client.list_objects_v2(Bucket=bucket, Prefix=prefix)
return json.dumps(response.get('Contents', []), default=datetime_converter)
def download_file(bucket, key, directory):
filename = os.path.basename(key)
destination = os.path.join(directory, filename)
s3_client.download_file(bucket, key, destination)
return json.dumps({"status": "success"})
def upload_file(source, bucket, key, is_remote_url=False):
# ....
def search_s3_objects(search_name, bucket=None, prefix='', exact_match=True):
# ....
It is essential to provide a description dictionary for the model providing each function argument details and requirements with the following fields:
- name: the exact function name for the model to return it when the conversation trigger a function call.
- description: string description for the expected action from this function.
- parameters: a dictionary including all the input properties and whether they are mandatory or not.
functions = [
{
"name": "list_buckets",
"description": "List all available S3 buckets",
"parameters": {
"type": "object",
"properties": {}
}
},
{
"name": "list_objects",
"description": "List the objects or files inside a given S3 bucket",
"parameters": {
"type": "object",
"properties": {
"bucket": {"type": "string", "description": "The name of the S3 bucket"},
"prefix": {"type": "string", "description": "The folder path in the S3 bucket"},
},
"required": ["bucket"],
},
}, # ... the rest in the git repo.
]
Step 3: Implementing the Conversation Flow:
In this step we will design a conversation flow using OpenAI’s chat completions API, where users engage with the chatbot to execute various S3 operations. As users enter commands or ask questions, the chatbot identifies when a suitable function should be called and generates JSON objects conforming to the specified function signatures.
The chat completion function:
def chat_completion_request(messages, functions=None, function_call='auto',
model_name='gpt-3.5-turbo-0613'):
if functions is not None:
return openai.ChatCompletion.create(
model=model_name,
messages=messages,
functions=functions,
function_call=function_call)
else:
return openai.ChatCompletion.create(
model=model_name,
messages=messages)
The conversation logic starts by sending the user’s input and the function details to the chatGPT, the model will verify the most suitable function to return or ask the user for extra clarification. When the model return the function and the argument, we will call a dictionary returning a function to execute based on the key name:
def run_manual_conversation(user_input, topic="s3 bucket functions."):
system_message=f"Don't make assumptions about what values to plug into functions. Ask for clarification if a user request is ambiguous. If the user ask question not related to {topic} response your scope is {topic} only."
messages = [{"role": "system", "content": system_message},
{"role": "user", "content": user_input}]
# Call the model to get a response
response = chat_completion_request(messages, functions=functions)
response_message = response['choices'][0]['message']
# check if GPT wanted to call a function
if response_message.get("function_call"):
function_name = response_message['function_call']['name']
function_args = json.loads(response_message['function_call']['arguments'])
# Call the function using custom dict available_functions
function_response = available_functions[function_name](**function_args)
# Add the response to the conversation
messages.append(response_message)
messages.append({
"role": "function",
"name": function_name,
"content": function_response,
})
# Call the model again to summarize the results
second_response = chat_completion_request(messages)
final_message = second_response['choices'][0]['message']['content']
else:
final_message = response_message['content']
return final_message
For the full s3 bucket automation code check this GitHub repo.
Examples of the function’s chatbot output
Let’s explore the chatbot responses with expectation to clarify the missing information before executing the request action:
user: list my s3 buckets bot: Here are your S3 buckets along with their creation dates: 1. “bakcup-ahm-test” – Creation Date: 2023-06-172. “elasticbeanstalk-eu-west-1” – Creation Date: 2023-03-173. “illinois-paper” – Creation Date: 2023-06-17 |
user: search for a file reflections.zip in all buckets bot: I found the file “reflections.zip” in the following S3 buckets: 1. Bucket: “bakcup-ahm-test”, Last Modified: “2023-06-17T16:05:16+00:00”, Size: 4862255 bytes, Storage Class: STANDARD2. Bucket: “illinois-paper”, Last Modified: “2023-06-17T12:38:24+00:00”, Size: 4862255 bytes, Storage Class: STANDARD |
user: search for a file bot: please provide the name of the file and the name of the S3 bucket? |
user: what is the weather today bot: I am only able to assist with S3 bucket functions. |
user: download Reflections.zip from illinois-paper bucket to current directory bot: I have successfully downloaded the file “Reflections.zip” from the “illinois-paper” bucket. |
user: upload ./resource/Reflections.zip to bakcup-ahm-test bucket bot: The file `Reflections.zip` is successfully uploaded to the bucket. |
GPT functions help me manage my s3 bucket tasks without a third-party agent, improving the ability to extract arguments from user messages and call the relevant actions.
Evaluation and Future Directions
During the evaluation, the model identified the appropriate function successfully using the input message’s semantic meaning. Additionally, it was able to avoid responding to off-topic commands or questions. One issue I encountered was that the model occasionally populated arguments with unexpected parameters when the user did not provide a value, despite being instructed not to do so. However, this behavior was infrequent, and in most cases, chatGPT came back with questions when a parameter was missing.
It is crucial to assess the effectiveness of the GPT models function and mitigate any potential risks with it by having a human in the loop “at least at the beginning” before executing the output actions. Prioritizing ongoing research to assess the performance and reliability of function features is imperative before applying it in production. While integrating with external tools and APIs may offer automation benefits, safety must be the priority.
References:
Author
Ahmad Albarqawi – [email protected]