Cognitive Planning with LLMs

From "Clean the Room" to ROS Actions

Large Language Models (LLMs) like GPT-4 or Gemini can act as the "prefrontal cortex" of the robot, breaking down high-level commands into a sequence of low-level actions.

Prompt Engineering for Robots

We need to provide the LLM with a list of available skills (ROS actions) and ask it to generate a plan.

System Prompt Example

You are a robot planner. You have the following skills:
- navigate_to(location)
- pick_up(object)
- put_down(location)

User Command: "Move the apple from the table to the kitchen."

Output a JSON plan.

LLM Response

{
  "plan": [
    {"skill": "navigate_to", "args": ["table"]},
    {"skill": "pick_up", "args": ["apple"]},
    {"skill": "navigate_to", "args": ["kitchen"]},
    {"skill": "put_down", "args": ["kitchen_counter"]}
  ]
}

Execution Loop

The robot's "Executive Node" iterates through this JSON list, calling the corresponding ROS 2 Action for each step and waiting for success before proceeding.

🤖 AI Learning Companion

From "Clean the Room" to ROS Actions​

Prompt Engineering for Robots​

System Prompt Example​

LLM Response​

Execution Loop​

From "Clean the Room" to ROS Actions

Prompt Engineering for Robots

System Prompt Example

LLM Response

Execution Loop