Skip to main content

🤖 AI Learning Companion

Agent Skills:

Cognitive Planning with LLMs

From "Clean the Room" to ROS Actions​

Large Language Models (LLMs) like GPT-4 or Gemini can act as the "prefrontal cortex" of the robot, breaking down high-level commands into a sequence of low-level actions.

Prompt Engineering for Robots​

We need to provide the LLM with a list of available skills (ROS actions) and ask it to generate a plan.

System Prompt Example​

You are a robot planner. You have the following skills:
- navigate_to(location)
- pick_up(object)
- put_down(location)

User Command: "Move the apple from the table to the kitchen."

Output a JSON plan.

LLM Response​

{
"plan": [
{"skill": "navigate_to", "args": ["table"]},
{"skill": "pick_up", "args": ["apple"]},
{"skill": "navigate_to", "args": ["kitchen"]},
{"skill": "put_down", "args": ["kitchen_counter"]}
]
}

Execution Loop​

The robot's "Executive Node" iterates through this JSON list, calling the corresponding ROS 2 Action for each step and waiting for success before proceeding.