Core Concept: Use GPT-4o-mini to generate instructions for the computer-use agent based on natural language queries
What We've Built:
โ Dynamic prompt generation - GPT-4o-mini creates task-specific instructions โ Dynamic model creation - Pydantic models generated on-the-fly for any data structure โ WebKit browser automation - Launches and navigates automatically
Proof of Concept:
Understand ANY query -> Generate its own instructions -> Navigate websites with computer vision -> Extract relevant data VISUALLY -> Handle different data types -> Save structured results ->
Result: successfully navigated Newegg (with all its anti-bot measures) using computer-use-preview openai api model and playwright and extracted real product data without websearch
User query : on Newegg find the price of acer laptop under 1000 dollars, list the price and website link for the specific laptop
Agent output length: 871 First 300 chars: { "found_items": [ { "title": "Acer Aspire 15 15.6" FHD Intel Core i9-13900H Laptop 16GB Memory 1TB SSD Windows 11 Home A15-51M-9386", "position": "Product page", "url": "https://www.newegg.com/p/N82E168343060376?quicklink=true", "snippet": "Lowest Price in 30 days", ...
==================================================================================================== Generated task configuration: Find Acer Laptops under $1000 on Newegg โ Research agent configured! ๐ Task: Find Acer Laptops under $1000 on Newegg ๐ Search terms: Acer laptop under 1000, Acer laptop price, buy Acer laptop ๐ Will extract: laptop_name, price, link ๐ฏ Success criteria: Found at least 1 item with partial data