OpenAI has introduced its first AI Agent known as Operator, designed to assist users by completing tasks within a web browser. This new technology utilizes the GPT-4o’s visual skills to browse and search the web effectively. Operator can perform various tasks such as making dinner reservations, filling out forms, and handling web-based assignments.
As a ‘research preview’ available for ChatGPT Pro subscribers in the US, Operator marks OpenAI’s initial venture into AI agents. The blog post by OpenAI highlights Operator’s capability to independently execute tasks assigned to it, hinting at the potential development of more agents in the future. Powered by the Computer Using Agent (CUA) model, Operator combines visual and reasoning skills to interact with browser elements effectively.
While Operator showcases promising abilities to ‘see’ and ‘interact’ with a browser interface, its practical utility in the real world may require further refinement and broader accessibility to users. Currently available for testing on OpenAI’s website for ChatGPT Pro subscribers in the US, Operator is expected to expand to other countries and subscription tiers in the future. This development signifies a significant advancement in AI technology and its potential to enhance user experiences.