xAI

Check xAI Grok Vision explaining program code screenshot

Published

on

xAI is yet to release its vision capabilities for the Grok large language model (LLM) but a screenshot showed this model in action by explaining a program code. Grok-1.5 Vision is the first-gen multimodal model. It can process a range of visual information, including documents, diagrams, charts, screenshots, and photographs.

It can write code from a diagram and calculate calories in an image. The model can draw a story from a toddler’s sketch and explain a meme. This version can convert a table to CSV format and give you suggestions about a scenario shown in the image. The model can solve a screenshot with Python code and turn the code into text.

Advertisement

An image shared by X user @Lohansimpson showed a user query to explain the code in the image. The model identifies the code and replies with a summary as well as commands used in the code structure and values.

xAI Grok 1.5 Vision explaining program code (Image Source – @Lohansimpson/X)

This chatbot comes with a general user interface (UI) including a prompt bar to ask questions and receive an answer. Grok is currently integrated into social media site X for premium users and the company only allows you to do text conversations.

However, xAI has developed a feature to upload files for Grok conversation. You can push a screenshot or image to Grok and caption to ask for an explanation.

Advertisement

Although Grok 1.5 is now released to most users but access to Vision is still limited to only a few testers. Overall, processing screenshots and images via Grok Vision is a major feature and should be released soon for all X premium users.

Advertisement
Comments
Exit mobile version