xAI
xAI Grok launches meme, image, screenshot, flowchart analysis feature
Generative AI firm xAI is rolling out vision capabilities for Grok users on X social media with image and screenshot analysis features.
In April, xAI announced the Vision 1.5 preview model, a first-gen multimodal model to process visual data. That includes documents, diagrams, charts, screenshots, and photographs. Yet, it took months to release these capabilities to the public.
Finally, the company has launched Vision Suit for X Premium and Premium+ subscribers. As I’ve tested, the feature works just fine with most of its tasks.
For example, I uploaded an image of the Silent Hill sign board, a supernatural horror movie, and asked “How should I get there”. The chatbot analyzed the image and replied with the place’s name and full instructions.
The chatbot can also read rough diagrams (screenshots) and turn them into real code. Furthermore, you can now get explanations for memes.
These latest capabilities are part of xAI’s latest Grok 2 series models. The company has improved the processing power with its new Supercomputer cluster and ability to process more context within a single request. Processing images with Grok is a new ability for users and it’s quite useful.
You should know that the current Grok version doesn’t support file document processing. Also, you will need to push supported image formats such as JPG or PNG at the moment. The company may add more file formats in the future.
How to Use
To process your image, meme, or screenshot, go to X app or web version and tap “Grok”. Afterward, follow the instructions below.
Tap on the image icon on the left side of the text bar to select images. On mobile, you will see a + icon to upload.
Once uploaded, you can send it directly to Grok or add additional instructions to be specific about your query about the uploaded image. You will see the results at the bottom.