Computer Vision Modern computer vision systems have superhuman accuracy when it comes to image recognition and analysis, but they don’t really understand what they see. At IBM Research, we’re designing AI systems with the ability to see the world like we do.
A new way to generate synthetic data for pretraining computer vision models IBM's Task2Sim churns out synthetic images tailored for specific AI tasks to reduce the need for real data. From chatbots to spellcheckers, modern AI was built on real data.
Visual Prompting is an innovative Artificial Intelligence (AI) approach for computer vision that enables users to transform an unlabelled dataset into a powerful segmentation model, with just a few clicks. In the field of AI, prompting generally refers to the action of providing textual input to an AI model, so that the model performs a certain task. For example, we can prompt a chatbot to ...
This is partly due to the community-based engagements of AI researchers in the continent, focusing on specific research areas such as NLP, Computer Vision, Geospatial ML. The AI research community in Africa is leading the responsible AI efforts globally.
It is the first “any-to-any” multi-modal generative AI model for Earth observation. This means it can self-generate additional training data from other modalities — a technique IBM researchers coined “Thinking-in-Modalities” (TiM) tuning. TiM is a novel approach for computer vision models similar to chain-of-thought in language models.
IBM’s new vision-language model for enterprise AI can extract knowledge locked away in tables, charts, and other graphics, bringing enterprises closer to automating a range of document understanding tasks.
General AI for computer vision is experiencing a surge of innovation fueled by the advent of vision transformers and Large Vision Models (LVMs). At IBM Research, we extended this technology to work for enterprise visual inspection, such as for infrastructure, automotive, manufacturing lines, quality control, and other domains where defects are often small and rare. We designed new algorithms ...
Earth observation (EO) differs fundamentally from other computer vision (CV) problems. Unlike tasks such as reading credit card characters or detecting people in images, RGB (red, green, blue) data alone cannot meet the complex needs of agriculture, environmental monitoring, or disaster response. That's why we've focused on innovations that address the unique demands of EO data, advancing the ...
To address the issue, our IBM Research team based in Zurich has developed an AI model that uses computer vision to detect tiny cracks in high-resolution images collected by drones. We’ve teamed up with the Canton of Zurich, the drone operations company Pixmap, and Dubendorf, the military airport on the outskirts of Zurich to inspect the airport’s runway surface. The project will test out ...