Facebook’s automatic alternative text (AAT) technology, introduced in 2016, makes it possible for the visually impaired to enjoy their news feed by providing them with descriptions of photos on demand. The social networking giant has now unveiled the “next generation of AAT,” which brings multiple technological advances to improve the photo experience for blind or visually impaired users.
Facebook says it used weak supervision based on billions of Instagram photos and their hashtags for the latest version of AAT. The resulting models are not just more accurate but also said to be culturally and demographically inclusive. The new models can now accurately identify wedding photos worldwide, based (in part) on traditional apparel.
While AAT could initially recognize just 100 concepts, the latest iteration can recognize more than 1,200 different concepts – including national monuments, food types, and selfies. Facebook also says it has achieved an industry first by enabling its models to identify the positional location and relative size of elements in photos.
Despite all the advancements, however, Facebook admits that there is still a margin for error. It has intentionally omitted concepts that couldn’t be reliably identified and starts every description with “Maybe.” We may earn a commission for purchases using our links. Learn more.