Amazon And Its Robot Army Declare War On Barcodes
Amazon calls the system "multimodal" because it leverages multiple pieces of data about an object, for example its appearance and dimensions, to identify it. The goal is to allow robots to pick up and move items without requiring them to find and scan a barcode. Amazon currently uses the technology in warehouses in Hamburg and Barcelona, where it monitors products rolling down a conveyor belt on their way to being shipped. So-called physical mismatches, when the items on the belt don't match the ones listed in inventory, are rare but add up when you're operating at the scale of an Amazon warehouse.
As with any machine learning system, MMID needed a lot of training data to get started, and that proved problematic. According to Amazon, there weren't many images of products as they appeared in a warehouse, so Amazon had to start by building an image library. Each image was associated with vital data to help the machine learning model identify what it sees. In the initial experiments, MMID was able to reach 75-85% accuracy, but it's now correct 99% of the time.
There have been some bumps along the way. For example, MMID was unable to differentiate between two colors of Amazon's Echo Dot speaker. The packaging looks almost identical, save for a colored dot that was not sufficient to be recognized by the system. This led to the inclusion of a confidence score. A high score indicates a potential mismatch, requiring a human to get involved. A low score indicates a product should be allowed through.
This is a necessary first step toward Amazon's ultimate goal, which is to allow a robot to identify objects in any situation and pick them up. Having objects roll past on a conveyor belt keeps the data nice and consistent, but the real world is much more messy. Amazon is confident this technology will eventually become another factor in getting packages out the door as fast as possible.