Posted by Adam Zewe | MIT News
Anyone who has ever tried to pack a family-sized load into a sedan-sized trunk knows this is a difficult problem. Robots also struggle with dense packaging tasks.
For robots, solving the packaging problem requires meeting many constraints, such as stacking suitcases so they don’t tip over in the trunk, ensuring heavy items don’t rest on light items, and collisions between robot arms and car bumpers. This is prevented.
Some traditional methods solve this problem sequentially by guessing a partial solution that satisfies one constraint at a time and then checking to see if other constraints are violated. This process can be unrealistically time consuming as there is a long series of actions to take and packing to do.
MIT researchers used a form of generative AI called a diffusion model to solve this problem more efficiently. Their method uses a collection of machine learning models, each model trained to represent one specific type of constraint. These models are combined to generate a global solution to the packaging problem by considering all constraints at once.
Their method was able to generate effective solutions faster than other techniques and produced a greater number of successful solutions in the same amount of time. Importantly, their technique can also solve problems with new combinations of constraints and larger numbers of objects that the model did not see during training.
Because of this generalizability, their technique could be used to teach robots how to understand and meet the overall constraints of a packing problem, such as the importance of collision avoidance or wanting one object to be next to another. Robots trained in this way can be applied to a wide range of complex tasks in a variety of environments, from order fulfillment in a warehouse to organizing bookshelves at home.
“My vision is to enable robots to perform more complex tasks that have more geometric constraints and require more continuous decisions. This is a problem that service robots face in unstructured and diverse human environments. “With a powerful tool called configural diffusion models, we can now solve these more complex problems and achieve good generalization results,” says Zhutian Yang, a graduate student in electrical engineering and computer science and lead author of the paper on this new machine learning technique. .
Her co-authors include MIT graduate students Jiayuan Mao and Yilun Du; Jiajun Wu, assistant professor of computer science at Stanford University; Joshua B. Tenenbaum, Professor of Brain and Cognitive Sciences at MIT and member of the Computer Science and Artificial Intelligence Laboratory (CSAIL); Tomás Lozano-Pérez, professor of computer science and engineering at MIT and CSAIL member; and senior author Leslie Kaelbling, professor of computer science at MIT and Panasonic and member of CSAIL. This research will be presented at the Robot Learning Conference.
pharmaceutical complications
The continuous constraint satisfaction problem is particularly difficult for robots. These problems appear in multi-step robotic manipulation tasks, such as packing items into boxes or setting the table. This often involves achieving multiple constraints, including geometric constraints such as avoiding collisions between the robotic arm and the environment. physical constraints, such as stabilizing objects by stacking them; There are qualitative constraints, such as putting the spoon to the right of the knife.
There may be many constraints and they will vary depending on the problem and environment, depending on the object’s geometry and human-specified requirements.
To efficiently solve these problems, MIT researchers developed a machine learning technique called Diffusion-CCSP. Diffusion models learn to iteratively improve their output to generate new data samples that are similar to samples in the training dataset.
To achieve this, diffusion models learn procedures that slightly improve potential solutions. Then, to solve the problem, you start with a random, very bad solution and then gradually improve it.
For example, imagine randomly placing plates and cutlery on a simulated table so that they physically overlap. Non-collision constraints between objects cause them to move away from each other, while qualitative constraints do things like drag a plate to the center, align a salad fork with a dinner fork, and so on.
Diffusion models are well suited to these kinds of continuous constraint satisfaction problems. This is because the influence of multiple models on an object’s pose can be configured to encourage satisfaction of all constraints, Yang explains. Starting from a random initial guess each time, the model can obtain a different set of good solutions.
working together
For Diffusion-CCSP, researchers wanted to capture the interconnectedness of constraints. For example, in packing, one constraint might require that a particular object be next to another object, while a second constraint might specify where one of those objects should be located.
Diffusion-CCSP learns a family of diffusion models, one for each constraint type. Because the models are trained together, they share some knowledge, such as the shape of the object to be packaged.
The models then work together to find a solution that jointly satisfies the constraints, in this case the location of the object to be placed.
“We don’t always find solutions on our first guess. However, if we continue to improve the solution and some violations occur, it will lead to a better solution. If something goes wrong, you can get guidance,” she says.
Training separate models for each constraint type and then combining them to make predictions significantly reduces the amount of training data required compared to other approaches.
However, training these models still requires large amounts of data that demonstrate problem solving. The cost of generating such data is prohibitive, Yang says, because humans must solve each problem in traditional, slow ways.
Instead, the researchers reversed the process of coming up with a solution first. They used a fast algorithm to generate segmented boxes and fit a diverse set of 3D objects to each segment, ensuring tight packing, stable poses, and collision-free solutions.
“With this process, data generation from simulations is almost instantaneous. We can create tens of thousands of environments where we know the problem is solvable,” she says.
Diffusion models trained using these data work together to determine where the object should be placed by the robotic gripper performing the packaging task while meeting all constraints.
After conducting a feasibility study, they developed Diffusion-CCSP using real robots that solve several difficult problems, including putting 2D triangles into boxes, filling 2D shapes with spatial relationship constraints, stacking 3D objects with stability constraints, and wrapping 3D objects. demonstrated. Robotic arm.
Their method outperformed other techniques in many experiments and produced more effective solutions that were stable and conflict-free.
In the future, Yang and her colleagues would like to test Diffusion-CCSP in more complex situations, such as robots that can move around a room. They also want to be able to use Diffusion-CCSP to solve problems in a variety of domains without having to retrain on new data.
“Diffusion-CCSP is a machine learning solution based on existing powerful generative models,” says Danfei Xu, assistant professor in the School of Interactive Computing at Georgia Institute of Technology and research scientist at NVIDIA AI. With this work. “By constructing known individual constraint models, we can quickly generate solutions that simultaneously satisfy multiple constraints. “Although we are still in the early stages of development, we expect that the continued evolution of this approach will enable more efficient, safer and more reliable autonomous systems in a variety of applications.”
This research was supported by the National Science Foundation, Air Force Office of Scientific Research, Office of Naval Research, MIT-IBM Watson AI Lab, MIT Quest for Intelligence, Center for Brains, Minds and and Machines, Boston Dynamics Artificial Intelligence Lab, Stanford Human-Centered Artificial Intelligence Lab, Analog Devices, JPMorgan Chase and Co. and Salesforce.
MIT News