Skip to main content

Robots Learning to Cook by Watching YouTube Videos......!!

 

In the hierarchy of things that I want robots to do for me, cooking dinner is right up there with doing the laundry and driving my car. And writing all my articles. For now, the best we can do is just watch progress being made toward getting all of these things to work reliably (and affordably). We’ve seen plenty of examples of robots that can cook, but generally, they’re all following some level of pre-programmed instructions. Telling robots what to do and how to do it is one of the trickiest things about robotics, especially for end users, so it’s a good thing we can all just sit back and let them learn things by watching videos on YouTube.
This project is taking place at the University of Maryland, and this video does a very good job of not really saying all that much over the course of 2 minutes.

The research we’re talking about here is from a paper titled, “Robot Learning Manipulation Action Plans by ‘Watching’ Unconstrained Videos from the World Wide Web.” The paper is really about visual processing: watching a human interacting with objects in a video, and then figuring out what that human is doing and how they’re doing it, with a final step of replicating those actions using the manipulation capabilities of a robot.
The University of Michigan has a dataset called YouCook, which consists of 88 open-source third-person YouTube cooking videos. Each video was given a set of unconstrained natural language descriptions by humans, and each video also has frame-by-frame object and action annotations. Using these data, the UMD researchers developed two convolutional neural networks: one to recognize and classify the objects in the videos, and the other to recognize and classify the grasps that the human is using.
While object recognition is a familiar thing, recognizing grasps is important because the robot may have different end effectors that it uses for different grasping purposes, and different grasps can also provide hints about what actions might happen next. From the paper:

The grasp contains information about the action itself, and it can be used for prediction or as a feature for recognition. It also contains information about the beginning and end of action segments, thus it can be used to segment videos in time. If we are to perform the action with a robot, knowledge about how to grasp the object is necessary so the robot can arrange its effectors. For example, consider a humanoid with one parallel gripper and one vacuum gripper. When a power grasp is desired, the robot should select the vacuum gripper for a stable grasp, but when a precision grasp is desired, the parallel gripper is a better choice.

For this particular case, grasps were divided into six types: power grasps and precision grasps, each for a small object, large objects, or spherical object. Objects, meanwhile, were divided into 48 classes, ranging from “apple” to “whisk.” Based on the YouCook data set, the overall recognition accuracy that the system demonstrated was 83 percent, with a 68 percent success rate at translating the  grasp and object combinations into commands that a robot could then execute.
In future work, the researchers would like to develop finer grasp categorizations (more than just the six based on object size and whether power or precision is required), and then use those categorizations to better predict what action is happening in the video, or (ideally) what action is probably going to come next. By that we think the researchers are saying they’re scouring YouTube for a meal that they can sit back and watch their robots cook for them.



Comments

Popular posts from this blog

Shows full image when you hover over a thumbnail

By:Prayag nao Thumbnail Zoom Plus   It's a firefox add-ons.Shows full image when you hover over a thumbnail. Works with Amazon, Baidu Images, Bing Images, Facebook, Flickr, Google+, Google Images, IMDb, LinkedIn, Netflix, Pinterest, Reddit, Twitter, Yandex, YouTube, Wikipedia, Yahoo Images, & many more.     About this Add-on Thumbnail Zoom Plus is a Firefox plug-in which shows a full-size image popup when you hover over a thumbnail or image link. When you hover your mouse over a thumbnail or a link to an image or YouTube video, the add-on displays the full-size image or video still-frame in a floating window. The image remains visible until you move the mouse outside the thumb, click the mouse, or press Escape. It’s quick and easy to move the mouse from one thumbnail to another to see the corresponding full-size images. For details see  User Manual . Supported sites include Amazon , Baidu , Bing , Facebook , Flickr , Google , Huffington

Here's How This Supercool Hoverboard Works

By:Prayag nao   If you've ever dreamed of cruising around town on a floating skateboard like Marty McFly does in the classic '80s flick "Back to the Future Part II," then you could soon be in luck. A pair of innovators is trying to make the futuristic fantasy of riding a hoverboard into a reality. About two months ago, husband and wife design team Jill and Greg Henderson launched a Kickstarter campaignfor their Hendo Hoverboard, a levitating skateboard that could hit "hoverparks" as early as October 2015. The Kickstarter campaign , which ends Sunday (Dec. 14), has been a resounding success, bringing in well over its initial goal of $250,000 in its first week. With only a couple of days to go in the impressive crowdfunding campaign, the project has already raised nearly $500,000. But with all the hype comes an important question: How in the world does this thing work? The basic premise behind the technology is something called

Real life Jarvis-Talk With Your Computer like Jarvis in Iron Man ....!

By:Prayag nao                                            Code to Make your Computer like Jarvis New Speech macro..>> Choose Advanced and change the code like this.. <speechMacros>   <command>     <listenFor></listenFor>   </command> </speechMacros> You have to add a commands  <listenFor>........</listenFor> - computer listens the words you specify here and respond accordingly. <speak>............</speak> - computer speaks what is written in this field according to the command which it got. Similarly, You can Edit more commands in the same way.   <speechMacros> <command> <listenFor>What's going on dude</listenFor> <speak>Nothing special tony</speak> </command> </speechMacros> This is just a basic command,If you want more advanced commands.you have to use Java Scripts and VB scripts . Tell me Time : This is d