It’s hard to look at the OpenAI Gym and not feel like you’re peeking in at a kindergarten for future robots.
Over there in the corner, they’re playing video games and stumbling through Frogger and PacMan. Over in the corner, the nose pickers staring at the screen and randomly shouting when the wind blows. Up front, the kid who gets everything right eventually, eyes bright and focused, picking up every book in the library.
OpenAI Gym is obviously not an actual school for robots, but it’s close. Their blog’s launch post describes it as:
“a toolkit for developing and comparing reinforcement learning (RL) algorithms. It consists of a growing suite of environments (from simulated robots to Atari games), and a site for comparing and reproducing results.”
Looking through some of the simulations, you can see the output of specialized learning algorithms in a bunch of different software ‘environments.’ Each of the environments looks simple – a retro video game, a few rounded shapes moving around, maybe some plain text. But, those environments are all stand-ins for real world things. A self-driving car is going to be solving some of the same problems as an algorithm playing frogger. An automated factory will be doing a lot of the same things as an algorithm walking safely around a frozen pond.
The biggest benefit: getting a two footed robot to walk around without falling over requires a lot of work, but building the rules in a virtual space means you never break an expensive kit.
And since this is an open source effort, the people crafting the algorithms can see and build off each other’s work, pushing everyone to a better spot over the long run. (It serves as another example of open source development leading to accelerated gains for all.)
What other sorts of uniquely human institutions exist that could be adapted to teaching our tools?
Taking a sideways look at this same type of development, the Open Source Ecology project is a collective of people who build plans for machines that build other machines.
The current practical implementation of the GVCS is a life size LEGO set of powerful, self-replicating production tools for distributed production. The Set includes fabrication and automated machines that make other machines. Through the GVCS, OSE intends to build not individual machines – but machine construction systems that can be used to build any machine whatsoever. Because new machines can be built from existing machines, the GVCS is intended to be a kernel for building infrastructures of modern civilization.
What are the possibilities if these two approaches are combined? A toolkit of battle-tested algorithms, each built to handle general use cases, and the machinery to put those rules to practical application?