OpenAI can translate English to code with its new Codex machine learning software
AI research firm OpenAI is releasing a new machine learning tool that translates the English language into code. the the software is called Codex and is designed to speed up the work of professional programmers, as well as to help hobbyists get started with coding.
In demos from Codex, OpenAI shows how the software can be used to create simple websites and rudimentary games using natural language, as well as translate between different programming languages and process data science queries. Users type English commands into the software, such as “create a web page with a menu on the side and a title at the top”, and the Codex translates them into code. The software is far from foolproof and takes a little patience to operate, but could prove invaluable in making coding faster and more accessible.
“We see this as a tool to multiply programmers,” said Greg Brockman, CTO and co-founder of OpenAI. The edge. There are two parts to programming: you need to ‘seriously think about a problem and try to figure it out’ and ‘map those little pieces to existing code, whether it’s a library, a function, or a API. “The second part is tedious,” he says, “but that’s what Codex does best.“ It takes people who are already programmers and takes away the tedious work. ”
OpenAI used an earlier version of Codex to create a tool called Copilot for GitHub, a code repository owned by Microsoft, which itself is a close partner of OpenAI. Copilot is similar to Gmail’s autocomplete tools, offering suggestions on how to complete lines of code as users type them. The new version of OpenAI Codex, however, is much more advanced and flexible, not only to finish the code, but to create it.
Codex is built on top of GPT-3, the OpenAI language generation model, which has been trained over much of the Internet, and as a result can generate and analyze the written word in impressive ways. An application that users found for GPT-3 generated code, but Codex upgrades the capabilities of its predecessors and is trained specifically on open source code repositories pulled from the web.
This last point has led many coders to complain that OpenAI is take unfair advantage of their work. OpenAI’s Copilot tool often suggests snippets of code written by others, for example, and the entire knowledge base of the program is ultimately derived from open source work, shared for the benefit of individuals, and not companies. The same criticisms will likely be directed at Codex, although OpenAI says its use of this data is legally protected under fair use.
Asked about these complaints, Brockman replies, “New technology is coming, we need this debate, and there will be things that we will do that the community has great points about and we will take feedback and do things differently. He argues, however, that the coding community at large will ultimately benefit from OpenAI’s work. “The actual net effect is of great value to the ecosystem,” says Brockman. “Ultimately these types of technologies, I think, can reshape our economy and create a better world for all of us. “
Codex will also certainly create value for OpenAI and its investors. Although the company started life as a nonprofit laboratory in 2015, it moved to a “capped profit” model in 2019 to attract external funding, and although Codex is initially released as a free API, OpenAI will start charging for access at some point in the future.
OpenAI says it doesn’t want to create its own tools using Codex because it’s in a better position to improve on the base model. “We realized that if we continued on one of them, we would cut any of our other routes,” says Brockman. “You can choose as a startup to be the best at something. And for us, there is no doubt that it creates better versions of all of these models. “
Of course, while the Codex looks extremely exciting, it’s hard to judge the extent of its capabilities until some real programmers tackle it. I’m not a coder myself, but have seen Codex in action and have some thoughts on the software.
Brockman from OpenAI and Wojciech Zaremba, Codex Manager, demonstrated the program to me online, using Codex to build a simple website first, then a rudimentary game. In the game’s demo, Brockman found the silhouette of a person on Google Images, then told Codex to “add this image of a person to the page” before pasting the URL. The figure appeared on screen and Brockman then changed its size (“makes the person a little bigger”) before making it controllable (“now make it controllable with the left and right arrow keys”).
Everything worked very well. The number started to move around the screen, but we quickly ran into a problem – it kept disappearing off the screen. To stop this, Brockman gave the computer an additional instruction, “Constantly check to see if the person is on the page and put them back on the page if they are.” That kept him from disappearing, but I was curious how precise these instructions must be. I suggested we try another one: “Make sure the person can’t leave the page.” It also worked, but for reasons neither Brockman nor Zaremba can explain, it also altered the width of the figure, crushing it flat on the screen.
“Sometimes he doesn’t know exactly what you’re asking for,” Brockman laughs. He has a few more tries and then comes up with a command that works without this unwanted change. “So you had to think a little bit about what was going on, but not very deeply,” he says.
It’s good in our little demo, but it says a lot about the limitations of this kind of program. It’s not some magical genius that can read your brain, turning every command into flawless code – and OpenAI doesn’t pretend either. Instead, it takes some thought and a bit of trial and error to use. Codex won’t turn non-coders into expert programmers overnight, but it is certainly much more accessible than any other programming language.
OpenAI is optimistic about the potential of Codex to change programming and computing more generally. Brockman says it could help solve the programmer shortage in the United States, while Zaremba sees it as the next step in the historic evolution of coding.
“What is happening with Codex has happened several times already,” he says. In the early days of computing, programming was done by creating physical punch cards that had to be fed into machines, and then people invented the first programming languages and began to refine them. “These programming languages, they started to look like English, using vocabulary like ‘print’ or ‘exit’ and so more people became able to program. The next part of this trajectory is to completely remove specialized coding languages and replace them with English commands.
Codex also has the ability to monitor other programs. In a demo, Brockman shows how the software can be used to create a voice interface for Microsoft Word. Because Word has its own API, Codex can provide it with instructions in code created from the user’s voice commands. Brockman copies a poem into a Word document, then tells Word (via Codex) to remove all indentations first, then number the lines, then count the frequency of certain words, and so on. It’s extremely smooth, although it’s hard to say how well it would work outside of the confines of a pre-arranged demo.
If successful, Codex could not only help programmers, but become a new interface between users and computers. OpenAI says it has tested Codex’s ability to control not only Word but other programs like Spotify and Google Calendar. And while the Word demo is just a proof of concept, Brockman says, Microsoft is apparently already interested in exploring the software’s possibility. “They are very excited about the model in general and you should expect to see a lot of Codex applications being created,” he says.