Web development is not an area seen as high-risk from artificial intelligence. It requires careful, often quite custom work by highly skilled developers, so is not the kind of repetitive task considered to be easily automatable.

However, a project is underway that, when complete, could see a core part of a web developer’s work automated, turning a task that currently takes hours or days into something that can be completed in a matter of seconds.

The product in question is known as pix2code, and is under development by Uizard.io. While it’s currently just a proof of concept, once it’s at a marketable stage, it could dramatically cut the time it takes to see web projects completed.

The current workflow: what AI could replace

Uizard is not targeting the entire web development process, but is instead looking to automate the first part of the development cycle, where a developer takes a visual wireframe mockup created by a designer and turns it into working code.

“In the current workflow you will have designers creating on screens the layout of the application, and then from there the developer will have to sit down and convert it into code before they can actually start implementing all the functionality, all the features and the logic of the application,” explains Uizard.io founder and CEO Tony Beltramelli, in a talk at London’s Deep Learning Summit.

“Implementing interfaces in HTML and CSS, it's quite a frustrating process.”

“The problem, though: implementing the interfaces in HTML and CSS for the web, it's quite a frustrating, time-consuming process.”

Uizard’s solution is to create AI that can take a supplied image and use it to automatically produce a HTML and CSS version of the mockup, allowing developers to jump this initial step and get straight to implementing functionality, features and logic.

Making AI work for code

While the end product is not yet ready for use, Uizard has already taken considerable steps towards creating it, which has involved making an AI that can ‘read’ the input image, identifying different graphical components and turning them into working code.

“We're trying to identify graphical components on the user interface, such as a button, and we have the task of generating a code event describing this button, this component,” says Beltramelli.

To achieve this, Uizard looked at other AI that it designed to ‘read’ images, such as those used to provide accurate image captions.

“In image captioning you try to train an algorithm to generate an English description when given a photograph of the real world,” Beltramelli explains.

“And the assumption was: if we can generate an English description from a given photograph, can we then generate computer code from a given graphical user interface? In both cases we're trying to produce a textual output from a visual input.”

“We're trying to identify graphical components on the user interface, such as a button, and we have the task of generating a code event describing this button, this component.”

However, these systems typically make use of pre-trained word vectors: existing label data that the AI can use for reference. But the challenge in this instance was that no such equivalent was available for computer languages, as it’s an area that has seldom been the subject of focus for AI research.

Generating a complete set of these word vectors for the languages in question would have been immensely time-consuming, so instead Uizard created a far simpler domain-specific language that functions as a middleman for the AI.

“We designed a domain-specific language, which is extremely simple and the only goal of this language is to describe a user interface with very simple tokens. So a user interface is composed of buttons, labels, stacks, rows of elements and so on,” he explains.

“Once we have this domain-specific language, which is much simpler than html, we can use this and compile it to HTML, depending on our task. And the nice thing about this is that this language is less complex, which is going to make the task of training the neural network to learn this language easier.”

Towards a final product: the limitations and benefits of the technology

While the work is still in development, the resulting software, pix2code, has already produced some fairly successful results, turning simple designs into basic but functional web pages, albeit with some margin of error.

In some examples that have been produced so far, which are the result of the AI trained on a fairly small dataset, simple user interfaces have been produced entirely successfully, while others have been produced in part with some elements, such as buttons missing.

“This is very early work, it's just a proof of concept that we can apply deep learning to automate part of the front-end development workflow,” says Beltramelli.

Now the challenge is to improve the AI, in part by training it on much larger datasets. However, with the web full of potential datasets, this should be entirely possible.

“The long-term vision would be that you could have the designer hand-drawing a user interface on paper with the customer, and then directly generate code out of it.”

Nevertheless, there will always be some designs that will pose a problem.

“It is always a matter of training data,” explains Beltramelli. “So it will always fail if it it’s a completely exotic user interface; if designers create crazy buttons with really exotic shapes it will always fail because the network will have never seen this kind of test design in the past.”

However, with web development so often conforming to relatively predictable designs, the technology will likely prove to be hugely beneficial once it is at a marketable stage. And thanks to its approach, it won’t mean designers are locked into any one piece of software to use it.

“The nice thing about this very simple architecture is because we're dealing with images, inputs, we can imagine in theory using the same approach for graphical user interfaces produced with any software,” he says. “If designers tomorrow decided the new hipster tool to use is Microsoft Paint, then in theory it could still work in a similar manner.”

In time, it could even work without the need for any type of design software at all, allowing designers to produce a working initial webpage directly from a sketch drawn in a client meeting.

“The long-term vision would be that you could have the designer hand-drawing a user interface on paper with the customer, and then directly generate code out of it.“