Turning web design mockups into code with Deep Learning

chrisfosterelli · on Jan 10, 2018

A lot of people in this thread seem to think that this is a neural network that takes an image and produces HTML, when that's not the case here at all.

This is a neural network that takes an image and predicts very simple blocks (like BODY, TEXT, BTN-GREEN in the bootstrap example) and then uses a map to convert them to well-formed HTML. While I think it's a great learning example I think it's important to note that this does generalize at all to any other websites -- you are NOT going to replace an actual person writing HTML with anything like this.

You can see the mapping here: https://github.com/emilwallner/Screenshot-to-code-in-Keras/b...

houqp · on Jan 10, 2018

You are correct. But like the other comment mentioned, the cool part about this is it automatically learns the mapping from image to sequence of tokens. To handle arbitrary html, we just need to extend it a little bit and convert all possible html input into tokens. I think the take away is this might be a feasible approach towards automatic UI code gen.

chrisfosterelli · on Jan 11, 2018

Automatically learning the mapping from an image to a sequence of tokens is a very fundamental task for CNNs and not particularly new.

I don't think it's clear, or likely, that this can extend to all possible html input tokens. As you add more tokens, it becomes more difficult for the network to choose among them accurately. Additionally, as the token set becomes more fine-grained the size of the output space will grow exponentially and the network will likely struggle to learn from the training examples as well as output valid structure.

I think you can compare this to approaches that receive an image as input and provide a caption of the image as output. Works surprisingly well in simple cases but no where near fully functional or actually capable of understanding all inputs.

I agree that this might be a feasible approach toward automatic UI code generation eventually, but this is several significant levels of complication away from that.

houqp · on Jan 11, 2018

Agree with you that this is by no means a complete solution and there is a long way to go to make it actually usable.

I think one big problem with image captioning could be lack of high quality training data. While in this case we can generate lots of good training data. Whether we will be able to generate enough good data and have enough compute power to train on them is something that we need to find out.

Playing go was considered a problem too complex to solve couple years ago, but it's now a solved problem. So I am hoping we can get a breakthrough on this sooner than we think.

fnl · on Jan 11, 2018

Not wanting to dampen your high hopes, yet, the rules of Go seem a lot simpler than the rules and grammar of the current hypertext markup "language", particularly if taking the "browser dialects" into account, which are crucial for professional pages....

samfisher83 · on Jan 10, 2018

The cool part is its not programmed to do that. It is learning how to do that.

CM30 · on Jan 10, 2018

Well, this seems like it could be useful for the times you just want a quick prototype mocked up and can't be bothered to code it, or when you're dealing with sites that don't need much in the way of dynamic functionality.

However, it also makes me think that:

A: Maybe developers (and software engineers in general) should stop thinking their own jobs are necessarily safe from automation, since something like this could be the first step to the field going the same way as lorry or taxi driving will in future.

B: That agencies might start seeing their basic web development work start drying up in the not too distant future.

nostrademons · on Jan 10, 2018

Some of the very first programs written for computers (assemblers and compilers) were so that programmers could automate their own jobs. Nearly every advance in system software since then - recursive functions, software libraries, virtual memory, high-level languages, garbage collection, lexical closures, object-oriented programming, constructors & destructors, smart pointers, version control, package managers - has involved automating progressively higher-level tasks so that programmers can focus on new stuff.

Dreamweaver and FrontPage have existed for over 20 years. The use-case listed here isn't all that useful - if you just want a quick prototype website built from a WYSIWYG graphical editor, there are already very good tools for that, starting with the above two and going through web services like SquareSpace, Wix, and Weebly. I found the discussion of his approach fascinating, though, because it could be applied to lots of other, unsolved problems. Imagine being able to publish a website by taking a photo of a poster using your smartphone.

swyx · on Jan 11, 2018

as a web developer my perception is dreamweaver and frontpage are now no longer relevant. is this actually true or am I just biased because of my job?

nostrademons · on Jan 11, 2018

Probably true, but it's because their market (small businesses who just want a basic HTML-only site) has been taken over by website builders like Wix, Weebly, Wordpress, and Squarespace. Agencies & professional web developers moved up-market to the customers who require dynamic app-like behavior, which Dreamweaver and Frontpage sucked at, while the bottom end of the market moved to the aforementioned hosted services that could also run the site for them, handle maintenance, and easily integrate add-ons like email/blogs/commenting/RSS/analytics/etc.

iambateman · on Jan 10, 2018

B. Yes.

A. No. Good developers can run up the abstraction chain to keep providing value at a higher level. Most good devs fundamental value prop isn’t “I can make something in HTML”.

If anything, this will mean that more people can do the redundant stuff faster.

As a frontend guy, I can’t wait for the day that a tool can generate presentational stuff.

nicolashahn · on Jan 10, 2018

I'd argue in part you're both right about A. Good devs (those can keep running up the abstraction chain) will always be useful. But the bar for what a "good dev" is keeps getting higher as we introduce more and more abstraction. The sheer amount of knowledge and understanding that the average dev needs in order to be productive AND understand what's going on enough under the hood (to the point where they're not doing stupid things constantly) keeps getting larger.

Think back to CS in the 70s - AFAIK, the majority of the field that was understood is now covered before an undergraduate gets to their upper division classes. We're now expected to understand things at a much higher level of abstraction, but we ALSO need to know how logic gates and machine code works in order to fully understand what we're doing.

The 'incubation period' to produce a good dev is going to keep growing. The ones that make it to full dev maturity are going to be more productive than ever before, but fewer people will make it there.

iambateman · on Jan 10, 2018

Sure, there is a divergence of skill sets...We're going to have "doctor" developers – people who carry an incredible breadth of knowledge and are paid accordingly.

Beside them will be "nurse" developers – people who are just as professional, but less learned in the full range of computer science ideas.

alexanderdmitri · on Jan 11, 2018

Though the abstract nature of the field often benefits from analogical representations, I think this one detracts. I don't see how our current hodgepodge list of role titles we've already borrowed from other fields (architect, developer, designer, scientist, ops, product manager, evangelist(?!), quality assurance, security specialist, chief information officer (not to mention wizards, code monkeys, Java baristas (aka the bespeckled missionaries (they don't C# for whatever reason) who walk through your neighborhood every once in a while to knock on doors with warm smiles hoping to share their beliefs on the one true Oracle and some bizarre philosophy of enterprise scale spiritual development involving complex and brittle structures and subservience to feudal and [frankly] regressive class structures that most favor a nepotistic inheritance procedure designed to chiefly empower their children and a small circle of similarly interested functionaries that seem hellbent on oppressing the rest of the system), unicorn jockeys, tech debt lawyers, and whatever you call that long-gone contractor who years ago thrust a bizarre moonscript backend into the core of your team's architecture like Excalibur waiting for an Arthur everyone hopes won't happen along because rock that asshole stuck the sword in happened to be the kingdom's cornerstone and even the slightest nudge will cause the whole system to precariously sway ...)) can be summarized with a doctor / nurse dichotomy.

Even if writing, practicing and administering software legally required a degree and license, I think this still detracts. Maybe useful once software powers organic, higher-order life forms, at which point I imagine our doctors and nurses will be entirely software driven anyway[0].

[0] http://memory-alpha.wikia.com/wiki/Emergency_Medical_Hologra...

iambateman · on Jan 11, 2018

Your reply is dizzying. Glad my commment could help you get some angst out of your system!

Hopefully the sword will be more Excalibur and less Sword of Damocles.

^^ I have no idea what that means, but it seemed relevant. :)

nicolashahn · on Jan 10, 2018

What will the "nurse" developers do such that the "doctor" developers don't just write something to abstract away the nurses?

mcknz · on Jan 10, 2018

Everyone knows the doctor developers rarely show up, and the nurse developers end up doing all the work.

Doctor is a long-running process that requires a callback from nurse.

taneq · on Jan 11, 2018

Whatever the "doctor" developers write to abstract away the nurses, it's still not going to be able to communicate with nontechnical managers. Or at least, once it can do that, the "doctor" developers will be obsolete soon anyway.

jdminhbg · on Jan 11, 2018

Until we have AGI that replaces all developers, someone will have to do the work of converting business requirements into LOB apps.

dreamfactored · on Jan 10, 2018

The level of the 'hood' rises too. Used to be that in-depth knowledge of assembler, register efficiency and compilers was considered a requirement.

wwweston · on Jan 10, 2018

> As a frontend guy, I can’t wait for the day that a tool can generate presentational stuff.

Qualified agreement from this sometimes front-end guy, but that's assuming the generated presentational stuff is of equal or higher quality as that of the other front-end devs.

Being the front-end guy who has to debug the auto-generated code that doesn't quite work in a corner case or two sounds potentially worse than the status quo.

iambateman · on Jan 10, 2018

For sure. I'm sure that autogen won't be on par with a human, just faster.

Squarespace is simultaneously worse than what _any-decent-designer_ would put together AND better than an average marketing website.

macintux · on Jan 10, 2018

I agree with your response to A in the short term. In the long term, however, much like a mountain in a flood people can only scramble so high before they run out of oxygen, or run out of room.

iambateman · on Jan 10, 2018

To take your metaphor a bit too far, it’s fortunate that we have more than one mountain. ;)

Sure, the days in which the web guy was viewed as a special guru won’t last forever. But even with auto-generated mockups, we’re still a long way from computers taking over everything.

msla · on Jan 10, 2018

As the baseline of tooling gets raised, the baseline of expectation gets raised.

Programmers will use these tools the way we always have: To automate the simple work on the way to building a finished product that's more complex, and done more quickly, than would have been possible previously.

Some markets die, yes. It's no longer possible to make a business out of selling Unix clones for commodity hardware, like you still could in the 1980s. It's no longer possible to make a living just making static HTML websites, like you could in the 1990s.

Ultimately, what programmers trade in is a specific kind of problem-solving design sense. Not just solving this single problem as fast as we can, but creating a more general solution, which composes well and can be understood and changed later on. Once programs can do that... well, we'd better be willing to offer them citizenship and recognize their civil rights as thinking beings.

mr_toad · on Jan 10, 2018

The top of this metaphorical mountain involves learning new APIs, frameworks, languages and programming paradigms. An AI that can do that will be a threat to everyone.

nxsynonym · on Jan 10, 2018

In regards to B:

Wouldn't the agencies most affected by this type to image-to-website tech already be affected by WYSIWYG operations, (squarespace, wix, wordpress, etc.)? I am curious how many agencies are developing simple html/css enough that this type of thing would dry up work.

As far as point A: I don't know if there will ever be a "final state" of web development. For every automation fix that is introduced to the web dev space, 10 new technologies are developed that provided an added layer of complexity.

I think the web is bit different from traditional jobs being taken over by automation in that (as far as we know) there isn't a hard limit on how far we can push web technology and use. If anything, the people who would be replaced by these automation services are (presumably) the ones who would be developing the automation systems and integration to begin with.

snowwrestler · on Jan 10, 2018

With regard to B, at least in the DC area it is all dried up and agencies are climbing the value chain in one or both directions:

- As a communications/marketing/PR agency who provides the website as a component of a larger campaign. Many of these folks are already using platforms like Wix, Squarespace, etc., or Wordpress with purchased templates.

- Doing more complex site development that requires a lot of IA work and institutional change management. For example: migrating ancient federal HTML websites (yes some still exist) to Drupal.

neovive · on Jan 10, 2018

With the advent of standardized web component libraries, design frameworks, css grids, and website builders; the task of building a simple website feels more like assembling the pieces of a puzzle. It's hard to know how long innovations in web development will outpace automation / deep learning, but it's an interesting space to follow.

Personally, I think bespoke web application development will be a strong area of innovation for many years to come as browsers and devices continue to increase in power. I'm more worried about keeping up with the fast pace of change than the entire industry drying up.

realusername · on Jan 11, 2018

> A: Maybe developers (and software engineers in general) should stop thinking their own jobs are necessarily safe from automation, since something like this could be the first step to the field going the same way as lorry or taxi driving will in future.

I've seen hundreds of products to automate this kind of tasks and all of them failed one by one. The simple answer is that if you would have a software which could replace code, it would have a learning curve as complex as learning to code ans so being then worthless to learn.

gedy · on Jan 10, 2018

Agreed, but also feels like the very common mockup->developer handoff is inefficient, and in many cases should just start from a working prototype vs a drawing.

seanmcdirmid · on Jan 11, 2018

Drawing a website can be much harder than just coding it up, especially if lots of repetition is involved. This is a nice trick, but, given the fixed vocabulary, isn't really that useful to non coders (who could accomplish faster results with drag and drop under similar restrictions).

matte_black · on Jan 10, 2018

A: Developers jobs were never safe from automation, they are just likely the last ones to be fully automated.

tekromancr · on Jan 10, 2018

Damn, it's almost like developers are going to need to learn to focus on using code to solve business problems, rather than just focusing on being able to write code and calling it a day. XD

jamesjyu · on Jan 10, 2018

This is really great.

Many years ago, I was neck deep in frontend coding. I got so good that when given a mockup, I could code the skeleton of the page "blind" without viewing it in the browser. I would load the page at the end and grade myself on how accurate I was. I've always wanted to do a contest with other frontend coders to see who could get closest to a complex layout—like the NYTimes—in one go.

One day, this type of skill will become a curiosity—a relic of the past similar to horseback riding. I would be happy to see the automation of pure translation of boxes into code.

janneklouman · on Jan 10, 2018

Sorry if this is OT, but these types of contests exist! I went to one of these[1] maybe two years ago in Stockholm and I had a blast. I think the format was 32 people competing, 8 on stage at a time being shown a design which they had 15 minutes to mimic without previewing, then the crowd voted for winners using their smartphones. Two winners from each group of 8 went on to the finals. The competitors screens were mirrored on screens facing the crowd, so everyone could see how people tackled the problem in real-time. One guy did an ASCII representation of the design, which the crowd enjoyed enough to send him to the finals.

A large audience, smoke machine and lots of lasers, hard techno, and free beer. They did an amazing job setting the mood. They even have their own IDE[2] that everyone competing is required to use (with combo-counter and cool rave-y visual effects).

[1] http://codeinthedark.com/

[2] https://github.com/codeinthedark/editor

jamesjyu · on Jan 10, 2018

That is incredible! I knew there would be the off chance that someone would have put something like this on, but this looks like they went all in.

codecamper · on Jan 10, 2018

I'm an old guy & Kraftwerk is all I need for a good mood programming. Or some Plaid.

oliv__ · on Jan 11, 2018

Oh my god why didn't anyone tell me about this? This looks freakin amazing!

_khhm · on Jan 10, 2018

I still like horseback riding. It's pretty fun!

jamesjyu · on Jan 10, 2018

Definitely not deriding horseback riding (no pun intended). Humans will always do things for pleasure that they used to do purely for practical reasons. Just look at the mere act of jogging.

But, one thing I know that I'll never do for fun is debugging IE6 browser compatibility issues. I'll never get that part of my brain back again.

fineline · on Jan 11, 2018

Many people still enjoy horses without using them as primary transport. I reckon it will be the same with ICE cars - various people with the money and inclination will still enjoy classic cars even when their daily driver is electric.

trafnar · on Jan 10, 2018

I can't find it, but there is a contest where people are challenged to re-create a mockup in front of a live audience, without seeing a browser until they are finished. I'd also like to do that :)

edit: someone else found it!

toddmorey · on Jan 10, 2018

I'm still waiting for the generation of tools that are native to the medium—fluid, responsive, interactive, and aware of the final context for the design. So we don't start with static mockups that have to be converted, but rather build web-native from the beginning.

There are some promising moves that direction (such as responsive resizing in Sketch and projects to automatically convert sketch files to React components), but we're not there yet—most designers I know are still producing largely static designs that have to be put through development cycles.

And I know you can live design with HTML and CSS, skipping the static mockup phase. That's my preferred method at the moment, but I'm here to tell you it's still pretty slow and tedious and makes me long for better web design tooling.

oliv__ · on Jan 11, 2018

Honestly, I actually find it really really fast. I don't know how long you've been doing this and I suspect that may be the issue here, but for me it just comes so naturally now that I can't think of any other tool someone could use to outrun me using CSS/HTML.

More broadly, to me, that combo is one of the tools that makes me feel most connected to the output, as the results are near instantaneous.

goatlover · on Jan 11, 2018

Have you ever watched a good designer mockup something using Indesign, PowerPoint or a Wordpress Theme like Divi? They're superfast with the visual tools. I doubt you can match that in CSS/HTML.

There's a reason Visual Basic was so popular in the 90s. That sort of development kind of got forgotten when the web overtook everything. And it's why Alan Kay made Smalltalk a visual environment, and not just raw code in an editor, because human beings are faster designing UIs with visual tools.

toddmorey · on Jan 11, 2018

18 years. I’m damn fast realizing design in code but will always be faster in visual media. I imagine your designs aren’t that complex? Simple pages do code up quick.

typeformer · on Jan 11, 2018

I could outrun you using Webflow...

pavlov · on Jan 11, 2018

I've worked on React Studio which tries to do exactly this:

https://reactstudio.com

One thing I've learned is that it's extremely hard to get designers to think about structure and hierarchies within a user interface. I don't see an easy solution.

Maybe machine learning that suggests reasonable structural transformations during the design process is the way? I know that Airbnb's design tool team (Jon Gold etc.) are exploring something along those lines.

albertgoeswoof · on Jan 10, 2018

While cool I am not sure if this is genuinely useful. I can now take my design and convert it to HTML- now what?

I still have to integrate my api, handle responsiveness, add JavaScript and other animations/actions etc. So I’ll probably end up rewriting most of it anyway.

What I actually get out of this is not much better than emebedding the original image in an img tag, in fact it might be worse, because it creates technical debt that I now have to maintain.

juliushuijnk · on Jan 10, 2018

I think the potential is in converting images (sketches) into ux source files that you can edit immediately and then convert those to HTML, PDF, DOCS, etc. I can see that the HTML would more likely used for a clickable prototype than the real thing, but it could still be a neat way to save some time. I'm working on a (text-command) ux tool and have been thinking about such a machine learned import feature, but lack those skills and time at the moment to build it.

felipeccastro · on Jan 10, 2018

I agree, and honestly, I think trying to solve this via deep learning is not the optimal approach. Too many details to handle besides a static layout, which are the hardest part.

I believe it would be a lot more effective if designers used tools that generated clean code directly, instead of Photoshop. If it's a SPA, then something with the actual, integrated React/Vue components for consistency and reuse.

While I'm not familiar with any such tool that's actually good enough for professional use (most WYSIWYG designers for the web I tried were terrible), I wonder why we haven't seen much success in this area yet, as we have with desktop app design since VB6.

ReDeiPirati · on Jan 10, 2018

Wow, really cool article!

I am not a huge fan of front-end dev, so I am really excited about the great opportunities behind this work/research. It will be another major step in the democratization of UI and less work at maintaining the web views.

My biggest concern is the model capacity. My question is: in your opinion, will it be able to handle very long HTML page? not to mention when it will be added a lot of CSS or JS. Even with more layers, will the model be able to generate corrected syntax(for really long pages)? will it need some additional modules(not only layers) to handle this? maybe a different problem formulation?

emilwallner · on Jan 10, 2018

The model uses 48 tokens at a time to make the next prediction. As long as it can keep track of where it is on the design image, it’s not a problem. In the final version, it has roughly the same accuracy (~97%) on short and long websites.

The tougher part is integrating JS or adding hover effects in CSS. In theory, this can be done with an attention layer, but I haven’t seen any papers on it.

gojomo · on Jan 10, 2018

Another step closer to my dream: a cross-compiler that takes as input a Powerpoint presentation, and outputs the enclosed-described mobile app, cloud-hosted website/backend, and Delaware C Corporation.

snaky · on Jan 11, 2018

Or just ICO whitepaper.

awb · on Jan 10, 2018

I'm not sure about the usefulness of coding UI prototypes.

When I owned a web design firm, if the UI wasn't polished, clients had a hard time wrapping their head around the design or what the final product might look like. In every case I remember we had to make it pixel perfect in the design phase and then again in the development phase.

It was a huge pain and an area ripe for innovation but I think the problem like other posters have mentioned is a lack of quality responsive design tools that are better than simple HTML/CSS design in the browser.

sekou · on Jan 10, 2018

This is a cool idea. I think this technology could potentially assist front-end developers and the designers who work with them even though there's still a fair amount of craft involved. Browser compatibility has gotten better on the desktop in recent years but the mobile appearance of websites is as diverse as ever. It might be interesting to explore "mobile-first" or "progressive enhancement" applications of this technology.

tinymollusk · on Jan 10, 2018

Did you see any of the output source? I wonder if it's clean enough to be understood by humans, or if it's one of those situations where it's been optimized to work but not be understood.

Then again, if you used some additional inputs like "readable code", it theoretically could optimize both the resulting application and the source code.

sekou · on Jan 10, 2018

Yeah I believe it's about 2/3rds down the page under the heading "Links to generated websites." It's a far cry from the old Dreamweaver days. I could see running into some complexity with CSS. Things like which units to use (em, vh/vw, %, px, rem, etc), building a readable selector hierarchy, or using newer CSS technologies like CSS grid or animations.

emilwallner · on Jan 10, 2018

Output source: https://emilwallner.github.io/bootstrap/pred_1/

wrangler99 · on Jan 10, 2018

In the short-term, this approach will struggle to compete against WYSIWYG editors. But as soon as they can match them in output, they’ll improve exponentially faster. WYSIWYG editors has a ton of code to maintain, while a model is simple to improve.

tonybeltramelli · on Jan 11, 2018

It's awesome to see how people have picked up and built on top of pix2code (original author here). Very exciting time for front-end development in general!

huac · on Jan 10, 2018

If the goal is to recreate the layout of a site then the context text isn't really important. You could represent each letter with a filler character e.g. . Then, you only need as many tokens as you have words with distinct numbers of characters. This approach (similar to how we use lorem ipsum for prototyping) would dramatically reduce the complexity of the model.

rburhum · on Jan 10, 2018

Why wouldn't you be able to train it to recognize expected behavior from a mockup? You could easily create a nomenclature of symbols to define standard behavior (⇑ could mean draggable upload for example). Even if you can get it to take you 80%, that would be huge.

sebringj · on Jan 11, 2018

When an AI does create a web page successfully in terms of replacing what a human can do completely, it will probably make the code in such a way that it is unmaintainable because why make it maintainable? We would not have to maintain it anyway. It would be like spaghetti code yet would look amazing on the presentational side. No one would give a damn anymore about coding styles or elegant frameworks like react, you could just bark at it and it would morph around. Sites/apps would just be thoughts you could construct on a whim that typically changed frequently or autonomously to produce the maximum desired result. Meanwhile we're sucking down nutrient packs floating in stasis in our travel pods spending the majority of our time in virtual consciousness.

raisspen · on Jan 11, 2018

That is just a further step in the training that could be undertaken. Many of the advances in machine vision come piecemeal for instance. One team advances one idea that does some thing(s) very well, but has some drawback(s) which another team comes up with a solution for at a later date. Maintainability of code could be something the AI is eventually trained to take into account.

serpix · on Jan 11, 2018

maintainability is irrelevant if the whole site can be regenerated in seconds with the new changes required.

I'm okay with this though. Creating websites is tedious and is most analogous to cranking a handle as it is a learnable skill. After a while it is clear it is just banging characters and seeing what changed on the screen this time.

Bear in mind I'm only talking about implementing the layout, colors, texts and responsiveness. Any dynamic functionality such as forms, buttons, dialogs, integrations and talking to servers is still potentially very complex and difficult taking.

romaniv · on Jan 10, 2018

>Currently, the largest barrier to automating front-end development is computing power.

Actually, it's poorly though-out web technologies that don't allow for reasonable WYSIWYG editing.

ethbro · on Jan 10, 2018

Stack Overflow answer iterator + genetic build algorithm = 90% of software development

y4mi · on Jan 10, 2018

You'd be right, because most software development creates more code dept than features.

trendia · on Jan 10, 2018

I think you mean "debt" -- I first read your comment as "code department" and was trying to figure out what that was.

troysk · on Jan 10, 2018

I have worked mostly on frontend development and have to say that the quality of pixel perfect code being generated will make many a designers happy!

Super awesome!

andegre · on Jan 10, 2018

I know alot of you smart ones think this isn't that great, but I think this is incredible. I'd love to see someone host this someplace so I can just upload an image and see what it spits out.

I'm not smart enough to take the code and get it running...

emilwallner · on Jan 10, 2018

It's not rocket science, I think you can figure it out. To get started:

git clone https://github.com/emilwallner/Screenshot-to-code-in-Keras

pip install -U floyd-cli

(login to floydhub.com, a super simple platform for cloud GPUs)

cd Screenshot-to-code-in-Keras/floydhub

floyd init picturetocode

floyd run --gpu --env tensorflow-1.4 --data emilwallner/datasets/imagetocode/2:data --mode jupyter

Then, navigate to the folder floydhub/Bootstrap/test_model_accuracy.ipynb in your Jupyter Notebook and click on Cell > Run all

Now you can see the prediction and the correct markup for all the evaluation images. Ping me on twitter: @emilwallner, if you get stuck.

andegre · on Jan 10, 2018

Thank you sir, I'll try that out!

huula · on Jan 10, 2018

Cool work! I'm very interested in this topic. Just wondering, how good does it generalize your training data other than just remembering strict input-output mapping?

emilwallner · on Jan 10, 2018

The bootstrap version generalizes with 97% accuracy on a new image. Because the vocabulary is limited, you can train the model overnight. To make the model generalize with all the HTML/CSS markup you need significantly more compute.

rossdavidh · on Jan 10, 2018

This is, approximately speaking, useless. Which is not to say it shouldn't have been done; you have to make a lot of useless prototypes with any new technology before you can actually make something useful. So long as one considers it in this light, it's cool. Just don't get too excited about not having to do this work yourself (if you need it done) for at least the next several years (perhaps decades).

erAck · on Jan 10, 2018

I don't think it will be decades. AI and DL are on a very fast pace the last 2-3 years and progress increases exponentially.

bbayer · on Jan 11, 2018

This is great example of deep learning that is applied to real world problem. I am just curious if that can be done more easier and more robust by using simple image processing algorithms? Box detection and OCR can work well and may produce better results with different types of mockups. Sometimes I feel like we are making problems even more complicated trying to solve them by using popular concepts.

sova · on Jan 10, 2018

"Nothing made sense until I understood the input and output data. The input, X, is one screenshot and the previous markup tags. The output, Y, is the next markup tag. When I got this, it became easier to understand everything between them. It also became easier to experiment with different architectures."

Thanks a lot for a great write-up. Truly a good browse for anybody learning Neural Nets.

_1qd4 · on Jan 10, 2018

You didn't turn anything into "code", you converted an image to markup; you've just "described" the image. This doesn't work for any sort of web app. A flat image from a designer does not convey enough information to build a proper interface. What of responsiveness? Or hooking it up to an API? Or accessibility? Or animations?

anfilt · on Jan 10, 2018

I would agree. However, most websites start out as such a design image. Getting the markup for the design, and adding onto the generated html/css template could speed things up.

-edit Also who is to say that it will not be able to do more in the future. More features could be added.

_1qd4 · on Jan 10, 2018

Markup is more than how it looks like. Some times a <p> tag should actually be a <span> or a <div>. Sometimes a button should be an <a> tag, sometimes it should be a <button>. I wouldn't use the <article> tag if the content wasn't standalone (a blog post, newspaper article, etc.).

You need to understand what has been designed and the goals of the project to know which tags to use. I need to actually read the content to know if I should use <article> or <section>. There's not much time saved if I have to go back and readjust everything, I may as well have written it the way I wanted it in the first place.

It won't be able to do more in the future, computers can't understand things.

tinymollusk · on Jan 10, 2018

Do humans understand things, or does our inner voice merely tell us we do? Without more concrete answers to some very vague/often infuriating questions, it's a very bold and unsupported position to suggest that computers cannot understand something.

To me, it's still up in the air whether we understand things, or if we're just applying a lot of simple, low level, information processing networks. Recognizing objects in vision, for example, turns out to be a serious of clever organizational and optimization tricks around simple mathematics.

_1qd4 · on Jan 10, 2018

That's a fair point! Part of understanding something is being able to explain it in different terms. We tend to create metaphors to display our understanding of things. Merely rearranging the words in a sentence does not mean you understand it, my grade school teacher would dock me marks for that.

So in order for a computer to understand something, it has to also understand something else, otherwise it can only regurgitate back what you've given it. If I tell a computer "there is snow covering a hill" and it responds "a hill is covered in snow" I am not sure it really understand what I said. But if it responds "snow blankets the hill"... Ohh! Now it understands!

This begs the question, what was the first thing we understood? I don't know! Maybe we're born with a certain level of understanding? Maybe our biology imparts a baseline understanding-of-things? If that's the case, then computers can never understand something; they have nothing to build off of, only what we tell it to regurgitate.

tinymollusk · on Jan 11, 2018

So, if I understand you correctly, being able to generate context-aware comparisons from similar knowledge is understanding? (I promise I'm not trying to trap you -- I think this is interesting and would like to explore more.)

What do you think about the classic example from computational linguistics[0]? Is that approaching understanding?

That seems like a neat trick, not actual comprehension, but I don't know why.

[0] King + Women = Queen, see https://cacm.acm.org/news/192212-king-man-woman-queen-the-ma...

anfilt · on Jan 10, 2018

Assuming we can't structure a computer with your hypothetical required base-line. Then it may be impossible, but if our understanding is a physical process that means we should be able construct an other physical entity with your required base-line. I don't see any reason to think we have some unreplicable physical process. Perhaps complicated, but that's anything but unreplicable.

woah · on Jan 10, 2018

What's the benefit to all these different tag types that have identical appearance and behavior? I remember the selling point was that they would be "semantic" and somehow used by software to add richness (I guess this would mostly be for search engines?). Is this actually happening? If the web had gone differently and styling was done completely by browsers, maybe these tag types would be more useful, but that's not the way things are.

_1qd4 · on Jan 10, 2018

As it concerns my role, accessibility is the main benefit. They do not have identical behaviour.

A <p> tag will be read out by a screen reader as "paragraph", but a div tag will not read out anything but the text inside. I only use a <p> tag if I want my accessible users to know that this is a paragraph (of a larger body of text).

I suppose you could make similar arguments for the other semantic tags. You use them when the content explicitly matches the intent of the tag so that other software can pick up and read it with a bit of... ahem understanding.

sedachv · on Jan 10, 2018

Really cool. I wonder what a differentiable HTML renderer (akin to https://www.researchgate.net/publication/270158331_OpenDR_An...) would look like, since it could be used in a similar manner.

peterchon · on Jan 10, 2018

I'm curious about how it handles design changes. Will it just rewrite from scratch? or adjust the code it has already written?

emilwallner · on Jan 10, 2018

This approach rewrites everything from scratch.

However, it's an interesting technical problem to adjust the code. Does anyone have a rough idea how to implement it?

iluvmylife · on Jan 10, 2018

This is great to see!

It's popular to apply AI to healthcare, transportation and legal work. However, these fields are heavily regulated, require domain expertise and data is hard to get. The nature of front-end dev and other digital skills make them more ripe for automation. I'm surprised there is so little progress in this area.

emilwallner · on Jan 10, 2018

I agree, researchers tend to focus on more academic problems and industry goes where the money is, leaving some areas without innovation. I’ve come across few papers that cover frond-end development.

toisanji · on Jan 10, 2018

I worked on a similar project but was generating graphics code with both java and ruby: http://www.jtoy.net/projects/sketchnet/

emilwallner · on Jan 10, 2018

This is rad, thanks for sharing it.

ydmitry · on Jan 10, 2018

It's just the loud title. Generated templates are consist from ready components in examples. It's quite simple to do with minimum knowledge about code and usually useless, because people need more from interface

vadimberman · on Jan 14, 2018

So... it recognizes text (OCR), detects style (bold, italic, etc.), maybe font face, and outputs the result? No pictures, nothing else, of course.

Props for the marketing though.

blurbleblurble · on Jan 10, 2018

From reading the comments, it looks like a lot of people are bored of mundane web design. I'm right there with you. Give me something interesting to do!

fredley · on Jan 10, 2018

If this can get one step further and generate a layout from a hand-drawn sketch, it would really be a gamechanger!

gregoire · on Jan 10, 2018

Check out https://airbnb.design/sketching-interfaces/

tonybeltramelli · on Jan 11, 2018

check this out: https://www.youtube.com/watch?v=DapKYJ6niCM

ben_jones · on Jan 11, 2018

Can you make the page "pop" more?

tkyjonathan · on Jan 10, 2018

Would be nice if this was in some adobe product. Then you can go from illustrating straight to web site.

petrey · on Jan 10, 2018

This is phenomenal.

itissid · on Jan 10, 2018

didn't Dropbox also recently blog about this?

foobaw · on Jan 10, 2018

I love this! Hope no one turns this into some SaaS startup though :(

anfilt · on Jan 10, 2018

You know business people of course someone probably will try. That recurring revenue stream is what they drool over.

jlebrech · on Jan 11, 2018

this would go well with a template language that maps objects to markup rather than having to put template code within the markup.

memebox3v · on Jan 10, 2018

[flagged]

amelius · on Jan 10, 2018

With a human or with a GPU?

memebox3v · on Jan 10, 2018

dh-g · on Jan 11, 2018

HTML is not code, its markup.

Letmesleep69 · on Jan 11, 2018

It is code, it is not a programming language.

tanilama · on Jan 10, 2018

But, in today's workflow, it is trivial for designer to generate the code or even animation with their mockup tools, right? This is useful in a sense if you only have an image of the design, like to quickly copy a competitor's work, but not really that groundbreaking for company's internal design workflow, the hard part of which is to figure out what is good design, not given a design layout then translate into markup