Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Compared to 17% when pre-trained on Python. Looks like the debate on whether or not HTML is a programming language is settled!


Was the HTML corpus it trained on only HTML and text, or was there JS code in the HTML?


There sure was. Here is a snippet from the very first example in the html.json file:

  <input type="text" id="searchQuery" name="searchQuery" placeholder="Enter name...">
        <button onclick="searchEmployees()">Search</button>
    </div>
    
    <script>
        function searchEmployees() {
            var input, filter, table, tr, td, i, txtValue;
            ...etc...
There's no code in the repo, but the paper says this about how the inputs were generated:

For all the selected languages except HTML, we adopt an in-depth evolution, by asking GPT-3.5 to rewrite the seed instruction (Python) into a more complicated version relevant to the target language (Python, JavaScript, TypeScript, C, C++, Java, or Go). However, for HTML, we adopt in-breadth evolution to produce a brand-new HTML-related instruction...




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: