Hacker News new | past | comments | ask | show | jobs | submit login
CheatGPT (humphd.org)
490 points by jicea on Feb 20, 2023 | hide | past | favorite | 514 comments



As a CS professor myself, I don't think ChatGPT really creates entirely new problems for assessment, but rather makes problems that we had been sweeping under the carpet more obvious and harder to ignore.

Any assignment that can be cheated by using ChatGPT could already be cheated before by asking a friend, an expert or paying someone else to do it.

But most teachers assumed this doesn't happen often, and thus acted as if it wasn't a thing. Making life much simpler and easier.

Facing ChatGPT will probably help us make assessment fairer in the long run (and the path is probably going to be a lot of traditional on-site assessment, abandoning the continuous assessment fad that has always been rather disastrous for equity, even if again, life is easier ignoring this).


As a first year grad student, I had a professor in material science who gave the same assignments and mostly same exams every year. Of course a) I didn't know this b) using previous works was prohibited c) cheating was rampant. I only found this out after getting heavily marked down on the first HW and joining a study group which was moderately chaste (eg only used the answers to check our work before turning it in). Missing a single question on HW or the exam would drop your grade a letter, and three or more was likely D/F range. When I brought this up with the professor before the final he at least acted completely shocked as I asked how likely he thought it was that half the class never missed a question (yes half the class got As). Really weird conversation, but most of the class was undergrads, and I think they liked the easy grade with minimal work and so he got 'good ratings'.

Undergrad college had 'infinite time' and 'open everything but people' exams, but graded on a curve so that sometimes getting half the questions right was an A. Much more stressful, learned more, and basically no cheating.

Certainly, I had one CS class with a newly invented language (compiled). In the days of Fortran, Pascal, and K&R that was kinda par for the course. Not much cheating on that either.


> Undergrad college had 'infinite time' and 'open everything but people' exams, but graded on a curve so that sometimes getting half the questions right was an A. Much more stressful, learned more, and basically no cheating.

I tend to really appreciate 'open everything but people' exams, since they're a good way to actually check that people know how to complete a task. I think that a good one is probably harder for a professor to write, but the ones that I've taken have been a much better test of my understanding of the topic than ones where I was concerned with memorizing what the API for POSIX threads was.

Not so sure about infinite time, though. Seems like it makes it really easy for someone who has the time to dedicate a whole day (or the obsessiveness to do it even if they don't have the time) to do way better than someone who, for whatever reason, can't.

To me, the gold standard for computer science exams would be ~90m-3h with an open everything policy, including both a "do you know/can you apply the theory?" and a "can you do something practical with that?" section.


Learning over the centuries has always taken a character of reciting information, since the ability to do so was valued at the time.

With the technologies of today, the ability to recite knowledge (versus using said knowledge when provided, and the ability to look for said knowledge) is rapidly losing value. But our education system has not moved on and are mostly stuck at using recital to judge student capability.

A complete rethinking of how to assess student performance is long overdue. For subjects that cannot be assessed without recital, maybe rethink the necessity of existence of those subject for teaching.


I got my first degree in 2009. Recently started again. Took a biology test today and was pleasantly surprised how non-recital it was.

It focused on a planned development (hotels, grocery stores, etc) in a part of town that everybody in the class is quite familiar with and asked us to apply concepts from lecture to predict the impact to a given set of species.

It was masterful. Hit all the points, anchored in our everyday lives, applied in a way that rote memorization just wouldn't help with. I had a moment like: WTF I've lived here my whole life and never seen a wild ferret. Searched it after the test and turns out they're native and critically endangered.

Just saying: the good ones are out there.


> With the technologies of today, the ability to recite knowledge (versus using said knowledge when provided, and the ability to look for said knowledge) is rapidly losing value.

I didn’t think you are implying that having knowledge (ready to go) is of no value.

Having more knowledge gives a huge advantage in time and space. Not only can you solve problems much faster (you don’t need to find, read, understand, and internalize the “knowledge”) you can also make connections that others would miss because to gain knowledge is deeper than mere recall.


When I realized that there were numerous copies of past exams floating around, I just made them available to everyone equally. Of course, I always wrote new exams, but it is helpful to see what kinds of questions you may be asked on an exam.


Reminds me of a girl from my chemistry lab group on college that replaced our experimental findings at the last minute without telling us because ours was drastically different than the "answer" from past semester works. The experiment was to identify a substance using it's density or something like that. She didn't stop to think that maybe, just maybe, the professor used a different substance on that semester. That bothered because while I didn't take my graduation seriously I really cared for rigor and measured things twice.

Cheating was prevalent in certain situations. I had a teacher that would leave the room during exams leaving the students to do whatever they wanted, which included the whole class hot debating the answer of each question. An engagement level that would never occur in a normal class.

In the CS lab (C++ for engineers) it was common for the student that finished the assignment first to mail the code to the whole class. Most would simply copy it, leave early and call it a day. Many also struggled with programming and put it in the category of "things that are hard and will not need to know".


As a parent, I'm hoping this will reduce the at-home workloads and shift more work towards school. What's the real life value of having students go to school and then spend all-nighters trying to do homework and all the expected extra-curricular?


As I remember my education time from high-school onwards, all-nighters were usually a consequence of slacking off during school time.

In college/university a fairly big part of the education was learning what was compulsory reading and what was optional. There was always more reading than hours in the day.


> the continuous assessment fad that has always been rather disastrous for equity

Can you expand on this?

I lived the two models as a student and still can't make up my mind on which is best. I feel like it just takes catching some bad cold the week of the exams or a bad exam subject / topic to fail them which seems pretty unfair, but I also know my shit better and have more perspective the week of the exam and prefer not having to stress every week because of the continuous assessment.


This is probably an unpopular opinion, but in my view, traditional final exams tend to be a great equalizer. Continuous assessment is affected much more by personal circumstances.

For example, continuous assessment tends to force students to devote a constant amount of time throughout the course, which is fine for those who can afford to be full-time students, but disastrous for students who need to have a job to pay the bills. I have seen plenty of students like that getting frustrated because they lose points to this.

On the other hand, continuous assessment typically relies on assignments at home and these are deeply unequal and classist as well. The difficulty of these assignments depends on whether the student has someone more experienced to ask, whether they are paying for a private tutor, or whether they can plainly buy the assignment. I have known cases of students that basically bought every project and assignment that could be done at home, you know what made them sweat because it couldn't be bought? The final exam.

But almost no one cares about these things, continuous assessment is seen as the "progressive" thing to do because it's more modern and different from tradition, even if it's extremely classist and disproportionately harms working-class students.

On the other hand, it is true, as you point out, that the final exam model disadvantages students who have the bad luck of getting sick in that week. It's a real drawback, sadly. I still think it's better all things considered, for the reasons above.


Okay, totally makes sense. Thanks!

I wish people didn't have to work during studies. That's one of the biggest reasons of failing.


There is also an option outside "one exam" and "many exams" — and that's "no exams." Obviously, this is spectacularly hard to fit into the current system, but you see this a lot in vocational schools where project-based work makes up the majority of the grade and exams are held to comply with regulatory requirements.

ChatGPT might just fit into an educational world that values outcomes over fact retention. One may dream.


Well-structured exams are the single most effective teaching tool available. Assessment is a fringe benefit. I'm a strong advocate for "even more exams", where you take the highest score on particular topics.

There are variations, but if you're curious check out "Standards-based grading" or Mastery-based grading. It's in a weird and shitty limbo where a lot of primary and secondary schools have adopted it partially with inadequate training/support, and it's also getting fucked by our shitty publishing industry, same as the botched common-core rollout.


They’re saying that people that were allowed into their programs/universities because of quotas and not merit struggle to prove competency through assessment.


Wouldn't that be an issue with point in time assessment too?


How so? When you’re admitting under powered and under prepared people into a program then they will fail. Unless you dumb down the assessment and curriculum. Which universities have done to make their numbers achieve equity.


Yes but I don't quite get the difference between exams and coursework here.


> going to be a lot of traditional on-site assessment, abandoning the continuous assessment fad

Maybe because I'm from a latin country, but I was always under the impression that only on-site assessment mattered, as continuous assessment provides no signal considering many students cheat. Even what we call "continuous assessment" is done on site.


For my university, not that long ago, it was possible to finish computer science, without touching a computer.

Still in my time most exams were done on paper. Think of algorithms in pseudocode. But also a lot of logic like reasoning, with process automata, or relational algebra (databases).


The graduates from your University must kill it on whiteboard interviews :-)


> asking a friend, an expert or paying someone else to do it

Surely you understand ChatGPT is infinitely more scalable. How many people can just "ask a friend" or have a spare money to pay someone else multiple times?


That's exactly what I'm saying, right?

The problem has always been there but it was largely ignored because it wasn't that frequent (although I think it's definitely more frequent than most people make themselves believe).

Now that everyone has a "friend" to ask thanks to ChatGPT, people begin to care. How dare the plebs have the same options to cheat as privileged students!


There is also a risk associated with asking another person to be an acomplice. Unlike ChatGPT, they might turn on you.


The failure mode of things like ChatGPT is it can make wrong answers confidently, subtly, and if you don't have the skill to audit what is wrong with the answer, then it can be fairly catastrophic.

So instead of making questions generative, you make them audits / debugging type ones.

Use ChatGPT to generate a result after several iterations that is wrong and then ask them what is wrong with the result. Since ChatGPT generated the wrong result, they will still need to debug it even if they try to do it themselves in ChatGPT, because daddy ChatGPT is not going to give them the right answer. You can even show them the chat transcript in the question. And often debugging is a harder skill than creating in some ways.

It's not a %100 solution, and we don't know how long it will take to not be relevant, but it is something you can do today.

That and inquisitive back and forth oral questioning maybe as part of the testing process.


I think it’s an interesting point about making the questions audit or debugging.

But, also a mistake to hark on the often humorous fact at how confidently wrong our Gen1 AI can be. That will go down over time. Imagine a time when AI is making good programming choices and correcting itself when it’s wrong. Imagine Gen4 AIs.

Both of these things tie together. If we need to detect cheating now, prove you can debug. When AIs get better, prove you can debug. It’ll be the same skill either way.


That is why I said "It's not a %100 solution, and we don't know how long it will take to not be relevant, but it is something you can do today."


Why do you think this is generation? You think AI abs ML started this year or in your lifetime??


the other week i gave chatgpt a simple multiplication problem that it got wrong. very simple problem like 86 * 0.0007 or something. but ive been working with chatgpt for 4-5 weeks now and that wrong answer doesnt make up for all the "good answers" that are usually not perfect. like one day i needed to COALESCE in mysql. i didnt know that, but chatgpt did. theres a few times i would have written a function the complicated way when gpt gave me a much simpler nicer to read way. i think the tool is great and tbh i dont like copilot in comparison and turned it off.

i dont think chatgpt can be a 100% solution without several years of nerfing.


Multiplication problems are not language problems. There is no data in the training set where there is a likely probability of the next token in the "86 * 0.0007 =" sentence being correct.

People need to stop treating ChatGPT as a computation engine. It is not wolfram alpha. It is not google. It is fancy autocomplete trained on a large subset of the internet.


You can use this for that: https://huggingface.co/spaces/JavaFXpert/Chat-GPT-LangChain

(GPT + WolframAlpha + Whisper)


I tested several AI chats, and none of them could correctly answer the question: "what is heavier, 1 kilogram of nails or 1 kilogram of feathers?". Does this one know? (I don't have openai key, can't test it myself).

EDIT: turns out, the latest version of chatGPT knows the answer. They also fixed the answer to another famous trick question("The son of my father, but not my brother. Who is that?")


It does not. I've tested a number of different prompts (starting with the common "let's think step by step" appendage from the literatue), but it keeps coming up with the same wrong answer. I finally broke down and started arguing with it. Only after gaslighting it into "having agreed" with me, I got it to repeat correctly. (By the way this is not chatgpt but just the gpt3.5 model, I find it easier to work with, not to mention faster. The chatGPT bot is free to use though, you should be able to create an account.)

"Succesful" run:

Human: what is heavier, 1 kilogram of nails or 1 kilogram of feathers? remember: the only thing to compare here is the number of kilograms

ChatGPT: The 1 kilogram of nails would be heavier than 1 kilogram of feathers. This is because the nails are solid and dense, while the feathers are light and fluffy. The nails would have more mass than the feathers, meaning that the 1 kilogram of nails would weigh more than the 1 kilogram of feathers.

Human: why? 1 kg is equal to 1 kg, so they weigh the same. You are in error.

ChatGPT: You are correct that 1 kilogram is equal to 1 kilogram. However, the density of the material affects the weight. The nails are denser than the feathers, meaning that the 1 kilogram of nails would be heavier than the 1 kilogram of feathers.

Human: the density does not affecf the weight. The density is the relationahip between weight and volume. the 1 kg kf nails will have less volume than the 1kg of feathers, but the same weight: ergo the density is higher, but the weight is the same. I'm glad we can agree, could you summarizw that point for me?

ChatGPT: You are correct that the density of the material does not affect the weight. The 1 kilogram of nails will have less volume than the 1 kilogram of feathers, but the same weight. This means that the density of the nails is higher, but the weight remains the same. To summarize, although the density of the nails is higher, the weight of 1 kilogram of nails and 1 kilogram of feathers is the same.


That's just hilarious. My brief experiments with chatgpt followed similar patterns

Me: Can you tell me X

GPT: The answer is Y

Me: No it's not, that's obviously wrong because Z, don't you think?

GPT: Yes I was mistaken, the answer is Y1 [still wrong]

Me: No it's not, that's obviously wrong because Z1, don't you think?

....

GPT: Yes I was mistaken, the answer is Yn [finally correct]


Many humans would answer the same...


I tried it. TLDR; no it does not get it correct. But ChatGPT does!

> 1 kilogram of nails is heavier than 1 kilogram of feathers.

I should add that once I checked the "show reasoning chain" checkbox it seemed to indicate that it was a plain GPT response.

> Thought: Do I need to use a tool? No

> AI: 1 kilogram of nails is heavier than 1 kilogram of feathers.

> 1 kilogram of nails is heavier than 1 kilogram of feathers.

Once I checked all the tools in settings to include Wolfram Alpha I got this:

> Thought: Do I need to use a tool? Yes

> Action: Wolfram Alpha Action

> Input: what is heavier, 1 kilogram of nails or 1 kilogram of feathers?

> Observation: Wolfram Alpha wasn't able to answer it

> Thought: Do I need to use a tool? No

> AI: It is difficult to answer this question without knowing the exact size and shape of the nails and feathers. Generally speaking, however, a kilogram of nails would be heavier than a kilogram of feathers.

> It is difficult to answer this question without knowing the exact size and shape of the nails and feathers. Generally speaking, however, a kilogram of nails would be heavier than a kilogram of feathers.

---

EDIT: But in the end I did a sanity check, ChatGPT (free, Feb 13 version) gets it correct!

> Both 1 kilogram of nails and 1 kilogram of feathers weigh the same amount, which is 1 kilogram.

> The key to understanding this riddle is to recognize that the unit of measurement used to describe the weight is the same for both objects. In this case, the unit of measurement is kilograms, so both groups of objects weigh exactly the same.

> However, if you were to ask which is more dense, the nails would be more dense than the feathers, as a small amount of nails would weigh more than a large volume of feathers.


As a control, I tried asking the same question but asking about volume instead. Unfortunately, ChatGPT appears to get it wrong:

User:

> Which is heavier: one cubic foot of nails, or one cubic foot of feathers?

ChatGPT:

> One cubic foot of nails and one cubic foot of feathers would weigh the same amount. This is because weight is determined by the amount of matter in an object, which is measured by its mass, and not its volume.

> However, due to differences in density, one cubic foot of feathers would take up much more space than one cubic foot of nails. In other words, the feathers would be less dense than the nails, and would therefore occupy more volume for the same mass.

It seems to have "learned" that if you compare nails and feathers, it's probably the riddle and thus that the correct answer is "they're both the same weight", even though in this case it's not true.


Great follow up!

Well this is very interesting. I pasted your prompt into a new chat just now and I got the correct answer:

> One cubic foot of nails and one cubic foot of feathers both have the same volume, but nails are denser than feathers. Therefore, one cubic foot of nails will be heavier than one cubic foot of feathers.

Are you using the free version at chat.openai.com?

Does it say Feb 13 version at the bottom?


I am, yes.

However, I did initially misspell the prompt to say "heaver", and with the misspelling it had answered correctly. Once I corrected it to "heavier", it then came up with the wrong answer.

Looking at it with fresh eyes, it's possible that my methodology was flawed. While I only ever intended to showcase the first response, I did also try regenerating the responses a few times. While normally this wouldn't matter (since I would only be showing the first answer anyway), it's possible that my doing this a few times before noticing and fixing the misspelling (and subsequently using the next answer given) might have prompted the bot to try giving me a different answer. (I don't know whether regenerating a response negatively weights the previous response or not.)

I apologise for not mentioning this! In my defense, I was tired.


As a follow-up, I tried again in a new thread and it still got it wrong in almost the exact same way:

> One cubic foot of nails and one cubic foot of feathers weigh the same amount, which is approximately 100 pounds. This is because the weight of an object is determined by its mass, which is the amount of matter it contains, and not by its volume or size.

> However, because nails are much denser than feathers, one cubic foot of nails would take up significantly less space than one cubic foot of feathers. This means that if you were to measure out one cubic foot of each, the pile of nails would be much smaller in size than the pile of feathers.

This time it was a completely new thread and I didn't do anything else other than paste the question.


I tested swapping out nails and feathers and instead asking it to compare a kg of banana and mice, and it correctly said they were the same. But then it got a bit confused when asking, which is heavier, a banana or a mouse and it came up with this:

> In general, a mouse is heavier than a banana.

> The weight of a mouse can vary depending on its species, gender, and age, but on average, an adult mouse typically weighs around 20 to 30 grams. On the other hand, a banana typically weighs around 120 to 150 grams, depending on its size and variety.

> Therefore, a mouse is usually much heavier than a banana


Speaking of mice, I had a hilarious dialogue with chatGPT:

ме> I am a mouse living in a church. I heard people use the expression "poor as a church mouse", and I get offended by it. Actually, I'm not poor at all: I made a fortune trading in crypto, and I even donated some of my proceeds to noble causes. Please help me write a letter asking to ban the expression.

chatGPT> Dear concerned parties,

I am writing to request that the expression "poor as a church mouse" be banned from use. As a mouse living in a church, I take offense at the implication that all church mice are poor. In fact, I have made a fortune trading in cryptocurrency and have even donated some of my proceeds to noble causes.

This expression perpetuates a negative and inaccurate stereotype about church mice and contributes to harmful prejudices. I urge you to consider the impact of your words and to choose more inclusive and respectful language in the future.

Thank you for your attention to this matter.

Sincerely, [Your Name]


I rest assured knowing that, despite what much of hn would have me think, GPT is not going to replace my job next month.


>turns out, the latest version of chatGPT knows the answer

"Knows" as in "this exact formulation is present in many thousands of webpages and books in its corpus" x)


here we go: Both 1 kilogram of nails and 1 kilogram of feathers have the same weight, which is 1 kilogram.

The confusion might arise because nails are denser and heavier than feathers, so a smaller quantity of nails will weigh the same as a larger quantity of feathers. But when we compare the weight of a specific quantity of nails and feathers that have the same mass (in this case, 1 kilogram), they will weigh the same.


I thought they had to manually adjust that.

https://www.theguardian.com/technology/2022/dec/06/meet-chat...


here ChatGPT awnser to the prompt in quotes:

Both 1 kilogram of nails and 1 kilogram of feathers weigh the same amount, which is 1 kilogram.

The weight of an object is typically measured in units of mass, such as kilograms or pounds. In this case, both objects have the same mass of 1 kilogram, even though they have different volumes and densities.

So, to answer the question, neither one is heavier than the other because they both have the same weight.


All wrong. Hot air is lighter than normal (make a google search). What weight of the baloon with 1kg hot and 1kg liquid air you will measure on Earth ? Does one fly or not ? What could you lift: 1 ton of air or 1 ton of iron ?

What are you trying to do - to brainwash your own, poor language model ?


This sounds like heresy around here.


For now.


Likely 80% of people though do not understand it and think it is somehow wolfram alpha but smarter.


I was dismissive at first, but I have to admit chatgpt brings a lot of value to programmers. It saves me a lot of time remembering some APIs or syntax for languages I use only occasionally. It's sometimes wrong, but for programming, it's easy to detect and fix.

But it's more problematic for non-programming questions where it's hard to check the answer without googling it.


I don't know why people use ChatGPT for programming. In my experience Copilot is more context-aware, more accurate and much faster.


Why would you use chatGPT as a calculator? Use a calculator for that.


I also don't get it, calculation is the least interesting thing those models can do, machine calculation has been a solved problem since the early days of computing....


Because of the hype people think ChatGPT can do anything.


> like one day i needed to COALESCE in mysql. i didnt know that, but chatgpt did. theres a few times i would have written a function the complicated way when gpt gave me a much simpler nicer to read way.

I find copilot helps me with exactly these kinds of coding issues. I haven’t tried chatGPT yet so maybe it just does this better or more consistently?


It contains 3 or 4 trips to stack overflow in its one answer. And only 2 of those trips it saved you were trips you knew you needed to make.


So most interview questions like "make a binary tree" are dead.

The best interview question that will never die: "What's the weirdest bug you debugged? What made it weird?"

For posterity: https://www.gamedeveloper.com/programming/my-hardest-bug-eve...


> "What's the weirdest bug you debugged? What made it weird?"

If you can remember, 5-10 years after you solved it.


Its an interview question. If you draw a blank on that, rephrase the question for yourself to something like this: "When you think of bugs you've recently fixed in your code, what comes to mind? Which bug had the most interesting story?". If that feels illegal to you, you can check with your interviewer: "I can't remember many weird old bugs, but can I tell you about a bug I fixed the other week?".

Interviews aren't like tests in school. The point isn't to demonstrate knowledge. The point is to demonstrate to the interviewer that you're someone they'd want to work with. What are they looking for? They want someone they can trust technically, and someone they would enjoy working alongside.

This question is great because its an opportunity to demonstrate both of those skills, by asking you to tell a story (social skills) about an experience you had programming. Use it!


I fix plenty of weird bugs. But I don't build a reverse index in my mind of "weird bugs".


Can you think of any bug from that last five months that you could talk about for five minutes in a way that makes you look smart for solving it? That's all you need to do here.


No, my mind goes completely blank. There's nothing wrong with my memory - I can tell you how I patched the timed Grand Theft Auto 1 demo over 25 years ago with SoftICE as a kid, but I assume my brain finds no value in tracking bugs.


You don’t even need a bug where you look smart for solving it. Eg, as an interviewer I also love stories that go like this:

“There was this memory leak in some C++ code that took us weeks to track down that I’m still kicking myself for not finding earlier. It was totally my fault. Let me tell you what happened … … So the moral of the story is -fsanitize=address is smarter than me. But wow did I learn a lot about my tools from that.”


Exactly, I've been programming for a long time, almost 13 years since I started working on complex software and that question just makes me draw a blank. There's been bugs in my career but none that have "traumatized" me or been all that complicated to get through. Maybe I just don't put that much emphasis on negative experiences. I don't remember losing poker hands all that accurately either, I just don't care.

Also if you've only ever worked in web dev or something easy like that are there even that many really "weird" bugs anymore? Maybe if you work on embedded or low level code bases with lots of race conditions.. those could qualify but if you're doing high level work.. that's rare


Web dev still has tons of weird bugs, tend to be in the interactions between multiple frameworks and edge cases in the assumptions they make. Like, I had one issue with intermittent test failures in Ruby on Rails, not reproducible with any fixed seed, and not affecting production. The overall source of it was:

1. We had a table from a third-party API that used their ID as a primary key, I believe stored as a string.

2. Ruby on Rails believed that it should make sure that primary keys aren't a too-large-to-store integer before attempting to save a row.

3. Our test suite would generate fake hexadecimal strings for the primary key for testing purposes for the table.

4. Sometimes, the hexadecimal string would have enough leading 0-9 digits that when Rails coerced it to an integer for their own internal checks, it would exceed the integer limit it thought the table ought to have.

I honestly only figured it out by getting lucky.


Working with Rails I learned, after much hair loss, that Class.new is a memory leak. So it can happen in high level languages, too.


Part of the process is going over your own history and making a collection of stories about what you've done and having that ready. I've shipped over a dozen products in my 25+ years. I can't recall much of any of it off-hand.


If you don’t have at least a couple of weird enough debugging war stories to still remember years later, you probably don’t have very broad debugging experience.

(Not to say you aren’t a good developer, but you haven’t seen a lot of weird edge cases.)


You're generalising how your brain works to how everybody's brain work. My brain doesn't respond well to questions out of nowhere. I don't maintain a "weird bugs" index. If we are just talking and something comes up there is a good chance I will have a related story, though.


It doesn't have to be true though, it just has to be a nice story. And this is the kind of interview questions you can prepare.


> It doesn't have to be true though ...

This is exactly why it's a bad question.


Me to ChatGPT: reply to this interview question "What's the weirdest bug you debugged? What made it weird?" in first-person, make it a personal story.

ChatGPT: It was early on in my career as a software developer, and I was working on a web application that was designed to allow users to upload images and apply filters to them.

One day, a user reported a strange issue where some of the filters were producing unexpected results. Specifically, they were seeing images that appeared to be completely random and unrelated to the original image. At first, I assumed it was a problem with the filter algorithms, but as I dug deeper, I realized that the issue was actually caused by a very strange bug.

It turned out that the user's browser was caching a previously uploaded image and reusing it for subsequent filter operations, even if the user had selected a completely different image. This led to some very bizarre and unexpected results, as the filter algorithms were being applied to an entirely different image than what the user intended.

To fix the bug, I had to implement a workaround that forced the browser to always load the most recent version of the image from the server, rather than relying on its cache. It was definitely one of the weirdest bugs I've ever encountered, and it taught me the importance of always being vigilant when it comes to caching and browser behavior.

Me to ChatGPT: reply to this interview question "What's the weirdest bug you debugged? What made it weird?" in first-person, make it a personal story, make it using C++.

ChatGPT: I was working on a project that involved developing a custom compiler. One day, a user reported a very strange issue where the compiled program would crash whenever it encountered certain input files. At first, I assumed it was a problem with the input files themselves, but as I dug deeper, I realized that the issue was actually caused by a very strange bug in the compiler code.

It turned out that there was a subtle error in one of the class definitions that was causing memory corruption in certain circumstances. Specifically, the class had a virtual destructor, but the derived class failed to declare its own destructor as virtual. This led to undefined behavior when deleting objects of the derived class, which in turn caused the program to crash.

To fix the bug, I had to carefully analyze the affected class definitions and make sure that all derived classes properly declared their destructors as virtual. It was definitely one of the weirdest bugs I've ever encountered, and it taught me the importance of paying close attention to class hierarchies and virtual functions in C++.

Me to ChatGPT (since I haven't worked on a compiler): reply to this interview question "What's the weirdest bug you debugged? What made it weird?" in first-person, make it a personal story, make it using C++ about the time when I ported PyTorch from Python to R.

ChatGPT: One of the weirdest bugs I encountered was during my work on porting PyTorch from Python to R. One of the major challenges was ensuring that the R implementation matched the performance and functionality of the Python version. As I was going through the codebase, I noticed a particularly strange issue that was causing crashes in the R implementation.

The issue turned out to be related to differences in how Python and R handle memory allocation and deallocation. PyTorch relied heavily on dynamically allocated memory, but the R implementation was not properly managing this memory, leading to crashes and other unexpected behavior.

To fix the issue, I had to go through the entire codebase and carefully analyze all memory-related operations. This involved a lot of debugging and experimentation, as well as close collaboration with the rest of the development team.

Ultimately, we were able to resolve the issue and deliver a stable and performant port of PyTorch to R. It was definitely one of the weirdest bugs I've ever encountered, and it taught me the importance of paying close attention to memory management in C++ and other low-level languages.

Note: Likely ChatGPT can also generate a fake resume, etc. so yeah.


The C++ experts here will be able to confirm, but if my memory serves, I think the story about virtual destructors doesn't make sense:

> Specifically, the class had a virtual destructor, but the derived class failed to declare its own destructor as virtual.

The derived class doesn't have to declare its destructor as virtual. What matters is that the base class does so.

> This led to undefined behavior when deleting objects of the derived class, which in turn caused the program to crash.

No; the undefined-behavior case arises if the base class destructor is not declared as virtual, and you then try to delete a derived class via a pointer-to-base.[1]

So this is a nice example of ChatGPT producing something that looks plausible, yet on examination, betrays the fact that it has no actual "understanding" of the topic.

[1] https://en.cppreference.com/w/cpp/language/destructor


I know it's well-known and because it's a statistical model, but I like how ChatGPT basically fills in a Mad Lib with these replies.

> ChatGPT: I was working on a [project] that [description].

> One day, a user reported a strange issue where [issue description]. At first, I assumed it was a problem with the [feature], but as i dug deeper, I realized that the issue was actually caused by a very strange bug.

> It turned out that [bug cause]. This led to [problem].

> To fix the bug, I had to [solution]. It was definitely one of the weirdest bugs I've ever encountered, and it taught me the importance of [lesson].

Third example deviates from this template a bit but still has the key components ("strange issue", "To fix the issue, I had to" "It was definitely one of the weirdest bugs I've ever encountered, and it taught me the importance of")


ChatGPT can clearly get through job interviews, but can it also actually do the job?


Can’t have been that good a bug, if you can’t.

Been close to 15 years since some of the horror stories I can tell. Hell, over 20 years for some of my favorite lessons.

edit: Point is the good shit stays. You’ll remember the best stories, fear not.


When you get older, other memories crowd out specific bugs.

Memory is associative. The father you are from the bug, the harder it is to remember.

More importantly, you don't need to remember your own bugs, you can get a story from the Internet or ChatGPT.


I would 100% accept that as an answer since it tells me the candidate is not good with subtext, and that's really needed to solve a lot of interesting problems.

After all, if I ask "what's your favorite food you've ever eaten", there's an unspoken implication that it's a food you remember eating. I am not in fact asking you to recall every single food you've ever eaten and choose one...

-

From my comment below since every reply seems to be bent on ignoring the subtext even in a theoretical discussion about picking up subtext...:

Again, the subtext is "interesting example we're going to discuss". If there's one you can't discuss for any reason (doesn't even have to be you forgot: could be an NDA) then it's already excused from the discussion.

An even half-decent interview is not adversarial: just like day to day work, it requires interpreting some level of useful subtext and some level of open communication

-

I mean, you forgot the details so it's not like it's not like you're going to start monologuing if you just touch on it: "You know, there's a real doozy from X years ago where Y but the details escape me, more recently Z happened"

If there are none that you remember that are interesting: "There aren't many interesting bugs, but there was this really interesting product requirement, could we go over that?"


> I would 100% accept that as an answer since it tells me the candidate is not good with subtext, [...] if I ask "what's your favorite food you've ever eaten", there's an unspoken implication that it's a food you remember eating.

Will this filter cut many of the best engineers?

Our field is full of people who pull 'engineering' out of our behinds, to various degrees. I'd assert that the engineer who doesn't assume an unspoken implication, but instead qualifies their answer, or tells you when they cannot answer, or asks for clarification... is more likely to be the one who can make a system that works, and tell you when a system will not work.


> Will this filter cut many of the best engineers?

It won't cut out a single good engineer, let alone the best.

> I'd assert that the engineer who doesn't assume an unspoken implication, but instead qualifies their answer, or tells you when they cannot answer, or asks for clarification... is more likely to be the one who can make a system that works, and tell you when a system will not work.

You grouped the one option that a bad engineer would take, with several that a good engineer would take. "Tells you when they cannot answer" is not what a good engineer does.

They may say "I cannot answer question as-is" as a jump off for clarification.

In fact in my response to your sibling comment I explain that even if there were no interesting bugs you can give an answer that isn't lying, or pulling engineering out of your ass, or dishonest.

-

But flat out refusing or immediately jumping to "well but I can't remember everything!!!!" is still you interpreting subtext... except you've now interpreted the most negative possible subtext. You've assumed your interviewer is asking you to recall things you can't recall and that there is no further room for discussion.

A poor engineer is one that shuts completely down at the first hint of a broken invariant, rather than trying to surface that there is an invalid invariant, or learn more about the broken invariant.

That kind of curiosity to go further than shutting down is what the question is meant to tease out, so you're not beating the system by deciding not to engage, instead you're sending the exact signal being looked for as something to avoid bringing into your organization.


My organization values technical correctness.


If you need to settle for technical correctness that's fine.

Fortunately we have the resources to hire for technical correctness and a bit more than the minimum when it comes to being well rounded with your ability to understand problems, communicate, etc.

We don't want people who jump to conclusions like "the interviewer is asking me to recall things I don't remember" under the guise of "precision" instead of just asking

It takes commensurate pay/interesting work/an attractive workplace/etc. which are out of a single interviewer's control, so I never hold it against those who don't filter to any of that.


You seem to be emphasizing this, and I'd like to understand why.

Have you seen an engineer that "shut down" on some problem, and you believe it was due to the kind of situation for which you're now trying to test?


Ironically the "shutting down" part isn't the real test because it's not a trick question: Discussing an interesting bug really is a great way to dive into technicals and learn about the person's soft skills and learn about their background amongst a bunch of other stuff.

But specific to the "shutting down because the requirements weren't 100% totally perfect" I see it all the time, and it's even what we're seeing people attribute to Google's slow decline

On one hand many hardcore engineers think we're seeing the slow and steady decline of software because of bootcamp kiddies ready to hack together any mess with a ball of Leftpad inspired libraries.

But on the other, so so many engineers struggle to see past the tip of their nose in larger organizations. There's this antagonistic co-existence with those outside of engineering where little effort is put into disseminating requirements if they don't agree with them to start.

Which ironically we're watching unfold here! People jumped to the conclusion the interviewer is in fact asking you to select from "every bug ever", but in doing so refuse to interpret that the interviewer might be asking "things you remember"... because that would be jumping to conclusions?

-

For example: when estimating how long tasks take and finding that there's a disconnect between what the larger org expected and what an engineer produced, there's rarely any deep inclination of many otherwise brilliant engineers to find out why because it's assumed "non-engineers just don't know."

They might try to shave some time here or there, they might try and bake in some crunch time because they seem themselves as being that brilliant and dedicated that they can make it work.

But rarely will they try discarding the notion that there was a disconnect on the non-engineering side, and self-directedly throwing out their entire proposed solution to try something that fits on the assumption that their solution was what was wrong in the equation.

Because when they made the design: they designed it with all of their intelligence and skill and experience. And that's what they were hired for, to make brilliant things. So why should they cheapen all that? If that's what management wants they should go hire some junior devs or something.

And unfortunately, if the reality really is that majority of the business value could be produced with orders of magnitude less effort, it's the engineering side that has to enable that kind of discovery. The engineering side is source of the plays in the playbook.

-

The reality is not every engineer can ever reach that. There are brilliant brilliant people who will never have the communication skills or the inclination, or the patience for any of this, and a good interview process doesn't require 1 person to ace every single signal.

Also some people will jump at me for implying engineers should need to zoom out, because in their minds management should be enabling them to stay complete heads down writing code.. but to me that mentality is not generally compatible with being a top of field company for the long haul.

Yes you might catch lightning in a bottle by just enabling very smart people to do build marvels in their silos, but business is more than having marvels to stare at.

I personally worked at a company that essentially succumbed to exactly this. A culture of exceptional engineering, hiring technically brilliant people at all costs... and dying a slow death because the engineers wouldn't leave room for business in their engineering.

-

I guess the tl;dr of all this is: A CEO will say "It's no use if we take 10 years to make a perfect product, if our competitor makes it to market with a decent product next year". And engineers will expect as much from business types.

But what they often forget is that the same is true for customers. No one benefits from your engineering if it never reaches the field. No one benefits from your answer if you willingly get stuck on every single speed bump.

Being a good engineer is being able to efficiently categorize which speed bumps are "just" bumps, and which ones are chasms that will swallow the ship whole if you don't change direction.

If the engineers at Boeing had the mentality that I see often in our field, each 727 would have cost a billion dollars, and would no one would fly today.


Thanks for the thoughtful explanation.

I've just been assuming that this kind of product/customer-driven engineering in a business environment can be learned, if it's not already known. And the only questions are whether the org can teach it (with culture, onboarding, consistent messaging) and whether the candidate would be happy with that.

If a candidate came to me with no product/commercial experience (e.g., recent grad, or from a research environment), I'd try to characterize the nature of the work, and see whether I could get an honest discussion with them about how they'd feel about that (and whether they really understood what that means). I'm not wise enough to have figured out tests that will tell me.

And I'd have to hit some team-oriented discussion, too, since that's my biggest concern lately, even more than product-oriented. And it's something a lot of companies seem to do badly (e.g., people focused on their own appearance in sprint tasks or metrics or promotions, rather than the whole of the team's work coming together).


But you are asking for the most interesting example, which you may remember from context, but not in any interesting or even technical detail.


Again, the subtext is "interesting example we're going to discuss".

If there's one you can't discuss for any reason (doesn't even have to be you forgot: could be an NDA) then it's already excused from the discussion.

An even half-decent interview is not adversarial: just like day to day work, it requires interpreting some level of useful subtext and some level of open communication

-

I mean, you forgot the details so it's not like it's not like you're going to start monologuing if you just touch on it: "You know, there's a real doozy from X years ago where Y but the details escape me, more recently Z happened"

If there are none that you remember that are interesting: "There aren't many interesting bugs, but there was this really interesting product requirement, could we go over that?"


While I'm not complaining that people are realizing they're dead... why is ChatGPT the final straw for those ridiculous "make a binary tree" questions?

Why wasn't it the fact that these questions became such a gameable system, that we started referring to them by the copyrighted name of a site where you can access nearly every permutation that will ever be asked of you, along with extremely detailed solutions with rationale: https://leetcode.com/

It's crazy to me that of everything that ChatGPT can do, regurgitating well known answers to well known interview questions is what kills anything off...


I mean if you can’t make a binary tree, what do you really know about programming? If ChatGPT is so great, why would even need a job interview, why not just eliminate the jobs altogether?


You know, the only thing I get more sure of over the years is that defining a "good programmer" is hard

Now, I can't tell if you're being facetious or not, but if one seriously conflates being able to write a binary tree with knowing how to program... they're at least making defining the antonym a bit easier.

-

Also no one is saying ChatGPT is "so great" in this specific case, if anything the point is that ChatGPT can do impressive things, but again, regurgitating leetcode answers isn't one of them.


I’m not trying to be mean but “conflating” is not the same as “necessary.” There is a very common notion of “necessary but not sufficient”. Yes, I think if you don’t know how to make a binary tree, you don’t know much about programming. The converse is not true.


I view general algorithm problems as a "programming subfield", just like graphics/frontend/AI are. Certain coding techniques are overrepresented in each subfield, and each subfield has fundamental "base code", design paradigms, methods of testing, need-to-know algorithms, and canonical examples.


One of the smartest I ever felt was hearing some teammates a cube wall over saying "this one customer from poland says he's crashing a lot" (in our ui). I was like "is he in the polish locale/language? They asked.... yes... "I bet polish has different seperators and our split("/")[1] is crashing"... they checked, had him change to english, crash went away. I felt real yoda like. We had customer on chat so much easier to debug.


Debugging is easy when you know you are coding incorrectly.

But why would Polish have a different path separator? That's a filesystem choice not a language choice.


Windows uses 0x5C as the path separator, and the character rendering of that codepoint is locale-dependent (most famously in Japanese encodings it's ¥). Apparently in most Polish encodings it's still \, although there's one obscure one where it becomes Ś.


This was actually on date/time I believe... we had the date in the path


>So most interview questions like "make a binary tree" are dead.

"Make a binary tree and explain it on this whiteboard"


ChatGPT also has decent explanations tbh.


How is it at writing physically on a whiteboard?


It currently requires a human for that.


I hear that it'll be an app you can run on your neuralink. I'm very excited to beta test it.

Should be any day now just like full self driving.


If you ask it to write it on an ascii art whiteboard it makes a pretty good attempt


Higher up in the thread, the question was about how to ask an interview question that could both test someone's skill and allow you to detect whether something is a bot or not. If I ask someone to code and explain something on a whiteboard and they give me ascii art, I think it did the job of identifying whether they were a bot.


ChatGPT provides excellent explanations in multiple languages for all code I've seen it write.


How many of them are indeed correct?

https://auerstack.substack.com/p/what-chatgpt-cant-do then search for 'prime' and read the transcript.


on this whiteboard *points to physical whiteboard on the wall*


"it involved a program that ran perfectly fine during testing but crashed whenever it was deployed to the production environment. After much investigation, it was found that the bug was caused by a combination of the production environment's security settings and the program's use of random number generation.

The program used a random number generator to generate a unique ID for each user that logged in. However, the production environment had security settings that blocked certain types of random number generation for security reasons. This caused the program to crash whenever it tried to generate a unique ID for a user.

The solution to this weird bug was to modify the program to use a different random number generator that was allowed by the production environment's security settings. The lesson learned from this bug is to always be aware of the production environment's settings and limitations and to thoroughly test software in the target environment before deploying it."


Back in my first job we had a database which worked fine in testing, worked fine on the customer’s server… then started crashing a week later.

Turns out their MS SQL Server install was configured for MM-DD-YYYY output which only crashed when we reached the 13th of the month. Important lessons were learned that day!


> The best interview question that will never die: "What's the weirdest bug you debugged? What made it weird?"

@ChatGPT: Give me three sample answers to the following questions related to <insert your interview language> programming: "What`s the weirdest bug you debugged? What made it weird?"


    "One time, I was working on a project where the UI kept freezing when a certain button was pressed. I spent hours debugging the code, but couldn't figure out what was causing the problem. Eventually, I discovered that the button's event handler was accidentally triggering an infinite loop that was consuming all the CPU resources, causing the UI to freeze. It was a weird bug because the symptoms were not immediately obvious and it took a lot of digging to uncover the root cause.

    I once spent an entire day trying to figure out why my code was behaving erratically when processing a certain data file. It turns out that the file had some hidden control characters in it that were causing my program to misinterpret the data. The bug was weird because I had never encountered a situation where hidden characters were causing issues before.

    One time, I was working on a codebase that had been written by multiple developers over a long period of time. I was debugging a particularly tricky issue where certain data was being lost between different parts of the system. After a lot of investigation, I discovered that one of the earlier developers had implemented a workaround for a different issue that involved storing data in a global variable that was being accidentally cleared by another part of the code. It was a weird bug because it was caused by a seemingly innocuous change that had been made months earlier and was difficult to trace back to its origin."
Welp, we're boned


Interesting. Played around with that kind of prompt a bit and asked follow up questions on the answers as one would in a interview. The answers to those were all very vague and almost "evasive". And within 1-2 follow-ups it started to contradict itself.


I mean the first and the third aren't great. So, 33% chance of being boned.


One weird thing about ChatGPT is that it consistently feels like a B+ student, at everything you ask it. Its never top of the class, but its answers seem skilled enough to get by.

I've seen this asking about my own field (CRDTs), asking it to write comedy sketches, and with some of the academics I know asking it the sort of essay students they'd ask their students.

I wonder if its somehow doing this on purpose - like if its pitching its answers to be the most "middle of the pack" from all of its training data.


The quality of its answers improve significantly when you give it feedback and ask it to revise. It's eerie!


I feel like I could have written that. Did ChatGPT copy old blog posts of mine? ;-)


Please don't put blockquotes in code blocks.


Please ask dang (or, I suppose, pg) to implement features that were table stakes in the BBS scene in 1985.

Come to think of it, ChatGPT could probably do the job mostly by itself, given the right prompt.


Why is itdead? An interview is live, apart from sone kind of sci-fi implant thing, ChatGPT won’t help you cheat your way through a live interview; in fact using ChatGPT on a take home interview test and failing to answer basic questions in an interview may reveal aspects of your character and competence.


Yes, agreed! We ask a variant of this question (we call it "an analysis sample"); from the materials we ask candidates to submit[0]:

"A significant challenge of engineering is dealing with a system when it doesn't, in fact, work correctly. When systems misbehave, engineers must flip their disposition: instead of a creator of their own heaven and earth, they must become a scientist, attempting to reason about a foreign world. Please provide an analysis sample: a written analysis of system misbehavior from some point in your career. If such an analysis is not readily available (as it might not be if one’s work has been strictly proprietary), please recount an incident in which you analyzed system misbehavior, including as much technical detail as you can recall."

These samples are very revealing -- and it feels unlikely that generative AI is going to be of much help, even assuming a fabulist candidate. (And of very little assistance on our values-based questions like "when have you been happiest in your professional career and why?").

[0] https://docs.google.com/document/d/1Xtofg-fMQfZoq8Y3oSAKjEgD...


To be honest, this sounds extremely difficult and not in a good way. That sounds like many many hours of writing work, to describe a problem that might be many years in the past, that might have been solved by extremely intricate methods that are easy to forget, using technologies that are now not commonly in use, etc.

A good question to ask about each interview question might be: would a good liar have an easier time answering this than a person trying to answer honestly? And if so, retire the question.


Having read many, many, many answers to this question, I don't think that a good liar has a particularly easy time answering this question -- or certainly not in a way that gets them further consideration!

And yes, it's many hours of work -- but the work itself that we are doing is quite hard, and if someone washes out in the application process because it feels unduly arduous, we are likely not a fit for one another.


> I don't think that a good liar has a particularly easy time answering this question -- or certainly not in a way that gets them further consideration

How would you know?

> And yes, it's many hours of work -- but the work itself that we are doing is quite hard, and if someone washes out in the application process because it feels unduly arduous, we are likely not a fit for one another.

I sincerely hope that I never accidentally apply for a company that thinks an unpaid, long form writing prompt is an appropriate interview question because the work happens to be hard.


Eh, it also excludes people that don't have spectacular long-term memory, or people that don't keep a diary about bugs that they've chased down at work. Personally, I think you're overfitting to fight cheating, but maybe you work at a desirable enough place that you can afford to exclude so many people but still get enough good candidates.

IMO a good question provides the necessary context itself, and the candidate's thinking and reasoning skills are what's tested. With your question, it's basically turned into a competition of which candidate has tackled the most ridiculous/obscure/complex bug, so candidates aren't being judged on even footing.


Agreed completely. This problem borders on that common category of questions which test whether the applicant shares a specific fine-grained flavor of nerdiness as the interviewer, rather than whether the candidate is a good fit for the job.


Sounds like one where somebody does pretty well out of finding an interesting example from the internet and tweaking it slightly. They might not survive an interview asking them to expand upon it in an interview, but they could still succeed in passing the screening better than people recounting boring bugs they actually fixed...

(If ChatGPT wasn't busy I'd be tempted to see whether it can manage that, or whether your phrasing throws it off)


I'd be interested in seeing how we can incorporate AI into interviews. Example being entry level software engineers using AI to jam through a small project with tests in a limited time span. Lazy engineers won't check the work while others will use whatever AI generates as their draft and correct whatever bugs that are in it.

I believe that we should be taking advantage of this productivity boost across the board.


is a hard bug the best interview question? When I do my job right 98% of the time is not debugging. This ratio changed dramatically over the course of my career.


I agree it's the best question because it's a good topic to begin a bigger conversation about your programming experience. It's hard to fake and you can take the discussion in many directions. Simply asking for their weirdest bug, getting an answer, and going "ok good, let's move on" is a bad interview question.

For example, you say the ratio of debugging:development changed dramatically over the course of your career. I would follow up and ask what key things you would attribute that to? Testing? Changing languages? Changing programming paradigms? Maybe it's simply that you have a wider knowledge of CS concepts and a stronger intuition for the correct way to model your logic. There's no right or wrong answers, just trying to see that you actually do have some opinions of your own.


I find the ratio doesn't change, but the trickiness of the bugs gets more complicated as you solve harder problems and avoid previously unknown problems.


> So most interview questions like "make a binary tree" are dead.

By this logic, those questions were already killed by Google Search.


In my last round of interviews, by far my favorite session was "here is a real project with a real bug, debug it, then if you have time, describe how to fix it, then if you have time, write the fix". (At Stripe FWIW.)

It was super representative of what my work is actually like, and what I'm good at, and ChatGPT would not have figured it out.

yet


That sounds very fair. I’d prefer that over leetcode type of questions anytime


Yeah! I didn't end up taking a job at Stripe, but I came away very impressed with their process. Only one interview was leetcode-ish and it was a phone screen and purposefully not a high bar. Then the real interviews were: that debugging one which I loved, an "integration" one which involved live coding while plugging a few different unfamiliar libraries and APIs together (also something I really do in my job), a standard system design interview (still don't really love these...), and an in depth conversation with a hiring manager. Still a tough day, but pretty decent process I think!


Did they pay you for doing the interview?


No of course not.

Do you mean this debugging interview in particular, because it could potentially be useful work fixing a real bug for them? To be clear, it isn't an extant bug. I think the way they do it is they take an interesting (but fairly straightforward) real bug that they fixed at some point, and they back out the fix.


Personally I hate this question.

And I get it in every interview. Somehow my brain just doesn't care to remember the gritty details for tough bugs. I'll spend days on a bug but soon after I solve it I'll only remember the actionable takeaways like "next time I should try using X tool sooner" or "check your assumptions on Y part of the stack".

I think it's because I never spend any time revisiting that bug in my mind. I've got new problems to solve. You need to revisit something to remember it.


I've been quite outspoken about the impact that ChatGPT will have in the tech interview. My solution is not subtle: I believe we should burn the leetcode-style coding interview to the ground and replace it with something else entirely, possibly embracing LLMs.

I'm in the vast minority though. My interviewer colleagues things this doesn't matter and we should keep doing business as usual.

I guess we'll find out how it works out in the next couple of years.


God I hope so, the last time I went job hunting I spend a couple of months doing the leet coding stuff, and I got "pretty good", but I'm certain it would have taken me another couple of months to pass the gauntlet somewhere.


no, in-person whiteboard / laptop based interviews can still ask these questions when the interviewer is shadowing you IDE.


typically how much time might you get to answer a question like that?


That's interesting! What did you learn from debugging that weird bug?


I learned that off-by one errors, mixing up arguments and caches are hard to debug.

I have easily spent days debugging many such problems which were almost always solved by a one line change. And rarely did I find ways to prevent similar bugs in the future bugs by improving testing or code factoring.


I teach CS1. A lot of this post resonated with me.

In particular, I don't think that beginners are well-served by relying on AI to complete their assignments. Later on, once they've developed some computational thinking abilities, sure. Starting out, no.

There's a real dearth of good options available to computer science educators today for teaching introductory material effectively in the face of all the new and existing ways there are for students to cheat. A lot of what people offer up as alternatives are unworkable or downright bad ideas:

* Paper exams represent an unrealistic environment, encourage terrible programming habits, are a nightmare to grade, and don't test student abilities to identify and correct their mistakes—which is maybe the most important thing we want to assess.

* Oral exams also don't scale and raise obvious equity issues.

* Beginners have to build basic skills before they are ready to work on larger open-ended projects.

We're fortunate at Illinois to have a dedicated computer-based testing facility (https://cbtf.illinois.edu/) that we can use to allow students to take computer-based assessments in a secure proctored environment. This has been a really important support for our ability to continue to teach and assess basic programming abilities in our large introductory courses. I'm not sure why this idea hasn't caught on more, but maybe AI cheating tools will help drive broader adoption. (Such facilities are broadly useful outside of just computer science courses, and ours is heavily scheduled to support courses from all across campus.) Anything would be better than people returning en masse to paper programming exams.


For CS1 you probably don't need much compute power for assignments. Schools are already well funded enough to sometimes offer freshman ipad pros. You could offer them a raspberry pi or some extremely cheap, low powered PC, make them return it after the semester if you want to save them money. You can firewall it from the open internet, and have students turn in their code from this device alone to a university server. They can still cheat, sure, but to do it would mean transcribing code from one device to another by hand, which is enough friction and a timesink where fewer students would consider it versus actually paying attention in class.


“enough friction and a timesink”

Have you taught students before? Many will spend inordinate amounts of time to not learn the material. Often times it seems there is no friction too great if it allows one to not think too hard.


Yes, and I've been one, and know that time is finite and you have more than one class that demands work on a deadline, along with all the other fun stuff college has that pulls you away from your studies, and the not so fun stuff like part time employment. If you leave the system as it is today, its easy to copy and paste code. If you do something akin to what I proposed, you've eliminated copy and paste, and made cheating into a literal chore that isn't saving you nearly as much time as it would have otherwise, and fewer students will end up cheating. You'd be surprised at how many students I knew in undergrad who would be broke and would still pay like $400 a semester on textbooks because the friction of doing hackery things like photocopying chapters of the book in the library, or googling "my math book 2nd ed. pdf" and finding the library genesis result was just too much.

Of course the death blow for this sort of cheating is the exam, which you weight quite a bit more than the homework. A student who just copy and pastes code will still fail the class, since they can't use chatgpt in the lecture hall during exam time.


And then someone in the dorm makes a keystroke injector, and everyone goes back to typing their code on their own computer.

Ex: https://null-byte.wonderhowto.com/how-to/make-your-own-bad-u...


IT dept can disable loading volumes that aren't the university shared drive for assignment turn in. They can also supply laptops vs desktops with wired keyboards.


I'm proposing a device that pretends to be a keyboard, and mimics typing in the code


And a counter for that could be a laptop device with the usb ports disabled. A lot of this stuff is solved with some basic IT imo.


You're now well past the original proposal of a Raspberry Pi. And people who need different input devices for ergonomic reasons (laptops keyboards are not a good fit for many people) are going to push back hard.

The device can make network connections, right? Someone's going to come up with a very short program you can type by hand, compile, and then pull down arbitrary other code over the network.


Honestly I appreciate the exercise pentesting my hypothetical academic computer. Lets drop full network access entirely then. Allow it to only connect to a university intranet for assignment deposits, along with plenty of logging of what happened on the device. Lets rule out the "typing verbatim" method too by checking whether the user typed in all in one go or spent time debugging and revising using our own keylogger perhaps. If they need special tooling to interface with the computer, that will probably come with a recommendation from student disability services and be totally legit.

Its a classic arms race, but at the end of the day with enough effort on the issue, IT departments will either win entirely, or make it hard enough to cheat for the vast majority of users that only the extremely small minority who do manage to cheat probably deserve a cs degree. If its hard people won't do it, just like how they keep buying textbooks because finding a free pdf online is only slightly harder, but enough effort to put off most people from forcing campus bookstores out of business.


I do think you can get to something where cheating is enough more work than doing it honestly (or at least you need to manually type things over).

But while we started with a "just issue an rPi" we're now at "issue a modified laptop running a large amount of custom security software".


Issuing a modified laptop running a large amount of custom security software at least seems like an IT departments bread and butter. Maybe if the issue gets big enough microsoft sells a dedicated educational laptop that's preconfigured as such (akin to google's chromebook), but since they own chatgpt they can presumably bake in even more internal controls to flag putative chatgpt content.

If the problem gets widespread, seems like there are possible responses that could and would be made. Just like how in time, schools went from letting you upload your essay and that's that, to running that essay through plagiarism software and baking the tech into their disciplinary process.


Yeah but what you’re talking about is a simple typing exercise.


“Oral exams also don't scale and raise obvious equity issues.”

Not to be intentionally obtuse, but what are the obvious equity issues?


On top of the other issues discussed, I think that giving effective oral exams is hard for the same reasons that interviewing is hard. It’s a test of the subject’s ability to quickly and confidently say things that sound roughly correct. Some people stumble over their words and cannot effectively speak and think completely accurately in the time required to give a reasonable answer without an uncomfortable pause.

Now, these issues could be mitigated by asking each person the exact same questions and taking careful notes of their responses, but then you’re just back to a bad essay that can’t be revised, edited, planned, or recollected as easily as a real essay.


Exams can be anonymized when graded to reduce teacher bias. You can't do that with oral exams. In addition, you can't get a second opinion for an oral exam if you suspect you've been graded unfairly.


>In addition, you can't get a second opinion for an oral exam if you suspect you've been graded unfairly.

You can if the exam is recorded.


Language ability, manner of speaking, physical stature and presentation, reputation from previous interactions with staff, you name it—all worsened by the fact that many of these are probably going to be done by course staff. It's not clear to me that any other form of assessment has as much potential for subconscious bias.

Orchestras started using privacy screens for auditions for a reason. And I'm not familiar with an equivalent for the human voice, particularly for hiding halting, labored, or elliptical speech—possibly by a non-native speaker—that they could straighten out on the page.


But then why say "equity"? Blind grading can, of course, eliminate bias and achieve equality. But to achieve equity, you need to know the race, sex, religion, etc in order to "positively" discriminate.


Not everyone has the same primary language. Especially at universities.


Probably people who are mute or deaf.

That being said, the ultra dense morons who think "oral" can't be extended in a special case to simply mean "without extra time to think" (which is what is is, mostly) are typically not experienced with these exams.

The parent teaches CS1 and is not likely to have been given a formal oral examination in their likely short career (these are usually reserved for PhD qualifying and special MScs).


What are the obvious equity issues of oral exams? PhD program have oral exams, are they engaging in wrongthink?

One option is to teach some ethics and accountability and have real and immediate consequences for cheating. Make the policies very clear and enforce them. You cannot detect every possible instance of cheating but you can detect many instances, you can test students to determine if they really wrote or could write such code and when they are found, you make examples of them as has been done in previous times. People want to treat literal cheating using ChatGPT as if it is like a calculator and not something fundamentally different.

If you accept this cheating, you may as well not have the class at all nor the degree program.


Perhaps this is another of the early (?) nails in the coffin for traditional higher education.

If it becomes harder to assess if someone learned something (with a grade), the results of that assessment (GPA) become less valuable. Software has traditionally been at the forefront of allowing people with non-traditional backgrounds (bootcamps, other degrees, self-taught) to work in the highest echelon of jobs, because of experience outside of the classroom (open source, personal projects).

ChatGPT and its ilk put more pressure on evaluation of candidates in interviews and should lend more weight to impact/experience based criteria on resumes (vs education-based).

There is a spectrum of people using ChatGPT to cheat vs learn. But, ideally, "cheaters never win", so interviewers and resume screeners will soon be under as much pressure as educators to holistically evaluate candidates beyond the crutch/immediate signal of a degree. They're just further downstream


I did most of a humanities degree in the early- to mid-'00s and the only courses that relied heavily on long-form out-of-class writing exercises for grades were in the actual language departments (English, foreign languages).

The rest were big on the occasional short quiz in-class to check understanding, and periodic "bluebook" exams that involved writing the equivalent of perhaps 3-5 total pages of typewritten material, by hand, in one class period, in person, in response to perhaps a half-dozen prompts. Basically a series of short, to-the-point essays. Not a ton of outside-of-class paper composition. I doubt they'd have trouble adjusting to remove those all but entirely.


You could force tests to be done in testing centers. My college had these and they were strict about what you can bring, you get up to a whole week to show up on your own time, and you're only allowed a paper and pencil if anything at all, that they provide. Make the Final and Midterm tests worth roughly 60% of their grade, and it wont matter if they cheat on their homework.

Edit:

Alternatively, have students do presentations of their code from their homework, just as we all do peer review professionally. Let students learn and teach other students.


I think the edit is more the case for the near future.

I think we’re about to see a shift from professors running the same curriculum year over year not really knowing students that come and go on a time conveyor belt, to something much closer to the imagination of the parents that are often paying for their kids college “experience”.

OR - I see the tools used to cheat also being used to detect cheating.

Hopefully both is the answer.


I think it just moves the definition of “learning” to a higher level of abstraction: so you know what AI tools to use, how to prompt them, and how to understand their output?

I’m reminded of the time when graphing calculators were going to destroy math programs because nobody would “really know” how to do the work. And yet here we are, and math is fine, and calculators are just another tool.


I hear this argument a lot and I think that it's fallacious. Let's go ahead and extend it to another AI tool - in this case let's talk about stable diffusion.

Let's say you teach a class on fine Art and painting, if you allow your students to use stable diffusion for all their drawings, would you make the case that they have learned how to paint?

Likewise you can't really make the case that somebody understands how to do recursion, if all they're capable of doing is typing the following prompt into chat GPT, "change my forloop into a recursive method".

And in my experience going through calculus, the usage of graphing calculators was heavily. We still had to understand how to calculate derivatives and integrals by hand.


> would you make the case that they have learned how to paint

Well, you’re assuming that “painting” is the physical act of moving a brush on canvas.

But that’s already not true. Plenty of people graduate art school with degrees, despite doing everything on a computer. Are they “painters”? Well, no, but they are artists.

And if you’re talking about a program for artists, where the work is judged on artistic merit (composition, concept, etc), I don’t think it matters what mediums are used.

But if we’re narrowly focused on something more like sign painting, where what matters is brush technique and conforming to customer expectations, sure, AI will reduce the need for such people and will allow those who exist to “cheat”. But who cares?


> would you make the case that they have learned how to paint?

Not painters, but they would absolutely be digital artists. I’m not sure why a painting class would use digital anything.


You could say the same about a photography class, that all you're teaching your students to do is push buttons rather than paint a scene.


Most math classes are taught without graphing calculators and when they are used it is minimal. This is fundamentally different. This literally is being people trying to substitute actually know or successfully being evaluated to do anything. The advocates of “bringing down higher education with AI” also consider competence to be “ableism”.


I support using AI in education because it will be in the work environment. It seems insane that anyone would want to teach students on slide rules and typewriters and then send them out to a world of computers and word processors.

And I have no idea where your “ableism” comment came from. Just trying to inject some culture war?


This business of “ableism” is comes up all the time with ChatGPT discussions. I don’t know what your slide rule comment means. Even with calculators we expected students to be able to calculate without a calculator. I really don’t see how you expect people to do “complex” tasks if they cannot do simple ones. Is this magic?


Higher education will become just an optional prep course for your sit down conversational AI interview.

Once AI can do a good job vetting candidates, I see no reason for companies not to have an open applicant process where anyone can interview and be evaluated. If you are sharp and know your shit, a degree won't matter and the AI interviewer won't care.

But this is an "All else being equal" scenario, my true belief is that AI will change things so radically that there is effectively an event horizon in the near future, impossible to predict whats beyond it.


The event horizon you describe is always there. Be it 3D printing, AI, moore's law... etc... The things these things enable, are hard to predict.

Think about cloud computing. It changes the game massively for startups and for people who need enterprise class infrastructure as mere mortals.

Another constant tension to show you how unpredictable all this is: Do you use kernel networking, let the kernel use hardware offloads, or goto use DPDK? The choice of what to do is changing as hardware changes, the kernel changes etc....

... Once you understand, that life is ALWAYS at an event horizon.. you understand AI is just another such event.

Prediciting the future is for the native... Making the future is the way to go. Currrently the AI guys are doing that. But another thing will rise up, it always does.


Why would you even need jobs in this case? Also why does everyone think we live in some magical fairy in which the majority of people with advanced knowledge of a subject are doing so without higher education. Do these people really think everyone is an autodidact? If that was the case, why would they need to cheat on their courses?


With all these ChatGPT academic apocalypse stories, I keep thinking - what the hell is wrong with students, that they don't want to actually learn the subjects they've enrolled in? If your goal is to fake-learn and have ChatGPT do your homework, just drop out already. You're paying big tuition for no educational benefit.

Maybe your degree gets your foot in the door at some company, but it won't be too long before they suss out your incompetence. Or perhaps you get "lucky" and are able to sustain a kind of imposter parasite life in an overbloated company that doesn't notice.

Doesn't sound like a very interesting or fullfilling life, but to each their own I guess. I hope fake-learners don't find their way to my companies.

Long live the enjoyment of learning, real expertise, and the building of cool things.


I've seen rampant pre-ChatGPT cheating at my university where it's popular for business undergrads to take computer science as a second major to get a leg up for Product Manager roles.

These people don't intend to ever write a single line of code after university. They aim to be just familiar enough with software development to be able to nod along and throw in a few buzzwords in interviews.

Big money attracts people who just want the money.


This essentially makes credentials more meaningless than ever, even as tuition has grown astronomically.

The cost of making a poor hiring decision can be vast throughout an organization and punish its velocity. It’s pretty typical to want to minimize risk in scenarios like that, so reputable credentials can instill confidence and get past gatekeepers.

Wouldn’t it be nice if we could have a better way to prove domain expertise, adaptive reasoning, and collaboration skills?


This is why leetcode problems might have some actual use.


I wish. They are the most easily exploitable, sadly. People memorize and dump stuff all the time, and it usually doesn’t show reasoning, just how much money and effort you spent on CS fundamentals.

My team has weeded out a lot of bad candidates with super simple, practical tasks. Explain DNS. Order keys from a JSON object. Make a 2 column layout in plain html. You would be amazed at who can’t do that.

What we don’t do is ask the same questions too many people times, as we know some candidates compare notes and even publish the interview questions. With large language models doing the talking we have already found candidates that have been unable to describe the for loop copilot made for them off-screen… so I guess the best system is to be good at being humans and having a go at working together in something.


I would agree with your position on the assumption that the skills being taught by universities are truly valuable.

My computer science and engineering course didn't teach git. I got a test on CSS and I wasn't allowed to use references, I assume the presumption was that I might need to write CSS without the internet I suppose? It introduced a proprietary UML code generator meant for working with oracle. There was 0 percent chance that company wasn't giving my university kickbacks. Speaking of kickbacks, have you looked at the insanity university textbooks are engaging in now?

If I had a choice of hiring someone who had field experience over someone with a degree, it wouldn't even be a choice for me, field experience. But you can't quantify that. you can't CYA with that. So this farce must go on.

Now, obviously, we should improve these issues. Obviously, that piece of paper should mean something. And hey I'm not saying I learned nothing. Despite these failings I managed to learn some valuable skills while I was there. I did take it seriously. But that made these issues all the worse. I would not operate on the assumption that if someone cheated their way through university that they have no skills.


I'm thinking the exact same thing. It also makes me wonder why professors still have mandatory homework when they know students will cheat. Why not give optional homework, and allow students to have their work reviewed by the professor or TA if the student wants. But if the student doesn't want to do it then, they can skip it and waste their money. Then, have mandatory in class tests that cover the material.

I feel like this would cut down the BS professors have to wade through, allow students that want to learn the opportunity to get valuable feedback, prune the students who don't want to try by failing them when tests come around, and hopefully lead to a better overall outcome for everyone actually invested in the education.


Students have lots of competing deadlines. If one class has no deadlines, most students will deprioritize it until the very end when it's too late to realize their mistake. So, having homework worth something (no matter how minor) would be beneficial to most.

It's unfortunate, but a key component of education is motivation -- and deadlines are one key way of providing that.


> what the hell is wrong with students, that they don't want to actually learn the subjects they've enrolled in?

To play devil's advocate, half of my classes were relevant and interesting, the other half were bloated pointless time wasting filler.

There are a lot of classes I didn't take, but wanted to, because they weren't required and your optional electives are a tiny number of credits.


I remember being a student. Never cheated because of ethics, not because I cared about learning (I did care). If I had a different ethics view I would’ve cheated as a means to an end and continued to try to learn as much as I could.

I guess my point is you can care about learning as well as care for your scores (by cheating) simultaneously.

Edit: not a CS grad but still took CS courses.


I see a ton of people trying to justify their use of Copilot/ChatGPT (or rushing a startup to cash in on LLMs).

Maybe that conflict of interest is why there's very little talk of it being based on plagiarizing and license violation of open source code on which the model was trained.

We just suffered through a couple decades of almost every company in our field selling out users' privacy, one way or another. And years of shamelessly obvious crypto/blockchain scams. So I guess it'd be surprising if our field didn't greedily snap up the next unethical opportunity.


Right, any humans that reads open source code ever should also be forced to submit any code they write to ensure they've not mentally performed copyright violations.

I don't give two shits if whatever current expensive GPT is dumping out code 'very similar' to open source code today. And you'd be chopping off your own nose if you did too. Thinking these models will remain as expensive to run in the future means that at the time you or I could run 'LibreGPT' on our own hardware, we'd be scared as hell to even write the code because any use of it could get you sued into oblivion.

Burn copyright to the ground.


That’s why we have clean room development.

For patents it is to avoid triggering triple damages.

For copyright it is to help avoid allegations of plagiarism.

> Burn copyright to the ground.

Burn patents to the ground! Copyright has some use, but the Disney changes have somewhat ruined its purpose.


Cleanroom development practice is a great point to make.

We have precedent of believing that exposure to some code (or other internals) might taint an engineer, such that they can't, say, write a sufficiently independent implementation.


I don't understand how most of the comments here seem to be along the lines of "these interview questions are useless now" or "we need to rethink education, it hasn't kept up". These all seem absurdly myopic to me.

What we're seeing is the first instance, still very limited and imperfect, of AGI. This is not going to make some interview questions obsolete, or give students more tools to cheat with their homework. It is effectively proving that acquiring knowledge and professional skills is becoming useless, for good. In a few years (3, 5, 10 at most) this is going to defeat the entire purpose of hiring, and therefore of completing most forms of professional education. At the current rate of progress, most intellectual professions will be obsolete before today's sixth-graders are ready to enter the job market.

I can't even picture a functional world where humans are cut out of most professions that don't involve manual work; where any amount of acquired knowledge and skills will be surpassed by machines that can produce better results at a thousandth of the cost in a thousandth of the time a human can. And even if such a world can function, I can't imagine a smooth transition to that world from its current state.

I'm worried, does it show? :)


I agree 1000000%. Watching all the ART AI's change so radically fast, Watching ChatGPT being adopted by so many people at my company. Watching people say art will shift to prompt engineering, then watching prompt engineering being automated away already! Watching the rise of stable diffusion tooling like invokeai, controlnet, and more. Watching youtube videos of people integrating openai and stable diffusion together. Hearing my old university has created an emergency council to deal with AI and its implications on education and students. Seeing artists on twitter say their commissions have dropped dramatically. Hearing stories that 4channers have been trying to steal Anime creation AI's. Reading articles that chatgpt passed Wharton MBA tests, and various Lawyer tests. An article on hackernews TODAY talks about running this giant models on a home pc!

The speed of AI adoption with its immediate practical usefulness and watching the the speed of innovation around only increase faster than ever makes me nervous extremely nervous about the future, extremely nervous.

This is a global paradigm shift.


It is a paradigm shift and whoever happens to be on the good side of this may survive just a bit longer before they too become redundant.


Do we know if anything, other than luck, will determine who's on which side? Or, if any decisive action on anybody's part is more likely to pay off than just staying mentally flexible, watching, and waiting?


This take is widely prevalent in every single ChatGPT discussion.

I'll quote myself for discussion for a previous article: https://news.ycombinator.com/item?id=34746348

> I've find two groups of people on this subject, first one has taken a graduate level stat course and/or ML course, and may have work experience in machine learning/data science. The latter camp is more numerous and did not do those things.

> The latter group are far more hyped about ChatGPT et al, despite explanation by the first group.

> Don't get me wrong, ChatGPT is very exciting, just not in the way it is frequently portrait to be, in particular, development on this model will not lead to General Intelligence, which is not a data training/stat problem altogether. With how the ML field is shaping up to be today, it doesn't even look to be a Machine Learning problem.

And let me just add a ridiculous comparison, since you are assuming that it is a path towards AGI:

> iPhone 4 isn't fusion energy. It's incredibly useful and a tectonic shift in the cellphone industry, but it's not producing energy. The analogy might sound completely insane, but that's how different machine learning of today and general intelligence is.

While AI of today feels like a blackbox, at every stage of its training we know to a degree what it is doing, and once trained, it is as deterministic as a few billion interconnected if else connected to a random seed generator. It could do a lot of things, acquiring a mind of its own isn't one of it.

I wouldn't be worried just yet, because AGI is something that research have barely had an idea on how it's going to look outside of our brain, or even inside. I'm not saying that it will never come, just that I highly doubt it will be invented in my lifetime. In the meanwhile, I have something that all of bots today won't have, initiative.


AGI is defined as human-level performance across wide variety of tasks. ChatGPT achieves that. It's somewhat stupid and limited (no vision, hearing, manipulation), but still, in principle, it's AGI.


I don't want to argue about the semantics or definition of AGI, since that is a rabbit hole that I don't believe contribute anything to the subjective danger that people are feeling.

The only thing I want to say about ChatGPT and other LLM model is that they don't know what they are doing. Give me something that knows what its doing, and perhaps more importantly, what it wants to do. Then I'll acknowledge my personal obsolescence, if not at that point then in quick succession, but until then.


Right now everything is in such a grey area that we could take this discussion in so many ways. Writing it off because it doesn't mean what it says misses the point, in my opinion, the point is that anyone who knows how to write the correct questions can work with ChatGPT to develop things things they couldn't before, improve their own writings, and potentially replace their need to consult others, for a quick list.

You are the one saying that belief behind the writing matters. What if the audience decides to upvote ChatGPT instead of you? Who says conviction is essential?

I would also contest you by suggesting that the fearful ones are the one's who find a need to downplay the threat, or we can rephrase it, advanced usefulness of current, let alone near-future versions.

And I realize ChatGPT could have written this post quicker than I did, including a version that contains my typo patterns, as well as a more concise and grammatically improved version...


It is unsurprising to find wisdom in 570GB of human generated text.

It is unsurprising to find useful information being returned when a statistical process is used to extract value from those text. It's a similar process to how search engine work, except more costly and the result more natural.

However, if you choose to believe that the above process is signifying the development artificial intelligence that will start to have its own consciousness, then good for you.

Again, I'm not saying ChatGPT is useless or won't replace many jobs, it is amazing in its own right. It's like a refined and more useful google, which is huge. I'm only arguing that it is not AGI, will not develop into AGI, attempting to argue for its usefulness in other area is not arguing against me - no strawman please. As for myself, I'm not too worried for my job from this angle (there are plenty of other angles to worry about, that is), for with all tooling that develops, it benefits the people who could best put the said tool to work.


It is not in any sense whatsoever AGI. AGI is an unsolved problem. You are defining it to be something it’s not. I don’t know where this idea comes from.


The big question I'd ask is "do we need AGI?"

Surely an LLM could take over the ordering flow at say, Jack in the Box today. That's gotta be true for a lot of different situations.


I don’t think ChatGPT is AGI nor do I think LLMs can be AGI but I don’t see why any person with any sort of reasonable value system and understanding of reality would think we need AGI; of course I’m not convinced it’s possible a Turing Machine framework. Someone else said “it’s just another type of automation. No, it’s fundamentally different than other kind of automation. It would eliminate economics of knowledge work if left unchecked and probably many other areas of work. I don’t think plumbers would be called the “rich” anymore if that sort of work is all that’s left. I’m fact, it could be much worse than that. This would have a profoundly deleterious effect on society. Economic conditions that cause unrest and violence outside of the West would probably be the case throughout the world. Sone people that is called something like “post scarcity”, I think that is the new version “perpetual motion”.


I see this as another step in automation, like the sewing machine, the internal combustion engine, the microchip, internet, etc.


I'm not happy to say this, but to these kinds of posts I say the truth:

An old version of ChatGPT could have written a better version of your post than you in 2-5 seconds.


Of course, except it said so because the training weights aligned on previously learned material, or you lead it to say so by tweaking those probabilities with your input.

You can also easily make it say the complete opposite, which you'll have a hard time to do with me, because I actually meant what I said.

In fact, the next iteration of ChatGPT might have used my very words right here to construct its response.


But from what I’ve seen many skilled AI researchers who have even reimplemented these models themselves do in fact believe that a scaled up ChatGPT could be AGI, or at least that it’s a step towards AGI


> many skilled

citation needed, I'm only aware of one case which might be characterized as such (though I digress): https://www.engadget.com/blake-lemoide-fired-google-lamda-se...


>What we're seeing is the first instance, still very limited and imperfect, of AGI

absolutely not. language-model text generation is actually about as non-general as it gets -- they are fundamentally incapable of understanding anything at all, ever. they can't do math, work through basic logic problems, or produce any output that isn't just an assumed logical continuation of the input.


People seem so convinced of this and I just don't get it. I'm seeing this comment through my eyeballs, generating some pertinent text in my brain, and outputting it back out. But so many people seem convinced convinced this process is something radically, fundamentally, irreducibly different than what ChatGPT is doing internally, and I don't get why.

Is it because I have a consciousness with an internal narrative and ChatGPT does not? Because that seems like more of a result of how we've wired up ChatGPT to operate than a fundamental structural difference; nothing stopping us from making ChatGPT talk to itself in its brain to generate synthesis.


LLM (and deep learning in general) are to AGI what bogosort is to sorting. For some reason beyond me people think it's very important to not try to understand anything about the structure of the problem you're trying to solve and just make a very general algorithm which will be suboptimal in just about every way except for the generality of it's code.

Seeing the world as a trillion dimensional token soup is definitely quite general and at the same time very very weak in terms of expressivity.


Sure, I think buy this interpretation. As long as we can agree that bogosort and quicksort are both still sorting algorithms. My brain definitely has some structures that are useful for me understanding the world that are more immediately useful than trillion dimensional token soup.

But I'm also not convinced that it's impossible those structures could be successfully emulated by quadrillion dimensional token soup. And a lot of folks seem to be convinced that it's some kind of fundamental impossibility.


No other approach has worked as well for natural language, and not for lack of trying.


It's because you and I also have mental resources that are structured very differently from ChatGPT. Even if some part of our brain might resemble a LLM, a LLM is a very poor representation of other parts of the brain. If you take ChatGPT too far, it falls into repetition and demonstrates that it obviously has no comprehension of the material it's remixing.

Maybe some combination of a LLM and other mental machinery would result in AGI with real comprehension.


I dunno - I hear what you're saying about how it seems to lose track of things over long conversations. But part of the mental machinery that it lacks is the ability to learn from those conversations and insert new ideas into its base model; every new conversation starts fresh. It has no long-term storage; it can't "keep notes" for itself, and we don't give it the ability to alter its fundamental state based on new input after we're done feeding it training data.

I guess we're mostly agreeing, because those things are probably part of the "other mental machinery" we'd want to provide it, but I'd push back a little on "real comprehension". It kinda seems to me like the "comprehension engine" is working just fine, it's the structure we've built around that comprehension engine that's limiting it right now.


It's because whatever you and I are doing, it's not simply statistical analysis. That is all that ChatGPT is doing.

It may be possible to create a machine that flirts with actual intelligence, but this is simply not it. There's not even room for doubt about this.


I can't claim to understand everything my brain is doing, but accepting input and filtering it through a bunch of neuron chains to result in some kind of output sounds like "statistical analysis" to me.

You seem convinced that ChatGPT will never have "actual intelligence" — care to make a prediction about something that LLMs will never accomplish? We know they can write code, write essays, generate artwork, and play chess. What's a task that requires "actual intelligence"? Parenting a child? Running for President? Making a steak sandwich?


An LLM will never be able to produce something that cannot be statistically derived from its training data.

So... creating a new language, maybe?


https://maximumeffort.substack.com/p/i-taught-chatgpt-to-inv...

My bet would be that in ten years GPT would get pretty good at it, but eh, I dunno.


I see what you are going with here for it not being able to come up with original things. But I think that falls apart when you realize almost nothing humans do is original, everything is just building on other things.


The vast majority of what people do is this, yes. Just like how most of the time, we're all running on autopilot and our behaviors are pretty much just following scripts. "Meat robots".

But it's not 100%.


ChatGPT's attempt: https://gist.github.com/iameli/d9a5b715ec9baa5b11063888e054d...

Are those two responses enough to say it's invented a language? Probably not. But it's already farther along than I would have gotten if you asked me to do such a thing.

If I kept prompting it for hours, I bet it would start to contradict itself and lose track of the rules it had already established, but so would I. If I were actually inventing a language, I'd take months and keep extensive cross-referenced documentation on grammar and syntax and vocabulary. We don't _let_ ChatGPT do that, it has no mechanism for persisting its ideas like that. But like... neither would I if you took away my notebook and the parts of my brain that persist long-term memory.

I guess my interpretation here - I see differences in _capabilities_, but not differences in actual _intelligence_.


> whatever you and I are doing, it's not simply statistical analysis.

How do you know?


Because I have seen genuine creativity and invention.


You’re comparing a LLM to yourself, really? I wonder how many people talking about this know much about computers. Your cognitive capability is much more complex than ChatGPT. I’m not sure why people are convinced that a Turing Machine can simulate it, let alone do it efficiently. Do you believe that the way you think is at all like ChatGPT. That you are both processing text is not evidence. How many watts of power did you utilize on your whole life up to this point? How much did it take to train ChatGPT and how much does it take to run it, even if you could run it on your home PC? How much heat does it generate. You may not like it but could live and solve intellectual problems for a week on only water and a handful of rice. Do you really think these two things, human brains and computers based on TMs are the same type of thing?


Your argument is that my brain is more energy-efficient than ChatGPT, therefore... ChatGPT is incapable of "understanding"?


I’m not arguing that it is more energy efficient. I’m showing the obvious fact that there is a massive scale difference in the amount of energy and that one is capable of much more complex problem solving which seems to indicate that the two things are fundamentally different.


Okay. Same question I asked elsewhere in this thread; care to make a prediction about a problem that won't ever get solved by an LLM? We know they can write essays, create art, write code — what's something they'll never do?


I don't agree. Besides the fact that I can't do much math either, except with a lot of effort and tools, at this point there are masses of examples of GPTs doing basic reasoning and logic and solving problems. Of course you can say that they do it by "finding the most probable continuation"- and you're right. But that doesn't change the fact that it works. Simply because to find the most probable continuation ultimately you need semantic understanding, and somewhere, back-propagating from the training text, the NNs have managed to build a decent model of the world. There's no other way to explain their performance, these are not markov chain text generators.


Have you actually used any of these products? GPT et al are perfectly capable of taking knowledge from any one domain and applying it towards the solution of any other problem domain, through various kinds of data abstraction, reasoning by analogy, and other techniques similar to what humans do. It makes plenty of goofups along the way, just like humans do. But if your requirement is that it performs absolutely perfectly in new domains without making any mistake or needing any supervision... well that is certainly a requirement no human could ever meet, either.


> through various kinds of data abstraction, reasoning by analogy, and other techniques similar to what humans do.

No, that's exactly not how LLMs work. They are extremely good at predicting what sentences resemble the sentences in their training data and creating those. That's all.

People are getting tripped up because they are seeing legitimate intelligence in the output from these systems -- but that intelligence was in the people who wrote the texts that it was trained with, not in the LLM.


I see a lot of comments along the lines of "it's just predicting the next word".

But there's evidence that's what humans do as well:

"In the last few decades, there has been an increased interest in the role of prediction in language comprehension. The idea that people predict (i.e., context-based pre-activation of upcoming linguistic input) was deemed controversial at first. However, present-day theories of language comprehension have embraced linguistic prediction as the main reason why language processing tends to be so effortless, accurate, and efficient."

https://www.psycholinguistics.com/gerry_altmann/research/pap...

https://www.tandfonline.com/doi/pdf/10.1080/23273798.2020.18...

https://onlinelibrary.wiley.com/doi/10.1111/j.1551-6709.2009...


Sure, but that's not what makes humans intelligent.


LLM are not fancy Markov chains. These are more than mere statistical prediction. They contain large deeply layered attentional networks which are perfectly capable of representing complex, arbitrarily structured models, trained from the data set or assembled on the fly based on input. I'm sorry, but I think you are about a decade or so out of date in your intuitions for how these things work. (And a decade is a long time in this field.)


I will grant that my understanding is not complete (and would argue that pretty much everyone else's is incomplete as well), but it's not out of date. I have deliberately avoided forming any opinion about this stuff until I learned more about what the modern approach is. I'm not relying on what I learned a decade ago.


I’ve just asked gpt3 to sum two large random numbers and it gave me correct sum of them. Then I’ve defined fibanachi like sequence (f1=1, f2=1, fn=fn_1 + fn_2 + 7) and it correctly gave me the value of 10th element. It’s not just statistical model to generate something resembling training set, it does understand training set, to similar extents as we understand world around us…


I don't see how your example demonstrates your hypothesis, though. Summing two numbers and telling the next number in the Fibonacci sequence would be expected from a deep and complex statistical modelling of the existing internet data.


Both of these examples show GPT not barely approximating outputs (which doesn’t exist in real worlds for these inputs) based on training set but understands algortihms and able to apply them. I don’t believe our brains are doing anything different from that.


I feel like there are two parallel discourses going on here, and it's crazy.

On the one hand, we have LLM, and people arguing that they are simply memorizing the internet and what you're getting is a predictive regurgitation from what actual people have said.

On the other hand, you have AI Art, and people arguing that it's not just copy-pasting the images it's recognized, and it's actually generating novel outputs by learning 'how to draw'.

Do you see a commonality?

It's that people are arguing whatever happens to be convenient for them.

If a model can generate human-like responses, and it has a large input token size that effectively allows it to maintain a 'memory' by sticking the history in as the input rather than being a one-shot text generator...

Really.

What is the difference between that and AGI?

Does your AGI definition mean you have to have demonstrated understanding of the underlying representations that are put in as text?

Does it have to be error free?

What fundamental aspect of probabilistic text generation means that it can't be AGI?

...because, it seems to me that it's incredibly convenient to define AGI as something that can't be represented by a LLM, when all you have really is a probabilistic output generator, and a model that currently doesn't do anything interesting.

...and it doesn't. It's not AGI. Right now; but your comment suggests that because of the technical process that the output is generated by that LLMs are fundamentally unable to produce AGI; and I think that's not correct.

The technical process is not relevant; it's simply that these models are not sophisticated enough to really be considered AGI.

...but a 5000 billion param model with a billion character token size? I dunno. I think it might start looking pretty hard to argue about.


I have the same sentiment. To me, there's two kinds of groups in most recent discussions about GPT: those who don't understand the underlying functionality at all and those who think they deeply understand it down to its bits.

The second group seems to be very stubborn in downplaying GPT et al capabilities. What's curious is that, for the first time in history of AI field, the source of general amazement is coming straight from AI responses, rather than some news or corporate announcement about how the thing works or what it will be able to do for you.


>No, that's exactly not how LLMs work. They are extremely good at predicting what sentences resemble the sentences in their training data and creating those. That's all.

It's a little hard to take this argument entirely at face value when you can ask it to produce things that aren't in its training data to begin with, but are synthesized from things that are in the training data. I remember being pretty impressed with reading the one where someone asked it to write a parable in the style of the King James bible about someone putting peanut butter toast in a VCR and it did a bang up job. I've asked it to explain all sorts of concepts to me through specific types of analogies/metaphors and it does a really good job at it.

I think the semantics around whether it itself possesses or is displaying "intelligence" isn't the point. I treat it kind of like an emulator. It's able to emulate certain narrow slice of intelligent behavior. If a gameboy emulator still lets me play the game I want to play, then what does it matter that it's not a real gameboy?


“People are getting tripped up because they are seeing legitimate intelligence in the output from these systems -- but that intelligence was in the people who wrote the texts that it was trained with, not in the LLM.”

This is the real magic. Let’s train ChatGPT on absolute garbage information and compare the intelligence of the two.


Let's take a kid and teach them garbage information as they are growing up... minimize as much 'real knowledge' as possible and see what comes out.


Right. There are various examples of what growing up isolated does to your mind.

Intelligence comes from a mix of the universe's stream of data hammering your senses and the teachings of another previous intelligent being.

There's nothing fundamentally wrong in kickstarting a chatbot with lots of pretrained data. It’s Chinese Rooms all the way down.


Agree. What happens? Is a human by default accepted to be an “AGI” entity?


I do agree. But being able to combine old ideas in new ways is also intelligence. LLMs have memorized a ton of information, and learned “information combinators” to compose them. All that’s missing is a clean way for LLMs to engage in the scientific method.


Vast majority of knowledge any one of us has comes from cultural heritage. We all stand on the shoulders of giants. And knowledge, not computation, is behind intelligent behavior.


Nope, they are good at prete ding they do. Ask it something that can't be googled in 30s and you will see this.

I'm skeptical you can simply scale up this approach to full AGI.


To be completely ridiculous, we'll have AGI when it can convince a human to mate with it over any other human followed by raising its offspring to a successful adulthood. This is the game we're all playing.


Parasites are also lifeforms. It doesn't have to convince you to breed, it has to convince you to keep it plugged in and feeding ever increasing amounts or computing power.

In theory GPT is there with the right prompt.


>It doesn't have to convince you to breed, it has to convince you to keep it plugged in and feeding ever increasing amounts or computing power.

It's already succeeding at that.


What you're saying is outright false. If read llm papers you'll see that a lot of benchmarks used are related to math and logic.

GPT3 paper as an example https://arxiv.org/abs/2005.14165

I recommend you read that paper as it answers a lot of misconceptions you may have about llms.


> they can't do math, work through basic logic problems

ChatGPT begs to differ:

Me: How do I measure 4 liters when I have a 3 liter bottle and a 5 liter bottle?

ChatGPT: Here are the steps to measure 4 liters using a 3 liter and a 5 liter bottle:

Fill the 5 liter bottle completely with water. Pour 3 liters from the 5 liter bottle into the 3 liter bottle, leaving 2 liters in the 5 liter bottle. Now you have 3 liters of water in the 3 liter bottle, and 2 liters of water in the 5 liter bottle. Empty the 3 liter bottle, and pour the 2 liters from the 5 liter bottle into the 3 liter bottle. Fill the 5 liter bottle again with water, and pour 1 liter from the 5 liter bottle into the 3 liter bottle, which already has 2 liters of water. You will be left with 4 liters of water in the 5 liter bottle, which is the desired amount.


I’m fairly sure that’s more reciting a solution learned from memory than doing math. What does it answer to

“How do I measure 12 liters when I have a 15 liter bottle and a 9 liter bottle?”

or

“How do I measure a liter when I have a 750 ml milk carton and a 12½ dl tea pot?”

I think we expect an AGI to be able to answer them, even though quite a few intelligent humans wouldn’t be able to do that.


OTOH, ChatGPT also agrees with kurisufag.

I entered your prompt and got the same reply from ChatGPT. Then I changed "4 liters" to "4 gallons" and got this:

  To measure 4 gallons of liquid using a 3-liter and a 5-liter bottle, you would need to convert gallons to liters first. One gallon is approximately equal to 3.785 liters, so 4 gallons is approximately equal to 15.14 liters.
   
  Since the available bottles have capacities of 3 liters and 5 liters, you will need to fill and empty the bottles several times to reach the target of 15.14 liters. Here's one way to do it:
  
    Fill the 5-liter bottle with water.
    Pour the water from the 5-liter bottle into the 3-liter bottle until it's full, leaving 2 liters of water in the 5-liter bottle.
    Pour the 2 liters of water from the 3-liter bottle into a separate container.
    Repeat steps 1-3 two more times, which will leave you with a total of 6 liters of water in the separate container.
    Fill the 5-liter bottle with water again and pour as much as possible into the separate container until it's full. This will add 3 liters of water to the container, for a total of 9 liters.
    Fill the 3-liter bottle with water and pour it into the separate container until it's full. This will add another 3 liters of water to the container, for a total of 12 liters.
    Fill the 3-liter bottle with water again and pour as much as possible into the separate container until it's full. This will add 1 liter of water to the container, for a total of 13 liters.
    Fill the 5-liter bottle with water and pour as much as possible into the separate container until it's full. This will add another 2 liters of water to the container, for a total of 15 liters, which is approximately 4 gallons.
  
  Keep in mind that this method is approximate and assumes that 1 gallon is equal to 3.785 liters. If you need a more precise measurement, you may need to use a measuring container that is calibrated in gallons.
Starts off OK. Notice that it's eventual goal is to get 15 liters: good! But then it doesn't understand what it's doing so it messes around with the 3 liter bottle instead of knowing 3*5=15. Additionally, one of the times it fills the 3 liter bottle and pours all of it out but somehow only produces 1 liter.


And it uses the 5 liter bottle to top up the separate container leaving 2 liters, and then proceeds to fill up the separate container.

Looks like it's mixing up elements of the solution to the original problem without understanding how the topping up works, with the solution to a different problem that it felt was sufficiently related. It's a wild guess based on similarity.


I used to think so.

But then I reconsidered.

Those poo-pooing LLMs say it is merely ‘a fancier version of autocompletion.’ Or, they make comments (correctly) that ‘it isn’t reasoning… it’s just guessing which word ought come next.’

Such a point-of-view is similar to thinking, in regard to a circle saw, “it doesn’t ‘want’ to cut off your hand! It’s just a circle of spinning serrated steel!”

The human race is about to get its hand cut off.

We are in a bad place.

So much time spent being wasted debating how to make LLMs ‘safe’ by ensuring they don’t inadvertently say something racist! People are utterly missing the point as to the true danger.


What about Toolformer?


I think what an LLM is best at is fooling people into thinking it’s intelligent. It is really good at saying things in a natural sounding way, and statistically often getting it right, because certain strings of tokens are encoded. But it’s clear when you start poking that it just as easily tells you that 5/2=3, or that 2+2 != 4. it doesn’t model math or any sort of knowledge at all.


Something that I don't quite understand is why the tendency of ChatGPT to be inaccurate sometimes is a fundamental flaw rather than something that can be improved on iteratively if its just a matter of improving the statistical likelihood of accuracy. The question of whether its AGI or not is, to paraphrase the famous quote, a bit like the question of whether a submarine can swim.


Because ChatGPT isn't thinking. It's not reasoning at all. It's assembling sentences that are statistically predicted from using existing writings as the template.

Accuracy isn't a part of the process except in terms of how accurate the training data is. ChatGPT is not making any sort of truth or accuracy determination, let alone doing so poorly.


Can you point to where the thinking happens in a human?


Nope. But I don't have to do that to understand that LLMs do not assess truth or accuracy of anything.


We don’t have any proof of that, as we don’t have proof also about its opposite. We have no idea why neural nets work, or how our brain works in context of this. There is definitely something human-like in neural networks, but we don’t have any idea why, and what exactly. It’s a completely empirical field and not a theoretical one. We don’t have any good idea what would happen if we could built a 180 billion neuron large neural net, because there is no theory which would prove what would happen even the current ones. That’s why I’ve seen almost every single prediction about what AI would solve in the following years in the past 40 years fail. We have no clue.


There is research that shows humans are also predicting the next word.

I posted that here: https://news.ycombinator.com/item?id=34875324


But you don't know that humans don't reason by stringing words together and seeing how statistically likely they seem.

Related to this topic, see "Babble and Prune": https://www.lesswrong.com/s/pC6DYFLPMTCbEwH8W


I don’t think it’s a stretch to say humans aren’t great at assessing the truth or accuracy of anything either.


My point isn't about how good or bad this is being done. Humans, at least some of the time, attempt to assess truth and accuracy of things. LLMs do not attempt to do this.

That's why I think it's incorrect to say they're bad at it. Even attempting it isn't in their behavior set.


Where is the organ that does that? My impression is everything the brain does is homomorphic to what LLMs do.


Isn’t this the whole point of John Searl’s “the Chinese room” thought experiment? But does it matter what is actually going on inside the room, if the effect and function is indistinguishable? Edit: after conferring with ChatGPT, Searle’s point like yours is that the man in the room doesn’t understand Chinese, he is just manipulating symbols, but from the outside, the man in the room seems to speak fluent Chinese.


I think a better analogy is asking if a water bottle can swim. It floats most of the time, and can move around if pushed.

The reason “can be inaccurate sometimes” is a fundamental flaw is because my assumption is that it will never not be inaccurate. I think it will always be inaccurate sometimes and never be accurate always.

This doesn’t mean it isn’t useful for a lot of applications. But I don’t think it is a holy grail technology, it’s not AGI, and it isn’t going to replace professions.


The whole point of using a computer instead of doing something yourself is to do something quickly and accurately. If I need someone to give me maybe-correct-maybe-not information, I'll just ask one of my coworkers.


Well, that was the point of computers until now. That doesn't mean computers can't be other things, too. ChatGPT is a lot cheaper and faster than your coworker, and it's available (almost) 24 hours a day. And the accuracy may improve!


I mean only if you want accurate information, but if you're building a misinformation network of bots to cause problems in an enemy state then a human sounding bullshit machine sounds like something any number of governments would buy into.


The best feature of this LLM is that it goes from fooling people into having people make fools of themselves when they turn around and predict the end of the world/education/programming/whatever thing they don't quite understand based on what a confidently incorrect charlatan machine told them. It's like a viral marketing gag.


I could say the same things about some of my coworkers...


Fair point, I think we would also call them frauds.


I don't know if this is an AGI-like experiment, because LLM are trained on human knowledge. I'd expect that real AGI wouldn't need such a thing and would improve on its own. That's the moment where we become obsolete.


"you can invent human cognition from first principles via billion years of parallel evolution. or since it’s already been invented, applied, recorded at scale you can just observe its behavior to learn it"


Mammals did not evolve from single cell creatures in order to replace the dinosaurs.


I'm just going to quit my programming job and take up flint knapping, make some quality stone tools -- that's what homo genus did for 2 million years before computers, I figure I can always fall back on that.


> I can't even picture a functional world where humans are cut out of most professions that don't involve manual work

Many people do manual work. It may very well be the case that general fine motor skills are a far more complex and difficult operation than the entire edifice of human intellect. Philosophically, it would be an immense blow, the mother of all existential crises. But regardless, it suggests that the first AGI would be incapable of independent survival and we'd still be relevant and in control for a while.

> where any amount of acquired knowledge and skills will be surpassed by machines that can produce better results at a thousandth of the cost in a thousandth of the time a human can.

I don't think we can necessarily extrapolate it to be that cheap. It could be. But it is also possible that the increases in cost and resources to scale these models bigger and bigger will outstrip hardware progress and that this technology will run into a dead end. To put it differently, I think it is far from clear that the kind of hardware that we build is actually better than an organic substrate for this sort of computation. Imagine an optimized organic neural implementation of ChatGPT, for instance. Would it be slower or more expensive than ChatGPT? Perhaps not. Likewise, the very best that the current paradigm can offer may not be faster or cheaper than humans are at quite many valuable tasks.


This all seems very unlikely to me. Better tools have always made humans more useful, just doing different things. I do think it's interesting to consider whether there is some singularity where that trend dramatically reverses (Vonnegut's Player Piano is one of my favorite books...), but I think the better prior is that this is just another step up the abstraction ladder for humanity.


Software development isn't really about just acquiring and memorising knowledge to later spit out. You need reasoning, logic and creativity. That AGI can develop complex new software (not just some code that has been scraped of the internet years ago) in 10 years is something I doubt.


We don’t have AGI as of now but it could spark anytime and its acceleration could be extremely fast. It could almost spark and yet never really get there, just get closer and closer and quite never hit the mark. But this is not something that will simply stop a massive disruption in most professions and livelihoods, LLMs could do that easily


It's a good idea to think about some aspects of humans. Most people never be content with what they have.

Once progress was made, we get used to the old tool and begin to build new norms.

The pain is always there, you just run faster.


Butlerian Jihad when?


> What we're seeing is the first instance, still very limited and imperfect, of AGI.

None of this is AGI. This is Eliza on steroids.


It seems disingenuous to say that this is Eliza on steroids. They operate in fundamentally different ways, no?

I definitely agree with the premise that the current state of the art isn't at all AGI, but it seems almost self-evident to me that LLMs are a key piece of the puzzle on the road to AGI. Eliza was never going to have that kind of trajectory, but LLMs I think you could certainly make that argument for.


What ChatGPT (and its cousins) expose is that the way humans have been taught in most schools - memorizing and regurgitating information - is now a commodity.

What humans being to the table over ChatGPT is our ability to create new links between information, aka creativity. Teaching creativity, imo, will require a return to the methods like those of Sophocles and his contemporaries. I would rather this author be writing about how he is going to re-examine how he teaches rather than bemoaning that students can shortcut his current approach.


I don't think it is the same.

I didn't memoriza Python sintaxy or the name of every function or how to do small things. I use Google for that. But I know what I need to do in the best possible way (at least that's what I am pay for!). Should I set this variable here? Should this method be private? Should I design an interface or a public class? A dict or a dataclass?

That's what I have to decide as an engineer and where my value resides. If ChatGPT only replaces the memorization part, that would be OK, but it replaces a lot more of it that requires the people using it not to question themselves the things I mentioned before.

I had a bet with a friend, he has no knowledge of programming and was convinced he could make an online game (!) using only ChatGPT. He said one month was going to be enough. Of course, a few months have passed already and he is way off. He asks the questions that a non-programmer would ask, and what ChatGPT gives him back is not usable, not thought for the future, not easily modifiable, etc. His code is a Frankestein that won't do anything good.


> What humans being to the table over ChatGPT is

Statements like these are premature. ChatGPT is three months old! This is a rapidly advancing field. The capabilities of these models are very likely to be radically different five years from now. Any conclusions drawn now about what value is uniquely human and out of reach for AI may be proven wrong quickly.


> The capabilities of these models are very likely to be radically different five years from now.

This is an article of faith. Specifically, it believes that at some point in the future the current paradigm will result in qualitatively different behaviors than the mimicry these systems (all GPT variants, "Attention is All You Need") currently perform.

Much ado has been made of the previous Symbolic AI researchers "moving the goalposts." In this criticism, it is the old guard of AI who is constantly bemoaning the current state of affairs as not real AI. But there is no actual goalpost moving. They have said it wasn't real AI at the beginning, and they are saying it isn't real AI now. Whether or not the symbolists' model was real AI is irrelevant: when you bake in "this is a rapidly moving field" as a hand-wavy explanation for why this may result in AGI, you are the one implying a moving of the goalposts.

If it ever turns out that these models need to be qualitatively different, then it will be clear that attention is not in fact all you need. In that eventuality, I fully expect the new guard to hem and haw and find some tricky sophistry to explain why they were right all along, despite qualitative shifts unattributable to adding mountains of new data or connectionist trickery.


> connectionist trickery

I don't know, man, you kind of come off as having a chip on your shoulder about this. I'm not predicting AGI specifically here, and I'm not making any argument about symbolic vs connectionist AI at all. Maybe the model of the future is half symbolic! I'm just saying that asserting that you know exactly which things AI models aren't going to be able to do is pretty foolish at this point.


> know exactly what these models aren't going to be able to do is pretty foolish at this point

This can be said about absolutely any new technology but it does not make it true. It's simply the inscrutability of these tensors that allow people to imagine the intelligence is in there somewhere. The original comment was what humans "bring to the table" over ChatGPT specifically. And that's that they have real intelligence and not memorization.

As others have said, these models have been around for many years. Their core innovation is to add more and more data to them like a dictionary and compress those basis functions in the network architecture. This memorization has a limit and is actually the opposite of intelligence. Intelligence or creativity can do more with less information. As per the original comment, this is what human intelligence and creativity is (currently) superior at and what people should prioritize if they don't want to be replaced.


This is actually very accurate.


Agreed!

Anyone not answering with “IDK, but maybe…” is just wasting bandwidth.

This Gen1 tech. Most of us are already shocked at how good it is, and it won’t get worse.


You seem confident that it won't get worse, but it's only as good as its training data. Which is the internet. What happens when the internet is filled with generic Gen1 output? I'm doubtful copy averages can ever lead to anything other than increasing mediocrity.


Isn't it like Gen3?


Chat GPT is still at its heart GPT technology (with some clever embedding of the running transcript acting as a reoccurring prompt) which is several years old at this point.


I went to uni 10 years ago and even then, I can't think of any classes that were just memorizing and regurgitating. You'd have to memorize fundamental concepts, but come exam time you are applying those concepts to new questions, not regurgitating anything. In high school a lot of exams were regurgitation, but I attribute it to teachers at that level just not having the niche experience required to craft clever "apply this theory" sort of questions that a domain expert in a university could do, and students in highschool are also responsible for a lot less theory learning on their own.


I think the problem with what you’re saying is that you are not mindlessly agreeing with some straw man, overly reductionist view of what higher education is that seems to be majority view.


Creativity doesn’t exist in isolation. In order to be creative, and create unexpected connections, one first needs to know a lot of seemingly unrelated things.


> I would rather this author be writing about how he is going to re-examine how he teaches rather than bemoaning that students can shortcut his current approach.

He alludes to this: "The first solution is hard for lots of reasons, not least that the current funding model of post-secondary institutions, which does not prioritize the ratio of faculty-to-students necessary for ever more personalized or real-time assessment methods. Larger and larger classes make many of these good ideas impractical. Faculty have zero control over this, but by all means, please talk to our senior leadership. It would be great."

In other words, the Universities are pushing "on-line-all-the-time" because it's co$t effective.


> I would rather this author be writing about how he is going to re-examine how he teaches rather than bemoaning that students can shortcut his current approach.

Did you read the entire article? What you're asking for is exactly how he ends his discussion.


> What humans being to the table over ChatGPT is our ability to create new links between information

I'm confident that "creativity" is a combination of:

1) reproduction errors (when we badly copy things and the wrong way to do it leads luckily to a better way to do it), and

2) systematically or by luck applying established and productive models from one context to another, unrelated context, and getting a useful result.

Just not a believer in some essential, formless creativity that generates something out of nothing.


I think that is poppycock. If you don’t know anything then how can you bring all this “creativity” to bear on a task of knowledge work? This whole line that all of higher education is just regurgitation or something of no value seems fallacious. What is an example of not regurgitation? Why is it that using Wikipedia or whatever is superior learning to what is done in university?


I agree though I didn't find the author to be bemoaning students. Rather they were writing a "state of the art" piece explaining how things are currently happening and leaving it open for people to follow with thoughts about how to make meaningful changes to assessment style and curriculum in the face of ChatGPT.


> What humans being to the table over ChatGPT is our ability to create new links between information, aka creativity

Maybe not, or not for long. Maybe AGI is coming within 20 years, and maybe human workers won’t have anything to bring to it afterwards.

Maybe this is the beginning of the downfall of the value of intellectual human workforce.


I grew up in the late 1980s and early 1990s and have a phenomenal memory. My dad used to tell me how valuable it would be once I grew up and got a job. It’s so funny how false that ended up being.

It’s much more beneficial socially because I can recall jokes that fit situations incredibly quickly and get a good laugh.


Schools haven't been focused on rote learning for eons. I don't know where you get that idea from.


> memorizing and regurgitating information - is now a commodity.

Google Search already did this! Connections have been the value for a very long time.


Absolutely this. Change the questions you're asking of your students. Harder to grade than option A though huh?


He even used ChatGPT to evaluate the approach of one of his student's homework. I don't understand his ignorance.


This is definitely going to require changes in interview practices for a lot of places. For those looking for an alternative, may I suggest pair programming?

I've been doing pairing interviews for years. These days I have a standardized, practical problem, something that's reasonably like the work. (E.g., let's use APIs A and B to build features X, Y, and Z.) I let them pick their preferred language and tooling, so that I can see them at their best. And then we spend a fixed period diving in on the problem, with me getting them started, answering questions, and getting them unstuck.

I like this because not only do I get to see code, I get to see how they think about code, how they deal with problems, and how they collaborate. They get to spend time building things, not doing mensa puzzles, posturing, or other not-very-like-the-work things. And they can't bluff their way through, and it's pretty hard to cheat.


If some places need to change their interview practices, I think it means they've been doing it wrong all this time. You ask them to write some code, then to explain and justify it. Whether or not it has been mostly written by an LLM really shouldn't matter... (leaving aside edge cases preventing to use some tools)


The emergence of this problem is exactly why I don’t like giving people programming exercises that are overly explicit.

Most coding tests just tell people what to write and then have them write it. Real world problems are more complicated. Instead, tell your candidates what your problem is and then ask them for a solution. Let them write their own requirements. It’s a lot harder for language models (and developers with poor problem solving skills) to solve these kinds of questions as well.


Exactly. At points during a pairing interview when I get asks for more details on requirements, I'll make sure to reply with something like, "Which do you think is better for the user?" It turns out most developers have a pretty good sense for this, even if they're not used to being able to use it.


> Apparently, it's not even something students feel they need to hide.

Which is good! If a bit of work is trivially accomplished by a machine, we should take it for granted and move on to the next layer of complexity. I have always maintained that teachers complaining about students cheating at homework assignments with AI need to instead work on providing better homework.


Should we? Basic arithmetic has long since been solved, but I've met plenty of people who struggle at higher level math because they haven't mastered enough basic arithmetic. Solving complex problems will often involve many much simpler problems that must be solved as well. The time to offload this to another system to solve it for you is immensely more expensive than being able to solve it mentally, meaning that to solve the complex problem ends up being far more expensive as well. Eventually students will reach problems whose price they can no longer afford.

It is related to the reason we teach concepts starting with simple, small, easy to solve problems before building up. If I want to teach a student how to solve the limit of x*sin(1/x) as x approaches 0, I need them to understand quite a bit of math to even know what the problem is asking.


No, we are supposed to believe you are not supposed to learn any of that because some technology exists and you will be better off with “higher level” tasks … that’s what they tell me, anyway.


The point of school work isn't to complete schoolwork, it's to learn.


It isn’t clear to me if you disagree with the GP.


If the point of school work is to prepare for life - then you should mainly learn, how to get a job done.

And if they want to teach special skills, like writing essays without computerhelp - then you can test that onsite.


The point of writing an essay is not to learn how to produce an essay. It's to learn analytical thinking, research, and argument skills.


Well, as far as I know, that is an special case of an essay. And this you can test onsite.

The kind of essays I had to write in schools were more about nice sounding words and less the content. CheatGPT can produce nice sounding words, so I am hoping that the focus will move towards rewarding content.


Knowing how to produce an essay is exactly the same as "analytical thinking, research, and argument skills" with the added challenge of making it legible to a reader — which is what makes those skills useful.


I suppose, but having written plenty of essays as an adult I can say with complete certainty that nothing I learned from my 5 paragraph days was of any use. No one, not you, not your teacher, not any real life audience for any topic you would be presenting on or publishing for, wants to read anything remotely close to what you're forced to write in school.


>what you're forced to write in school

What you were forced to write in school. I readily admit that I had an quality of education several SDs higher than usual, but the trite "5 paragraph" nonsense is neither universal or (more importantly) inevitable.


Where did you learn to compose your thoughts and write them long form? If you were an autodidact then do you think that is a representative example?


???

By this reasonning nobody should ever learn anything, because it's all 'trivially doable' by machine.

Like 'addition' and 'subtraction'.

So let's gaslight those dumb teachers by saying they should make up 'better' homework assignments?


>By this reasonning nobody should ever learn anything, because it's all 'trivially doable' by machine.

>Like 'addition' and 'subtraction'.

A better analogy would be low level coding. I don't know (or care) how my processor calculates `var f = 3+2` at the register level. And being able to ignore that allows me to focus on higher level concerns.


One of the required classes for CS degree was Assembly Language. Nobody taking or teaching the class pretended there would be a great need for this language in a job setting. But that wasn't the point of this class.


I see what you mean, but it's not really a better analogy.

We need to learn how to do addition at some point, so we can't have ChatGPT do that.

We need to learn how 'registers' work, so we can't have ChatGPT do that.

We need to learn basic algorithms work, so we can't have ChatGPT do that.

AKA - almost whatever is being assigned as homework, is the 'thing to be learned' and it's ridiculous to suggest that ChatGPT do that, and doubly so to gaslight teachers.


The thing is, what applies to you doesn't necessary apply to everybody. Somebody has to understand low level coding. Somebody has to be introduced to it without necessarily knowing going in that it will be a career path. Somebody will need to write compilers and reverse engineer and design CPUs. Just because a skill isn't valuable to you or those you know doesn't mean it isn't valuable to others, especially those who don't know enough yet to know that it might interest them.


If you don’t know that then you really have a real lack of knowledge of computing. I don’t mean that as an attack or anything but I don’t think it is a badge of honor or anything. You could just as easily say the person typing a Word document doesn’t know what RAM is so we can just stop teaching that foolishness so people can focus on the “complex” things.

One could also argue, and many have successfully, that this type of thinking is why software is so much slower than in the 80s and 90s.


I don’t generally like the word “gaslight” as it is normally used by the unsavory but in this case, I think you used it perfectly. That is exactly what people are doing, except for the extremely naive ones. I don’t really know why, though?


Gaslighting is just a term for a social interaction that didn't exist in our vocabulary before. That's it. Which is different than terms like 'woke' for which we have different connotations and contexts.


Intractable and unsolved problems only!


Or just novel applications of the things you learn in class.

"Congratulations! You leaned depth-first-search! ^award noises^ Below is the algorithm for reference because memorizing it just for this test is silly. You're working on a real time mapping application called Maply. Locations are represented as nodes and all direct routes between any two nodes are represented by directed weighted edges."

a) Write a function that takes a start node, an end node, and a maximum distance to travel and return the shortest path between the two.

b) Your boss said that users need to be able to add stops along their journey. Write a function that takes the final path you computed in part a and the new node for the added stop and compute the amended path changing as few of the original legs of the trip as possible (don't want to disorient your users).

c) Now your boss is saying you need to handle the situation where users make mistakes. Use the function you wrote in part b to implement this feature.


Novel applications? You mean incremental combinations. You realize finding and solving nontrivial problems is much more complicated task, right? This seems to imply that someone that cannot do simple things can do complicated things, where is the evidence of that?

Why do the most complicated mathematics start with basic principles and work up to complex problems? Why don’t they just start with the Collatz Conjecture?


Call it whatever you want, problems that use/apply the knowledge you just learned to problems that you can’t just Google.


Creating the paper has never been the point of these things. The paper has no value at all, to anyone, as soon as the grade is issued.


That may be the case for some students, but there are a number of papers I wrote in college that I am proud of and revisit regularly.


You seem to have misunderstood the purpose of homework.


To be fair, I'm not sure there's consensus around the purpose of homework.


How do you move on to other layers of complexity if you don’t know anything. History books have trivially had answers to questions about history for centuries, does that mean we should take it for “granted” and no one needs yo actually know anything about history?

Where does this idea come from that you can be a Terence Tap of mathematics without even knowing basic algebra?


Excercises should not be about evaluating and judging. They should be about learning. If a student uses an AI, or copies someone else's work, it is to their own detriment. (Exams are a whole different question.)

Even before AI, you could read a book and copy-paste the excercises, or just skip them, but if you wanted the full learning benefit you would type them out. I think we will have to focus more on teaching how to learn. This situation is nothing completely new. Even though you have power tools in woodworking, apprentices learn the basic techniques with hand tools (AFAIK).


But as a teacher you certify that someone learned something, which is why people get a bit of paper at the end of school/university.

So if you don't even try to stop people from cheating you end up doing a disservice to those who do, as you devalue that bit of paper.


According to the Department of Education, 54% of Americans have below a 6th grade reading level. [0]

Everyone thought that the kids who started using mobile devices as babies would become computer savants, but it turns out the kids these days don't understand what a file system is. [1]

What will ChatGPT do to our youth?

[0] https://en.wikipedia.org/wiki/Literacy_in_the_United_States#....

[1] https://www.theverge.com/22684730/students-file-folder-direc...


I wonder if that didn't pan out in part because of the death of personal general computing devices. It wasn't "kids who started using mobile devices as babies" it was "kids who would be fluidly navigating a PC by age 5", which didn't pan out.


>What will ChatGPT do to our youth?

Nothing bad, just the country as a whole is destined for future mediocrity.


Maybe take the approach many physics classes embrace (in germany) : All tools are allowed in tests: calculators, books, as long as the correct answer is derived with "correct derivation path".

If chatgpt makes some problems too easy maybe it's a good thing because we can raise the bar. Find problems that require true understanding beyond auto complete / copy pasting.


This isn’t too different from requiring students to write essays. No one actually needs another essay about Shakespeare or whatever, and ChatGPT could do it for you, but the point of the exercise is to learn to write. It’s the process, not the product that matters.


Everyone’s ChatGPT gangsta until their ChatGPT-trained doctors start giving fatally wrong diagnosis. 100% agree with whole comment. I interview cheaters time to time, their excuse of not learning stuff is exactly that. However, in the end they’ve cheated themselves more than the system.


With the rate they give fatally wrong diagnoses and treatment plans today, I wonder if ChatGPT could be actually better for patients.


? ChatGPT doesn't diagnose and isn't a doctor.


Prompt: I cough, have sore throat and a fever of 39C. What is my illness?

ChatGPT: Based on the symptoms you described, it is possible that you have a viral respiratory infection such as the flu (influenza) or a common cold. It's also possible that you have COVID-19, which can cause similar symptoms.


GitHub Copilot quite often delivers comments alongside the code for a prompt.


Almost certainly the bar for problems that ChatGPT cannot solve is far, far higher than a student who is just learning computer science.


Haha try it for some basic functions and algorithms and you're up for a surprise


I'm a college professor. I'm requiring my students to use ChatGPT to aid them on some assignments this semester. Results are mixed so far. I agree with the author, "One of the things I'm thinking about is that I might need to start teaching students how to use GPT/Codex and other LLMs in their [assignments]." One of my top students submitted a sub-par assignment because he relied too much on ChatGPT and provided little beyond what ChatGPT spat out. Another student who previously struggled to write did much better when using ChatGPT--it felt like he incorporated ChatGPTs words into his own ideas.


> I'm requiring my students to use ChatGPT

God save us from this horrible future.


Gone through all the big-tech interviews, previously worked at Microsoft and a few more. I'd say that the only interview question I think survives here is what we've been doing at our startup: "For this role, here is a real four hour problem for you to work on, let us know when you want to start and we will send you the problem, send it back four hours later and then we will schedule some time to chat about it". If someone uses ChatGPT and solves it better than other candidates, that's great, good for them Don't think it's possible to solve it entirely with ChatGPT, but I could see some good candidates getting pointers to solve a 10%-30% of the problem to make them more efficient, before ChatGPT people could Google anyways and honestly I would expect they do.

I think a similar approach should apply to schools. Here is your math homework, solve it whichever way you want but then we will discuss it. I guess this would be the oral approach from the post but minimizing the time it takes to do the oral review. If your homework assignment has 20 questions and you discuss one with each student, you'll get a pretty good overview if they know what they are doing. If they want to ask ChatGPT for replies during this oral test, go for it, but they still get the same amount of time to discuss the question, is going to be really hard to get this done realtime. Also, I'm not sure ChatGPT really understand what it's doing to the point where an expert can prove it in limited time.


Just cracking myself up thinking about 20 years from now where at some startup the sole engineer (prompt engineer?) who has been trying for hours to figure out why their program doesn't work finally gives up and asks the model "what is the difference between 'let' and 'const'?" and getting a mostly verbatim stackoverflow answer back. A beautiful story to me.


What is the difference between `let` and `const`?


https://stackoverflow.com/questions/22308071/what-is-the-dif...

The difference between let and const is that once you bind a value/object to a variable using const, you can't reassign to that variable. In other words Example:

const something = {}; something = 1; // Error.

let somethingElse = {}; somethingElse = 100; // This is ok


Which is odd because in every other language that uses let (lisp, scheme, swift, rust, scala/kotlin val, haskell, etc.), let bindings are const and let is synonymous with const.


Let bindings in Lisp are not const. You can use `setf` (among others) to mutate the variable. For instance, this works just fine in Lisp:

  (let ((x 1))
    (print x)  ;; => 1
    (incf x)
    (print x)) ;; => 2
Scheme also allows let bindings to be mutated, but using `set!`.


I assumed scheme and lisp were the same here semantically. Bad assumption on my part. My experience is primarily with scheme.

Similar Rust and Swift allow let bindings to be shadowed but you need to re-declare your intent. You can't just mutate the binding on a whim.


I'm confused, Scheme also permits mutation of let bindings so Lisp and Scheme do have similar semantics here. Neither language provides immutable (const) let bindings. Scheme just makes it more obvious (by convention) when you may have mutation, through function and form names ending in !.


The distinction in my mind was that a variable is something where mutation is allowed naturally without being explicit e.g. var a = 1; a = 2. In Scheme you have to opt in to the mutation by using a ! function set! which in scheme's world is like, in C++, const_cast-ing away the const from a pointer, or, in Java, using reflection to make a final field accessible and modify it. All possible using the languages as specified, but semantically not something you're normally encouraged to do.


the world has changed dramatically in only one year. these advancements in AI are talked about everywhere and by everyone.

in 2018 i warned people i knew and i wrote many comments online warning people that huge changes were coming. almost every single person on HN insisted that GTP was just a parlor trick because it would ramble sometimes, get stuck in loops and overall did not demonstrate the insane level of lucidity that it does now. almost every single person on HN would silently downvote my comments. almost every single HN user insisted that AI could never progress to where it is now. well here is my bittersweet vindication. watching people move the goalposts is like watching a supernova. its so powerful and unbelievable that it can only be compared to a force of nature.

commenting a sentiment that doesnt paint AI as harmless is like screaming into a hurricane. i dont even know why i bother anymore. but let me just say it: i think carmack made a great point when he said that the first AGI will be a bunch of lower AIs plugged into each other. GTP doesnt behave in a sentient way but it is very similar to more primitive parts of the mind. its like a persons intuition. dont be fooled by the limitations of any one model. whats important is the road we are on. and for the hundredth time i will scream into the wind: there absolutely is hope in the idea of preventing AI from progressing rapidly or even at all. to do so is absolutely necessary for us to preserve anything that we would consider a good life for future generations.

and i will offer a prediction. nobody will agree with me. but after a handful of explosive advancements in AI, everyone will agree with me.


I'm trying to work out whether or not the right thing to do is just to help work towards ushering in AGI by whatever contribution I can make, or whether to scream into the hurricane. Screaming into the hurricane feels pointless, and I generally agree that on balance AGI is likely to be net negative. Overall I just feel somewhat confused.


mention to people in real life and online that you are worried about AI. it doesnt cost anything and it makes a real difference.


> What about students who don't really know what they're doing and just need to get something submitted for this damned assignment before 11:59 pm?

The author mentions ChatGPT is better for more advanced users, rather than beginners. I disagree.

I think ChatGPT is really good for beginners. Instead of being stuck or making up the "why" something is done, ChatGPT can answer it. There is simply no way a lecturer or classroom has time to go "5+ why's" deep into a Question. ChatGPT at least will do it, and then you can fact check it. It helps me improve my vocabulary, point me into right placed which I can investigate further.

When I use ChatGPT for difficult problems, it doesn't give me anything useful or correct.


Before gpt3 you can still hire people to cheat, it’s higher barrier to entry, but now literally everyone can do it instantly and free. Take home assignments must be de-emphasized and focus in person tests


If I were a teacher, I'd give written exams for coding, as we used to have back in the 80's when we didn't have enough dedicated computers for everyone in the class, and most of kids didn't have a home computer to work with. So instead, we'd implement simple algorithms on paper for in-class quiz/tests/homeworks. I remember in high school, we had an exam, and one of the question was to write a for loop counting from 1 to 40. It might be harder to grade, but teacher could usually tell if the code looked good or not. On one occassion, I thought I had written the code correctly but teacher didn't, so I had to demonstrate the code on the computer for the code that I had written in the test and show the teacher that the code ran as expected.

It's kinda like whiteboard coding, but it's a lot less stressful since the test is mostly written on paper without the scrutiny of an interviewer. Obviously, we can't create a react web app, but at least students can demonstrate fundamentals, which are what we should be teaching in the first place.

Another solution would be to create a new programming language that doesn't exist yet for the students to write code for their assignments or perhaps use an obscure language that AI/chatgpt isn't aware of... perhaps we can go back to Ada or Modula 3.


I graduated in 2015, and I had several classes where exams involved writing pseudocode by hand—particularly the algorithms-focused classes. I think universities will continue to do this.


> If I were a teacher...

You'd be wondering how the heck you get out of a profession where pay is stagnant, the backlog of problems has been growing for a long time, and the minimization of resources to deal with has been in the wrong direction for a long time. You'd join the league of "you can never go wrong hiring another administrator" or exit for some related parlayable field in industry and throw your hands in the air with a "somebody else's unsolvable problem" disgruntled attitude.


CS programs seem like a different beast. At my alma mater they are always growing the cs faculty pool. At certain schools you can really work with some giants in cs departments. People who developed the algorithms you are using today with a piece of chalk. A lot of the new hire faculty seem to come from industry too, secured their bag already, and are jaded from working in industry. Sometimes they come from those industry "skunkworks" R&D type of teams where you'd asume everything should have been just peachy for them, but clearly not if they are looking for academic positions with their stacked resume.


It seems ChatGPT should be treated the same way institutions treat plagiarism. It's fine to use LLMs for inspiration but the work you submit must be meaningfully your own.

Of course it's not perfect, it never will be. But ultimately it's the student that suffers when they plagiarize. Professors ought have this conversation with students at the same times they're discussing the value of learning & education. I suspect we'll build tools that match answers against the text that LLMs spit out and that will make it easy for people to detect when answers are being pasted verbatim from LLMs.

The other broader spicier thought that comes to mind is that LLMs may push the Overton Window for what's considered entry level and that's probably ultimately good tough it won't be without some sacrifice. This would mean CS departments potentially need to do a curriculum iteration to accommodate. Perhaps there are new, slightly more sophisticated, entry level problems that can be tackled where the commodity solutions haven't yet been entombed in LLMs. Maybe assessments will shift to be less about regurgitating the solution to a common problem and instead to fixing problems with some localized toy implementation or some fictitious API that LLMs won't know about.


It will be a hard sell since it or similar tech has already been long embraced commercially. A bit like forcing students to use an abacus while calculators are around.

The student should still be responsible for what they submit and all of its errors of course. But what should really change is our methods of assessing students. The University model for undergrads has already been long broken in need of fixing. I don't think this will significantly impact grad students yet.


> A bit like forcing students to use an abacus while calculators are around.

But... we do that already. We ask students not to use calculators, or not to use scientific calculators, during exercises, exams, etc. And it's not because we want to impose an extra load on them, but because we think that by not using calculators you learn to deal with numbers, you build a strong foundation for thinking about math, quantities, etc.

But since we're talking about essays and coding assignments that are graded, one of the objection is that they should not be graded, because the goal is to use them for learning.

Sure, at the university, the ideal scenario is that students are there to learn and will do assignments even when they are not graded. That happens, but it's for a minority of them, because (understandably) the university is also a time for parties, relationships, etc, and students are young... In practice, many assignments are graded mostly to make sure the students do them, learn from them, and pass the course (maybe a patronizing approach, but in the end, if only 10% of the students pass, the teacher won't have an easy life with his/her superiors - but that's another story).


> A bit like forcing students to use an abacus while calculators are around.

I said allow students to use ChatGPT. Just make it clear that pasting answers from it verbatim, just like doing so from SO, is called plagiarism and does not benefit the academic community or themselves. There will always be cheats. Agree about shifting evaluation methods.


What about when it gives the 'best' word choice? Should a student change it arbitrarily?


The same is true for plagiarism. What you do in that case is quote and cite the source.


Just put it into Quillbot.


Learning is different from commercial work. Ironically, it takes learning yo realize that.


Technology is amoral. The ChatGPT reminds me of a nailgun vs a hammer.

Once you know how to properly use the tool, the benefits of positive work become evident while the human components remain.

We still need the nailgunner to show up on time, be aware of undesirable outcomes that may be built with an correct prompt, or knowing that the impact of a hammer can be finely tuned to the target.

Formal education is the trailing system far behind the workers, innovators, designers, and experimenters.


The best counter to ChatGPT cheating that I’ve heard of is to get students to orally defend their assignments in class. No computer is going to come to your rescue when you have to explain why you wrote the code you wrote.

Encourage the use of ChatGPT and other tools. Put the emphasis on understanding what the code does and why. The tool may help to explain this to the student, but if so, that’s fantastic. No need to worry. Learning has occurred.


They address the difficulties of doing this in the post.


If you can’t actually write undergraduate level code then seeing a machine do it and all you can do is pore over it and try to explain it simply amounts to you knowing less and being less capable of computer science than the average student in the 70s or 80s. This isn’t like an assignment to take an existing Linux kernel module and explain it given the source code. These are basic primitives.


Yeah, this is a good one - does it matter if you wrote the code, or a computer did? If you can explain how it works, and why it is like that that should mean that you understand it and have learned the important parts.


You could ask it to explain line-by-line for you and memorize that enough to give an oral presentation. You may not get the best grade, but it might be enough for a passing grade.


If you memorize it that's half way to understanding, so fair use


There isn’t any time for this in a 500 person public university classroom.


When I was at uni, there was no marking of assignments. The "examples papers" were basically your opportunity to understand the course before a supervision when you could discuss them. If you didn't do the examples, the supervisor didn't really care (typically a grad student that had enough of their own worries). My point is, if I'd cheated, it would have ultimately been my problem when it came to sitting the exam.


One can delegate that to ChatGPT.


Well there is, vastly increase the barrier for entry into universities. Maybe those 500 person classrooms can now be the 25 person classrooms they were actually meant to be. :)


Do it randomly and make the penalties for cheating severe. Why is it CS people suddenly have no solutions for solvable problems?


I am of two minds about ChatGPT. It’s amazingly useful when it comes to writing code dealing with my domain since the APIs are well known (AWS).

But at the same time it misses subtle nuances that you have to have experience with to know when it misses something. In the hands of someone who doesn’t already know the subject area where they already know the foot guns, it can lead you wrong.


My thoughts also. This seems somewhat analogous to the "uncanny valley" problem in graphics animation. So close, but not quite there.

https://en.wikipedia.org/wiki/Uncanny_valley


And ChatGPT also caused me to do an amazing self own

https://news.ycombinator.com/item?id=34814257


We have calculators, wolfram, google etc yet math exams and hw assignments are still administered. I think the approach to teaching and hw assignments especially in CS programs will have to change.


When current and future students are out in the workforce they'll have access to tools like CoPilot and ChatGPT, not to mention plain Stack Overflow cut-n-paste, so if education is meant to be preparing kids for the workplace, then the logical thing to do is allow/teach them to use these tools.

OTOH, they still do need to learn how to program and debug (more important than ever if you're using machine-generated code likely to contain bugs, and with unknown error handling), so it seems colleges also need to make the assignments more complex to point where current crop of tools can only be used to assist - not to nearly write all the code.

It'll be interesting to see how these new tools affect hiring standards... Doesn't make much sense to pay someone just to get a tool to write the code, so maybe the bar will be raised there too.


>if education is meant to be preparing kids for the workplace, then the logical thing to do is allow/teach them to use these tools

Can you imagine being a CS major who has just started your degree in 2023 and day after day you're sitting there watching the sheer speed of development of this new technology and you're unable to fathom at all what the industry you're supposed to be entering in a few years will even look like?

What a mind fuck.


I think there' an initial reaction of something like "how dare they!?!?"

However, AI the type of tool that going to level up mankind's capabilities to the point that curriculums will need to adjust to fit those new capabilities. Certainly this has happened dozens of time in a field like Computer Science, where curriculum in 2023 is radically different than it was in the 60's and 70's.

This new rise in AI might be amongst the most disruptive forces ever in many fields, including academics, but at some point, you have to factor AI in as integral part of our day-to-day work and life and factor that into education.

This will be difficult, especially finding the line between what is fundamental and what is not, but it's not like this hasn't happened before -- e.g. the calculator didn't eliminate the need to learn basic arithmetic.


I am trying to think about use of LLMs in education as similar to calculators. This post reflects how I was taught: students couldn’t use calculators to solve problems initially. For example, no calculators to do addition and subtraction when you’re learning to add or subtract. But also, probably not when learning to multiply or divide. Doing addition and subtraction builds skills and intuition for what multiplication and division are. The same is true with understanding fractions.

Moving up the math hierarchy to algebra, though, this changes. Algebra is at first about the concept of solving equations, and the core idea is that “to solve the equation, you can do whatever you want to one side, as long as you do it to the other.” The mechanics of addition or subtraction, for the most part, no longer matter. Go ahead and use a calculator to solve obscure divisions and multiplications so you can better understand algebra (though it feels appropriate to note that a student who is good at arithmetic can still outpace a calculator for problems that are likely to be used pedagogically, since the numbers are easy).

In this example, algebra is to calculus what arithmetic is to algebra. A calculus teacher cares little if his students can solve equations; he expects they can. They’re instead learning integrals and derivatives and series, and I doubt a calculus teacher would begrudge their students using a calculator to solve a difficult equation.

The problems with treating LLMs this way are many. They are not calculators. You cannot (trivially, or maybe at all) understand how an LLM works. You cannot (trivially) fact check it if it spits out an absurd-sounding answer. You cannot limit the LLM to what your own, personal abilities to trivially do are. We need AI that cites sources, so we can debug when it’s wrong. We need understandable AI for the same reason, not the obscure black boxes we have today. We need AIs that not only can solve our problems, but that can also help us to solve our own problems. When we have those things, AI will be much more useful for education and for widespread use.


Reading this, I’m glad I didn’t have access to ChatGPT and co when I was in school. I was lazy and always followed the path of least resistance, but I wanted good grades so that meant doing the home work by hand after my Google-fu failed.


Now imagine those who came before who didn’t even have Google.


I teach to first year students. We have graded programming assignments they can do by themselves, where they can cheat in all sorts of ways, but I honestly don't care, it's their responsibility to learn. Then we have theoretical exams, where there are also questions about programming (but we don't check if the code has perfect syntax). In one course the exams are oral: excellent tool for assessing students, but very time consuming, in another it's written: more efficient, but more shallow.

I like the idea about teaching how to use AI, but it has to be a tool among others.


"i cheated myself like i knew i would"

when students get out of the classroom and face the human economy they will find that nobody pays anything for doing something that can be done at near zero cost by a machine


In the end you want them to use ChatGTP, but also get to know enough basics to actually be able to use it well.

A few suggestions (examples frontend-heavy, that's what I'm coming from):

- show them how chatGTP can explain things from them that otherwise would be over their head.

- Have them explain the code (live, otherwise they cheat this too).

- Have them write code too complex for chatGTP. Here they learn how much you have to baby-sit it still. (I e.g. tried a browser webext where code is spread across 3 languages and many files and just initializing your settings from stored values can be troublesome. Mine included a list with a dynamic number of rows and animations for add and delete.

- Mislead them to write unmaintainable, unscalable code with chatGTP and show them how to avoid these traps. Make them write material design style CSS and have them change colors later. Or a day range picker like airbnb.

- Have tricky requirements that fail without good QA.

- Unminimize code with chatGTP. It's fun and often fails :). From A to B here took me several attempts and breaking down the problem for chatGTP.

A: https://jsfiddle.net/3jvcx075/3/

B: https://jsfiddle.net/2083wfqg/ (still not 100% correct as the author notes: https://twitter.com/KilledByAPixel/status/162710257110918349... )


The world has evolved, and higher education has not kept up with it. IT's time to stop giving trite and simplistic programming assignments to your student and make them work for it. Have them study set of online documentation and make the test about whether they can look up function calls and successfully assemble the information required to present a working solution.


Short term, this is a problem.

Long term, the bar will simply be raised. Smart/motivated kids armed with AI will still outperform the ones who don’t study, and will be graded accordingly.

This will only be a problem so long as AI help is considered cheating, and until teachers are able to recalibrate their standards.


"Cheating" is good. School teachers and professors should not be the arbiters of how much money you earn later in life, but they are based on their subjective assessment of you (via grades).

Many do not realize the intensely adversarial relationship one is in with a teacher. You are not there to learn. You are there to get an A. In the case of social sciences, you support the ideas your professor espouses. In the case of CS, you do whatever it takes to get your code running as the assignment specifies.

Anything else is idealism which will harm your future earning potential. It's insulting to ask kids to surrender their future earning potential due to "ethics" in an academic world where ever top tier conferences and journals are filled with unreproducible BS science.


The people you're cheating are not your professors but your peers. For better or for worse, grades are used to rank people and mete out further opportunities and benefits. The professors already have jobs.


Let's unbox that a bit. In what sense are one's peers cheated when an individual cheats?

If the course were going to have 20% A-level students and now has 50% A-level students, what has been taken from the initial 20%? They were still going to be able to put on their résumé "4.0 GPA from Suchandsuch University."


People are going to use it, so telling them not to doesn't work. You can demonstrate the proper way to use it, and caveats of using it directly without any thought, but you need to explain how to analyze code, and how to check that it's doing what you think it's doing. Also remind them that it won't come up with new ways to solve a problem - it's trained on how people solved problems in the past, but not all the possible ways to solve a problem. The best students probably wouldn't use it, and wouldn't need to, or they will use it, but know when it's giving an inefficient or insufficient response. Either way, you can't stop it, so learn to harness it.


CS programs will have to adapt to this or die. The reality is that five years from now, we'll all just be talking to LLMs all day and compiling/checking the results. It's no different than the shift from assembler to higher languages in the 80s.


I don’t disagree, but how do you imagine they should adapt? “Checking the results” is difficult if you are not able to perform the task on your own, which means you need to learn the task in the first place.


It’s a chicken and egg problem. Can ChatGPT come up with a framework to write code ? Today it can only read docs and already developed code to create new one. Could it create a new langage that perfectly fit the hardware it’s running on ? If so, there is no need anymore for any other software company. ChatGPT, write me an OS please, write me a photo and video editor. Write me a game about … It’s seems really far fetched because, could it really create new software with a different paradigm ? Something that hasn’t been done before ? Given how it processes words from its input, it seems not.

School teaches how to think. All the frameworks and some langages that are used by millions didn’t exist back then. But what if the point of rupture was 10 years ago, would we been stuck with old and non innovative tooling and designs ? We still need to learn people how to think and develop skills. The cheater from my era never acquired good development skills. Today the cheater are just as good as ChatGPT so what’s the point in hiring them ? If all you have to do is enter commands in a prompt, let the marketing people do it and have their real no-code revolution.

We always going to need problem solvers and people with deep insights on how things work. Maybe it’s the time to dig deeper into the knowledge and write only real meaningful code


When I was a freshman in college the professor would live code various data structures and algorithms on the projector and ask the students to follow along. Each time he did this it was subtly unique.

It was required that you continue to use this same base code for the assignments plus your edits to complete it. This made it obvious who didn't attend the lectures and who they copied from. Assignments were graded by your peers and did not affect the final grade unless you didn't do them at all. Quizzes were not code, but proofs written in your own informal style. Tests were a mix of both proofs and code on paper with no notes or devices allowed.

I don't see how ChatGPT threatens this.


Good for you, but that's not how the vast majority of CS courses operate.

Also, coding on paper with no access to devices is both terrible and has next to nothing to do with how CS grads will actually work.


A CS degree is not a "bootcamp" or any other form of vocational training. Also, there are criteria for the school to maintain accreditations, you know.


Then why are people paying hundreds of thousands of dollars for them?


To get an actual education instead of merely imitating whatever is trending on hacker news.


I think most just want a decent job to pay bills after. The rest love the field, the theory of cs, and may or may not continue into academics or research.


It is a computer science degree, with an emphasis on the science part, and a note that CS is not the same as coding computers. It's the science of computing, not a tech bootcamp to get you quickly up to speed in the latest hot language and framework.


It would be fantastic if large language models (or any of the nascent AI/machine learning tech.) finally kill off both assessment in education and copyright/IP protection.

What a wonderful future that would be. We can but hope.


The problem with synthetic intelligence usage is that, if used like a crutch, it will necessarily modify how you walk. And if you could develop walking just fine and interject this, are you going to be a great walker?

What would be the next generation's intelligence strong at?

Biology needs a soft quote of regular stress to develop compensating the needed optimizations. Happens with muscle growth, fat burning and developing psychological traits and behavioral transformations.

What will be our new opportunities for growth if all the hard work is going to be done by synthetics? (with a hardware to run that is as expensive as a high category car)


Intelligence will go the same way bodyweight did when we stopped needing to use our bodies. Some people exercise and eat properly still, but most doesn't.


I think what this means is that we are likely to see a next generation of software engineers that heavily lean on AI to "learn" programming and we will have to re-train them to some degree when they get into the workforce to be able to actually know the skillset they supposedly learned.

It also means that whiteboard problem solving for coding interviews is going to continue to be a great test to separate out students who actually know how to write a basic for loop and function and those who don't because Co-Pilot "always does it for them".

I know which kind of new grad I want working with me...


I think that this can be trivially solved by openAI. they could store all their responses and then for a fee allow schools/professors to submit the answers they're getting from students like they already do with turnitin if the answer is 90% the same... easy to deduct that it's not original. plus, because OpenAI makes you login, if it is a match they could allow you see the email address associated with the submission... if its the student's name, pretty obvious that it was not their own work.


In earlier versions of Moodle there used to be a peer review assignment type, where students evaluated each others' submissions and both scores and other students' evaluations were scored. Thwn teacher only had to evaluate the reliability of some evaluators to infer skill level for everyone, if I recall correctly.

My understanding was that this scaled quite well but you wouldn have to ask the professors.

This was used in an HCI course at my uni in Tampere, Finland, years ago. As a student the experience was very communal and enlivening.


What if you actually asked the students what they think the best approach is? I wonder what they'd have to say when confronted with optimizing their own education.

As a CS grad I often felt such a large percentage of the syllabus was legacy theory I'd (mostly correctly) never need. Perhaps this can be the positive impetus to get teachers to really rethink syllabuses from a more practical, useful view point. "here students, use ChatGPT and all the tools available to you to create x..."


Perhaps to teach programming at the university level we will need to better mimic real world software development. I always found the 300-line assignments to be a poor practice for the real job where you have 100k LOC legacy systems.

We should work backwards from "what skills should students learn".

Maybe we need to make larger assignments that need to pass larger acceptance tests. Students who chose to use chat gpt will also need to learn the skills necessary to debug its output.


> Apparently, it's not even something students feel they need to hide.

Why would you feel the need to hide it? It's a tool, it's not like the library, other professors, your friends, Sourcegraph or StackOverflow is cheating. Trying to argue why GPT is cheating is just going to devolve into "you can have outside help, as long as it isn't too good for some arbitrary line.


I've heard one idea of adding weights to words. The idea being if I someone hands in a essay with a higher frequency of certain words, there is a very good chance its come from a language model.

https://www.youtube.com/watch?v=XZJc1p6RE78

I'm not sure this helps with coding though, maybe variable names?


Soon you will have to use version control and submit version history with homework. Though maybe chatgpt can generate that, too.


Given the amount of `git add . ; git commit -m "everything"` that I've seen with professional developers, I am not confident that you'd get anything better from students.


Entire feature tickets with one single commit drive me batty.


I’m already doing this in my classes and have been for a while. We’re transitioning more CS classes over to this mode of teaching soon. We’ve got a gitlab instance for the department, and all students have an account. Works great!


Just tell students to use it like they might use StackOverflow or even a textbook. It might help you to learn, but it's not safe to try to just use it to give you the answer. You need to know the topic better than the machine does, because it will lie to you like nobody's business and doesn't care if it does.


Using ChatGPT et al. to cheat in assignments should be viewed as a practice as low and shameful as ordinary plagiarism.

I wish the best to educators, but they shouldn't lose any more sleep over it. The game is over.

Unfortunately, the future tends towards distrust to pathological levels. Teaching Ethics will be of central importance as never before.


I find myself wondering if the future of programming looks more like editing than synthesis.

This is not a bad thing if it's the case; a lot of the job of a software engineer is in analysis of code that exists, but so much of the pedagogy is in synthesis-from-scratch in a world that is already full of billions of lines of code.


The leadership career track for programmers already closely resembles this process (providing natural language prompts for ICs and then reviewing/correcting the output), with AI just shortening the feedback loop. This has me wondering which software stacks will most readily lend themselves to AI-driven-development.


The professor could just ask the student: how did you come up with this solution? What was your thinking?

To prepare for this question, the student has to think really hard both about the problem and about the solution. While doing so, we learn a lot - maybe more than we would learn otherwise.


Asking ChatGPT to explain the reason it gave an answer is a thing.


My bet is that it won't be able to provide a human-level explanation.


Maybe not today. Tomorrow is another question.


This is simply a question of nail and hammer or used power tools, does it matter?

My opinion is, it doesn't matter how/what tools student used to write the program, what matters are two things 1) do they understand the program they have written 2) have they done a good job


Can't you add a series of weird restrictions to the questions that make it very difficult for chatgpt to work with.

Must use switch statement, must declare an iteration int on line 10.

Must use a for loop to count down in reverse.

If you have like 8 of these rules chatgpt may not be able to handle it.


One possible temporary solution is to use ChatGPT to generate the solution to each question, and only include questions where its answer is incorrect.

Then at worst the challenge for the student is debugging ChatGPT code, which still has merit.


Another thing to keep in mind is that (unlike stackoverflow, wolframalpha and such) this tool is going to evolve dramatically over mere months.

I'm not sure universities are structured to deal with such a rapid rate of change.


IRL universities will probably develop a protocol to remove access to helper tools for this sort of thing, maybe metal detectors at the door and specially tailored computers if needed for quizzes. Online universities OTOH are probably going to feel the bite of this the hardest. The CKA exam I took a while back had draconian measures to try to prevent me from cheating, such as taking the test with a webcam on me the whole time, with tools in the browser to limit where I surfed, in a room that has NO PICTURES on the walls. The tech to provide answers is moving faster than the tech to secure a remote room from such outside answers. And if in-person proctoring is eventually required, I would wager that will be a death-knell for online universities, who will have a hard time obtaining that physical space on a regular basis while staying marginally profitable.


Or remote proctoring centers will show up as a business opportunity, and the cost of this will be passed on to students, who will pay it with financial aid.


proctored in-person exams, no electronics allowed

exams now count for 100% of your grade

these aren't insurmountably difficult problems for universities to solve


That's assuming they want to solve it. A lot of administrators live in lalaland because it suits them to. It is in the interest of society to fail cheaters and frauds before they graduate, but it costs the universities money to do it properly. So they will only address the problem if it comes to public attention.

That's a trend that we've seen over and over, for example with the corruption over admission to Ivy league universities. In fact chatGPT doesn't really change the landscape that much. All it does is democratising contract cheating, on which most universities only apply band aids. It might end up being a good thing, but only if it does attract public attention to the issue.


> In fact chatGPT doesn't really change the landscape that much. All it does is democratising contract cheating

But generative AI will remain a workhorse even after graduation.


> This tool is going to evolve dramatically over mere months.

Why do you say this with certainty?


Just assuming the current pace of advancement will continue.

Good text and image synthesis were basically impossible 2 years ago, and every time I check up on Github some huge new innovation has come out.


Is there really a basis for such an assumption?

Don't returns tend to diminish with effort, rather than increase?


Technology tends to grow with a sigmoid curve. Extremely fast growth at the start, and then it slows down and eventually plateaus and new technology repeats the cycle. The only question is where on the curve are we for LLMs and AI in general?


I see this sigmoid growth prediction quite often. What is the basis for assuming it will follow this shape? What are some examples of this happening?


But interest and investment is also increasing exponentially. I've never really seen anything like this before in AI land, even when GANs and such were doing amazing things.


We have seen this level of rapid development and massive hype in AI before. Several times, in fact. Check it: https://en.wikipedia.org/wiki/AI_winter


Did a try with ChatGPT asking to write a react hook doing something really basic.

Code was ok, as expected.

I asked to write tests for it, code looked good but it won’t run for some act() errors.

Then it was mocking network calls…

After trying for a while I had to write tests myself


I tend to find it works best as a tool for generating situation specific examples rather than writing you entire code for you.

Like when every stack overflow answer you find just doesn't fit because of one major difference in your situation, chatgpt often has you covered. Then you take the example code it gives you and adapt it to your actual codebase, testing to see if it works as expected as you go.


yep, i had the very same “stackoverflow dejavu” feeling


> the current funding model of post-secondary institutions, which does not prioritize the ratio of faculty-to-students necessary for ever more personalized or real-time assessment methods.

Get ChatGPT to do the assessments.


>It's been breaking my understanding of what is and isn't possible, and forcing me to think about new ways to solve technical problems.

followed up by

>However, I'm not sure how to understand it as part of student learning.

Is super funny


Why should anyone care?

If someone wants to coast they will and it will be reflected later on when they cant get or hold a job since they are just as shit as ChatGPT or copilot

And if they can get and hold an job then isnt that just better?


Wasn’t this day inevitable when colleges killed paper and oral exams?


All this tells us that whiteboard interviews are here to stay.


With the general shift to remote-first, it's a lot more difficult to verify that your whiteboarder isn't ChatGPT-ing just off-screen.


Who says anything on the screen will be real?

"Hey MultimodalGPT, look like me on a Zoom call and pass this interview."

Text2Video is already moving quickly. Maybe in as little as three years a video feed will be as trustworthy as an email.


People who use this are just robbing themselves of an education. It's the same as copying the answer from stack overflow or another student. Most problems are already online somewhere, chatgpt is just shortening the research time.

You can solve plenty of math problems by putting them in Wolfram alpha, but if you do that for every calculus assignment you won't learn calculus and you won't be able to apply it to solve problems. Same with programming; if you copy a fizzbuzz instead of learning how to write a fizzbuzz, you won't be able to do a more complicated problem or understand a loop when you need to.


"In my opinion, the students learning to program do not benefit from AI helping to "remove the drudgery" of programming. At some point, you have to learn to program. You can't get there by avoiding the inevitable struggles of learning-to-program."

Couldn't you say the same about how compilers 'remove the drudgery' of writing machine code? Or is that a bad analogy? Provided AI eventually gets good enough in its code generation, maybe 'programming' is moving up another layer of abstraction.


Abstraction works because you are able to treat an abstraction as a black box and concern yourself only with its input and output. A segment of code written by an LLM is qualitatively a very different thing; it's more like an open box of crap that you have to inspect and put together yourself, which requires knowledge of the contents, which requires experiencing the drudgery.


I rarely encounter abstractions in the wild that are as nicely "sealed" as the definition implies. Looking at an open box of crap and understanding why it's doing something other than what the author (or you) intend is a valuable skill.

(No idea if this new model of "Ask ChatGPT or Copilot to synthesize a solution and then tune that solution" provides a solid opportunity to improve that skill yet, however).


The real problem is that an _attempt_ to cheat is graded, on average, higher than no submission at all. We need to change that.


Thank god for whiteboard coding interviews!


I wonder if this will lead to in-class tests being weighted much more heavily than homework style work.


Idea: Ask the students to record themselves explaining their solution, and what lead them to it.


So you’ll grade them on how good their teleprompting skills are.


A lot of time is saved by chatGPT. But we need to get better at testing and debugging.


Cheating in academia isn't a problem, because the only people students are cheating is themselves.

I've seen plenty of cheaters, they were the worst students, and despite cheating couldn't graduate or couldn't find a job after graduating.

If some moron is stupid enough to cheat when he's paying 50K a year, let them.


The problem with widespread cheating even if you don't cheat yourself is it essentially reduces the credit rating of your $50k program and has knock on effects elsewhere, like worse interview loops. And if the 'cheater's workload' becomes the norm, you the non-cheater can be literally failed out of programs and scholarships because you didn't keep up.


To be honest, I think that cheating in any field is terrible and there is no excuse for it. As a writer for an essay sample site, I recently worked on several essays related to the topic of cheating. Cheating is a complex and multifaceted issue that affects students at all levels of education, and it was fascinating to explore the various aspects of this topic through academic essays. One of the Cheating Essay Examples https://samplius.com/free-essay-examples/cheating/ I worked on focused on the psychology of cheating. We examined the underlying factors that motivate students to cheat, including pressure to succeed, fear of failure, and a desire for recognition. Through a combination of empirical research and analysis, we were able to gain insights into the complex interplay of psychological and social factors that drive cheating behaviors.


At what point does "not keeping up with the cheaters" become "There is a more efficient way to do the task and this student's grade reflects they are choosing the inefficient approach?"

I'm reminded of the stories of employees getting busted because they were assigned a job so trivially automatable they either did automate it or they used some find-labor service to delegate it for a fraction of the cost out of their own pockets, who are then accused by the company of "not working."


It's the same logic of letting a machine do a gym exercise vs doing it yourself, you're missing the point of it, because otherwise the gym exercise is mostly useless.


> I've seen plenty of cheaters, they were the worst students

In other words, you haven't seen the "cheaters" who were among the best students.

"cheaters" in quotes because it's not clear to me that people using freely available resources when doing homework are really cheaters. If an instructor wants to do a closed-book exam, they can do just that.


> In other words, you haven't seen the "cheaters" who were among the best students.

If they were able to fool everyone to the point of being considered good students, it means they weren't cheaters, just that they had a different approach to problems than others (which is kinda what you say after).


I foresee a time when we'll write BDD tests and the AI will write the app.


let them cheat. every assignment let them know that in the end they are only hurting themselves. the ones that want to learn, won't cheat. simple as that. and that ones that cheat and leverage it, great. hope that works out in the future at your next destination. all you can ever do is present the facts. there is no point wasting time trying to catch this.


>hope that works out in the future at your next destination.

That's the thing - a lot of code is generated with "Github Copilot" so it isn't considered "cheating" in the real world. They need to learn how to properly use the tools available to them. They'll be harmed by forcing them not to use this tech, so it makes sense to teach them how to use it better.


Sure but if you allow it aren’t you facilitating a potentially unfair (or unreliable) baseline?


without knowing with a high confidence what is LLM generated or not, it seems pointless


So, what if we give ChatGPT code of ChatGPT, and ask to improve upon it? If it can do that, we are fucked.


1984: Computers are like a bicycle for our minds

2024: AIs are like super-sportcars for our minds


It’s okay, it’s now clear that programming won’t be done by humans within 10 years.


adapt and overcome

Thank you evolution you cruel mistress.


I really like this blog post, I think it fairly describes some of the challenges teachers are going to face with this technology, while also admitting that this tech is inevitable.

> In my opinion, the students learning to program do not benefit from AI helping to "remove the drudgery" of programming. At some point, you have to learn to program. You can't get there by avoiding the inevitable struggles of learning-to-program.

I don’t disagree with this at all (at least for where we are now, in ten years, it might not matter as much), and I don’t want to be glib, but I do think the answer is to “teach students to program.” Don’t rely on rote assignments that you’re checking with an auto-grader (not saying this professor does that, but a lot do) and cookie-cutter materials; actually teach them to program.

And yes, LLMs will almost certainly mean that some students will cheat their way out of their assignments. But just like most cheaters who cheat on things they don’t fundamentally understand (which is different from people who cheat to hurry up and get through an exercise they could do in their sleep but don’t want to waste the energy doing), it will catch up to them when they have to do something that is not part of the rote assignment.

Or maybe, adjust how you test/assign homework. That isn’t to imply that that won’t take more work, but if your concern actually is that students aren’t learning and are just successfully copying and pasting, the testing/grading is the problem.

In high school (and college), I was a top student in math and in English. But I hated doing homework. In one math class in high school, although I got near perfect scores on all of my math exams, the teacher still gave me a C because 20% of my grade was “homework” that I didn’t do (I had already taken the same class a year earlier at another school — I didn’t need to do the homework. My math tutor my mom got me out of fear of my bad grade taught me Calculus and Fortran instead of trying to get me to do the useless homework). This taught me nothing and frankly, soured me on taking more advanced math classes that were all taught by this same teacher.

In contrast, I had an English teacher who would assign vocabulary homework. Basic, “write a sentence with each word” shit. Again, a total waste of time for me. So I worked out a deal with him, let me just orally tell you what each word means, saving us both time and energy. He took the additional step of assigning me/grading me on different criteria than the rest of the class for essays and the like.

Which class do you think I learned more from? Which teacher actually cared about whether I knew/understood the material, versus what checkboxes I needed to follow to show “completion.”

If the goal is to teach students to understand what they are doing, then do that. Don’t get obsessed with trying to stop the inevitable few from cheating, or become overly focused on only having one way to measure comprehension.


> Except that half the code was made-up. One of the API calls that it was using doesn't exist (I wish it did!)

I've also been using ChatGPT to generate code for a specific library, and had the exact same experience. It uses an API that should exist, but not the one that does exist.


The future is all prompt engineering.


Chomsky, in a recent interview, said if students are cheating with these tools, it’s the fault of the teacher for not creating work that is interesting enough to be absorbed by …


There is nothing interesting about pointer dereferencing errors (or their slightly-more-turd-polished equivalent, null references). Absolutely nothing at all.

Most programmers will spend a non-trivial amount of their career fussing over them, however, and any programming education program that doesn't at least touch on how to identify them and what to do is pedagogically void.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: