Hacker Newsnew | past | comments | ask | show | jobs | submit | _jx7j's commentslogin

While I generally agree, I think there's a point in these 2 statements that can easily misinterpreted.

>It is likely that the Data Scientist role is in a long term decline...

Also

> Data science is in decline and vaguely defined

Reading this, you can think that "Data Science" jobs are decreasing. But I don't think that's true.

Let's just say that it's 2017 and I hire a team of 3 people with the job title of Data Scientist. One ends up focusing on the data side, one on modeling+analysis, and one on building the infrastructure. In 2023, I decide to change the job titles so one of them is now a Data engineer, one is now a Data Scientist, and one is now a ML Engineer to match what is happening in the job market.

It's still 3 jobs with 3 people doing the same thing. So the number of jobs aren't decreasing, but their titles are more specific. Overall, the number of "Data Science" jobs are still doing up.

Somebody will say "But that's exactly what the author said." But I think people who are new(ish) to this field might read it as "Data Science Jobs are decreasing." So I'm making this comment.

> skills such as data mining and visualisation are also out of favour.

Honestly, I just don't believe this. It's possible that as job descriptions are filled with different buzzwords, people just leave these out. For visualization it's also possible that there is a bigger focus on keywords of an established BI tool (e.g. PowerBI) instead of ad-hoc charts in matplotlib or ggplot. But some degree of data mining and visualization is useful, even to Data Engineers.


>With 10+ years in DS, I've always felt that best DS were always basically software engineers that knew math and were more interested in prototyping cool machine learning product than maintaining production infrastructure. Unfortunately this always accounted for a small fraction of DS I interacted with.

I've been a DS for 10+ years, and I feel the exact opposite. The worst "Data Scientists" I've worked with are all ex Software Engineers who seem to assume that business problems are really computation problems. So they find convenient ways to ignore the human aspects (e.g. trying to figure out why the data is a mess) and gravitate to using more complex algorithms and breaking down the problem to an achievable programming pipeline that runs in production, but the results are of low value. But it looks awesome on a resume.

Are you right or am I right about SWEs turned DS? I have no idea. But one quality that IMHO is important is the interest in actually looking at data and asking questions, which is much rarer than most people realize.


> Are you right or am I right about SWEs turned DS?

It doesn't sound much like your worst and their best is the same kind of person. I don't see necessarily conflicting views.


There might be a lesson from history on where an "AI" startup needs to focus if they want to succeed.

Starting around 2012, there was a huge hype around ML. Lots of startups on selling "ML." If you look today, the majority of the startups that just sold ML algorithms or ML in a box are pretty much gone, or were acquired for amounts where only the founders made enough money to pay off their car loan on a Toyota Camry. At least one of the most highly regarded Unicorns in 2014 now is in the "Wait, that company still exists?" category.

The companies that survived or got acquired for a decent amount were the companies that used ML to actually solve a tangible problem. They were not selling ML, but a solution to some problem that happened to use ML - even if their marketing focused on the ML part.

I wouldn't be surprised to see the same with AI companies.

A big company (e.g. Google) can quickly release a similar product for an LLM because that's an area where the big company was already doing work and a lot of the knowledge to build it is published or known. But if a startup is targeting a focused problem in a specific industry, a big company can't just wake up one day and get the information needed to solve that problem in a short amount of time.


One of the ways I think about this type of problem is by asking "You want to use computation to extract a signal from this data. What's that signal worth to you in business ROI dollars?"

If Domain Expertise + Feature Engineering + ML model can get you 90% of the way there and it runs on a tiny cloud instance that takes 30 minutes to train, is a DL based approach that pushes you to 91% worth it from an ROI instance if takes a 4xGPU cluster 2 days to train it, not to mention inference costs? Especially if you need to explain what the model is doing?"

This above is exactly the situation I'm in now with my job. I'm on the "Get useful stuff to production so we can save money" side of things, and we have R&D teams who try to approach the same problems using DL and all the latest methods. At least for the use cases our team focuses on, they haven't been able to do more than set $$$ on fire via GPUs. For us, Domain Knowledge + Good Data Engineering is the secret.

I think ML is going to be around for a long time because it works, even though DL is dominating the news right now. Just because a neurologist can also diagnose and treat common medical conditions (e.g a pneumonia), that doesn't mean we need every doctor to be neurologist.


this is awesome! thanks


>This couldnt be further from the truth.

I think one thing to keep in mind is that there are specific use cases where the cost of using DL isn't worth the improvement in accuracy (if there is one) from a business ROI perspective.

I know somebody who works in the insurance industry on a text classification use case. The business impact of this use case is important as it's used as part of the claims process. The team he's on has tried a lot of different things, but feature engineering + domain expertise + a particular tree ML model has provided the best performance for the lowest overall cost. They are very open to trying new things, but a DL approach simply hasn't been worth it.


>But (especially in Enterprise B2B) they're not coders, and they're not product managers, and they are not quite salesperson "enough" to qualify for evolving to any of those career positions.

I was in PreSales for a long time and IME, this isn't true (excluding SWEs). Different SEs have different personalities and different inclinations. Some of these personality traits means they have to stop being an SE and do something else to maximize their career satisfaction.

A lot of SEs stay as SEs and simply grow as technical experts or industry experts and then go into a "Principal" SE role. I think these tend to be the type of people who like go deep into something and their strengths are in going deep. The ones who get tired of the sales part can go into developer relations, solution development, etc where they are part of building solutions, but they are removed from the heavy sales parts of it. You can certainly get paid enough as a Principal level that money is removed as a reason for changing career paths.

There's then the type of SE who is strong technically and is good at understanding what a customer is really asking for and translating that into a solution to a specific use case. These are the ones who can be good product managers. I put myself in this category and the product SWEs I worked with usually liked talking to me because I could frame a customer ask in a concrete technical way, which was something a lot of Product Managers struggled with. I had more than one instance where a Product Manager said if I wanted to switch to Technical Product Management they could find a path to make that happen.

There's then the type of SE who is okay technically, but who really cares about customers succeeding in the long term. They can do well in customer success roles, or even technical support and management roles in either of those areas.

There's then type of SE who is really enjoys the "educating" part of sales. They can go into the Training/Education side of things.

Finally, there's the type of SE who has a great personality, but they are poor technically and not very detail oriented when it comes to anything technical. I've seen these become sales people or move into account management.

I've worked with SEs who have moved into Sales roles, Product Management, Training, Customer Success, and PreSales Management, Developer Relations, etc.

Regarding SWE, I do agree with you that transitioning from SE to SWE is harder. I've worked with SWEs who became SEs but couldn't transition back. Once you've climbed out of the trenches and seen the big picture it's hard to say I'm going to go back in the trenches, especially at a large B2B company where there are 1000s of SWEs. It's easier if you were a specialist (e.g. in Data Science) and the SWE job is a specialist role.

>How do you evolve SEs in your org?

One thing I've seen is that many SE managers are sales people at heart, which makes them terrible at evolving SEs. They only reward and recognize the SEs who have the traits that overlap with sales people. So they can't even recognize how to evolve their SEs. Even if the SE manager gets it, the Directors/VPs, etc are all ex-sales people and they don't get the value of focusing on these other things.

IMO the way to evolve SEs is to connect them with somebody from a different org (that can use the SEs strengths) and have them work together. The SE will learn about how people do a different job, and that other person gets a direct line into what SES are seeing in the field. It's generally a net positive for the company.


>Probability Through Problems

First time I'm hearing about this one, thanks for the recommendation. Unlike Calculus or even a typical one semester Statistics course, probability is one of those topics where you need to see a lot of problems to really grok anything. The only way is to see a lot of solved problems and think about why that's the right answer.

Even highly recommended books (e.g. by Blitzstein) don't have enough solved problems, so it's nice to there's a problem focused book out there.


In the last 5 years I've moved from Data Science Manager to Principal (IC role, but basically the external facing technical lead of the team) and now Senior. When I add up all the positives and negatives at work and at home, I think I'm the most content I've been in a long time.

One piece of advice I can give is to make a list of concrete things that are making you think about becoming an IC. And by concrete I don't mean just a word like "Stress." List out the things that are causing that (e.g. Unpredictable rushed deadlines). If it's useful, create a mental Venn diagram of what overlaps and pull out some larger themes. From the list, pick the top 2 - 3 and ask yourself if you are really going to get that as an IC.

Example: For me, one thing that was causing me an immense amount of stress was a highly unpredictable schedule - last minute meetings, fires, people with no ability to prioritize, etc. At the top of my list was being able control my schedule to a reasonable degree. IME that's really hard to do as a Manager or even a Lead/Principal, so it made sense to me to move down to Senior and find a job that allowed for that.

That being said, there are plenty of IC jobs where you sit in pointless meetings all day and get called into meetings at the last minute. It's not like moving down is a guarantee of anything, so it's important to really identify if moving down is going to get you what want. HN has plenty of anecdotes of crappy IC jobs.

As far as pay, I took a small haircut from Manager -> Principal and then a slightly bigger haircut from Principal -> Senior. My spouse works so we still have more than enough to pay bills, meet retirement goals, and have something left over for other stuff. I'm not paid even close to what FAANG people are making, but I suspect lots of people I work with on the business side of things would love to be in my salary band.


>But it also (1) helps keep bad content off of the platform, so users aren't exposed to it, (2) lowers the number of human reviewers who come into contact with it, which is improves their jobs, and (3) frees up budget for whatever improvements need to be made to this whole workflow.

I think reading that this type of solution was created, and person who worked on it was laid off, makes me very sad as a Data Scientist.

I enjoy working as a Data Scientist, but I struggle a lot with the field. Lots of jobs are mostly about grabbing eyeballs or selling something. Some jobs are just total bullshit. Even the ones where you're doing something concrete (e.g. keeping a machine running), some days you still wonder if it really matters in the long run.

But with some of these social media safety topics, it can feel like a job has some meaning beyond just shuffling numbers around on an spreadsheet.

So it's disappointing to hear that people with the skills to create something like that are fired.


It's possible to have teams that save millions of dollars a year and not be worth it to keep them.

For the sake of argument, let's say a statistics team has 5 people.

Cost of Employee at FB, including insurance, office space, 401K match, salary, bonuses = 250K/year (probably very conservative).

Cost of Data and Software Infrastructure to support them (including people to respond to Infrastructure support tickets), let's just be very conservative = 100K/year.

Cost of People Management overhead to support them. Includes salary of at least one manager, not to mention the time of a program manager, project manager, product manager, or whomever else. Let's just say 500K/year.

Total = 1.85 Million/Year.

Let's say this team of 5 people comes up models that save the company $4M a year. I once had a VP tell me that to justify a Data Scientist on the team, they needed to have a savings of 10X what they cost the company to have that person on staff. I know this logic and math seems very weak and hazy. Mapping costs is a strange thing. But this is how some decision makers think, and this is how people get cut.


The team was all mathematicians. We did the math. I helped one of our data scientists put a model into production that saved $15M a year from that model alone, and we had a dozen people like that. We were working on signal loss models that had potential to save billions. I genuinely do not understand the logic of cutting this team to save costs.


Eric, my best wishes to you, I've also enjoyed reading your texts, at these older times when you were allowed to write about your work.

Having had some similar experiences to yours now, I don't believe there has to be strict logic behind the managerial decisions leading to big changes. That's not how they are made, and that happens more often and with more impact than we typically register in our own environment, as we are busy doing our specific tasks. I know that it can sound cynical but I think it correctly reflects the reality.

In one specific case from my previous work, I know from those present where the decisions were made, that a decision about hundreds of people working further of not on many running projects was made after one high manager left and the few remaining who were the only one deciding literally had a short talk: "OK, who wants to take over these, I won't, do you?", "no", "no", "me neither." "OK, then let's dismount all that." And so it went. And similarly, it's not that it was not profitable for the company, it was clearly documented. The decision of each of those involved was then explainable with "it didn't match our vision of where we want to concentrate our company's effort." It is sometimes as simple as that. The "high managers" so often score additional points whenever they decide that the company makes less of different stuff.

Steve Jobs was, of course, famous for abandoning different projects in Apple on his comeback, and it provably gave the results. But I also see the companies overnight losing the proficiency in some fields based on managerial decisions impulsively made, performing even worse later. I don't have any grand narrative based on these experiences to push, except to state my belief that sometimes the "reasons" are extremely simple and very, very mundane, to the point of causing huge disappointment to those who heard so many decisions presented as strictly a result of precise measurements and deliberations, who knew they did their best and were aware that "nothing was wrong."

It does leave one questioning why they correctly invested as much energy in what they did, and if they made right decisions during these times, from a newly obtained perspective.


>I genuinely do not understand the logic of cutting this team to save costs.

I've been in a situation where a company was under pressure, was trying to make a big pivot, and there where multiple rounds of layoffs.

At one point I could only make sense of it by picturing a somewhat blind lumberjack getting an order that says "There's a forest that needs 15% of trees cut. Go cut." Good trees get get, bad trees get cut. Thankfully we are not trees and if we get cut we can move on. We don't die just because we got chopped down.


Unfortunately, top-down mandates are imperfect and should be avoided as much as possible. Net profit matters to an operator who cares about today's profitability, but not at all to someone whose paradigm is "thinking in bets" and future payoffs. And the street has been rewarding people who ignore today's profits in favor of the narrative about tomorrow's growth.

From afar, it looks like Meta's leadership is a bunch of future thinkers who got told to cut today's costs, and it's not a well-practiced muscle for them.


Perhaps in the future the company would not be adding any new models or require optimization of any new surfaces, so they don't expect to be spending enough on new initiatives to justify optimizing them. And all the existing initiatives have been optimized efficiently already (though that does seem unlikely when I type it out).


> We were working on signal loss models that had potential to save billions

What are signal loss models in this context?


"Signal loss" is the overarching term for all the factors that lead to the company being less able to make good inferences about users. Not just the obvious consideration of "how do we serve an ad that is relevant to the user?" but for any data-driven decision that affects a user's experience.

The biggest recent cause of signal loss was Apple changing the rules for apps on their phone, but there are plenty of other causes.

The idea of a signal loss model is to identify ways to work around signal loss and still do a good job of making a decision with the data you have, when some of the data you were relying upon disappears suddenly.


Perhaps an example would be - we no longer have location data for users, but we do have time of activity, so we can presume that during daylight hours in the USA most of our activity is coming from there, things like that.

But with more inputs and such.


These employees are making a lot more than $250k just in base salary. Cost is probably closer to $1M each, all in. "A few million" in net cost savings isn't much for a team that probably costs $5M a year.

It would definitely be better to find another internal home (assuming the team is portable without its mother team that got cut), but sometimes these decisions are made quickly without a lot of granularity. They aren't necessarily going to find one sub-team that saves only ~1x their cost in net profit and figure out how to transplant them to another org.

He seems to have taken away the important lesson - if you're not primary you're in danger.


How in the world are you getting from $250k salary to $1M total cost? Stuff like office space and equipment/services, health insurance, HR overhead are constants per person, they don't scale up with salary. Are you assuming that some big bonus or grant package is necessary?


Yes, their total comp is $500k+. They are taking up a portion of the management time of someone whose total comp is approaching (or over) $1M.

Software Engineer: https://www.levels.fyi/companies/facebook/salaries/software-...

Software Engineering Manager: https://www.levels.fyi/companies/facebook/salaries/software-...


The trick with discussing any numbers like this are variables that none of us can know without more intimate knowledge of a firm. For example, my spouse works for an SV firm. His team is 100% WFH, 100% of the time. They have no permanently allocated office space in any of the company’s buildings anywhere in the world.

However, they’re paying out bonuses twice a year, annual (PB)RSUs, (specifically for us) around almost $30k/yr in employer contributions to health insurance and our HSA combined, music streaming subscription, and so on.

The benefits, the bonuses, the extras, they all add up and are all very company specific. I’m not saying you’re wrong by any stretch. But with the number of extra benefits, healthcare, and everything else that’s different from employer to employer, we are all just guessing.


Facebook employees make a lot more than 250k. Someone with Eric lippert’s level of experience probably makes well over 600-700k in total compensation - just see levels.fyi!


A big portion of the comp especially at higher levels is in stock grants... and Meta stock just dropped 75% in value this year.

These grants are valued at the market price at time of hire (or refresh).

So maybe pre-2022 the comp was 700k...


It was claimed to be millions of net savings per deployed model.


Doesnt this paragraph indicate that they were making millions of dollars in savings OVER their cost of operating?

"The PPL team in particular was at the point where we were regularly putting models into production that on net reduced costs by millions of dollars a year over the cost of the work"


If the company as a whole has an ROI of 10×, then a position with a 4× ROI is actually reducing the marginal profit of the company.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: