This is not the point. The point is you misquoted the article without understanding the full context, and was corrected. The parents weren't judgmental of an online life, they were just unaware. Matter of fact, on the documentary from one of the replies, it felt that they were glad their son had good friends who really cared about him.
Jeez. I understood the full context, I just wasn't even talking about that, and neither was the grandparent comment, I think. The "online life: wholesome or not?" debate has crept into this comment chain by accident.
I'm a similar position as you in terms of job, salary and balance.
But this is my 5th job out of college and the longest I've been in a single job. When I hit my 4year mark, I started to think that the grass is greener on the other side and looked for opportunities here and there.
A few years in and I didn't find the right place that would make me jump ship. I have high standards and I can spot redflags based on past experiences in jobs in multiple countries.
I also found out that after I got older and accrue more responsibilities outside work, my job became a much smaller focal point of my life.
I'd rather be employed in an okay place, being paid a competitive enough salary (75% in the curve), and have opportunity to learn new things in the job and out, even if I don't love the field. As opposed to try a new job and risk being put in a toxic work environment and lose the balance I have now.
If you are not growing as an engineer, going to a new job might not help you and could be detrimental. It's much better to learn new things while you have the time to do so. Find an area you want to learn more, try new projects or courses, and have fun at your own pace.
Regarding your question, first, I'd like to understand what problem you want to solve, and whether this approach will be useful for other users of tea-tasting.
No problem! I have most of the code in very small functions that I'd be willing to contribute.
At my company we have very time sensitive AB tests that we have to run with very few data points (at most 20 conversions per week, after 1000 or so failures).
We found out that using Bayesian A/B testing was excellent for our needs as it could be run with fewer data points than regular AB for the sort of conversion changes we aim for. It gives a probability of group B converting better than A, and we can run checks to see if we should stop the test.
Regular ABs would take too long and the significance of the test wouldnt make much sense because after a few weeks we would be comparing apples to oranges.
Thank you for explanation. If I understand correctly, you use this approach to increase sensitivity (compared to NHST) using the same data.
Most probably, in your case, higher sensitivity (or power) comes at the cost of higher type I error rate. And this might be fine. Sometimes making more changes and faster is more important than false positives. In this case, you can just use a higher p-value threshold in the NHST framework.
You might argue that the discrete type I error does not concern you. And that the potential loss in metric value is what matters. This might be true in your setting. But in real life scenarios, in most cases, there are additional costs that are not taken into account in the proposed solution: increased complexity, more time spent on development, implementation, and maintenance.
While the approach might fit in your setting, I don't believe most of other users of tea-tasting would benefit from it. For the moment, I must decline your kind contribution.
But you still can use tea-tasting and perform the calculations described in the whitepaper. See the guide on how to define a custom metric with a statistical test of your choice: https://tea-tasting.e10v.me/custom-metrics/
Most models are derived of Machine Learning principles that are a mix of classic probability theory, Frequentist and Bayesian statistics and lots of Computer Science fundamentals. But there have been advancements in Bayesian Inference and Bayesian Deep Learning, you should check the work of frameworks like Pyro (built on top of PyTorch)
Edit: corrected my sentence, but see 0xdde reply for better info.
I could be wrong, but my sense is that ML has leaned Bayesian for a very long time. For example, even Bishop's widely used book from 2006 [1] is Bayesian. Not sure how Bayesian his new deep learning book is.
I stand corrected! It was my impression that many methods used in ML such as Support Vector Machines, Decision Trees, Random Forests, Boosting, Bagging and so on have very deep roots in Frequentist Methods, although current CS implementations lean heavily on optimizations such as Gradient Descent.
Giving a cursory look into Bishop's book I see that I am wrong, as there's deep root in Bayesian Inference as well.
On another note, I find it very interesting that there's not a bigger emphasis on using the correct distributions in ML models, as the methods are much more concerned in optimizing objective functions.
I miss the college days where professors would argue endlessly on Bayesian vs Frequentist.
The article is very well succinct and even explains why even my Bayesian professors had different approaches to research and analysis. I never knew about the third camp, Pragmatic Bayes, but definitely is in line with a professor's research that was very through on probability fit and the many iteration to get the prior and joint PDF just right.
Andrew Gelman has a very cool talk "Andrew Gelman - Bayes, statistics, and reproducibility (Rutgers, Foundations of Probability)", which I highly recommend for many Data Scientists
A few things I wish I knew when took Statistics courses at university some 25 or so years ago:
- Statistical significance testing and hypothesis testing are two completely different approaches with different philosophies behind them developed by different groups of people that kinda do the same thing but not quite and textbooks tend to completely blur this distinction out.
- The above approaches were developed in the early 1900s in the context of farms and breweries where 3 things were true - 1) data was extremely limited, often there were only 5 or 6 data points available, 2) there were no electronic computers, so computation was limited to pen and paper and slide rules, and 3) the cost in terms of time and money of running experiments (e.g., planting a crop differently and waiting for harvest) were enormous.
- The majority of classical statistics was focused on two simple questions - 1) what can I reliably say about a population based on a sample taken from it and 2) what can I reliably about the differences between two populations based on the samples taken from each? That's it. An enormous mathematical apparatus was built around answering those two questions in the context of the limitations in point #2.
The data-poor and computation-poor context of old school statistics definitely biased the methods towards the "recipe" approach scientists are supposed to follow mechanically, where each recipe is some predefined sequence of steps, justified based on an analytical approximations to a sampling distribution (given lots of assumptions).
In modern computation-rich days, we can get away from the recipes by using resampling methods (e.g. permutation tests and bootstrap), so we don't need the analytical approximation formulas anymore.
I think there is still room for small sample methods though... it's not like biological and social sciences are dealing with very large samples.
My understanding is that frequentist statistics was developed in response to the Bayesian methodology which was prevalent in the 1800s and which was starting to be perceived as having important flaws. The idea that the invention of Bayesian statistics made frequentist statistics obsolete doesn't quite agree with the historical facts.
I see, so academics are frequentists (attackers) or objective Bayes (naive), and the people Doing Science are pragmatic (correct).
The article gave me the same vibe, nice, short set of labels for me to apply as a heuristic.
I never really understood this particular war, I'm a simpleton, A in Stats 101, that's it. I guess I need to bone up on Wikipedia to understand what's going on here more.
Bayes lets you use your priors, which can be very helpful.
I got all riled up when I saw you wrote "correct", I can't really explain why... but I just feel that we need to keep an open mind. These approaches to data are choices at the end of the day... Was Einstein a Bayesian? (spoiler: no)
Using your priors is another way of saying you know something about the problem. It is exceedingly difficult to objectively analyze a dataset without interjecting any bias. There are too many decision points where something needs to be done to massage the data into shape. Priors is just an explicit encoding of some of that knowledge.
> Priors is just an explicit encoding of some of that knowledge.
A classic example is analyzing data on mind reading or ghost detection. Your experiment shows you that your ghost detector has detected a haunting with p < .001. What is the probability the house is haunted?
Well, something could count as evidence that ghosts or ESP exist, but the evidence better be really strong.
A person getting 50.1% accuracy on an ESP experiment with a p-value less than some threshold doesn't cut it. But that doesn't mean the prior is insurmountable.
The closing down of loopholes in Bell inequality tests is a good example of a pretty aggressive prior being overridden by increasingly compelling evidence.
The fact that you are designing an experiment and not trusting it is bonkers. The experiment concludes that the house is haunted and you've already agreed that it would be so before the experiment.
You're absolutely right, trying to walk a delicate tightrope that doesn't end up with me giving my unfiltered "you're wrong so lets end conversation" response.
Me 6 months ago would have written: "this comment is unhelpful and boring, but honestly, that's slightly unfair to you, as it just made me realize how little help the article is, and it set the tone. is this even a real argument with sides?"
For people who want to improve on this aspect of themselves, like I did for years:
- show, don't tell (ex. here, I made the oddities more explicit, enough that people could reply to me spelling out what I shouldn't.)
- Don't assert anything that wasn't said directly, ex. don't remark on the commenter, or subjective qualities you assess in the comment.
Frequentist and Bayesian are correct if both have scientific rigor in their research and methodology. Both can be wrong if the research is whack or sloppy.
I've used both in some papers and report two results (why not?). The golden rule in my mind is to fully describe your process and assumptions, then let the reader decide.
I understand the war between bayesians and frequentists. Frequentist methods have been misused for over a century now to justify all sorts of pseudoscience and hoaxes (as well as created a fair share of honest mistakes), so it is understandable that people would come forward and claim there must be a better way.
What I don’t understand is the war between naive bayes and pragmatic bayes. If it is real, it seems like the extension of philosophers vs. engineers. Scientists should see value in both. Naive Bayes is important to the philosophy of science, without which there would be a lot of junk science which would go unscrutinized for far to long, and engineers should be able to see the value of philosophers saving them works by debunking wrong science before they start to implement theories which simply will not work in practice.
> - subjective Bayes is the strawman that frequentist academics like to attack
I don’t get what all the hate for subjective Bayesianism is. It seems the most philosophically defensible approach, in that all it assumes is our own subjective judgements of likelihood, the idea that we can quantify them (however in exactly), and the idea (avoid Dutch books) that we want to be consistent (most people do).
Whereas, objective Bayes is basically subjective Bayes from the viewpoint of an idealised perfectly rational agent - and “perfectly rational” seems philosophically a lot more expensive than anything subjective Bayes relies on.
Funny enough I also heard recently about Fiducial Statistics as a 3rd camp, an intriguing podcast episode 581 of super data science, with the EiC of Harvard Business Review.
I’m always puzzled by this because while I come from a country where the frequentist approach generally dominates, the fight with Bayesian basically doesn’t exist. That’s just a bunch of mathematical theories and tools. Just use what’s useful.
I’m still convinced that Americans tend to dislike the frequentist view because it requires a stronger background in mathematics.
I don’t think mathematical ability has much to do with it.
I think it’s useful to break down the anti-Bayesians into statisticians and non-statistician scientists.
The former are mathematically savvy enough to understand bayes but object on philosophical grounds; the later don’t care about the philosophy so much as they feel like an attack on frequentism is an attack on their previous research and they take it personally
This is a reasonable heuristic. I studied in a program that (for both philosophical and practical reasons) questioned whether the Bayesian formalism should be applied as widely as it is. (Which for many people is, basically everywhere.)
There are some cases, that do arise in practice, where you can’t impose a prior, and/or where the “Dutch book” arguments to justify Bayesian decisions don’t apply.
I think the distaste Americans have to frequentists has much more to do with history of science. The Eugenics movement had a massive influence on science in America a and they used frequentist methods to justify (or rather validate) their scientific racism. Authors like Gould brought this up in the 1980s, particularly in relation to factor analysis and intelligence testing, and was kind of proven right when Hernstein and Murray published The Bell Curve in 1994.
The p-hacking exposures of the 1990s only fermented the notion that it is very easy to get away with junk science using frequentest methods to unjustly validate your claims.
That said, frequentists are still the default statistics in social sciences, which ironically is where the damage was the worst.
I’m not actually in any statistician circles (although I did work at a statistical startup that used Kalman Filters in Reykjavík 10 years ago; and I did dropout from learning statistics in University of Iceland).
But what I gathered after moving to Seattle is that Bayesian statistics are a lot more trendy (accepted even) here west of the ocean. Frequentists is very much the default, especially in hypothesis testing, so you are not wrong. However I’m seeing a lot more Bayesian advocacy over here than I did back in Iceland. So I’m not sure my parent is wrong either, that Americans tend to dislike frequentist methods, at least more than Europeans do.
I’m sure there are creative ways to misuse bayesian statistics, although I think it is harder to hide your intentions as you do that. With frequentist approaches your intentions become obscure in the whole mess of computations and at the end of it you get to claim this is a simple “objective” truth because the p value shows < 0.05. In bayesan statistics the data you put into it is front and center: The chances of my theory being true given this data is greater than 95% (or was it chances of getting this data given my theory?). In reality most hoaxes and junk science was because of bad data which didn’t get scrutinized until much too late (this is what Gould did).
But I think the crux of the matter is that bad science has been demonstrated with frequentists and is now a part of our history. So people must either find a way to fix the frequentist approaches or throw it out for something different. Bayesian statistics is that something different.
> "The chances of my theory being true given this data is greater than 95% (or was it chances of getting this data given my theory?)"
The first statement assumes that parameters (i.e. a state of nature) are random variables. That's the Bayesan approach. The second statement assumes that parameters are fixed values, not random, but unknown. That's the frequentist approach.
I'd suggest you to read "The Book of Why"[1]. It is mostly about Judea's Pearl next creation, about causality, but he also covers bayesian approach, the history of statistics, his motivation behind bayesian statistics, and some success stories also.
To read this book will be much better, then to apply "Hanlon's Razor"[2] because you see no other explanation.
This statement is correct only on a very basic, fundamental sense, but it disregards the research practice. Let's say you're a mathematician who studies analysis or algebra. Sure, technically there is no fundamental reason for constructive logic and classical logic to "compete", you can simply choose whichever one is useful for the problem you're solving, in fact {constructive + lem + choice axioms} will be equivalent to classical math, so why not just study constructive math since it's higher level of abstraction and you can always add those axioms "later" when you have a particular application.
In reality, on a human level, it doesn't work like that because, when you have disagreements on the very foundations of your field, although both camps can agree that their results do follow, the fact that their results (and thus terminology) are incompatible makes it too difficult to research both at the same time. This basically means, practically speaking, you need to be familiar with both, but definitely specialize in one. Which creates hubs of different sorts of math/stats/cs departments etc.
If you're, for example, working on constructive analysis, you'll have to spend tremendous amount of energy on understanding contemporary techniques like localization etc just to work around a basic logical axiom, which is likely irrelevant to a lot of applications. Really, this is like trying to understand the mathematical properties of binary arithmetic (Z/2Z) but day-to-day studying group theory in general. Well, sure Z/2Z is a group, but really you're simply interested in a single, tiny, finite abelian group, but now you need to do a whole bunch of work on non-abelian groups, infinite groups, non-cyclic groups etc just to ignore all those facts.
I would follow but neither Bayesian nor frequentist probabilities are rocket science.
I’m not following your exemple about binary and group theory either. Nobody looks at the properties of binary and stops there. If you are interested in number theory, group theory will be a useful part of your toolbox for sure.
It's because practicioners of one says that the other camp is wrong and question each other's methodologies. And in academia, questioning one's methodology is akin to saying one is dumb.
To understand both camps I summarize like this.
Frequentist statistics has very sound theory but is misapplied by using many heuristics, rule of thumbs and prepared tables. It's very easy to use any method and hack the p-value away to get statistically significant results.
Bayesian statistics has an interesting premise and inference methods, but until recently with the advancements of computing power, it was near impossible to do simulations to validate the complex distributions used, the goodness of fit and so on. And even in the current year, some bayesian statisticians don't question the priors and iterate on their research.
I recommend using methods both whenever it's convenient and fits the problem at hand.
I can attest that the frequentist view is still very much the mainstream here too and fills almost every college curriculum across the United States. You may get one or two Bayesian classes if you're a stats major, but generally it's hypothesis testing, point estimates, etc.
Regardless, the idea that frequentist stats requires a stronger background in mathematics is just flat out silly though, not even sure what you mean by that.
I also thought it was silly, but maybe they mean that frequentist methods still have analytical solutions in some settings where Bayesian methods must resort to Monte Carlo methods?
> I’m still convinced that Americans tend to dislike the frequentist view because it requires a stronger background in mathematics.
The opposite is true. Bayesian approaches require more mathematics. The Bayesian approach is perhaps more similar to PDE where problems are so difficult that the only way we can currently solve them is with numerical methods.
Hey Peter, I'm a foreigner residing abroad with an american spouse of 10 years and we have kids holding American Passports. I'm currently working as a contractor for a company in the US, but I'm thinking of moving to the US and applying for the IR1 visa due to family reasons. I am the primary earner of the household.
What are my limitations in this case? Can I keep the contract I'm currently in while the visa process chugs along? What about proof of income?
A friend of mine did the "fiancee visa" in Europe at the local embassy there. It took something like 3-5 months to get the visa approved for travel to the US. Once there, he could start working immediately.
But this was a long time ago, I have no idea what the processing times are today? Are they listed online?
And above all, since his spouse did not work in the US, so they needed a US based citizen with "not bad" finances to sponsor him.
If you're an independent contractor based in a different country, it'll be fun times figuring out the taxes for the year that you move to the US. Both your old country and the US will want a cut.
For you to apply for your green card while in the U.S., you need to be in the U.S. in a work visa status, not in a visitor status. Alternatively, you can get your green card outside the U.S. through a U.S. Consulate (known as an "immigrant visa") but that process would take longer 6-18 months (versus 6 months or less if filed in the U.S.). And your employment as a contractor (as well as your and wife's assets) can be used to meet the financial support requirements.
That is a much lesser punishment than what the US is doing in this thread. A fine that is a small but non-insignficant percentage of annual revenue is a measured response when you want to punish bad behavior but allow businesses to still operate within the jurisdiction.
Restricting business operation altogether is a response a country gives when they see the other party as extremely adversarial, which is a few orders of magnitude above what EU fines are to Big Tech.
I know someone who works on this in Meta. His resume is computer science heavy, with a masters in Machine Learning. On the previous experience side, before getting into Meta, he had about a decade working as a Software Engineer with Machine Learning system in multiple languages, such as Go, C++ and Python.
To get the job he applied for a spot I'm Software Engineer applied in Machine Learning, he went through the multiple step interview process, and then when he got the job he did a few weeks of training and interviewing teams. One of the teams in charge of optimizing ML code in Meta picked him up and now he works there.
Because of Meta's scale, optimizing code that saves a few ms or watts is a huge impact in the bottom line.
In sum:
- Get a formal education in the area
- Get work experience somewhere
- Apply for a big tech job in Software Engineer applied with ML
- Hope they hire you and have a spot in one of the teams in charge of optimizing stuff
This is helpful thank you. There's always some luck.
I have a PhD in CS, and lots of experience in optimization and some in throughput/speedups (in an amdahl sense) for planning problems. My biggest challenge is really getting something meaty with high constraints or large compute requirements. By the time I get a pipeline set up it's good enough and we move on. So it's tough to build up that skillset to get in the door where the big problems are.
I won't be defending Nintendo because their issue with joy-stick drifting, their unwillingness to fix their design flaw, and their constant efforts to stop video game preservation really pisses me off.
But emulators are hit or miss, specially if you aren't tech-savvy enough to fix specific problems, and depends on the people's preferences:
- they need to install and configure an emulator, which isn't fool proof
- go through the hoops of downloading ROMs from sometimes sketchy sources
- configuring each game to work well
- self troubleshooting any issues they might have with some games
- not wanting to play their games online (which in Nintendo's case the online experience is truly lacking)
- not have any moral issues with piracy for games that are currently selling in both physical and digital format and are widely available
Nintendo's Switch is due for a very necessary upgrade, but the console is still widely available, very convenient with minimal setup and troubleshooting, and very plug and play. With the added benefit that you get to support developers to continue making the games.
> I won't be defending Nintendo because their issue with joy-stick drifting, their unwillingness to fix their design flaw, and their constant efforts to stop video game preservation really pisses me off.
I think this criticism is a little misguided.
1. Joy-con drifting doesn't happen much with newer Joy-cons. Nintendo also repairs all Joy-cons for free now - even ones that aren't drifting. Break a button? Damage the rubber top of the Joy-con? They fix those free too, and even pay all shipping costs. We might as well bring up the Xbox Red Ring incident, or the PS5 having melting USB ports.
2. "Their unwillingness to fix their design flaw" - as already stated, it already has been mostly fixed through subtle changes. Sure, there's no big announcement of a specific revision that has no issues, but that would be begging for a class-action lawsuit (the reason why companies can never admit guilt publicly - or they've already lost). Also, if they were to announce that "revision X has no issues," and then it developed issues eventually like all non-Hall sticks do, another lawsuit.
3. "constant efforts to stop video game preservation" - You've surely never seen Sony or Microsoft's efforts then. They are more subtle and skilled, but don't think for a second they don't have the same goals. I actually think Microsoft is the most insidious character; for making consoles that cannot be set up without internet, combined with "backwards compatibility" for "preservation" that also does not work without internet.
> I actually think Microsoft is the most insidious character;
Towards game preservation? Microsoft is basically a patron saint. For starters, the concept of an "Xbox exclusive" barely exists. Most console titles Microsoft publishes release day-and-date on PC, where DirectX is entirely reverse-engineered and doesn't rely on Windows. The Xbox itself is the only console among the companies you mentioned where you can install an emulator without hacking the OS or red-teaming the OEM.
> combined with "backwards compatibility" for "preservation" that also does not work without internet.
That's because the Xbox does not contain redundant backwards-compatible hardware. It provides high-level emulation for older titles, which doesn't work unless you can download the mods. It's a bit like complaining that FPS Boost doesn't work without internet to download the update from.
Yeah, I definitely like emulators because I don't have to have a library of consoles beneath the TV and some features such as rendering at higher resolutions and savestates + save management in general are great. But they aren't a seamless experience (especially for newer hardware) and I wouldn't expect the majority of people to prefer it compared to the original hardware.
Freight broker that wanted to use technology to automate the work of finding trucks, managing the communication between truck, shipper and facilities, and auditing the job of moving the load.
But the job is very high revenue (thousands of dollars per cargo depending on the distance) but low profit (5%-10%) in these times.