The RideShareGuy is the most prominent blog for drivers and their annual survey ...

CogitoCogito · on Aug 20, 2020

Assuming that poll is representative of the drivers' will at large, I guess then hopefully Uber and Lyft can satisfy these drivers by following the regulations required for them to be classified as independent contractors.

fidelramos · on Aug 21, 2020

There is another way: laws should allow people and companies to determine their own terms for collaboration, instead of meddling with their freedom of association.

CogitoCogito · on Aug 23, 2020

Nothing is keeping Uber and Lyft from determining their own terms. However, the terms they choose affect the taxes they pay and the benefits they provide. If they want certain terms, then they need to accept the regulations that come with them.

So I guess I'm not really sure what your point is. Is your point that companies shouldn't be required to follow labor/tax law? What's the difference between this and saying that you don't want to pay payroll taxes on employees? Or that you want to pay a lower rate? Or that you don't want to pay import taxes or that you don't want to follow required laws in what you may put in the products you sell? What exactly makes Uber and Lyft so special that they should get to choose what rules they are to follow?

shanemhansen · on Aug 20, 2020

There's lots of ways to lie with surveys, I wonder what the exact wording was and what the responses would be to variants such as:

1. Do you want to be CEO of your own company or a peon employee?

2. Do you want uber to set your hours?

3. Do you want medical, vacation days, or unemployment insurance?

For some people these are all basically the same questions. For most folks, I don't think they are.

renewiltord · on Aug 20, 2020

Very curious about this trend I've seen on HN where instead of going to the primary source, commenters will take 2x the time to hypothesize about what the primary source could contain.

In this particular case it was so easy that it really makes me wonder.

I can think of a couple of reasons:

* Uncertainty with where in the primary source the content is

* Concern that you're being Gish Galloped with citations

* Lack of skill at skimming

Anyway, the source is particularly weird with the way it links to things and stuff but it wasn't that hard to find. I went through it and then decided to screen record the interaction afterwards. It's my second time going through but not that different from the first.

Using headlines, pull quotes, and pictures as the things I'm aiming for, I got there in 30 s. I think your comment would have taken longer for me to have written than 30 s.

https://gfycat.com/rashwholeindianjackal

mediaman · on Aug 20, 2020

There's a psychological reward to having other people read what you write, and possibly getting upvotes.

This reward exceeds the reward from learning the actual information.

Many platforms are plagued by people more interested in being heard than learning facts.

haberman · on Aug 20, 2020

> When we polled drivers about their preference between remaining an independent contractor or becoming an employee, an overwhelming number of drivers wanted to remain independent.

> "What type of employment relationship would you like to have with rideshare companies"

> (A) I don't know the difference: 3.3%

> (B) I'd like to be an employee: 20.8%

> (C) I'd like to be an independent contractor: 75.9%

https://therideshareguy.com/california-sues-uber-and-lyft-fo...

kelnos · on Aug 20, 2020

Is 734 out of a population of 80,000 a representative sample? And even if it is, are the 80,000 people on their mailing list a representative sample of all drivers?

Not saying they aren't, but I... just don't know.

notafraudster · on Aug 20, 2020

"Representativeness" has to do with the process of sampling, not the sample size -- a sample of one is in fact a representative sample of the population in the sense of being unbiased for the quantity of interest.

Here's a simple anecdote: suppose you want to measure how often a coin comes up heads. The true answer is 50% heads. If your "sample" is a single coin flip, the answer will always be 100% or 0% (both wrong answers!). Maybe you do this experiment and get 100% and I do it and get 0%. But since the magnitude of the error will be the same on either side, on average across many repetitions of the experiment (this is called a "sampling distribution") we'll get the right answer.

What adding additional sample size does is reduce the variance of the estimated statistic -- that is to say reduce the degree to which the estimate of the parameter moves around across samples. If I flip the coin 100 times and you flip the coin 100 times, we're both likely to get very close answers to one another.

The bigger concern here is not sample size, it's whether the sampling was random (it was not) and whether the sample frame -- the population from which they were sampling -- matches the population of interest (it does not, as you suggest in your post, so your instincts here are good!).

There is very little reason to believe the people who chose to reply to the email are as-if random with respect to the question being asked. Rather, I would expect diehards of RideShareGuy (who likely converge on RSG's approximate editorial position on this issue) are more likely to reply. There is also likely to be confounding based on age, hours worked per week, geographical location in the country, etc.

There is also very little reason to believe RideShareGuy's mailing list represents rideshare drivers as a whole; again, selection based on age, tech savvy, English competency, SES, geographic location, etc. all likely to be confounders.

If this were a classical random sample of a valid sample frame, the parameter of interest would have a classical margin of error +- 3.6%, which is small compared to the overall story being told. This speaks to your concern. A simple rule of thumb is that classical MOEs are +- 1/sqrt(n) where n is the sample size. This comment is too long so I won't get into the derivation here.

I actually think this question presents a lot of interesting problems for a survey statistician. In particular, I would guess there is extreme subgroup heterogeneity -- that is to say there are classes of people who overwhelmingly want to be contractors and classes of people who overwhelmingly want to be employees. My guess would be that the population-wide parameter is of little interest compared to identifying those groups. If we discovered that, say, every person above 40 hours a week wanted to be an employee and every person below wanted to be a contractor, it'd be an error to present a weighted average of the groups versus exploring policy solutions that reflect that heterogeneity.

kelnos · on Sept 3, 2020

Thanks for the detailed dive into this! I found it really interesting.

> The bigger concern here is not sample size, it's whether the sampling was random

This confuses me a little. Aren't both of those things pretty important? To go back to your first sentence, a sample size of one could be perfectly random, but it will be pretty useless at telling us anything interesting about the population. Obviously that's a silly case, but what about 10 people? 100? I would venture to guess that a random sample of 100 out of 80,000 people would be unlikely to tell you anything useful about those 80,000 people, at least not without a margin of error much higher than the effect being measured. So that suggests to me that sample size is pretty important too.

> If we discovered that, say, every person above 40 hours a week wanted to be an employee and every person below wanted to be a contractor, it'd be an error to present a weighted average of the groups versus exploring policy solutions that reflect that heterogeneity.

That's a really good point. It's possible, and likely, that any single chosen classification is going to make a lot of people unhappy. Better might be to develop new rules and classes that fit the situation and different drivers' needs better.

notafraudster · on Sept 16, 2020

Sorry I missed your reply here, but on the off-change you get this: a random sample of 10 people is unbiased for the entire population. A random sample of 1 person is unbiased for the entire population. The limitation of very low sample size is that it is going to produce high variance; on average it will be correct, but across multiple surveys our estimates will move around a lot. We can define a margin of error around our estimate to formally account for this. If what we're interested in is a binary yes/no answer, using the classical calculations and without trying to add more complexity for you, the margin of error is +/- 1/sqrt(n), so a sample of 100 people gets you in a qualitative ballpark of whether a property is very common, common, rare, or very rare. If you want to tell whether 69% and 71% reflect statistically different underlying propensities, you will need a much bigger sample. But if you just need to know "does there exist a decent swath of people who say yes to this?", a very small sample will do on average.

The full population size is generally irrelevant statistically (it applies to a "finite population correction"). Almost all the statistics you see presented publicly ignore FPCs by assuming the population is infinitely larger than the sample. Even when people take census reads of a full population they typically assume the population is a realized sample of a larger meta-population.

In general if you are seeing publicly presented poll data, sample size is not the thing that should be sending up red flags. Differential non-response and selection bias; sample frame matching or not matching the population; underlying noise in the conceptual measure; design effects caused by weighting; and many other components are all part of "TSE" (total survey error), which dwarfs the impact of sampling variation as a concern.

Now, mind you, subgroup analyses often do cut samples too finely... e.g., an N=500 sample subset to Men 65+ and people trying to make inferences but not properly reflecting either the sampling considerations or reduced subgroup sample size).

Let it also be noted from those classical calculations that larger samples are diminishing returns: N = 1000 -> 3.1% -> N = 5000 -> 1.4%. Five times the cost of surveying if not more, only a halving of the margin of error, and limited extra power to answer most real questions of interest.

LegitShady · on Aug 20, 2020

The secret is that your status has nothing to do with what you want but the terms of employment. It doesn't matter if 100% of them want to be contractors the judge just ruled it doesn't count as contractors by law.

If Uber wants contractors they need to make sure the terms of their employment follow the law, including classification.