Hacker News new | past | comments | ask | show | jobs | submit login

I recently tested MTurk for my startup. We set up about 500 HITs to collect website URL and email from various businesses. We set the price at $0.05 (Amazon takes an additional $0.01). Jobs quickly got started and within 24 hours we had all of our data collected.

I'm not sure I would do it again though. A lot of the businesses we were targeting don't have a web presence and therefore "No URL/No Email" was a viable answer. However, when I went through the list to see 150 "No URL/No Email" answers I didn't know for sure whether that is true or whether the Turker realized they could just copy/paste and make a quick buck. Amazon does provide the amount of time they spent on the task so I rejected any that were less than 10 seconds as I felt like they didn't give it a good enough try. Over that, I just accepted the answer realize that it may be false.

In the end I think I spent more time going through results and correcting them then it actually saved me. I'm excited to use MTurk in the future again, but only for appropriate projects.




I think you probably needed to have some redundancy, have each business checked at least twice.


are there any tools that help you do QA on MT data entry in a Bayesian way?

seems like the situation is ripe for such a tool

---------

start with a subset of questions you have a predetermined answer to, only keep feed questions if the person responding has met a certain quality threshold on those question

every so often feed them a QA question

every so often send the same question to someone else to check it for redundancy

seems like there is a lot you could do to adjust theses based on Bayesian confidence intervals, and exactly how mission critical you need certain data to be

maybe something like that already exists, idk

------------- (edit: is this what the Scale API does?)


Yup. My old company (www.crowdflower.com) has a platform set up to do this.


Please continue, this is a highly valuable comment.


You are definitely correct. This was just our first kick at the can. It was only $30 or so to test out the service. We learned a bit of info that we'll apply to future jobs (if we do it again). Such as, telling the Turker what to do if they don't find a website/email for the business.


Sorry to hear about that experience. You should consider using Scale API (www.scaleapi.com). This is a perfect use case for our data collection API: https://docs.scaleapi.com/#create-data-collection-task


Why does the "Scale" logo on this page not link to your home page? From that API documentation page I have no context about what Scale API is or what it does. The "Introduction" page isn't even that helpful if I've never heard of you before.

All that said, Scale API seems like a nice alternative to Mechanical Turk for some kinds of tasks.


It's a fair point. Lots of documentation pages or company blogs don't link to the company's main website, and it's super annoying.


We're working on redesigning and cleaning up all of our pages! Sorry about that


Very interesting. I will keep this in mind since we do have another 10,000 or so jobs that we'll need completed in the next while.

The pricing page isn't totally helpful. I'm not sure if I would fall under `Comparison $0.10` or `Data Collection $0.25/minute`... or if many of those prices apply to a single job.

Either way, I'll keep your product in mind.


Tbh I found scaleapi to be quite expensive, much more than implementing your own QA process on Turk. Also if scaleapi goes down you have to do the work all over again. Scaleapi is nothing new, there have been others before but they were gone as quickly as they came, and suddenly all the API code stopped working.

It was a bad enough experience that I'd never base my mission critical product on a 3rd party startup. Amazon could go bad too but a small startup it's almost expected.


Cool! We're working on a redesign of the pricing page—stay tuned :)




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: