I recently tested MTurk for my startup. We set up about 500 HITs to collect webs...

fenwick67 · on Feb 1, 2017

I think you probably needed to have some redundancy, have each business checked at least twice.

pitt1980 · on Feb 1, 2017

are there any tools that help you do QA on MT data entry in a Bayesian way?

seems like the situation is ripe for such a tool

---------

start with a subset of questions you have a predetermined answer to, only keep feed questions if the person responding has met a certain quality threshold on those question

every so often feed them a QA question

every so often send the same question to someone else to check it for redundancy

seems like there is a lot you could do to adjust theses based on Bayesian confidence intervals, and exactly how mission critical you need certain data to be

maybe something like that already exists, idk

------------- (edit: is this what the Scale API does?)

doleson · on Feb 2, 2017

Yup. My old company (www.crowdflower.com) has a platform set up to do this.

brilliantcode · on Feb 1, 2017

Please continue, this is a highly valuable comment.

giarc · on Feb 1, 2017

You are definitely correct. This was just our first kick at the can. It was only $30 or so to test out the service. We learned a bit of info that we'll apply to future jobs (if we do it again). Such as, telling the Turker what to do if they don't find a website/email for the business.

ayw · on Feb 1, 2017

Sorry to hear about that experience. You should consider using Scale API (www.scaleapi.com). This is a perfect use case for our data collection API: https://docs.scaleapi.com/#create-data-collection-task

Arcsech · on Feb 1, 2017

Why does the "Scale" logo on this page not link to your home page? From that API documentation page I have no context about what Scale API is or what it does. The "Introduction" page isn't even that helpful if I've never heard of you before.

All that said, Scale API seems like a nice alternative to Mechanical Turk for some kinds of tasks.

freehunter · on Feb 1, 2017

It's a fair point. Lots of documentation pages or company blogs don't link to the company's main website, and it's super annoying.

ayw · on Feb 1, 2017

We're working on redesigning and cleaning up all of our pages! Sorry about that

giarc · on Feb 1, 2017

Very interesting. I will keep this in mind since we do have another 10,000 or so jobs that we'll need completed in the next while.

The pricing page isn't totally helpful. I'm not sure if I would fall under `Comparison $0.10` or `Data Collection $0.25/minute`... or if many of those prices apply to a single job.

Either way, I'll keep your product in mind.

brilliantcode · on Feb 1, 2017

Tbh I found scaleapi to be quite expensive, much more than implementing your own QA process on Turk. Also if scaleapi goes down you have to do the work all over again. Scaleapi is nothing new, there have been others before but they were gone as quickly as they came, and suddenly all the API code stopped working.

It was a bad enough experience that I'd never base my mission critical product on a 3rd party startup. Amazon could go bad too but a small startup it's almost expected.

ayw · on Feb 1, 2017

Cool! We're working on a redesign of the pricing page—stay tuned :)