Hacker News new | past | comments | ask | show | jobs | submit login

I've often thought about setting up a scraper for sites like this and republishing the deleted content.



You would be sued, probably successfully, into oblivion. The terms of service [1] for Glassdoor are pretty clear that you agree to not:

Introduce software or automated agents to Glassdoor, or access Glassdoor so as to produce multiple accounts, generate automated messages, or to scrape, strip or mine data from Glassdoor without our express written permission;

Copy, modify or create derivative works of Glassdoor or any Content (excluding Your Content) without our express written permission);

Copy or use the information, Content (excluding Your Content), or data on Glassdoor in connection with a competitive service, as determined by Glassdoor;

Interfere with, disrupt, or create an undue burden on Glassdoor or the networks or services connected to Glassdoor;

Now, since reviews are public information and not hidden behind an authentication wall (i.e. no account signup required), there is a compelling argument that you can scrape the website without having technically agreed to those terms and conditions. In particular, the robots.txt [2] for Glassdoor does not disallow automated crawling of the /Reviews/ endpoint. But this would still likely result in a protracted lawsuit, even if you ultimately win.

________________________________________

1. https://www.glassdoor.com/about/terms.htm

2. https://www.glassdoor.com/robots.txt


Didn't LinkedIn just lose a court case over very similar anti-scraping terms? Offering something publicly on the internet, then attempting to block certain uses using your ToS seems like Glassdoor being an Indian giver...


Hence my second paragraph, yes. You’d definitely be sued, and it would definitely be expensive. You might not lose.


If you are sued by someone and win in the US, do you not get your legal expenses covered by the entity suing you?

Also, you can probably just set up the scraper and webpage in another country or something.


> If you are sued by someone and win in the US, do you not get your legal expenses covered by the entity suing you?

often yes, but you actually have to win the case first. the idea is that you run out of money and give up before you get there.


Recovering legal fees is not automatic just because you're innocent.


You still have to pay the legal fees up front, then hopefully get reimbursed later.


One could still go to archive.org There's quite a lot of snapshot. Indeed they are not interfering with Glassdoor per se.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: