Hacker Newsnew | past | comments | ask | show | jobs | submit | more themcgruff's commentslogin

Most, if not all production mysql installations have this cache disabled anyways. (Based on my experience at Engine Yard for 3 years and other places the last 5 years.)


I did not realize that. Interesting. Thanks!


Why?


Also the MySQL query cache is not very granular, it gets cleared on every write to that table. Thus, we've seen in practice that with it enabled MySQL has poorer performance as its constantly being flushed. The overhead of maintaining it doesnt pay off.

Of course its a trade-off. If you have have substantially more reads than writes than it might be OK for your needs.


Mutex issues and other problems with high insert / update rates. Google "mysql query cache issues" and "mysql query cache disable" for lots of posts about the various costs / benefits.


There's two of us at 37signals building projects on top of the HttpLua / ngx_lua module. The OpenResty (http://www.openresty.org) project is absolutely worth a look if you are willing to live on the edge and you wan't incredibly fast performance. I can't say enough good things about the work agentzh and chaoslawful have been up to lately -- just check out their Github profiles.


I really like lua and have been lookin at getting my hands dirty with this.. Sounds like I have my weekend planned then!


Yep, awesome project. I have used this to make a mini-heroku. Only problem is that it is http 1.0 and not 1.1.


If you're not living on the edge, you're taking up too much space.


You comment reminds me that we left out a biggie: We include a "stop sending me these messages" link in almost every email we send. The link actually works too.


I think ceejayoz was saying some users are lazy or don't remember signing up for your service, so they'll click the "Report as Spam" button as a quick way to "unsubscribe".


Or some users assume that by clicking the "unsubscribe" link you are actually confirming the email is read and might get even more spam.


That's probably because that's how it used to work. Remember before can spam laws?


What makes you believe it changed?


For some mailing lists you find yourself onto that's actually the case, removing yourself is next to impossible...


I know this wasn't clear from the post... We include a "fake" image (with a unique has for that specific message) in all html email so we know if it's "really been delivered" (ie it was actually opened).


And you get a 99.3% return rate on your image? That seems highly unlikely given that many (most?) mail clients do not display images by default.


Copied from Noah's comment on the post (I asked him first):

"We do track open rates for emails that are already HTML formatted and making remote requests for images, but you’ll never get 100% accuracy with that metric because many people use plain text emails or don’t load images. Our experience is that the best you’ll ever see is between 60-70% “open” rate because of this. Some of our applications in some contexts only send plain text emails as well, so we don’t track open rate there at all.

Why is remote server acceptance rate important anyway? Some thoughts:

1) First, because hard bounces really do happen a lot, and at our scale, a 1% difference in hard bounce rate means 160k messages per week that aren’t making it to users, which means a poor experience for many and many support requests coming in to us. Based on all the information we’ve been able to find, we’re pretty sure a 0.7% hard bounce rate at our scale is pretty good.

2) Second, because it is a relative metric of overall deliverability. Our experience has been that when we do get on a blacklist, servers start hard bouncing our mail until we get off of it. As we’ve improved our SpamAssassin type scores over the last few years, we’ve seen an improvement in hard bounce rate. We also see a strong correlation between hard bounce rate and the number of email delivery related support requests we get. While it’s not the perfect measure, it’s the best measure we have available that we can reliably monitor.

Again, there’s no perfect way that I know of to reliably tell whether an email is getting to a user, since read status isn’t particularly accurate. We use whatever we can (hard bounce rate, open rate, number of support requests relating to email) to get as close to that as we can."


I would be interested in know what their open rate is. As it was mentioned in another comment most email subscribers don't turn on images making open tracking a metric that is very low.


FWIW we haven't done any "paid" whitelisting. That's too mafia-esque for us.

SES requires each sending address to be verified upfront. For us that's an issue.


Why is it an issue to verify sending addresses? Your setup now requires that you control the domain as well.


You don't have to pay to get whitelisted with an ISP. You have to have good IP reputation then apply.


Many ISPs are turning to returnpath to manage their white lists, and that is indeed a daunting financial outlay.


Which big ones use that as their white label?


Cox, Comcast, Roadrunner, etc. Most of the large ones besides Gmail.


Thanks - that's why I use sendgrid & mailchimp. Don't have time to care about these things.


Don't know why you're being downvoted; any business <$1M annual revenue that has to send loads of e-mail shouldn't be wasting time rolling their own solutions


We do more than that, but only send a few hundred emails per day outside of our internal google apps accounts (using sendgrid) and then blast a newsletter once a week via mailchimp.


It's not worth outsourcing. We just add a tiny gif that has a unique hash. When that file is requested record the hash and mark the identified message as read.


Seems like most e-mail clients these days don't load remote images automatically - do you find that significantly affects this technique?


The hashed image is the same method the big boys would use to record open rate.


Ok, but surely they have the same problem then?


Thanks - is there a library I can use? Wouldn't want to reinvent the wheel


I'll talk with Noah about open sourcing it. Right now it would take a little bit of effort, but it's probably worth it for the good of the community.


That should read we "check a repo". We don't actually do a git checkout.


At 37signals:

We send 40M - 100M emails from our applications each month. We run 3 outbound relays on top of Postfix. We sign about 95% of our outbound mail with Domain Keys using dkim-filter. We also use SPF records for all of our sending IPs / domains. We monitor our sending IP reputation with via Nagios checks and manual review. Use also have checks setup for the most common black lists. Every email that goes out from our applications has an "opt out" link. (This is really important!)

We used to send all of our mail via one IP address. That meant any given spam report could really hurt our deliverability. Today we send from about 10+ IP addresses, divided amongst the apps. If the reputation of one IP goes down because of a false report, we just decrease the volume on that IP until the reputation goes back up.

For our "company" mail we use Google Apps. For our "marketing" mail use Mailchimp.


Interesting. Could you elaborate on the Nagios reputation monitoring? I'm wondering how what sources you use for this. Thanks.


We use this script: https://raw.github.com/37signals/37s_cookbooks/edefbd17eeb8f... to check spamhaus, abuseat, sorbs, etc.


You meant dkim-milter right?

That's great info btw, that's not far from what I've currently got but mailman instead of Mailchimp and I've only got the one IP so when someone hits the spam button it's affecting all types of mail until it recovers.


Yes. Sorry used the package name(dkim-filter). It's a milter though.


The file was named cat.jpg and that was logged, which was what we saw. We do not look at user’s files.


Sorry, but that is just a huge blunder. I can see from your comment that you think it's no big deal, but I read that item and immediately blacklisted 37Signals as a vendor that looks at customer files.

Your explanation makes it worse, not better; you shouldn't even be looking at filenames.


While it was stupid of them to publish this, you do realize that engineers working on cloud storage services have access to user data, don't you? However restricted it is, there are always people who have to debug this last mile and look at things, including actual user data, if something is not working on the live site.


Yes, I realize that. In my opinion, there's no qualitative difference between employees of AWS peeking into my data or employees that I hired peeking into my data. It's about trust in the end.

Anyone who has set up company email knows this. A lot of people think that having an in-house team manage a dedicated, on-premises mail server is somehow "better" or "more secure" than hiring Google or Microsoft or Ma&Pa Exchange Hosting to do it. Those people either: (1) have a reason to trust their employees that they don't have when it comes to Google/Microsoft/Ma&Pa, or (2) are living in a fantasy of their own delusions.


How do you know that other vendors don't look at your data? Really, what assurance do you have, other than that they don't casually mention doing so?


You should of emailed the user that their picture was #100'000'000 and if they would give you permission to look at the picture to feature in their blog post. That would of been the ethical way to do it.


Terribly bad judgment to post that. Like apparently numerous others, that bit caught my eye and made me pause and reflect on the downside of SaaS.

Even looking at the filename seems pretty suspect, as an aside. What if the filename was BankruptcyPreparation.docx, or TerminationOfBobDobbs.pdf, etc? The metadata about a file should be confidential as well.


As DHH commented on the post: The file name was only shared because it was funny and not identifiable with anything, and just seen passing through the logs. We do not look at people’s files and we do not share any personal information of any kind.

Had it been BankruptcyPreparation.docx, or any other sensitive file name I would never have shared it. Going forward I won't be sharing dog.gif either.


You're going to get flack for this because you've revealed a very small PR hole in an otherwise great company with great products.

I'll continue to use your products. Even if you were to post BankruptcyPreparation.docx. Here's why: You're not a bunch of fucking idiots.

If 37signals was a mish-mash of security issues and red flags, ya, I might be a bit pissed to see BankruptcyPreparation.docx mentioned.

Some of us would love to hear that 37signals fucked up large and did something imperfect. Not because of narcissism or blind hate, but just because it makes their success digestible, makes them human.


Passwords. Plaintext.

You're welcome.


I won't comment on whether or not it was wise of them to post that information, except to say that plenty of other services have posted much more revealing data without backlash of any kind.

What I will question, however, is the assertion that looking at user filenames is suspect. That's easily fair game, and to claim otherwise is as ridiculous as claiming your dentist has no right to see your dental records, or your bank shouldn't know how much money is in your account.

If you're that protective of your data, then it's up to you make wiser decisions. For starters, don't name your files SuperSecretPrivateInfo.doc and then give them to other people to store. Take a look at their extremely readable privacy policy. Send them an email with questions. If you care so much, take action and stop blaming other people for your own laziness.


"to claim otherwise is as ridiculous as claiming your dentist has no right to see your dental records, or your bank shouldn't know how much money is in your account."

The issue is not that they looked at things, the issue is that they chose to tell the world about it.


Are you serious? They merely stated that one of their MILLIONS of customers uploaded a picture of a cat. There was zero identifying information there. How can you call that an issue?

Really: What is it here that makes you so upset? Would you be concerned if your dentist told you, "I filled a cavity last year"?


I am not upset at all; I merely point out that, IMO, they should not have disclosed any of their client's data, no matter how small.

IMO, there is customer data on their servers that they should not disclose without the consent of their customer. If so, the moment you allow a service provider to expose some information without such consent, you are accepting the fact that there are is a border (no matter how vaguely defined) between 'OK to disclose' and 'not OK to disclose' data, and that it is up to the service provider to decide where that border lies.

Because of that, I think a provider should not disclose any information about their clients, no matter how tiny, even if the information cannot be traced to any particular user, unless their terms of service clearly state what they will disclose (or sell to third parties)

(And yes, I _do_ read terms of service)


The debate is not over them posting the filename, it is over the fact that it is exposed to them.


Uploaded data is always exposed, unless the user encrypts it before sending and doesn't give the decryption key to the company. Which of course is impractical for most people. If you think that doesn't happen, then you are naive - so yes, if you talk about your sex life on your GMail account, there's a chance some Google employee will see it.

We can go back and forth on this all day long, but these are the facts: (1) your online data is not safe, unless you encrypt it yourself and (2) in this instance, no user identifiable information was given.

At least 37signals never said that they don't have access to those files, like other companies would make you believe:

http://www.businessinsider.com/dropbox-updates-security-term...


> Uploaded data is always exposed, unless the user encrypts it before sending and doesn't give the decryption key to the company.

Fucking BINGO. I'm always shocked how many people on HN don't understand this, considering the high percentage of techies and programmers.


Plenty of other services have shown indiscretion about their client's data. That doesn't validate this case, especially considering that many of us look to 37signals as essentially the poster boy of leading behaviors.

We expect more from them.

I am not trying to be argumentative but want to respond to a point you made as I think it is critically important for many HNers running or aspiring to run SaaS solutions-

"If you're that protective of your data, then it's up to you make wiser decisions. For starters, don't name your files SuperSecretPrivateInfo.doc and then give them to other people to store."

For real? I guarantee that 37signals would not sanction such a ridiculous statement. Most SaaS companies wouldn't touch such claims with a 40' pole.

The industry lives and breathes on the feeling that the data is confidential. We're currently looking at some hosted helpdesk ticket solutions, and I can tell you that if there was even the slightest hint that the vendors casually browsed our data we would rethink the whole adventure.


Cat.jpg from an unknown user of Basecamp. This tells us nothing about anyone. Get over it.


It tells us that they looked at customer data, and that's a really, really big deal to people who are doing serious business that involves private information that is: (1) regulated by government, and/or (2) has significant commercial value.

You can waive your arms and talk yourself blue in the face about your security protocols, but in the end it all comes down to trust. This kind of slip-up erodes that trust.


> It tells us that they looked at customer data

Have you ever supported a product that has external users? Eventually have to see their data in some way, shape, or form. Whether it be a username, email address, ip address, user-agent strings, filenames, etc; there are times when troubleshooting, verifying functionality, validating report data, etc where you will have to look at at least some subset of actual customer data somewhere. It is simply unrealistic to think otherwise.

How would you go about providing customer support or auditing without looking at the customer data required to complete such tasks?

(edited to add quote)


And announcing the content of the one millionth file upload is serving the customer how?


>Have you ever supported a product that has external users?

And if an apartment dweller had a plumbing leak, the landlord would enter their apartment and fix the leak. They would access on a need basis. They wouldn't do casual sneak and peeks and then post analysis on the entry door that seven residents have bongs in their living rooms.

Seriously, though, mechanisms to deal with exception situations, such as customer support, has nothing to do with "looking because it made for a fun blog post".

I only engage in this conversation because this is important for many HNers -- spin all the justifications you want, or blame users (a good attitude that guarantees business failure), however this was a serious blunder that other businesses should look to avoid.

Even if you do casually trawl the data of your users, for the love of all things unholy don't talk about it.


Actually it only tells us that they looked at their own logs.

You could maybe classify that as looking at customer meta-data if you wanted to.

Though if you think that is bad, what do you think of when you call a bank and the person at the other end can see all of your private banking data? Or even when you call some other company and some random call centre employee has your whole transaction history in front of them. Shouldn't that worry you more?


All we know for sure is that (a) they counted the number of files, and (b) they read a filename. I'd be willing to bet they don't even know who the user is - in a properly normalized database, all they'd likely see is a user ID unless they explicitly have an interface that correlates user ID and file data with personally identifiable user information.

Perhaps saying, "it was a cat" was a poor idea if they only read the filename, but a lot of people are making a lot of assumptions here. If you're dealing with regulated or high-value data, you probably shouldn't be using 37signals without doing any kind of audit of their operating practices anyway.


Erodes what trust exactly?

You're assuming 37signals made an agreement ("we won't be able to see your filenames in our logs") that they intentionally never made, and then you're claiming they violated said non-existent agreement.

If you're a company with super-sensitive data then it's on YOU to choose the appropriate place to store that data. There are services made explicitly for that sort of thing. It's absolutely ridiculous and irresponsible to require a certain level of privacy features, sign up for a service that doesn't have those privacy features, and then act as if they violated your trust.


Actually, I agree with you. 37Signals is characteristically up-front in their security policy about not being an appropriate solution for sensitive data. That's why I don't use them for my own client work. I respect their decision not to take on that responsibility.

Having said that, their security policy does suggest that customers expect some level of privacy. It's just bad form to publicly announce that you've been poking around the content of customer data as part of a fun blog post about metrics.

Nobody is building classified weapons on Basecamp. If you say that Basecamp customers shouldn't expect any measure of privacy, I'll respectfully acknowledge that you might be correct about that.


> That's why I don't use them for my own client work. I respect their decision not to take on that responsibility

I respect that viewpoint immensely, and I wish more people would think like that.

I think the main issue here is that the words "privacy" and "security" are so vague as to be meaningless. Consequently, people have mapped their own personal definitions on to them. For example, different people may care about different subsets of the following:

- Do you use SSL when I transfer sensitive data to your servers? - How secure is your database? What do you use to protect stored data from hackers? - Do you share user data with third-parties? If so, do you anonymize it? - Do you prevent your own employees from accessing my data? - Do you have an SLA that guarantees I will be able to access my data the vast majority of the time? - Do you have measures in place to ensure my data is never accidentally deleted?

Etc. The list could go on and on. If the customer is worried about any of the above (and sadly, most aren't), then she's in luck, because privacy policies and company email addresses are usually a click away on the web. I doubt I know anyone familiar with all the privacy and security processes employed by banks, doctors, financial aid offices, etc. Which is why it's confusing to see so many technically-savvy people up in arms about the name of a jpg file.


Right. Because no ISP sysadmin will ever keep some sendmail logs open and thus see that secretive_user@other_isp.com is sending mail to another_secretive_user@this_isp.com


>> I guarantee that 37signals would not sanction such a ridiculous statement. Most SaaS companies wouldn't touch such claims with a 40' pole. The industry lives and breathes on the feeling that the data is confidential. We're currently looking at some hosted helpdesk ticket solutions, and I can tell you that if there was even the slightest hint that the vendors casually browsed our data we would rethink the whole adventure.

This encapsulates exactly why I think your position is naive. You're confusing reality with what-companies-say-to-make-money. It's as if you're completely unaware of the concept of marketing. You seem to be in a position to make purchasing decisions for your company, so let me explain: marketing is a tool used to make money. Again: MARKETING IS A TOOL USED TO MAKE MONEY.

37signals knows that people want to feel like their data is confidential. So they plaster pictures of locks and words like "safe", "secure", "24-hour surveillance", and "biometric locks" on their signup page. This is called marketing. It creates a feeling -- and nothing more -- so that people like you will click "Buy". It's like the airline commercials that depict flights as being comfortable and quiet. It's like oil company commercials that talk about how great it is to be green. It's like McDonald's commercials that show thin and healthy people eating Big Macs. It's smoke and mirrors, and you're falling for it.

The reality is that you have a choice: belong to the 99% who simply want to buy into a "feeling", or belong to the 1% who read the privacy policy and ask questions.

But if you're going to choose to be in the 99%, then do us all a favor and stop complaining about it.


With all due respect, both of your responses have been completely obnoxious. You seem to be taking some unmerited grizzled vet position that might sell to children, but here it reads like a junior developer talking tough.

See, we actually sell software as a service. Data security for our clients isn't marketing, it is the absolutely lifeblood of the company (just as it is a critical principal for this industry). 37signals knows that it was a foolish oversight to casually comment on content trawling, which is a good sign. Your ridiculous arguments in their favor do no one any good.


If you take issue with someone's argument, it's customary to point out the specific flaws and explain why you disagree. Simply slinging ad hominems and calling the argument ridiculous doesn't cut it. An example of how to respond:

>> See, we actually sell software as a service.

Great, I sell software as a service too. This is an irrelevant fact that doesn't make you or I any more or less correct.

>> Data security for our clients isn't marketing, it is the absolutely lifeblood of the company

"Data security" is a vague term that has as many different definitions as there are people to talk about it. Your mistake is in assuming that your definition (in which seeing a file name in a log is unethical behavior and/or a security breach) is the one and only correct definition. Many many web developers disagree with your definition.

Secondly, security is marketing. You yourself have said, and I quote: "Even if you do casually trawl the data of your users, for the love of all things unholy don't talk about it." So not only do you recognize the importance of data security itself, but you recognize that even the appearance of data security (or lack thereof) can affect a company's bottom-line. From there, it should be easy to understand how companies use the appearance of security as a marketing technique.

>> 37signals knows that it was a foolish oversight to casually comment on content trawling

It seems to me that they've repeatedly defended their decision on both their blog and in this thread, and haven't removed the reference from their post. You're on your own, here.


>considering that many of us look to 37signals as essentially the poster boy of leading behaviors

Hopefully you guys learn from your mistake.


This is just an initial spike. The conversation started here: https://github.com/chaoslawful/lua-nginx-module/issues/35.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: