Evolution of words frequencies in porn

route66 · on Jan 30, 2014

To speak of evolution when your time frame is 2008-2012 is somewhat far fetched. But I believe I see a reassuring trend here: http://porngram.sexualitics.org/?q=erlang%2C+clojure%2C+Ada%...

ot · on Jan 30, 2014

Still, when performance matters, C is the blows all the others away by large:

http://porngram.sexualitics.org/?q=erlang%2Cclojure%2CAda%2C...

Disclaimer: all puns intended

donquichotte · on Jan 30, 2014

Clearly, golang is getting traction. http://porngram.sexualitics.org/?q=erlang%2Cclojure%2CAda%2C...

ushi · on Jan 30, 2014

Omg, look at C. It outperforms everything... http://porngram.sexualitics.org/?q=erlang%2Cclojure%2CAda%2C...

IgorPartola · on Jan 30, 2014

As someone pointed out below, this just looks at substrings so looking for "C" means looking for any words that has the letter "C" in it.

mappum · on Jan 30, 2014

I would be terrified if "brainfuck" had a nonzero frequency.

zorlem · on Jan 30, 2014

I wouldn't bet hard money that it doesn't exist, there is always Rule #34.

http://xkcd.com/305/

vanderZwan · on Jan 30, 2014

And yet, somehow very surprised.

RyanMcGreal · on Jan 30, 2014

And Ruby is way ahead of Python:

http://porngram.sexualitics.org/?q=python%2Cruby

tunnuz · on Jan 30, 2014

This comment made my day :D

ttty · on Jan 30, 2014

http://porngram.sexualitics.org/?q=teen%2Cblack%2Chot%2Ccock...

sushirain · on Jan 30, 2014

1. It doesn't count word frequencies, but sub-string frequencies. Moreover, if a sub-string appears more than once-per-title, then it is counted more than once. I draw this conclusion by submitting "a,b,c". And from their paper [1]:

   our algorithm strips out dashes and catches any 
   occurrence of the query in the title, for example, 
   'blow' catches 'blowing', 'blowjobs'

This explains the results of these queries: "ada,erlang", "tea,beer". As an alternative they could have used a stemmer [2].

2. The "slow,fast" and "love,hardcore" trends illustrate an interesting trend. Perhaps towards women or mainstream viewers.

[1] http://sexualitics.org/wp-content/uploads/2014/01/PORNSTUDIE...

[2] http://nlp.stanford.edu/IR-book/html/htmledition/stemming-an...

easy_rider · on Feb 1, 2014

> 2. The "slow,fast" and "love,hardcore" trends illustrate an interesting trend. Perhaps towards women or mainstream viewers.

I don't think so [1]

[1] https://www.google.nl/search?q=teen+loves+to&oq=teen+loves+t...

easy_rider · on Jan 30, 2014

In my first 2 weeks of working at an adult company (as a dev yes, it's sad) one of my tasks was to watch/scan 200+ video's and describe them.. It's true, you run out of inspiration fast. Also I could hint you: the "love" in the titles is probably explained by "love(s) to <insert profanity> ". I don't think I ever used hardcore in a title.

mratzloff · on Jan 30, 2014

I used to work with a guy who once worked as a dev at an adult company. He said it was the most soul-sucking experience of his career, and that the owners knew absolutely nothing about technology and treated their tech staff terribly.

Based on the fact that you had to spend your first two weeks doing data entry, it sounds like his experience wasn't unique.

easy_rider · on Jan 30, 2014

I didn't quite learn if this was an industry thing. Never met other devs in the industry from other companies. He painted a perfect picture.

Just talking about it makes me unhappy :)

probably_wrong · on Jan 30, 2014

Traditional professions are still on top:

http://porngram.sexualitics.org/?q=pizza%2Cdelivery%2Cplumbe...

I sense a business opportunity there.

jeltz · on Jan 30, 2014

http://porngram.sexualitics.org/?q=pizza%2Cdelivery%2Cplumbe...

Teaching and prostitution clearly dominate.

rmk2 · on Jan 30, 2014

> Traditional professions are still on top:

Time to change positions, then!

ozh · on Jan 30, 2014

Quite fun!

Next: provide the porn industry a simple markov chain script to generate probabilistic porn movie titles, and save them all those incredibly tiresome brainstrom sessions they must have to create new titles :)

Negitivefrags · on Jan 30, 2014

This reminds me of the first time I implemented a markov chain text generator. We were at a LAN party so we looked on the public network shares to find a corpus of text files to use as input.

The first thing we found was a copy of the bible and the second thing we found was someones collection of porn stories.

The start of the output was "He slipped his tongue into the lord..."

Ygg2 · on Jan 30, 2014

"... and Lord came unto him"?

angersock · on Jan 30, 2014

Standard missionary work.

onnoonno · on Jan 30, 2014

I think the interesting part about all this is how it changes over time. I have this impression that the whole area of sex is still and always has been a weird reflection on society at large. I would be curious, for example, for a much longer term graph comparing frequency of the two words 'hardcore', and 'love'.

mazr · on Jan 30, 2014

https://books.google.com/ngrams/graph?content=hardcore%2Clov...

Not the same dataset though...

endriju · on Jan 30, 2014

http://porngram.sexualitics.org/?q=btc%2Cusd

uloweb · on Jan 30, 2014

http://porngram.sexualitics.org/?q=iphone%2Candroid

viraptor · on Jan 30, 2014

I find the graph shapes amusing given the data source :)

mapleoin · on Jan 30, 2014

http://porngram.sexualitics.org/?q=hipster

duiker101 · on Jan 30, 2014

I wonder how we survived before 2009 without hipster porn.

edoloughlin · on Jan 30, 2014

Strange: http://porngram.sexualitics.org/?q=tea,wine,beer,coffee

RyanMcGreal · on Jan 30, 2014

In a similar vein:

http://porngram.sexualitics.org/?q=breakfast%2Clunch%2Cdinne...

choult · on Jan 30, 2014

Not sure that that is so strange when you consider tea might appear more often due to a fairly well-known n-grame including its delivery container...

sushirain · on Jan 30, 2014

It counts sub-string frequency, not word frequency. Try "a,b,c".

thinker · on Jan 30, 2014

I wonder if we'll soon see an Upworthy for Porn: "You will never believe what this girl did to pay her rent"

keammo1 · on Jan 30, 2014

"He was just there to fix the cable. What happened next will shock you"

guybrushT · on Jan 30, 2014

Very interesting to see the dataset being made available. Whenever I want to do this kind of analysis, I always stumble at 'how to get the data?'. In their paper, it is mentioned that "We created a dedicated computer program to carry out the navigation and data collection tasks required to gather the metadata for all available videos...". I would love to see this program. More broadly, can anyone help me with best resources (pref python) where one can learn to crawl/scrape this type of information?

mazr · on Jan 31, 2014

Hi, I didn't release the code of the crawler, first, because it was not well-crafted enough to be released (quick and dirty linear programming), and second, because any change in the site you crawl calls for recrafting your code.

I used python, sometimes with Beautifulsoup, sometimes with lxml, both are very good for crawling. I would say BS is easier, and LXML cleaner.

stcredzero · on Jan 30, 2014

words frequency

Is the title a British thing? Like maths vs. math?

polymatter · on Jan 30, 2014

As a native Brit, looks like a mistake to me.

toyg · on Jan 30, 2014

Probably just an error: the guys behind it are French http://sexualitics.org/#theteam

graylights · on Jan 30, 2014

So I did gay vs lesbian and I was confused why there was a big spike in 2010 for gay that has since dropped off. Is this an anomaly in their sampling?

Also Obama's numbers have really dropped compared to Bush: http://porngram.sexualitics.org/?q=bush%2Cobama

6d0debc071 · on Jan 30, 2014

I suppose we may as well do the obvious ones

http://porngram.sexualitics.org/?q=BDSM%2Ctorture%2Cpain%2Cr...

:/

catFishery · on Jan 31, 2014

I dunno, the less obvious ones are more interesting.

http://porngram.sexualitics.org/?q=twat%2Ccunt%2Cfist%2Cprol...

misnome · on Jan 30, 2014

Kind of interesting, but really needs statistical error bounds

himal · on Jan 30, 2014

http://porngram.sexualitics.org/?q=HIV%2CAIDS

jpswade · on Jan 30, 2014

http://porngram.sexualitics.org/?q=cup

bobowzki · on Jan 30, 2014

"sister" :-)