He don't seem to use the swear factor anywhere. The actual statistical comparison (Table 3.1) is simply mean SoftWipe score of repos with swears (5.87) vs. mean SoftWipe score of repos with 4+ stars (5.41). The increase is due to 2-3 clusters of swear repos with SoftWipe score ~7.5 and ~20k lines of code. It seems like he deduplicated the repos based on URL, not content, and Github could have biased the results returned in the GitHub search, so I wonder if it is simply sample bias.
paper: https://cme.h-its.org/exelixis/pubs/JanThesis.pdf