This dataset is used in a study performed to understand the semantic
content of the source code produced in a collaborative environment. The
semantic content is described as the `dictionary' of the key terms
contained within a source artifact. We posit that the semantic content
of a Java class will increase as long as more developers add more
content on the same class. This has a direct effect on its complexity,
maintainability and understandability.