Description

The abstracts, one per file, were furnished by the NSF (National Science Foundation). A sample abstract is shown in the next section. The bag-of-word data was produced by automatically processing the abstracts with a text analyzer called NSFAbst, built using VisualText. While most fields of the output are very accurate, the authors were not extracted from the Investigator: field with 100% accuracy, due to wide variability in that field. The word list came from a separate process, and may not include all the words of interest in the abstracts.