This research investigates schemes for describing documents which can achieve much higher recall and precision than existing techniques. The schemes use term frequency in an optimal way, significantly more effective than the binary independence model. The significance of this work is that the explosively growing number of documents available to scientists, engineers, and researchers in all fields can now only be handled effectively by computer systems. Current indexing systems operate rapidly, but must be improved to accurately retrieve the most relevant documents while retrieving very few irrelevant documents for any given query.