The goal of this research is to investigate the relationship between the occurrence of significant topics in a document and the structure of the document. The unique contribution of this research lies in the combination of methods to be used for the production of a list of significant topics, built on both statistical and rule-based techniques for the identification of term variants as a function of their distribution in focus areas in documents. Applications that can employ these methods include information retrieval, passage retrieval, relevance feedback, information extraction, and summarization. The results can be used directly in ongoing research projects on automatic summarization of documents, using both statistical and information extraction techniques, i.e., combining information retrieval (IR) and natural language processing (NLP). To the extent that these techniques are based on linguistically-motivated patterns and not on domain-dependent vocabularies, these patterns should apply to general text. This approach will be applied to several domains to test its generality and applicability across document types. This will permit measuring the cost of porting across genres. Formative and summative evaluation procedures will be developed and performed at each step of the analysis. This research is undertaken in the context of the Digital Library Research Program at Columbia University, in conjunction with the Center for Research on Information Access. The resulting techniques grounded in the novel combination and cross-fertilization of IR and NLP methods are expected to improve information access based on significant topics across domains and genres.