For doing medical diagnosis, one may wish to design a learning algorithm which attempts to determine from test data which combinations of symptoms indicate that a person will develop a certain disease. The test data may consist of a collection of patient profiles, with the outcomes of a large number of tests on each patient, and a bit indicating whether the patient subsequently developed the disease. One may further assume that whether a patient develops the disease depends only on the presence or absence of a few symptoms, and all other symptoms are irrelevant. In that case, a good learning algorithms should output a hypothesis that is also a function of a small number of symptoms. A hypothesis can be used to predict whether future patients will develop the disease. If the hypothesis depends on only a small number of clinical tests, it saves time and money. In recent work, the investigator has shown a polynomial time PAC (Probably Approximately Correct) algorithm which learns k-term disjunctive normal form formulae which depend on at most r variables (where r is unknown and may be a function of the total number of variables) by creating a disjunctive normal form (DNF) hypothesis which contains at most poly(r) variables. An initial goal of this project is (a) to demonstrate a similar result for the class of decision lists. Additional goals include: (b) the exploration of different paradigms where extracting relevant information is important; (c) the discovery of efficient algorithms for the current problems; (d) the development of a theoretical framework for measuring relevance under which universally applicable techniques for reducing irrelevant information can be found. In addition to the research described above, the PI will teach a graduate seminar on computational learning theory with the expectation of attracting graduate students to the area.