Parallel computers are difficult to program. Often, users have to specify the parallel execution, i.e., to specify data distributions to different local memory modules (as in High Performance Fortran), data communications between processors, and processor synchronizations. The insufficient programming support limits the applicability and further development of parallel systems. The objectives of this research are (1) to search and find solutions (optimal or feasible) for those fundamental problems in programming parallel systems such as how to distribute/replicate input data to different local memories in the system to minimize inter-processor communications, how to schedule and partition the computations to minimize the overall running time, how many processors should be used and how to choose the right message size for a message vector; (2) to develop new techniques for data alignment, scheduling partitioning and estimating the number of processors to be used for optimal execution of algorithms with nested loops; and (3) to implement these techniques and add them to some existing software tools such as PVM and MPI. Education: The education plans are as follows: (1) Undergraduate computer architecture lab at Santa Clara University has not been changed for six years and is obsolete. It is planned to update the lab and syllabi and redesign the students' projects to include recent advances in computer design, technology and debugging software to have students more motivated and prepared when they graduate. (2) Develop new parallel processing lab projects. One project is to have students run parallel programs in PVM environment so students can see and think those fundamental problems in parallel programming. (3) Be involved in mentoring women and underrepresented undergraduates and graduate students. (4) Do research projects with several senior undergraduate students, especially women engineering students to motivate them to pursue graduate studies in computer engineering at SCU