This Small Business Innovation Research Phase I project investigates the technical and commercial feasibility of a realtime automatic accompaniment technology. The proposed innovation would allow a computer to improvise in realtime with a human musician. The intellectual merit of this proposal is in researching a novel computational framework for modeling musical interaction. Realtime improvisation is one of the most demanding cognitive tasks that humans undertake. This proposal seeks to model the sequential structure of melodies and realtime interaction between two improvising musicians. Although music is a creative act, often full of small surprises, it nevertheless is highly patterned. The prediction methods described in this proposal attempt to uncover this structure through context-dependent models. Realtime musical interaction in the context of tonal music is a relatively new area of research, and the cognitive and musical strategies by which musicians coordinate are complex. The research plan will explore whether this intuitive negotiation can be modeled using game theory, with musically appropriate ideas of payoff. The researchers will evaluate both methods using cross-entropy, an objective measure of how well the predictive distribution matches the actual sequence data.
The objective of this proposal is to develop a RAA technology that facilitates technology-based music creation among people that are not musically trained and, in so doing, leads to a new class of self-expression products that represent a multi-billion dollar commercial opportunity. Further, the approach may have clinical applications in the areas of speech therapy for stroke patients suffering speech loss and children with developmental disorders.
The research undertaken has demonstrated that realtime automatic accompaniment is a viable technology. In particular we have shown how predictive modeling, based on statistical machine learning, can be used to generate musical accompaniment in realtime to a sung melody. Predictive modeling is becoming increasingly important in many fields, such as finance, health and public safety. Because music is a realtime activity with hard time constraints, and a well understood structure, it is an ideal test bed for developing ideas about human-machine interaction. It is possible that approaches tha we have used here will inspire researchers in other fields to consider the predictive models, such as multiple viewpoint models in their own work. A commercial outcome of this project was the Songify Toy, which was distributed to retailers such as Target and Walmart, by JAKKS Pacific, a leading toy manufacturer. We have demonstrated that user-generated musical content can form the basis for a vibrant music economy. At the time that the final outcomes were reported for the Project, Smule Inc., which acquired Khush, had a revenue run-rate of $20 million, based on music creation technologies. In addition to the commercialization of the speech-to-music technology in the form of a toy, Khush and the PI, Prerna Gupta, received substantial press related to our music creation technologies. In doing so, Prerna Gupta, a woman, became a visible role model for girls and women considering STEM careers. Prerna was interviewed by the Girl Scouts for a video series designed to encourage girls to pursue STEM careers, in which she discussed her journey of entrepreneurship and the importance of perseverance in the face of failure. She was named one of the Most Influential Women in Technology by FastCompany in 2011, and she has written several articles in the New York Times, TechCrunch and VentureBeat on topics of entrepreneurship, education, women in technology and flexible work policies. Since Khush was acquired, Prerna has become a Resident Mentor at 500 Startups, a leading early-stage venture capital firm in Silicon Valley that has invested in several women-led businesses. Prernaâ€™s work has also been featured in Cosmopolitan Magazine, a popular publication targeting teenage girls.