An interdisciplinary team of computer scientists, statisticians, and ornithologists will develop novel computer science methods and apply them to the challenge of understanding the annual migration of birds across North America, which is one of the most complex and dynamic natural phenomena on the planet. While direct observation of migrating birds is limited to a handful of birds wearing tracking devices, other sources of data provide partial information about migration that, when appropriately combined, will provide insight into migration at a scale previously unimaginable. These sources include a continent-wide network of volunteer bird watchers, night flight calls captured by a network of acoustic monitoring stations, continent-scale weather patterns gathered by a network of weather stations, and clouds of migrating birds detected at night by WSR-88D weather radar stations. To analyze these data, the team will develop two innovative machine learning techniques-Collective Graphical Models (CGMs) and Semi-Parametric Latent Process Models (SLPMs). The resulting model will be able to identify the complex conditions governing the dynamics of migration behavior including the choice of migratory pathways, the factors that influence when birds migrate, and the speed and duration of each night's movements. CGMs greatly extend the scope of phenomena that can be captured with graphical models. Under suitable conditions, a CGM is able to recover a model of the behavior of individuals using only collective observations.
For BirdCast, it will construct a model of individual bird dynamics from the collective observations provided by birders, acoustic and weather stations, and weather radar. Once the model is constructed, it will be applied to live data feeds (bird sightings, acoustic detections, radar detections, and weather forecasts) to predict bird migration in real time. SLPMs are an extension of latent process models, such as the CGM for bird migration, in which the dynamics of a process is represented by latent variables that are observed only indirectly. In an SLPM, the conditional probability distribution of each variable is modeled using flexible, non-parametric methods from machine learning, such as boosted regression trees. Introducing such flexible methods such as CGMs and SLPMs into latent variable models raises difficult challenges for model fitting and validation. Preventing over-fitting will require the creation of novel information regularization and latent model cross-validation methods to enforce latent variable semantics.
The proposed work will allow, for the first time, real-time predictions of bird migrations: when they migrate, where they migrate, and how far they will be flying. Accurate models of migration have broad application for basic research by allowing researchers to understand behavioral aspects of migration, how migration timing and pathways respond to variation in climatic conditions, and whether linkages exist between annual variation in migration timing and subsequent inter-annual changes in population size.
BirdCast will expand opportunities for the public to participate in the gathering of data and its analysis. The existing data set has more than 60 million observations, and the size is growing exponentially. Last year, volunteers contributed more than 1.3 million hours observing birds. Student engagement in the research is significant as well.