Data Coding, Analysis, Archiving, and Sharing for Open Collaboration:
From OpenSHAPA to Open Data Sharing
From OpenSHAPA to Open Data Sharing
Researchers now have access to richer and more detailed behavioral data than ever before. For example, when studying how children learn to walk, researchers can collect eye-tracking data from miniature head-mounted cameras recording the infant's eye movements and field of view, making it possible to see exactly where the child looks while navigating through the environment. Simultaneously, researchers can collect high-speed motion-tracking data detailing the trajectories of the child's limb movements and video data about the child's path relative to caregivers and obstacles, interactions with people, objects, and surfaces, and affective responses while walking, falling, and interacting. Despite the widespread availability of video and other recording technologies, behavioral researchers typically settle for analyzing only one variable in one stream of data, rather than seeking relations among multiple variables across multiple data streams. Powerful data analysis tools and sophisticated data management practices are needed to integrate different kinds of data and relate them to each other -- tools and practices that few researchers have. In addition, researchers usually work in isolation, seldom sharing data that might illuminate others' research. Without richer analyses and data sharing, theoretical progress in developmental psychology and other fields of behavioral science is hampered.
The purpose of this workshop is to delve into the conceptual, technical, and management issues that, when resolved, will allow researchers to perform richer analyses across large, shared, data sets. The workshop will focus in part on the future development of an emerging open-source software tool, OpenSHAPA, and will explore how OpenSHAPA might be extended to encompass new data exploration and visualization tools and promote data management and data sharing. Twenty-two researchers will participate in the workshop, representing the fields of cognitive, perceptual, social, language, and motor development, human-computer interaction, visual analytics, computer science, eResearch, cognitive science, and human factors. Collectively, the invited researchers have experience with different aspects of the problem of exploring rich behavioral data, such as performing massive data visualization, innovative data analyses, integrating multiple data streams, performing custodianship of shared data sets, and creating eResearch communities and data management tools.
The outcomes from the workshop will help to improve the quality of behavioral science. First, findings from the workshop will have an immediate impact on further development of the OpenSHAPA tool, where development is shared across a burgeoning community of users. Possible directions are changes to the architecture to prepare for expansion of data management and data sharing capabilities, building links to existing software, creating libraries of scripts for users to manage data in standardized ways, creating web-based user guides and best practices, expanding user forums, and providing efficient technical support. Research community members can freely adopt OpenSHAPA, expand their current use of it, or build bridges between it and other open source tools, and will bring new users into the community of current users and developers. Second, the richer data analysis that results should support richer theoretical insights. Better data management practices will support more reliable and replicable research, and will better preserve data for future use within and across laboratories. A community of open data sharing practices will lead to greater transparency and efficiency in research and teaching by allowing researchers to inspect each other's data sets and analyses, thereby reducing puzzling failures to replicate, generating new hypotheses, and exposing students to original footage of tasks and findings.
schedule
| Day 1 (Thursday, September 15) |
|---|
| 8:00-8:30 Breakfast |
| 8:30-10:00 Introduction (Karen Adolph, Penelope Sanderson, Clinton Freeman, Jesse Lingeman) |
| 10:00-10:30 Morning break |
| 10:30-12:00 Data Coding, Management And Sharing (Linda Smith, John Stamper, Brian MacWhinney) |
| 12:00-1:30 Lunch |
| 1:30-3:00 Data Mining, Visualization, Analysis, Plug-Ins (Daniel Messinger, Chen Yu, Katy Borner) |
| 3:00-3:30 Afternoon break |
| 3:30-5:00 Data Sharing: Professional & Technical Issues (Marc Bornstein, Bennett Bertenthal, Micah Altman, Pamela Davis-Kean) |
| 7:00 Workshop dinner at Willow (4301 N. Fairfax Drive, Arlington, VA 22203) |
| Day 2 (Friday, September 16) |
|---|
| 8:00-8:30 Breakfast |
| 8:30-10:00 Managing Multiple Data Streams (Martha Alibali, Mike Goldstein, Cole Galloway, Mike Goldstein) |
| 10:00-10:30 Morning break |
| 10:30-12:00 Data Annotation, Exploration, Visualization (Robert Hoffman, Wayne Gray, William Wong) |
| 12:00-12:15 Identification of 3-5 key issues for discussion after lunch |
| 12:15-1:30 Lunch |
| 1:30-3:00 Problem-oriented group discussions |
| 3:00-3:30 Afternoon break |
| 3:30-4:00 Short summary presentations from group discussions |
| 4:00-5:00 Summary (Richard Aslin, Francis Quek, Jeff Lockman) |
| 7:00 Workshop dinner at Rock Bottom (4238 Wilson Boulevard, Ste. 1256, Arlington, VA 22203) |
booklet
Can be downloaded here