Wrangler Supercomputer Speeds through Big Data

Data-intensive supercomputer brings new users to high performance computing for science

Published on March 10, 2016 by Jorge Salazar


Human Origins in Fossil Data


Paleoanthropologist Denne Reed of UT Austin connects fossil data of human origins

New discoveries might lie buried deep in the data of human fossils. That's according to Denné Reed, an associate professor in the Department of Anthropology at The University of Texas at Austin (UT Austin). Reed is the principal investigator of PaleoCore, an informatics initiative funded by the National Science Foundation (NSF).

The PaleoCore project aims to get researchers of human origins worldwide all on the same page with their fossil data. Reed said PaleoCore is doing this by implementing data standards; making a place to store all data of human fossils; and developing new tools to collect the data. What he hopes to come out of this are deeper insights into our origins from better integration and sharing between different research projects in paleoanthropology and paleontology.

Denné Reed

Denné Reed, Department of Anthropology at the University of Texas at Austin.

"We've tried to take advantage of some of the geo-processing and database capabilities that are available through Wrangler to create large archives," Reed said. The big data Reed wants to archive on Wrangler are the entirety of the fossil record on human origins. PaleoCore will also include geospatial data such as satellite imagery. "For many of the countries that we're working in, this is their cultural heritage. We need to be able to ensure that not only are the data rapidly available, accessible, searchable, and everything else, but that they're safely archived," Reed said.

PaleoCore also wants to take advantage of the Wrangler data-intensive supercomputer's ability to rapidly interlace data and make connections between different databases. "Wrangler is something we hope will be really promising for that," Reed said. Traditional SQL-based databases are just one way to store information, he added.

"Novel technologies are being developed now here at UT Austin in computer science, ways to take conventional SQL data stores, but represent them in alternative ways as semantic web triple stores if that's what the request demands; being able to convert data from one format to another on the fly to meet various different demands," he said.

"The linked open data possibilities are immense," Reed said. Linked open datasets are interrelated on the Web, amenable to queries and to showing the relationships among data. What that means for PaleoCore is tying fragments of information collected by individual projects separated by distant points of time and across vast geographical regions.

"If we're going to have a comprehensive understanding of human origins and paleontology in general, we have to be able to synthesize and pull together all of these disparate bits of information in a cohesive and coherent way," Reed said.

Data collection has come a long way from the days of just cataloging finds with paper and pen.


PaleoCore project

The PaleoCore project ties together fragments of information across vast distances in space and time to reveal new insights about human origins. Paleocore sprouted from Dr. Reed's research in Dikika and in Mille-Logya, Africa (pictured here). Credit: Denné Reed.

Skull artifacts

New data-handling tools and techniques such as Structure from Motion combined with supercomputers are helping students share in the experience of learning from artifacts. Credit: Wikipedia.


"When we work in the field in Ethiopia and find a fossil, we record specific information about it as it's found in real time on mobile devices," Reed said. "In this case we're using IOS devices like our iPhones and iPads that automatically record the GPS location of the fossil; as well as who collected it; the date and time; what kind of fossil we think it is; and its stratigraphic position. All of that is captured at the moment we pick up the fossil."

PaleoCore's future looks to creating virtual reality (VR) simulations of human fossil data - enabled in part by Wrangler's command in manipulating the large data sets in VR and 3D models. "Structure From Motion is another technology that's really changing the way we do paleontology and archeology," Reed said. For example, multiple photographs of a fossil or artifact taken from mobile devices can be combined to construct a VR simulation, an automatically geo-referenced rich source of information for students.

"All of a sudden, students can see for themselves exactly where this fossil came from, what this artifact looks like, be able to manipulate it, and even if they can't get there themselves, at least in part share in the experience," Reed said.


Continue reading Wrangler Special Report >>