New software will transform biological data analysis

Main page content

An NIH-funded team at UW Tacoma, led by Dr. Ka Yee Yeung, has launched BioDepot-workflow-builder to move bioinformatics data analysis into the cloud.

A team of students and faculty from UW Tacoma are breaking new ground in software for biological data analysis.

Dr. Ka Yee Yeung, Professor, UW Tacoma School of Engineering & Technology, received the 2019 UW Tacoma Distinguished Research Award.Dr. Ka Yee Yeung is a professor in the School of Engineering & Technology (SET) and an adjunct professor in the Department of Microbiology at UW in Seattle. She has been leading a project to create Bwb, or BioDepot-workflow-builder, a modular set of bioinformatics data analysis tools and methods that can be deployed in the cloud.

“The advent of cloud computing and big data analytics techniques have revolutionized not just computer science but disciplines across the STEM spectrum,” said Yeung. “Bioinformatics is the term that describes new ways of collecting and analyzing the enormous volumes of complex biological data that are generated every day in labs around the world.”

Confronting the growing deluge of information, Yeung and her colleagues set out to develop a so-called “workflow builder” that would take advantage of another revolution in computing, containerization.

Containerization is a way of bundling all the bits and pieces of a software application together so that it can run in the cloud or any other computer, and it doesn’t matter what server or operating system environment is present. Containerization in the cloud allows Bwb to snuggle up to the biomedical scientist in the lab, rather than requiring the scientist to get immersed in arcane computer methodologies.

“One of our big objectives for Bwb is reproducibility,” said Yeung. “When I run an analysis with my computer, I should come up with the same results as when you run the same analysis on your computer, even though your computing environment is different.”

This reproducibility is achieved through the modular containerization design of Bwb. It focuses on building workflows, stringing together tools for processing and analyzing data. One tool’s output is the next tool’s input, and on down through the workflow. Those tools might be provided by Bwb, or taken off-the-shelf and plugged in, or elaborate tools custom-built by the biomedical scientist.

Another requirement of bioinformatics, and an important contribution made by Bwb, is that the dependencies established by stringing together analytical modules are explicitly controlled and described. “When you are trying to diagnose cancer,” said Yeung, “you want the same result or prediction regardless of what platform or combination of tools you are using.”

A screen-shot of BioDepot-workflow-builder shows a sample workflow using two popular RNA-sequencing modules, Kallisto and Sleuth. In the workspace, the icons represent steps in the workflow, and the lines connecting them trace the flow of data.

 “It is important to note that Bwb does not control accuracy,” said Yeung. “We have developed a software platform: accuracy is the responsibility of the user; to collect and understand the data, and to select meaningful analytical methodologies and understand how the data is manipulated as it travels through the workflow.”

Yeung’s colleagues on the project include Ling-Hong Hung, a research scientist in SET, Wes Lloyd, an assistant professor in SET, and Jiaming Hu, Trevor Meiss, Danny Kristiyanto and Alyssa Ingersoll, all graduates of the UW Tacoma computer science and systems program. Dr. Eric Sobie of the Icahn School of Medicine at Mount Sinai, in New York City, provided biological data from his lab that the team used as a real-world test-bed while building Bwb, and Yuguang Xiong, an assistant professor at Mount Sinai, coordinated beta-testing of the platform.

The project is funded by research grants from the U.S. National Institutes of Health. The team is working with CoMotion, UW’s technology transfer office, to secure a patent for Bwb technology. The software developed with the NIH grant will remain available for use around the world as open source software, and the team will seek to develop the commercial prospects of the software through further add-ons and a user-support model.

The team has described the BioDepot-workflow-builder system in an article published in the prestigious journal “Cell Systems,” called “Building Containerized Workflows Using the BioDepot-Workflow-Builder.”

Written by: 
John Burkhardt / September 26, 2019
Media contact: 

John Burkhardt, UW Tacoma Communications, 253-692-4536 or