Tomviz

Tomviz: A Platform for Reproducible Materials Tomography

SciPy 2017, Austin, TX

13 July, 2017

Marcus D. Hanwell

@mhanwell

PhD: Thiol Coated Gold Nanoparticles

Tomviz: 3D Materials Tomography

  • Open source project, tomviz.org
  • DOE SBIR—Kitware-Cornell collaboration
  • Focus on 3D tomography for materials
    • Data from TEM microscope
    • Possible to obtain atomic resolution in 3D
  • Go from image stack collected on microscope
    • Align the images
    • Reconstruct the 3D volume
    • Uses Python, NumPy, SciPy for many functions
    • Record workflow in XML state file

The Tomviz Application

Goals

  • Convert MATLAB tools developed at Cornell
    • Required experts to use their code
    • Scalability issues, separate visalization tools
    • Required expensive MATLAB license
  • Develop a user-friendly, open source tool
    • Desktop application aimed at experimentalists
    • Offering an environment that can be extended
    • Primarily developed in C++
    • Python/Python wrapped C++ code

Goals

  • Provide a single software application for workflows
  • Make materials tomography reproducible
    • Research into new reconstruction techniques
  • Permissive BSD can be reused everywhere
    • Including commercial reuse/proprietary
  • Provide cross-platform self-contained installer
  • Highly interactive data visualization tool
  • Use hardware acceleration, background threads
Tomviz Tomviz laptop
  • Powerful—CPU, GPU, memory
  • Open source—BSD, GitHub
  • Open languages—C++, Python
  • Open formats—TIFF, MRC, EMD (HDF5)
  • Scientific—SciPy, NumPy, ITK, VTK
  • Cross-platform—Windows, macOS, Linux
  • Shareable—Self-contained packaging
  • Reproducible—State files, pipelines

Data Collection: Experiment

Tomography experiment

Tomographic Reconstruction

Tomography reconstruction

Core Problem

  • Tomography involves a complex set of steps
  • Collection, alignment, reconstruction, viz, ...
    • Choices at any step can profoundly affect results
    • Changing early steps—rerun everything!
  • Develop an automated software platform
  • Make it easy to add new algorithms, etc
  • Could these steps be published with results
    • Review of all steps, not just first/last

Tomographic Workflow

Tomography workflow

Software Stack

Tomviz stack

Building Blocks

  • Data sources
    • Data read from files
    • Derived data, i.e. recontructions of tilt series
  • Operators
    • Operations on the data sources
    • Alignment, math, reconstruction, segmnentation
  • Modules
    • Visualization, contouring, outlines, volumes

Python Operator

Tomviz Python operator

Make It Easy to Add Algorithms

  • Develop natural Python code using NumPy
  • Input 3D array from previous pipeline step
  • API to update user interface on progress
  • Povide an interactive editor for operators
  • Translation to/from application painless using views
    • Fortran ordering of image data has been painful
    • Looking at using geometric transform
  • Set output arrays, including tables, messages
  • Background thread, seamless to operator developer

Reproducible Data Pipeline

  • The pipeline is central to the application
  • Document the path from raw data to final images
  • XML format developed for reproducibility
    • Entire pipeline saved to the XML file
    • Relative file paths to enable sharing
    • Custom Python code embedded in state file
  • Access to common file formats
  • Operators run in a background thread
    • Remain interactive as operations are applied

Fuel Cell Catalysts (Carbon Supports)

  • Research at Cornell (Elliot Padgett)
  • Hand segmented for nanoparticles inside/outside
  • Move from manual, painstaking task to routine

Open Data Supporting Tomography

  • Distribute small sample data with application
    • CC-BY licensed tilt-series and reconstructions
  • Nature scientific data article published
    • CC-BY openly licensed data sets with full data
    • Levin, B. D. A. et al. Nanomaterial datasets to advance tomopgraphy in scanning transmission electron microscopy, Sci. Data 3:160041
    • Open paper for materials tomography
  • Going beyond showing the final processed image

Innovations in Tomviz

  • Load in raw data, align, reconstruct, visualize
  • Relocatable state files—share full pipeline
  • Background execution of operators (Python & C++)
  • Python-native pipeline using ITK, VTK, SciPy, etc
  • Advanced volume rendering, flying edges
  • Early adopter of OpenGL 2, Qt 5, C++11, Python 3
  • Focused, intuitive interface for tomography
  • Self-contained installer for Windows, macOS, Linux
  • Automated generation of installers for all platforms
  • Export images, movies, interactive HTML5

Advanced Volume Rendering

Tomviz volume rednering

Optimized for Tomography

  • Destructive pipeline minimizes memory use
    • Rerun entire pipeline when anything changes
    • Executes in a background thread—interactivity
  • Optimized contouring for sparse data
    • Early termination offering interactive contours
  • Combined hitogram-opacity-color map
  • Single application for all steps of tomography
    • Alignment, preprocessing, reconstruction, postprocessing, segmentation, viz, data analysis

Segmentation in Materials

  • Move towards quantitative analysis
  • Leverage existing expertize—ITK project
  • Provide turn-key solution for common data
  • Make it easy to extend to unique situations
  • Extension of the data pipeline for labeled images
  • New visualization capabilities for label maps
  • So far using ITK's wrapped Python API exclusively

Development Methodology

  • Main project hosted at tomviz.org
    • Links to resources, downloads, movies
  • Development takes place on GitHub
    • Use pull requests, code review, issues, etc
    • Signed releases (verified badge)
    • DOI generation via GitHub-Zenodo integration
  • Automated software quality dashboards
    • Test the latest merged code
    • Build and upload binary installers

Development Methodology

  • Files to help you get started
  • Superbuild used for builds
    • Builds all dependencies and Tomvivz
    • Help new developers get up and running
  • Mixture of C++ and Python
    • Using recent standards—C++11 and Python 3

Continuous Integration

  • Using great tools available to open projects
  • For every pull request we run three CIs
    • Travis runs clang-format, Python tests
    • CircleCI runs our Linux build
    • AppVeyor runs our Windows builds
  • Awesome interfaces to document CI process
    • Needed to adapt build to segment off pieces
    • Helped us improve CI build times
  • Need to add more tests, tackle OpenGL in CI

The Superbuild: Packaging

  • Provide a binary installer for all platforms
    • Build and package Python, SciPy, NumPy, Qt, TBB, FFMPEG, HDF5 VTK, ParaView, ITK
    • Python wrapped C++ interfaces to libraries
    • Offer a downloadable application with full stack
  • Use of CMake "superbuild" to coordinate build
    • Dependencies built first for all three platforms
    • Binary package created using CPack
    • Automatically uploaded when master changes

Creating Installers

Tomviz installer

Challenges

  • Complex build system requirements for stack
    • SciPy needs Intel Fortran to gel with MSVC
    • Building Python to work on older OSes
  • How much do we package, what do we package
  • Including domain scientists in development
  • How best to achieve a reproducible pipeline
  • NumPy views, zero-copy, memory managemnet
  • Multithreading, CUDA, TBB, HPC/cloud processing

Where Next?

  • Tomviz received DOE Phase IIB funding!
    • Partnering with Michigan University
  • Expand Tomviz to other types of tomography
  • Add support for multidimensional volume data
  • Data acquisition, real-time reconstruction
  • Richer state file using JSON for enhanced sharing
    • Optionally wrapped in HDF5 with all data
  • Advanced pipeline with significant advances
    • Run operators in a separate process
    • Optionally run operators in cloud/HPC

Closing Thoughts

We have a fantastic team of developers and collaborators: David Muller, Robert Hovden, Peter Ercius, Yi Jiang, Elliot Padgett, Barnaby Levin, Colin Ophus, Shawn Waldon, Chris Harris, Cory Quammen, Robert Maynard, Utkarsh Ayachit, Sebastien Jourdain, Matt McCormick, Alvaro Sanchez, TJ Corona, Berk Geveci, Martin Turner, Dula Parkinson, ...

Also funding from the Department of Energy, Office of Science under contract DE-SC0011385

TEM Microscope at NCEM (LBNL)

TEM microscope

Tomviz Data Acquisition

Nanotube

Tomviz Hackathon

Tomiz hackathon

Advanced Volume Rendering

Nanotube