Skip to content

TokSearch

TokSearch is a flexible, high-performance Python framework for retrieving and processing fusion experimental data at scale. It lets you define pipelines that pull diagnostic signals, filter and transform data, and run distributed analyses across thousands of shots—all using simple, readable Python code.

What It's For

  • Batch analysis across many plasma shots
  • Preprocessing and feature extraction for machine learning
  • Building reproducible, scalable workflows
  • Flexible backend execution (serial, multiprocessing, Ray, Spark)

TokSearch supports legacy data formats like MDSplus and PTDATA and integrates seamlessly with the Fusion Data Platform (FDP) cache infrastructure. It also works with standard Python libraries like NumPy, xarray, and pandas, making it easy to slot into existing workflows.

Documentation

📚 Full documentation and usage examples are available at:
👉 https://ga-fdp.github.io/toksearch/

Source Code


Looking for an example? Want to run it on your own cluster? Check out the docs for tutorials and walkthroughs.