Darkspark is a GUI for your neural network. It allows you to explore a visual, interactive version of your PyTorch code.

I tried all the other options I could find (netron, google’s model-explorer, tensorboard, torchview, torchlens, apple’s mycelium). These are all great projects (I really wanted to use one of them!) but none had all of the features I needed:

Opinionated layout. A layout-engine should automatically expose the underlying logic of the model. E.g. a U-net should look like a "U", see stable-diffusion-v1.5 here https://darkspark.dev/models/?model=stable-diffusion-v1-5

Interactive. Ops need to be collapsible and expandable. Complex models like stable-diffusion won't even load without this.

‘Just Works’ with any arbitrary code. I don’t want to export to ONNX, I don’t want to upload something, I don’t want to manually specify what is the model and what are the inputs. I just want to wrap my existing code in something simple.*

Microscope. Sometimes I also want to explore the activations and attention patterns. Like OpenAI’s microscope, but for your own models. E.g. this CLIP-like model is highly interpretable. https://darkspark.dev/models/?model=vit_base_patch16_siglip_....

Hosted gallery. Most of what I want is usually a variant of an existing model. It’s often more convenient to just reference a url rather than trace your own code. All the models from timm, and many from the huggingface transformers and diffusers libraries, are available at https://darkspark.dev

Hoping to get feedback on the hosted models before releasing the pip package for local use. Thank you!

* darkspark uses __torch_function__. This allows us to capture all the ops and tensors inside the context of darkspark.Tracer without breaking when it hits dynamic control flow ops that can’t be captured in e.g. ONNX or torch exported_program. We also get access to all the tensors, activation patterns, etc, without using hooks.

2
1
reddlee 13 hours ago

This HN thread goes over some similar projects https://news.ycombinator.com/item?id=40357681, which I would have loved to use if they worked on large, complex graphs without clear inputs/outputs and with dynamic control flow built in (e.g. a transformers pipeline like stable-diffusion with controlnet)