<< All versions
Skill v1.0.0
Trusted Publisher100/100microsoft/vibe-kit/msresearch-bioemu
──Details
PublishedMay 28, 2026 at 07:41 AM
Content Hashsha256:83852b2b19339ef3...
Git SHA
──Files
Files (1 file, 5.8 KB)
SKILL.md5.8 KBactive
SKILL.md · 66 lines · 5.8 KB
version: "1.0.0" name: msresearch-bioemu description: Microsoft Research's BioEmu — generates protein conformational ensembles from amino acid sequence on a local GPU. Use when users mention BioEmu by name, want to predict protein dynamics or flexibility from sequence, or are setting up / running / troubleshooting BioEmu locally. license: MIT
Scope
- Run BioEmu locally on a CUDA-capable NVIDIA GPU to sample protein conformational ensembles from amino acid sequence.
- Two supported entry points:
- CLI —
pip install bioemu[cuda]andpython -m bioemu.sampledirectly from the upstream package. - Reference app — the React + Flask app in
assets/reference-app/. Run its bundledscore/Flask server locally on:5001, point the proxy backend at it via.env, and use the React frontend for visualization.
- Analyze trajectories with MDTraj (RMSD, RMSF, Rg, secondary structure) and visualize with Molstar in the reference app.
Prerequisites
- Linux (or WSL2 on Windows 11 with NVIDIA GPU passthrough — install WSL per Microsoft's guide and follow NVIDIA's CUDA-on-WSL guide for driver setup) with a CUDA-capable NVIDIA GPU. CPU works only for ~10-residue toy sequences; everything else is unusably slow.
- Python 3.10+, ~5 GB free disk for cached weights (AlphaFold2 weights ~3.5 GB + BioEmu checkpoint + working space) at
~/.cache/colabfold/(CLI) or/app/colabfold_cache(Docker). - Path B only: Docker with GPU support (Docker Desktop on Windows includes this; Linux needs
nvidia-container-toolkit), or system Python for a bare-metalscore/run; Node.js 18+ for the frontend.
Workflow
- Load
docs/about-bioemu.mdwhen users ask what BioEmu is, how it works, or need scientific background. - Follow
docs/quick-start.mdto get a first ensemble locally — Path A (CLI smoke test) before Path B (reference app). - For code examples (Python sampling API, MDTraj analysis, AlphaFold comparison, output file formats), route to
docs/application-patterns.md. - Route errors to
docs/troubleshooting.md.
Operating rules
- Windows users: BioEmu is Linux-only. On Windows, all commands must run inside a WSL2 distro (Ubuntu recommended) with NVIDIA drivers installed on the Windows host (not inside WSL). Native Windows Python and PowerShell are not supported. Point Windows users at Microsoft's WSL install guide and NVIDIA's CUDA-on-WSL guide before running any quick-start step.
- GPU probe first: Before suggesting any non-trivial sampling, run
nvidia-smiand confirm a CUDA-capable GPU is visible. If none is present, warn the user that BioEmu is unusable beyond ~10-residue toy sequences and stop — do not proceed to install. - Weight download: First call downloads ~3.5 GB of AlphaFold2 + BioEmu weights. Tell the user this will be slow and must not be interrupted. Subsequent runs reuse the cache.
- Install times:
pip install bioemu[cuda],pip install -r requirements.txt, andnpm install --legacy-peer-depsare all slow on first run. Do not interrupt installs. - Three terminals (Path B):
score/server, proxy backend, and frontend each run in their own terminal and must stay running. Never run other commands in a terminal hosting a live server. - Local credentials: For Path B the
.envkeeps itsAZURE_BIOEMU_*variable names because the proxy reads those exact names — butAZURE_BIOEMU_ENDPOINTshould point athttp://localhost:5001/scoreandAZURE_BIOEMU_KEYcan be any non-empty string (the localscore/server doesn't enforce auth). Explain this when guiding users; never ask them to paste secrets into chat. - Honest scope: This is local inference, not fully offline. ColabFold MSA generation still hits an external MMseqs2 server on first use of a new sequence. Do not claim air-gapped operation.
- Execute, don't display: When terminal execution is available, run quick-start commands directly rather than printing bash blocks for the user to copy.
- Always offer the next step: After loading any explainer-style doc (e.g.
docs/about-bioemu.md), end your response with a concrete offer to advance the user along the Learning Path. Default phrasing: "Want to try BioEmu? I can walk you through running it locally — three commands gets you a first ensemble on your GPU, and from there we can wire up the full reference app UI if you want it." Adapt wording to context, but never end an explainer response without a concrete next-step offer.
Routing
| Doc | When to load | |
|---|---|---|
docs/about-bioemu.md | User asks what BioEmu is, how it works, performance metrics, limitations, or the scientific FAQ | |
docs/quick-start.md | User wants to run BioEmu locally (CLI or full reference app) | |
docs/application-patterns.md | User wants code examples for sampling, MDTraj analysis, output file formats, or AlphaFold comparison | |
docs/troubleshooting.md | User hits an error, missing GPU, weight download stall, port conflict, or MSA timeout |
Learning Path
docs/about-bioemu.md— Understand what BioEmu is and why it mattersdocs/quick-start.mdPath A — Three-command CLI smoke test on your GPUdocs/quick-start.mdPath B — Run the full reference app against a localscore/serverdocs/application-patterns.md— Build your own sampling and analysis workflows
Reference Links
- Science paper: https://www.science.org/doi/10.1126/science.adv9817
- GitHub: https://github.com/microsoft/bioemu
- Hugging Face: https://huggingface.co/microsoft/bioemu
Assets
assets/reference-app/— Self-contained React + Flask +score/Docker app. This is Path B; no separate clone needed.