Contributing Similarity Algorithms
Matheel supports two contribution paths:
- A custom algorithm module for local experiments.
- A built-in Matheel method or preset for reusable package support.
Use the custom path first when prototyping. Move into package code only after the method has a stable contract, deterministic tests, and clear dependencies.
Start From A Template
Generate a starter module:
matheel init-custom-algorithm my_algorithm.py
Then run it locally:
python examples/sample_data.py --output sample_pairs.zip --overwrite
matheel compare sample_pairs.zip --algorithm-path my_algorithm.py
The generated module exposes score_pair(code_a, code_b, dataset_context=None, row=None, **kwargs). It may also define prepare_dataset(dataset, prepared_texts=None, **kwargs) for reusable context.
Custom Algorithm Checklist
- Return a finite numeric score where larger means more similar.
- Accept
rowanddataset_contextwhen useful, but do not require them for basic scoring. - Keep options JSON-serializable so CLI, suite, and reproducibility outputs can store them.
- Avoid network calls and mutable global state during scoring.
- Add tests for direct pair scoring and dataset evaluation when the method is intended for benchmarks.
- Keep private code, credentials, and real datasets out of the repository.
Built-In Method Checklist
When promoting a method into Matheel:
- Add the implementation to the smallest relevant module.
- Expose public helpers only when they are useful outside the implementation.
- Add deterministic unit tests with tiny synthetic code snippets.
- Document parameters, dependency requirements, and score range.
- Add the method to comparison or Gradio controls when it is user-facing.
- Add an offline leaderboard preset when the method should be benchmarked by default.
Offline Leaderboard Presets
Leaderboard presets make a method available in manifests, Python, and the Gradio Ready Leaderboard tab.
from matheel import register_leaderboard_algorithm_preset
register_leaderboard_algorithm_preset(
"My Method",
{
"description": "Short benchmark-facing description.",
"similarity_options": {
"feature_weights": {"levenshtein": 1.0},
"code_metric": "codebleu",
"code_metric_weight": 0.0,
},
},
overwrite=True,
)
Use a preset in a leaderboard manifest:
{
"datasets": [{"name": "pairs", "task": "pair", "path": "./normalized_pairs"}],
"algorithms": ["My Method"]
}
Built-in presets should be added with tests that verify available_leaderboard_algorithm_presets(), manifest normalization, and at least one tiny leaderboard run.
Pull Request Expectations
- Include docs and examples for new user-facing behavior.
- Include tests that run offline.
- Keep optional heavy dependencies import-late or behind extras.
- Run:
uv run python -m pytest -q
uv run python -m ruff check .
uv run python -m mkdocs build --strict
git diff --check