EXACT-OM Documentation

Overview

EXACT-OM predicts correspondences between source and target ontology entities. It combines lexical label matching, ontology structure, auxiliary attributes, optional LLM arbitration, and a global candidate selector. The important runtime property is that each scored pair can carry an inspectable explanation: final scores are broken down into lexical, structural, and LLM contributions, with structural evidence split into hierarchy, similarity, difference, and attribute channels.

Global mode

Use this when you want EXACT-OM to generate candidates, score them, apply threshold and cardinality filters, and write a mapping TSV.

Local mode

Use this when you already have a candidate TSV. Pass -c and the system keeps the full ranked target list for each source.

Audit and review

Enable summary and JSON outputs to inspect scores, channel importances, selected triples, rationales, and run statistics.

Install

Install from the repository root. Use Python 3.10 and make sure Java is available on the system path before running ontology-backed workflows.

poetry install
poetry run exact --help

Required: Python 3.10, Poetry, Java/JDK or JRE.
Recommended: CUDA-capable GPU for large biomedical runs.
Visualizer frontend: Node/npm only when rebuilding explanations_visualizer.

The package defines these entry points: exact, bioml-eval, exact-llm-debug, exact-user-study, and exact-study-viz.

Inputs

Every alignment run needs a source ontology, a target ontology, an output directory, and a YAML config.

Input	Required	Format	Used for
Source ontology	Yes	`.owl`	Source entity labels, annotations, hierarchy, and graph evidence.
Target ontology	Yes	`.owl`	Target entity labels, annotations, hierarchy, and graph evidence.
Training reference	No	TSV: `SrcEntity`, `TgtEntity`, `Score`	Supervised calibration for the global candidate selector.
Full reference	No	TSV: `SrcEntity`, `TgtEntity`, `Score`	Evaluation and reference-aware analysis.
Candidate file	No	TSV: `SrcEntity`, `TgtEntity`, `TgtCandidates`	Local ranking mode. `TgtCandidates` stores the target candidate list for the source.

The helper data/get_data.py downloads Bio-ML data when the data directory is empty and can also build conference-dataset folders. The repository keeps large dataset folders ignored, so expect to provide benchmark data locally.

Run Alignment

Global alignment

Omit -c to let EXACT-OM build the candidate set and write a filtered global alignment.

poetry run exact \
  -s data/ncit-doid/ncit.owl \
  -t data/ncit-doid/doid.owl \
  -o exp/runs/ncit_doid/global \
  -y exact/default_config.yaml \
  -r data/ncit-doid/train.tsv \
  -f data/ncit-doid/test.tsv \
  -l -e -m 60G -d 0

Local candidate ranking

Pass -c when you want to score and rank an existing candidate set. All candidates remain in the ranking file.

poetry run exact \
  -s data/ncit-doid/ncit.owl \
  -t data/ncit-doid/doid.owl \
  -o exp/runs/ncit_doid/local \
  -y exact/default_config.yaml \
  -f data/ncit-doid/test.tsv \
  -c data/ncit-doid/test.cands.tsv \
  -l -e -m 60G -d 0

CLI options

Option	Meaning
`-s`, `--source_ontology_file`	Path to the source OWL ontology.
`-t`, `--target_ontology_file`	Path to the target OWL ontology.
`-o`, `--output_dir`	Run output directory. Created when missing.
`-y`, `--config_file`	YAML runtime configuration. Defaults to built-in settings when omitted.
`-r`, `--training_reference_file`	Optional training mappings for selector calibration.
`-f`, `--full_reference_file`	Optional reference mappings for evaluation and analysis.
`-c`, `--candidates_file`	Candidate restriction file. Enables local ranking mode.
`-e`, `--run_eval`	Run evaluation after writing the alignment.
`-l`, `--save_logs`	Write `exact.log` in the run directory.
`-m`, `--jvm_heap_size`	JVM heap size. A bare number is interpreted as GB.
`-d`, `--device`	CUDA device id. Omit for CPU.

YAML runner

For repeatable jobs, use a run-config YAML with tools/run_exact_job.py.

poetry run python tools/run_exact_job.py \
  --run-config exp/runs/ncit_doid/run.yaml \
  --dry-run

The same helper can submit through Slurm with --sbatch-script deploy/sbatch/exact_single_run.sh.

Configuration

The default config is exact/default_config.yaml. Copy it into the run folder and edit only the blocks that matter for the run. Small overrides are merged with defaults, so you do not need to repeat every parameter.

Alignment decisions

alignment_params:
  threshold: 0.7
  cardinality: 1
  target_cardinality: 1
  save_json: true

threshold filters global alignments. In local mode it labels rationales as positive or negative while preserving the ranking.

Candidate generation

candidates_params:
  retrieval_strategy: hybrid
  top_k: 20
  lexical_encoder_name: sentence-transformers/all-MiniLM-L6-v2

Global mode uses these settings to build source-local target candidates before scoring.

Disable LLM use

model:
  params:
    use_llm: false
    generate_llm_rationales: false

This keeps lexical and structural scoring active while avoiding hosted or local LLM calls.

Use OpenRouter

export OPENROUTER_API_KEY=...

llm_routing:
  decision_profile: openrouter_gpt4o_mini
  rationale_profile: openrouter_gpt4o_mini

Hosted decision scoring is probe-gated. If chat logprobs are unavailable, the runtime falls back to the configured local decision profile.

The most common first-pass tuning knobs are candidates_params.top_k, alignment_params.threshold, dataset_params.n_hops, dataset_params.hierarchy_max_depth, and model.params.tau_LLM.

System Pipeline

EXACT-OM pipeline diagram showing candidate generation, pair-adaptive scoring, LLM arbitration, and output explanations. — Default pipeline: candidate retrieval, pair-adaptive evidence channels, optional LLM arbitration, selector, and auditable outputs.

Exact lexical prefilter. Normalized labels and synonyms are matched first. Exact matches can be removed from downstream scoring and reinserted later.
Candidate generation. In global mode, a hybrid retriever combines dense label embeddings with lexical token and character similarity.
Pair-adaptive scoring. Each source-target pair receives lexical, hierarchy, similarity, difference, and attribute scores with quality estimates.
Adaptive fusion. Strong and reliable channels receive more weight. Empty channels become neutral and do not dilute the result.
LLM arbitration. Ambiguous or internally disagreeing pairs can receive a pair brief and a binary LLM decision probability.
Global selection. For generated candidates, the optional CandidateSetSelector compares each source's candidate set jointly and can abstain with NO_MATCH.
Audit export. Outputs include mapping files, flattened metrics, run stats, plots, and optional full explanation JSON.

Evidence channels

Channel	Signal	What to inspect
Lexical	Best label or synonym similarity.	`s_label`, `I_label`, selected labels.
Hierarchy	Aligned parents or configured hierarchy families.	`s_hier`, `I_hier`, selected hierarchy triples.
Similarity	Supported non-hierarchical object-property triples.	`s_sim`, `I_sim`, selected similarity triples.
Difference	Informative triples on one side without support on the other.	`s_diff`, `I_diff`, unsupported evidence.
Attribute	Definitions, synonyms, xrefs, and projected literals that support the pair.	`s_attr`, `I_attr`, selected attributes.
LLM	Decision probability on ambiguous or disagreeing evidence.	`p_llm`, `I_llm`, generated rationale.

Outputs

A successful run writes a structured output tree under the directory passed with -o.

run-dir/
  exact.log
  times.txt
  dataset/
    dataset.csv
    dataset.meta.json
    feature_metrics.csv
    plots/
  model/
    alignment/
      src2tgt.maps_global.tsv
      src2tgt.maps_local.tsv
      default/
        summary_metrics.csv
        run_stats.json
        run_stats.csv
        full_explanations.json
        llm_calibration.json
    checkpoints/
    cache/
    plots/

Artifact	When present	Purpose
`src2tgt.maps_global.tsv`	Global mode	Saved mappings with `SrcEntity`, `TgtEntity`, and `Score`.
`src2tgt.maps_local.tsv`	Local mode	Per-source ranked target candidates.
`summary_metrics.csv`	`alignment_params.save_csv: true`	Flattened numeric scores, weights, importances, selector fields, and labels.
`full_explanations.json`	`alignment_params.save_json: true`	Full candidate-level explanation records with channel evidence and rationales.
`run_stats.json`	Summary CSV enabled	Run-level aggregates, LLM usage, review-band fraction, and score distributions.
`times.txt`	Always after run stages complete	Stage-level runtime measurements.

Example explanation table with score, channel contribution, and evidence fields. — Explanation records are intended for reviewer-facing inspection, not only aggregate metrics.

Evaluation

Use -e during an alignment run, or evaluate an existing mapping file with bioml-eval.

poetry run bioml-eval \
  --alignment_file exp/runs/ncit_doid/global/model/alignment/src2tgt.maps_global.tsv \
  --output_dir exp/runs/ncit_doid/global \
  --full_reference_file data/ncit-doid/test.tsv \
  --source_ontology_file data/ncit-doid/ncit.owl \
  --target_ontology_file data/ncit-doid/doid.owl \
  --save_logs -m 32G

Global evaluation reports precision, recall, and F1 against the reference alignment. Local candidate evaluation uses the supplied candidate file and reports ranking metrics such as MRR and Hits@K.

Additional analysis helpers include tools/analyze_alignment_run.py, tools/aggregate_results.py, tools/run_candidate_recall_experiment.py, and tools/run_cardinality_threshold_tests.py.

User Study Analysis

The exact-user-study command builds reusable artifacts from an existing local ranking run. The run must contain model/alignment/src2tgt.maps_local.tsv and model/alignment/default/full_explanations.json.

poetry run exact-user-study \
  --run-dir exp/runs/omim_ordo/local \
  --top-k 5 \
  --per-rank 4 \
  --shortlist-per-rank 8 \
  --generate-rationales \
  --jvm-heap-size 32G

Artifacts are written to <run-dir>/analysis/user_study unless --output-dir is set.

pair_metrics.csv and source_panels.csv: candidate and source-panel metrics.
study_shortlist.csv and study_selection_review.csv: balanced selection workflow files.
study_selected_records_with_rationales.json: final selected cases for the visualizer.
study_mapping.json: compact payload served by the study visualizer.
failure_taxonomy.csv and user_study_analysis.ipynb: failure-analysis outputs.

Study Visualizer

The visualizer serves a fixed study run through FastAPI and a static React/Cytoscape frontend. It is designed for read-only inspection and LimeSurvey iframe embedding.

cd explanations_visualizer
npm install
npm run build

cd ..
poetry run python -m study_visualizer_runtime.cli \
  --run-dir exp/runs/omim_ordo/local \
  --analysis-dir exp/runs/omim_ordo/local/analysis/user_study \
  --port 8000

Open a specific source panel with:

http://localhost:8000/?source=<exact_source_iri>

Render bundle deployment

For a lightweight hosted visualizer, export a bundle and deploy the Render assets in deploy/render.

poetry run python tools/prepare_study_visualizer_bundle.py \
  --run-dir exp/runs/omim_ordo/local \
  --bundle-dir deploy/render/study_bundles/omim-ordo \
  --overwrite

The bundle contains the config, study mapping, selected records, ontology cache, and manifest needed by the runtime service.

Python API

Use the API wrappers when integrating EXACT-OM into a script or notebook.

from exact import AlignmentRunner

runner = AlignmentRunner(
    source_ontology_file="data/ncit-doid/ncit.owl",
    target_ontology_file="data/ncit-doid/doid.owl",
    output_dir="exp/runs/ncit_doid/api",
    training_reference_file="data/ncit-doid/train.tsv",
    full_reference_file="data/ncit-doid/test.tsv",
    config_file="exact/default_config.yaml",
    save_logs=True,
    jvm_heap_size="60G",
    run_eval=True,
    device=0,
)
runner.run()

Evaluation is available through exact.EvalutionRunner. The class name is intentionally spelled as it appears in the package API.

Operations

Caching

use_file_cache: true reuses dataset and model caches. Inference checkpoints are enabled by default and resume from compatible runs.

Memory

Use -m 60G or a larger heap for large OWL files. Java is needed for ontology loading, reasoning, and visualizer ontology expansion.

Devices

Pass -d 0 for GPU 0. If CUDA is unavailable, the alignment action logs a warning and uses CPU.

Slurm

Use the scripts in deploy/sbatch with the YAML runner for reproducible cluster jobs.

Troubleshooting

Symptom	Likely cause	Action
JVM fails to initialize	Java is missing or `JAVA_HOME` does not point to a usable runtime.	Install a JDK/JRE and rerun with an explicit heap size such as `-m 32G`.
Hosted LLM falls back locally	Missing OpenRouter key or hosted decision profile lacks usable chat logprobs.	Set `OPENROUTER_API_KEY`, choose a compatible profile, or disable LLM use.
No `full_explanations.json`	JSON export is disabled.	Set `alignment_params.save_json: true` before the run.
User-study analysis cannot start	The input run is not a local ranking run or lacks full explanations.	Run alignment with `-c` and enable `save_json`.
Very slow preprocessing	Large ontology reasoning, high candidate `top_k`, or wide structural evidence pools.	Increase heap, reuse cache, lower `top_k`, or reduce structural caps for exploratory runs.