API Reference¶

Top-level imports¶

shacl_bridges — N-to-m semantic mapping via SHACL shapes with SPARQL CONSTRUCT rules.

Typical usage::

from shacl_bridges.io.yaml_reader import load_mapping
from shacl_bridges.core.graph import select_root_class, check_connectivity
from shacl_bridges.core.shacl import generate_shacl
from shacl_bridges.core.diff import run_bridge_from_files, save_result

mapping = load_mapping("bridge.yaml")
root = select_root_class(mapping.source_pattern.triples, mapping.root_class())

issues = check_connectivity(mapping.source_pattern.triples, root)
if issues:
    raise ValueError(f"Disconnected nodes for root {root!r}: {issues}")

shacl_ttl = generate_shacl(mapping, root)
with open("bridge_shape.ttl", "w") as f:
    f.write(shacl_ttl)

result = run_bridge_from_files("data.ttl", "bridge_shape.ttl")
save_result(result, "expanded.ttl", "diff.ttl")

Or use the CLI::

shacl-bridges validate  bridge.yaml
shacl-bridges diagram   bridge.yaml -o diagram.mmd
shacl-bridges generate  bridge.yaml -o bridge_shape.ttl
shacl-bridges run       bridge.yaml data.ttl

`load_mapping(path)` ¶

Load a :class:BridgeMapping from a YAML file.

Parameters:

Name	Type	Description	Default
`path`	`str \| Path`	Path to the `bridge.yaml` file.	required

Returns:

Type	Description
`BridgeMapping`	Parsed and structurally validated :class:`BridgeMapping`.

Raises:

Type	Description
`ValueError`	If required keys are missing or triples are malformed.
`FileNotFoundError`	If path does not exist.

`select_root_class(triples, explicit_root=None)` ¶

Return the CURIE of the class that should anchor the SHACL shape.

Selection order: 1. explicit_root if provided (comes from source_pattern.root in the YAML). 2. The node with the highest closeness centrality in the undirected view of the source-pattern graph; ties broken by out-degree in the directed view.

Parameters:

Name	Type	Description	Default
`triples`	`list[Triple]`	Source-pattern triples (`source_pattern.triples`).	required
`explicit_root`	`str \| None`	CURIE string supplied by the user, or None.	`None`

Returns:

Type	Description
`str`	CURIE string of the selected root class.

Raises:

Type	Description
`ValueError`	If triples is empty.

`check_connectivity(triples, root)` ¶

Return a list of nodes NOT reachable from root in the source-pattern graph.

A non-empty result means the WHERE clause would contain disconnected sub-patterns, causing the SPARQL CONSTRUCT to over-match.

Parameters:

Name	Type	Description	Default
`triples`	`list[Triple]`	Source-pattern triples.	required
`root`	`str`	CURIE of the chosen root class.	required

Returns:

Type	Description
`list[str]`	Sorted list of unreachable node CURIEs (empty if fully connected).

`generate_shacl(mapping, root_class, shape_name='shapes:BridgeShape')` ¶

Generate a complete SHACL Turtle document for the given mapping.

Parameters:

Name	Type	Description	Default
`mapping`	`BridgeMapping`	Loaded :class:`~shacl_bridges.io.yaml_reader.BridgeMapping`.	required
`root_class`	`str`	CURIE of the class that the shape targets (`sh:targetClass`).	required
`shape_name`	`str`	Local name for the generated `sh:NodeShape`.	`'shapes:BridgeShape'`

Returns:

Type	Description
`str`	Full Turtle string ready to be written to a `.ttl` file.

`run_bridge(data_graph, shacl_graph, inference='rdfs')` ¶

Apply a SHACL bridge shape to data_graph and return the result.

Parameters:

Name	Type	Description	Default
`data_graph`	`Graph`	The instance data to transform.	required
`shacl_graph`	`Graph`	The generated SHACL shape (containing the SPARQLRule).	required
`inference`	`str`	Reasoner to apply before validation. `"rdfs"` is the default and sufficient for most harmonization needs. Pass `"none"` to disable inference entirely.	`'rdfs'`

Returns:

Name	Type	Description
`A`	`BridgeResult`	class:`BridgeResult` with expanded graph, diff, and report.

`run_bridge_from_files(data_path, shacl_path, inference='rdfs')` ¶

Convenience wrapper: load graphs from file paths, then call :func:run_bridge.

Parameters:

Name	Type	Description	Default
`data_path`	`str \| Path`	Path to the instance data Turtle file.	required
`shacl_path`	`str \| Path`	Path to the SHACL shape Turtle file.	required
`inference`	`str`	Reasoner to apply.	`'rdfs'`

Returns:

Name	Type	Description
`A`	`BridgeResult`	class:`BridgeResult`.

`save_result(result, expanded_path, diff_path)` ¶

Serialize expanded and diff graphs to Turtle files.

Parameters:

Name	Type	Description	Default
`result`	`BridgeResult`	Output of :func:`run_bridge`.	required
`expanded_path`	`str \| Path`	Destination path for the expanded graph.	required
`diff_path`	`str \| Path`	Destination path for the diff graph.	required

`harmonize_to_turtle(source, destination=None, fmt=None)` ¶

Load source in any RDF serialization and re-serialize as Turtle.

This normalizes syntax differences between tools (Protégé RDF/XML, robot OWL/XML, hand-written Turtle, etc.) before the bridge pipeline runs. No inference is applied.

Parameters:

Name	Type	Description	Default
`source`	`str \| Path`	Input RDF file (any serialization).	required
`destination`	`str \| Path \| None`	Output `.ttl` path. When None the file is written alongside source with a `.ttl` suffix replacing the original extension.	`None`
`fmt`	`str \| None`	Force an input format string instead of auto-detecting.	`None`

Returns:

Type	Description
`Graph`	The loaded :class:`rdflib.Graph` (the in-memory representation after
`Graph`	round-tripping through rdflib's parser/serializer).

shacl_bridges.io.yaml_reader¶

`shacl_bridges.io.yaml_reader` ¶

YAML-based bridge mapping loader.

A mapping is defined in a single YAML file (conventionally named bridge.yaml) with five top-level sections:

metadata — title, version, creator, license, default justification
prefixes — namespace declarations (prefix → IRI)
source_pattern — S-P-O triples defining the source design pattern; optional root override
target_pattern — S-P-O triples defining the target design pattern
class_map — explicit alignment between source and target classes

See docs/yaml_format.md for the full schema and annotated example.

`Triple = tuple[str, str, str]` `module-attribute` ¶

A (subject, predicate, object) triple where all three are CURIE strings.

`BridgeMapping` `dataclass` ¶

All information that defines one bridge mapping.

Load with :func:load_mapping. Validate with :func:~shacl_bridges.validate.validate_mapping.

Source code in shacl_bridges/io/yaml_reader.py

@dataclass
class BridgeMapping:
    """All information that defines one bridge mapping.

    Load with :func:`load_mapping`. Validate with
    :func:`~shacl_bridges.validate.validate_mapping`.
    """

    prefixes: dict[str, str]
    """Namespace declarations: ``{prefix: IRI}``."""

    source_pattern: SourcePattern
    """The source design pattern with its triples and optional root override."""

    target_pattern: TargetPattern
    """The target design pattern triples."""

    class_map: list[ClassMapEntry]
    """Alignment between source and target classes."""

    metadata: Metadata = field(default_factory=Metadata)
    """Title, version, creator, license, default justification."""

    # ------------------------------------------------------------------
    # Convenience accessors
    # ------------------------------------------------------------------

    def prefix_map(self) -> dict[str, str]:
        """Return ``{prefix: namespace}`` dict for SHACL/SPARQL generation."""
        return dict(self.prefixes)

    def root_class(self) -> str | None:
        """Return the explicitly declared root class CURIE, or *None*."""
        return self.source_pattern.root

    def class_alignment(self) -> dict[str, str]:
        """Return ``{source_curie: target_curie}`` for **regular** (non-derived) entries.

        Entries with a ``derived_iri`` represent *new* instances minted at query
        time and are intentionally excluded here — they are accessed via
        :meth:`derived_class_map` and handled separately by the SPARQL builder.

        When the same source class appears in both a regular and a derived entry
        the regular (non-derived) entry wins and sets the primary ``?this``
        target type.
        """
        result: dict[str, str] = {}
        for e in self.class_map:
            if e.derived_iri is None and e.source not in result:
                result[e.source] = e.target
        return result

    def derived_class_map(self) -> list[ClassMapEntry]:
        """Return only the entries that carry a ``derived_iri`` (instance-split targets)."""
        return [e for e in self.class_map if e.derived_iri is not None]

    def source_classes(self) -> set[str]:
        """Return the set of source CURIEs declared in the class map."""
        return {e.source for e in self.class_map}

    def target_classes(self) -> set[str]:
        """Return the set of target CURIEs declared in the class map (regular + derived)."""
        return {e.target for e in self.class_map}

`class_map` `instance-attribute` ¶

Alignment between source and target classes.

`metadata = field(default_factory=Metadata)` `class-attribute` `instance-attribute` ¶

Title, version, creator, license, default justification.

`prefixes` `instance-attribute` ¶

Namespace declarations: {prefix: IRI}.

`source_pattern` `instance-attribute` ¶

The source design pattern with its triples and optional root override.

`target_pattern` `instance-attribute` ¶

The target design pattern triples.

`class_alignment()` ¶

Return {source_curie: target_curie} for regular (non-derived) entries.

Entries with a derived_iri represent new instances minted at query time and are intentionally excluded here — they are accessed via :meth:derived_class_map and handled separately by the SPARQL builder.

When the same source class appears in both a regular and a derived entry the regular (non-derived) entry wins and sets the primary ?this target type.

Source code in shacl_bridges/io/yaml_reader.py

def class_alignment(self) -> dict[str, str]:
    """Return ``{source_curie: target_curie}`` for **regular** (non-derived) entries.

    Entries with a ``derived_iri`` represent *new* instances minted at query
    time and are intentionally excluded here — they are accessed via
    :meth:`derived_class_map` and handled separately by the SPARQL builder.

    When the same source class appears in both a regular and a derived entry
    the regular (non-derived) entry wins and sets the primary ``?this``
    target type.
    """
    result: dict[str, str] = {}
    for e in self.class_map:
        if e.derived_iri is None and e.source not in result:
            result[e.source] = e.target
    return result

`derived_class_map()` ¶

Return only the entries that carry a derived_iri (instance-split targets).

Source code in shacl_bridges/io/yaml_reader.py

def derived_class_map(self) -> list[ClassMapEntry]:
    """Return only the entries that carry a ``derived_iri`` (instance-split targets)."""
    return [e for e in self.class_map if e.derived_iri is not None]

`prefix_map()` ¶

Return {prefix: namespace} dict for SHACL/SPARQL generation.

Source code in shacl_bridges/io/yaml_reader.py

def prefix_map(self) -> dict[str, str]:
    """Return ``{prefix: namespace}`` dict for SHACL/SPARQL generation."""
    return dict(self.prefixes)

`root_class()` ¶

Return the explicitly declared root class CURIE, or None.

Source code in shacl_bridges/io/yaml_reader.py

def root_class(self) -> str | None:
    """Return the explicitly declared root class CURIE, or *None*."""
    return self.source_pattern.root

`source_classes()` ¶

Return the set of source CURIEs declared in the class map.

Source code in shacl_bridges/io/yaml_reader.py

def source_classes(self) -> set[str]:
    """Return the set of source CURIEs declared in the class map."""
    return {e.source for e in self.class_map}

`target_classes()` ¶

Return the set of target CURIEs declared in the class map (regular + derived).

Source code in shacl_bridges/io/yaml_reader.py

def target_classes(self) -> set[str]:
    """Return the set of target CURIEs declared in the class map (regular + derived)."""
    return {e.target for e in self.class_map}

`ClassMapEntry` `dataclass` ¶

A single source-class → target-class alignment.

For a standard 1-to-1 mapping leave derived_iri as None.

For instance-split mappings — where one source instance must become two target instances (e.g. a conflated "Agent+Role" class splitting into a separate Agent and AgentRole) — add a second entry for the same source class with derived_iri set. The tool will mint a new IRI for the derived instance at query time.

Supported derived_iri forms:

"suffix:<string>" — append to the source instance IRI. Example: suffix:_role turns ex:agent1 into ex:agent1_role.

Source code in shacl_bridges/io/yaml_reader.py

@dataclass
class ClassMapEntry:
    """A single source-class → target-class alignment.

    For a standard 1-to-1 mapping leave *derived_iri* as *None*.

    For **instance-split** mappings — where one source instance must become two
    target instances (e.g. a conflated "Agent+Role" class splitting into a
    separate ``Agent`` and ``AgentRole``) — add a second entry for the same
    *source* class with ``derived_iri`` set.  The tool will mint a new IRI for
    the derived instance at query time.

    Supported ``derived_iri`` forms:

    * ``"suffix:<string>"`` — append *<string>* to the source instance IRI.
      Example: ``suffix:_role`` turns ``ex:agent1`` into ``ex:agent1_role``.
    """

    source: str
    """CURIE of the source class (must appear in ``source_pattern.triples``)."""

    target: str
    """CURIE of the target class (must appear in ``target_pattern.triples``)."""

    justification: str | None = None
    """SSSOM-style justification CURIE, e.g. ``semapv:ManualMappingCuration``."""

    comment: str | None = None
    """Human-readable explanation of why this mapping is valid."""

    derived_iri: str | None = None
    """IRI minting rule for instance-split targets (see class docstring).
    When *None* the target instance reuses the source instance IRI (standard case)."""

`comment = None` `class-attribute` `instance-attribute` ¶

Human-readable explanation of why this mapping is valid.

`derived_iri = None` `class-attribute` `instance-attribute` ¶

IRI minting rule for instance-split targets (see class docstring). When None the target instance reuses the source instance IRI (standard case).

`justification = None` `class-attribute` `instance-attribute` ¶

SSSOM-style justification CURIE, e.g. semapv:ManualMappingCuration.

`source` `instance-attribute` ¶

CURIE of the source class (must appear in source_pattern.triples).

`target` `instance-attribute` ¶

CURIE of the target class (must appear in target_pattern.triples).

`Metadata` `dataclass` ¶

Human-readable metadata about the bridge mapping.

Source code in shacl_bridges/io/yaml_reader.py

@dataclass
class Metadata:
    """Human-readable metadata about the bridge mapping."""

    title: str = ""
    version: str = "0.1.0"
    creator: str = ""
    license: str = ""
    mapping_justification: str = "semapv:ManualMappingCuration"
    """Default justification applied to all class-map entries that don't override it."""

`mapping_justification = 'semapv:ManualMappingCuration'` `class-attribute` `instance-attribute` ¶

Default justification applied to all class-map entries that don't override it.

`SourcePattern` `dataclass` ¶

The source design pattern: a list of S-P-O triples and an optional root override.

Source code in shacl_bridges/io/yaml_reader.py

@dataclass
class SourcePattern:
    """The source design pattern: a list of S-P-O triples and an optional root override."""

    triples: list[Triple]
    """All triples (core structural + peripheral validation) of the source pattern."""

    root: str | None = None
    """CURIE of the class that should anchor ``sh:targetClass`` and ``?this``.
    When *None* the root is computed automatically via closeness centrality."""

`root = None` `class-attribute` `instance-attribute` ¶

CURIE of the class that should anchor sh:targetClass and ?this. When None the root is computed automatically via closeness centrality.

`triples` `instance-attribute` ¶

All triples (core structural + peripheral validation) of the source pattern.

`TargetPattern` `dataclass` ¶

The target design pattern: the triples that the bridge CONSTRUCT will produce.

Source code in shacl_bridges/io/yaml_reader.py

@dataclass
class TargetPattern:
    """The target design pattern: the triples that the bridge CONSTRUCT will produce."""

    triples: list[Triple]

`load_mapping(path)` ¶

Load a :class:BridgeMapping from a YAML file.

Parameters:

Name	Type	Description	Default
`path`	`str \| Path`	Path to the `bridge.yaml` file.	required

Returns:

Type	Description
`BridgeMapping`	Parsed and structurally validated :class:`BridgeMapping`.

Raises:

Type	Description
`ValueError`	If required keys are missing or triples are malformed.
`FileNotFoundError`	If path does not exist.

Source code in shacl_bridges/io/yaml_reader.py

def load_mapping(path: str | Path) -> BridgeMapping:
    """Load a :class:`BridgeMapping` from a YAML file.

    Args:
        path: Path to the ``bridge.yaml`` file.

    Returns:
        Parsed and structurally validated :class:`BridgeMapping`.

    Raises:
        ValueError: If required keys are missing or triples are malformed.
        FileNotFoundError: If *path* does not exist.
    """
    path = Path(path)
    with path.open(encoding="utf-8") as f:
        data = yaml.safe_load(f)

    if not isinstance(data, dict):
        raise ValueError(
            f"{path}: YAML root must be a mapping, got {type(data).__name__}"
        )

    # metadata (optional)
    m = data.get("metadata", {}) or {}
    metadata = Metadata(
        title=str(m.get("title", "")),
        version=str(m.get("version", "0.1.0")),
        creator=str(m.get("creator", "")),
        license=str(m.get("license", "")),
        mapping_justification=str(
            m.get("mapping_justification", "semapv:ManualMappingCuration")
        ),
    )

    # prefixes (required)
    raw_prefixes = _require(data, "prefixes")
    if not isinstance(raw_prefixes, dict):
        raise ValueError("'prefixes' must be a mapping of prefix → namespace IRI")
    prefixes = {str(k): str(v) for k, v in raw_prefixes.items()}

    # source_pattern (required)
    sp_raw = _require(data, "source_pattern")
    source_triples_raw = _require(sp_raw, "triples", "source_pattern")
    source_pattern = SourcePattern(
        root=sp_raw.get("root"),
        triples=[_parse_triple(t) for t in source_triples_raw],
    )

    # target_pattern (required)
    tp_raw = _require(data, "target_pattern")
    target_triples_raw = _require(tp_raw, "triples", "target_pattern")
    target_pattern = TargetPattern(
        triples=[_parse_triple(t) for t in target_triples_raw],
    )

    # class_map (required)
    cm_raw = _require(data, "class_map")
    if not isinstance(cm_raw, list):
        raise ValueError("'class_map' must be a list of {source, target, ...} entries")
    class_map: list[ClassMapEntry] = []
    for i, entry in enumerate(cm_raw):
        if not isinstance(entry, dict):
            raise ValueError(
                f"class_map[{i}] must be a mapping, got {type(entry).__name__}"
            )
        derived_iri_raw = entry.get("derived_iri")
        if derived_iri_raw is not None:
            derived_iri_raw = str(derived_iri_raw)
            if not derived_iri_raw.startswith("suffix:"):
                raise ValueError(
                    f"class_map[{i}].derived_iri: unsupported form {derived_iri_raw!r}."
                    " Currently supported: 'suffix:<string>'"
                )
        class_map.append(ClassMapEntry(
            source=str(_require(entry, "source", f"class_map[{i}]")),
            target=str(_require(entry, "target", f"class_map[{i}]")),
            justification=entry.get("justification"),
            comment=entry.get("comment"),
            derived_iri=derived_iri_raw,
        ))

    return BridgeMapping(
        metadata=metadata,
        prefixes=prefixes,
        source_pattern=source_pattern,
        target_pattern=target_pattern,
        class_map=class_map,
    )

shacl_bridges.io.rdf_utils¶

`shacl_bridges.io.rdf_utils` ¶

Utilities for loading and normalizing RDF graphs.

The primary purpose here is syntax harmonization: converting any RDF serialization (RDF/XML, OWL/XML, Turtle, N-Triples, JSON-LD, etc.) to a canonical Turtle representation before the bridge pipeline runs. This avoids blank-node ID collisions and namespace prefix inconsistencies that arise when mixing serialization styles from tools like Protégé, OWLTools, or robot.

No semantic inference is performed here. That belongs in core/diff.py via pyshacl.

`harmonize_many(sources, output_dir=None, fmt=None)` ¶

Harmonize multiple RDF files to Turtle in one call.

Parameters:

Name	Type	Description	Default
`sources`	`list[str \| Path]`	List of input file paths.	required
`output_dir`	`str \| Path \| None`	Directory for output files. When None each file is written next to its source.	`None`
`fmt`	`str \| None`	Force an input format for all files.	`None`

Returns:

Type	Description
`dict[Path, Graph]`	Mapping from output path to loaded :class:`rdflib.Graph`.

Source code in shacl_bridges/io/rdf_utils.py

def harmonize_many(
    sources: list[str | Path],
    output_dir: str | Path | None = None,
    fmt: str | None = None,
) -> dict[Path, Graph]:
    """Harmonize multiple RDF files to Turtle in one call.

    Args:
        sources: List of input file paths.
        output_dir: Directory for output files. When *None* each file is written
                    next to its source.
        fmt: Force an input format for all files.

    Returns:
        Mapping from output path to loaded :class:`rdflib.Graph`.
    """
    results: dict[Path, Graph] = {}
    for src in sources:
        src = Path(src)
        if output_dir is not None:
            dest = Path(output_dir) / src.with_suffix(".ttl").name
        else:
            dest = None
        g = harmonize_to_turtle(src, destination=dest, fmt=fmt)
        out_path = dest if dest is not None else src.with_suffix(".ttl")
        results[out_path] = g
    return results

`harmonize_to_turtle(source, destination=None, fmt=None)` ¶

Load source in any RDF serialization and re-serialize as Turtle.

This normalizes syntax differences between tools (Protégé RDF/XML, robot OWL/XML, hand-written Turtle, etc.) before the bridge pipeline runs. No inference is applied.

Parameters:

Name	Type	Description	Default
`source`	`str \| Path`	Input RDF file (any serialization).	required
`destination`	`str \| Path \| None`	Output `.ttl` path. When None the file is written alongside source with a `.ttl` suffix replacing the original extension.	`None`
`fmt`	`str \| None`	Force an input format string instead of auto-detecting.	`None`

Returns:

Type	Description
`Graph`	The loaded :class:`rdflib.Graph` (the in-memory representation after
`Graph`	round-tripping through rdflib's parser/serializer).

Source code in shacl_bridges/io/rdf_utils.py

def harmonize_to_turtle(
    source: str | Path,
    destination: str | Path | None = None,
    fmt: str | None = None,
) -> Graph:
    """Load *source* in any RDF serialization and re-serialize as Turtle.

    This normalizes syntax differences between tools (Protégé RDF/XML,
    robot OWL/XML, hand-written Turtle, etc.) before the bridge pipeline runs.
    No inference is applied.

    Args:
        source: Input RDF file (any serialization).
        destination: Output ``.ttl`` path. When *None* the file is written
                     alongside *source* with a ``.ttl`` suffix replacing the
                     original extension.
        fmt: Force an input format string instead of auto-detecting.

    Returns:
        The loaded :class:`rdflib.Graph` (the in-memory representation after
        round-tripping through rdflib's parser/serializer).
    """
    source = Path(source)
    g = load_graph(source, fmt=fmt)

    if destination is None:
        destination = source.with_suffix(".ttl")
    destination = Path(destination)

    g.serialize(destination=str(destination), format="turtle")
    return g

`load_graph(source, fmt=None)` ¶

Load an RDF graph from source, auto-detecting serialization if fmt is None.

Parameters:

Name	Type	Description	Default
`source`	`str \| Path`	File path or URL.	required
`fmt`	`str \| None`	Explicit rdflib format string (e.g. `"xml"`, `"turtle"`). When None the format is guessed from the file extension.	`None`

Returns:

Type	Description
`Graph`	A parsed :class:`rdflib.Graph`.

Source code in shacl_bridges/io/rdf_utils.py

def load_graph(source: str | Path, fmt: str | None = None) -> Graph:
    """Load an RDF graph from *source*, auto-detecting serialization if *fmt* is None.

    Args:
        source: File path or URL.
        fmt: Explicit rdflib format string (e.g. ``"xml"``, ``"turtle"``).
             When *None* the format is guessed from the file extension.

    Returns:
        A parsed :class:`rdflib.Graph`.
    """
    path = Path(source)
    resolved_fmt = fmt or _guess_format(path)
    g = Graph()
    g.parse(str(path), format=resolved_fmt)
    return g

shacl_bridges.core.graph¶

`shacl_bridges.core.graph` ¶

Graph analysis utilities used to select the root (target) class for SHACL generation.

The root class becomes sh:targetClass in the generated NodeShape and anchors the ?this variable in the SPARQL CONSTRUCT WHERE clause. Choosing the wrong root produces a disconnected WHERE pattern that over-matches — the most common source of incorrect bridge output.

Two mechanisms are provided: 1. Explicit override via source_pattern.root in the bridge YAML. 2. Automatic selection using closeness centrality on the source-pattern graph, with out-degree as a tiebreaker.

`build_validation_graph(triples)` ¶

Build a directed graph from a list of S-P-O triples.

Each triple (subject, predicate, object) becomes a directed edge subject → object labelled with the predicate.

Parameters:

Name	Type	Description	Default
`triples`	`list[Triple]`	List of `(subject_curie, predicate_curie, object_curie)` tuples.	required

Returns:

Type	Description
`DiGraph`	Directed graph with `predicate` edge attribute.

Source code in shacl_bridges/core/graph.py

def build_validation_graph(triples: list[Triple]) -> nx.DiGraph:
    """Build a directed graph from a list of S-P-O triples.

    Each triple ``(subject, predicate, object)`` becomes a directed edge
    ``subject → object`` labelled with the predicate.

    Args:
        triples: List of ``(subject_curie, predicate_curie, object_curie)`` tuples.

    Returns:
        Directed graph with ``predicate`` edge attribute.
    """
    G = nx.DiGraph()
    for s, p, o in triples:
        G.add_edge(s, o, predicate=p)
    return G

`check_connectivity(triples, root)` ¶

Return a list of nodes NOT reachable from root in the source-pattern graph.

A non-empty result means the WHERE clause would contain disconnected sub-patterns, causing the SPARQL CONSTRUCT to over-match.

Parameters:

Name	Type	Description	Default
`triples`	`list[Triple]`	Source-pattern triples.	required
`root`	`str`	CURIE of the chosen root class.	required

Returns:

Type	Description
`list[str]`	Sorted list of unreachable node CURIEs (empty if fully connected).

Source code in shacl_bridges/core/graph.py

def check_connectivity(
    triples: list[Triple],
    root: str,
) -> list[str]:
    """Return a list of nodes NOT reachable from *root* in the source-pattern graph.

    A non-empty result means the WHERE clause would contain disconnected
    sub-patterns, causing the SPARQL CONSTRUCT to over-match.

    Args:
        triples: Source-pattern triples.
        root: CURIE of the chosen root class.

    Returns:
        Sorted list of unreachable node CURIEs (empty if fully connected).
    """
    G = build_validation_graph(triples)
    reachable = nx.node_connected_component(G.to_undirected(), root)
    all_nodes = set(G.nodes())
    return sorted(all_nodes - reachable)

`longest_path_length(G)` ¶

Return the number of edges on the longest path in a DAG.

Parameters:

Name	Type	Description	Default
`G`	`DiGraph`	A directed acyclic graph.	required

Returns:

Type	Description
`int`	Number of edges (0 for a single-node graph with no edges).

Raises:

Type	Description
`ValueError`	If G is not a DAG.

Source code in shacl_bridges/core/graph.py

def longest_path_length(G: nx.DiGraph) -> int:
    """Return the number of *edges* on the longest path in a DAG.

    Args:
        G: A directed acyclic graph.

    Returns:
        Number of edges (0 for a single-node graph with no edges).

    Raises:
        ValueError: If *G* is not a DAG.
    """
    if not nx.is_directed_acyclic_graph(G):
        raise ValueError("Graph is not a DAG; longest path is undefined.")
    path = nx.dag_longest_path(G)
    return max(0, len(path) - 1)

`select_root_class(triples, explicit_root=None)` ¶

Return the CURIE of the class that should anchor the SHACL shape.

Selection order: 1. explicit_root if provided (comes from source_pattern.root in the YAML). 2. The node with the highest closeness centrality in the undirected view of the source-pattern graph; ties broken by out-degree in the directed view.

Parameters:

Name	Type	Description	Default
`triples`	`list[Triple]`	Source-pattern triples (`source_pattern.triples`).	required
`explicit_root`	`str \| None`	CURIE string supplied by the user, or None.	`None`

Returns:

Type	Description
`str`	CURIE string of the selected root class.

Raises:

Type	Description
`ValueError`	If triples is empty.

Source code in shacl_bridges/core/graph.py

def select_root_class(
    triples: list[Triple],
    explicit_root: str | None = None,
) -> str:
    """Return the CURIE of the class that should anchor the SHACL shape.

    Selection order:
    1. *explicit_root* if provided (comes from ``source_pattern.root`` in the YAML).
    2. The node with the highest closeness centrality in the undirected view of
       the source-pattern graph; ties broken by out-degree in the directed view.

    Args:
        triples: Source-pattern triples (``source_pattern.triples``).
        explicit_root: CURIE string supplied by the user, or *None*.

    Returns:
        CURIE string of the selected root class.

    Raises:
        ValueError: If *triples* is empty.
    """
    if explicit_root:
        return explicit_root

    G_directed = build_validation_graph(triples)

    if G_directed.number_of_nodes() == 0:
        raise ValueError("source_pattern has no triples; cannot select a root class.")

    # Undirected copy with asymmetric weights so that traversing "against" an
    # edge is penalised — nodes reachable via outgoing edges are preferred.
    G_undirected = nx.Graph()
    for u, v in G_directed.edges():
        G_undirected.add_edge(u, v, weight=1)
        G_undirected.add_edge(v, u, weight=2)

    centrality = nx.closeness_centrality(G_undirected, distance="weight")
    max_val = max(centrality.values())
    candidates = [n for n, c in centrality.items() if c == max_val]

    if len(candidates) == 1:
        return candidates[0]

    # Tiebreak: highest out-degree in the directed graph
    return max(candidates, key=lambda n: G_directed.out_degree(n))

shacl_bridges.core.sparql¶

`shacl_bridges.core.sparql` ¶

SPARQL CONSTRUCT query generation.

Generates the CONSTRUCT { ... } WHERE { ... } block that is embedded inside a sh:SPARQLRule. The WHERE clause is always anchored to ?this (the SHACL convention for the focused node), which guarantees that only subgraphs that are fully connected to the root class are matched.

Variable naming

?this — the focused node (bound to the root class by SHACL)
?var_<suffix> — auto-generated variables for all other nodes, where suffix is a letter sequence (a, b, c, … z, aa, ab, …)

`build_sparql_construct(class_alignment, source_triples, target_triples, root_class, prefix_map, derived_entries=None)` ¶

Generate a SPARQL CONSTRUCT query from the bridge mapping.

The WHERE clause: - Binds ?this to the root_class (?this rdf:type <root_class>) - Includes only core source triples — those where both the subject and object are source classes present in class_alignment. Peripheral/upper-level triples (e.g. ex:Process isSome ex:ChemicalInvestigation) exist only at the TBox level and are omitted from the SPARQL pattern. - Every core triple produces type assertions for the non-root nodes. - For each derived_entry a BIND(IRI(CONCAT(STR(?this), "…")) AS ?derived_X) line is appended to mint a fresh IRI for the split-off instance.

The CONSTRUCT clause: - Asserts new rdf:type triples for each source → target class mapping - Asserts new rdf:type triples for each derived entry (instance split) - Asserts the target-pattern relation triples, with variables resolved via the reverse of class_alignment and the derived variable map

Parameters:

Name	Type	Description	Default
`class_alignment`	`dict[str, str]`	`{source_curie: target_curie}` from the regular (non-derived) class-map entries.	required
`source_triples`	`list[Triple]`	All triples from `source_pattern.triples`.	required
`target_triples`	`list[Triple]`	All triples from `target_pattern.triples`.	required
`root_class`	`str`	CURIE of the root class (the `sh:targetClass`).	required
`prefix_map`	`dict[str, str]`	`{prefix: namespace}` dict (used for context; not embedded here).	required
`derived_entries`	`list[ClassMapEntry] \| None`	Entries with a `derived_iri` field — each describes one instance to be split off from the source and assigned a minted IRI.	`None`

Returns:

Type	Description
`str`	SPARQL CONSTRUCT string (without `PREFIX` declarations — those are
`str`	emitted separately as `sh:prefixes` blocks in the SHACL shape).

Source code in shacl_bridges/core/sparql.py

def build_sparql_construct(
    class_alignment: dict[str, str],
    source_triples: list[Triple],
    target_triples: list[Triple],
    root_class: str,
    prefix_map: dict[str, str],
    derived_entries: list[ClassMapEntry] | None = None,
) -> str:
    """Generate a SPARQL CONSTRUCT query from the bridge mapping.

    The WHERE clause:
    - Binds ``?this`` to the *root_class* (``?this rdf:type <root_class>``)
    - Includes only *core* source triples — those where both the subject and object
      are source classes present in *class_alignment*. Peripheral/upper-level triples
      (e.g. ``ex:Process isSome ex:ChemicalInvestigation``) exist only at the TBox
      level and are omitted from the SPARQL pattern.
    - Every core triple produces type assertions for the non-root nodes.
    - For each *derived_entry* a ``BIND(IRI(CONCAT(STR(?this), "…")) AS ?derived_X)``
      line is appended to mint a fresh IRI for the split-off instance.

    The CONSTRUCT clause:
    - Asserts new ``rdf:type`` triples for each source → target class mapping
    - Asserts new ``rdf:type`` triples for each derived entry (instance split)
    - Asserts the target-pattern relation triples, with variables resolved via the
      reverse of *class_alignment* and the derived variable map

    Args:
        class_alignment: ``{source_curie: target_curie}`` from the **regular**
            (non-derived) class-map entries.
        source_triples: All triples from ``source_pattern.triples``.
        target_triples: All triples from ``target_pattern.triples``.
        root_class: CURIE of the root class (the ``sh:targetClass``).
        prefix_map: ``{prefix: namespace}`` dict (used for context; not embedded here).
        derived_entries: Entries with a ``derived_iri`` field — each describes one
            instance to be split off from the source and assigned a minted IRI.

    Returns:
        SPARQL CONSTRUCT string (without ``PREFIX`` declarations — those are
        emitted separately as ``sh:prefixes`` blocks in the SHACL shape).
    """
    derived_entries = derived_entries or []
    source_classes = set(class_alignment.keys())

    # Collect all source-side entities in a stable order so variable
    # assignment is deterministic. Class-map sources come first so they
    # always get the lowest-suffix variables.
    all_source_entities: list[str] = []
    seen: set[str] = set()

    for src in class_alignment:
        if src not in seen:
            all_source_entities.append(src)
            seen.add(src)
    for s, _p, o in source_triples:
        for v in (s, o):
            if v not in seen:
                all_source_entities.append(v)
                seen.add(v)

    var_map = _generate_variable_names(all_source_entities)
    var_map[root_class] = "?this"  # root class always binds to ?this

    # ------------------------------------------------------------------
    # Derived-entry variable map
    # Maps each derived target CURIE to a fresh SPARQL variable name.
    # Variable name: ?derived_<LocalName> where LocalName is the part
    # after the last ":" in the CURIE.
    # ------------------------------------------------------------------
    derived_var_map: dict[str, str] = {}
    for entry in derived_entries:
        local = entry.target.split(":")[-1]
        derived_var_map[entry.target] = f"?derived_{local}"

    # ------------------------------------------------------------------
    # WHERE clause
    # ------------------------------------------------------------------
    # Only include source triples where BOTH subject and object are source
    # classes in the class_alignment. Upper-level / taxonomic triples are
    # excluded — they apply only at the TBox level, not in instance data.
    where_lines: list[str] = [f"  ?this rdf:type {root_class} ."]

    for s, p, o in source_triples:
        if s in source_classes and o in source_classes:
            s_var = var_map.get(s, f"?{s}")
            o_var = var_map.get(o, f"?{o}")
            where_lines.append(f"  {s_var} {p} {o_var} .")
            if s != root_class:
                where_lines.append(f"  {s_var} rdf:type {s} .")
            where_lines.append(f"  {o_var} rdf:type {o} .")

    where_lines = list(dict.fromkeys(where_lines))  # deduplicate preserving order

    # BIND lines for derived (instance-split) entries come after pattern triples.
    for entry in derived_entries:
        suffix = entry.derived_iri[len("suffix:"):]   # strip "suffix:" prefix
        var_name = derived_var_map[entry.target]
        where_lines.append(
            f'  BIND(IRI(CONCAT(STR(?this), "{suffix}")) AS {var_name})'
        )

    # ------------------------------------------------------------------
    # CONSTRUCT clause
    # ------------------------------------------------------------------
    construct_lines: list[str] = []
    seen_construct: set[str] = set()

    # 1. rdf:type assertions: each source instance is also asserted as its target type.
    # Blank-node targets (``_:label``) are skipped — blank nodes have no fixed rdf:type.
    for src, tgt in class_alignment.items():
        if tgt.startswith("_:"):
            continue
        src_var = var_map.get(src, f"?{src}")
        line = f"  {src_var} rdf:type {tgt} ."
        if line not in seen_construct:
            construct_lines.append(line)
            seen_construct.add(line)

    # 2. rdf:type assertions for derived (instance-split) targets.
    for entry in derived_entries:
        var_name = derived_var_map[entry.target]
        line = f"  {var_name} rdf:type {entry.target} ."
        if line not in seen_construct:
            construct_lines.append(line)
            seen_construct.add(line)

    # 3. Target-pattern relation triples.
    # Resolve target classes back to their source variables via the reverse map,
    # falling back to derived_var_map for split-off nodes.
    # Blank-node labels (``_:label``) pass through verbatim — SPARQL CONSTRUCT
    # creates a fresh blank node for each solution row.
    rev_alignment = {tgt: src for src, tgt in class_alignment.items()}

    def _resolve_target_node(node: str) -> str:
        """Return the SPARQL term for a target-pattern node."""
        if node.startswith("_:"):
            return node  # blank node label — kept as-is in CONSTRUCT
        if node in derived_var_map:
            return derived_var_map[node]  # derived / minted instance
        src = rev_alignment.get(node)
        return var_map.get(src) if src else f"<{node}>"

    for s, p, o in target_triples:
        s_var = _resolve_target_node(s)
        o_var = _resolve_target_node(o)
        line = f"  {s_var} {p} {o_var} ."
        if line not in seen_construct:
            construct_lines.append(line)
            seen_construct.add(line)

    construct_block = "\n".join(construct_lines)
    where_block = "\n".join(where_lines)

    return f"CONSTRUCT {{\n{construct_block}\n}}\nWHERE {{\n{where_block}\n}}"

shacl_bridges.core.shacl¶

`shacl_bridges.core.shacl` ¶

SHACL shape generation.

Produces a complete Turtle-serialized SHACL document containing: 1. A sh:NodeShape targeting the root class with nested sh:property constraints that mirror the source design pattern. 2. A sh:SPARQLRule embedding the SPARQL CONSTRUCT query from :mod:shacl_bridges.core.sparql.

The nested property validation ensures that pyshacl only fires the SPARQL rule against nodes that genuinely conform to the source pattern — preventing the rule from matching isolated instances that happen to share a class name.

`generate_shacl(mapping, root_class, shape_name='shapes:BridgeShape')` ¶

Generate a complete SHACL Turtle document for the given mapping.

Parameters:

Name	Type	Description	Default
`mapping`	`BridgeMapping`	Loaded :class:`~shacl_bridges.io.yaml_reader.BridgeMapping`.	required
`root_class`	`str`	CURIE of the class that the shape targets (`sh:targetClass`).	required
`shape_name`	`str`	Local name for the generated `sh:NodeShape`.	`'shapes:BridgeShape'`

Returns:

Type	Description
`str`	Full Turtle string ready to be written to a `.ttl` file.

Source code in shacl_bridges/core/shacl.py

def generate_shacl(
    mapping: BridgeMapping,
    root_class: str,
    shape_name: str = "shapes:BridgeShape",
) -> str:
    """Generate a complete SHACL Turtle document for the given mapping.

    Args:
        mapping: Loaded :class:`~shacl_bridges.io.yaml_reader.BridgeMapping`.
        root_class: CURIE of the class that the shape targets (``sh:targetClass``).
        shape_name: Local name for the generated ``sh:NodeShape``.

    Returns:
        Full Turtle string ready to be written to a ``.ttl`` file.
    """
    prefix_map = mapping.prefix_map()

    # ------------------------------------------------------------------
    # Turtle @prefix declarations
    # ------------------------------------------------------------------
    prefix_lines = [
        "@prefix sh:    <http://www.w3.org/ns/shacl#> .",
        "@prefix rdf:   <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .",
        "@prefix rdfs:  <http://www.w3.org/2000/01/rdf-schema#> .",
        "@prefix xsd:   <http://www.w3.org/2001/XMLSchema#> .",
        "@prefix shapes: <urn:shacl-bridges:shapes#> .",
    ]
    for pfx, ns in prefix_map.items():
        prefix_lines.append(f"@prefix {pfx}: <{ns}> .")
    prefix_block = "\n".join(prefix_lines)

    # ------------------------------------------------------------------
    # Nested sh:property validation (full source pattern, including peripheral)
    # ------------------------------------------------------------------
    G = build_validation_graph(mapping.source_pattern.triples)
    nested = _nested_properties(G, root_class)

    # ------------------------------------------------------------------
    # SPARQL CONSTRUCT query
    # ------------------------------------------------------------------
    construct_query = build_sparql_construct(
        mapping.class_alignment(),
        mapping.source_pattern.triples,
        mapping.target_pattern.triples,
        root_class,
        prefix_map,
        derived_entries=mapping.derived_class_map(),
    )
    # Indent the query body for embedding in Turtle triple-quote string
    indented_query = "\n".join(
        "        " + line if line.strip() else line
        for line in construct_query.splitlines()
    )

    sparql_rule = (
        "    sh:rule [\n"
        "        a sh:SPARQLRule ;\n"
        + _prefix_block(prefix_map, indent=2)
        + "        sh:message \"Bridge rule: transforms source design pattern to target.\" ;\n"
        f"        sh:construct \"\"\"\n{indented_query}\n        \"\"\" ;\n"
        "    ] ;\n"
    )

    # ------------------------------------------------------------------
    # Assemble
    # ------------------------------------------------------------------
    shape = (
        f"{shape_name}\n"
        "    a sh:NodeShape ;\n"
        f"    sh:targetClass {root_class} ;\n"
        + nested
        + sparql_rule
        + ".\n"
    )

    return f"{prefix_block}\n\n{shape}"

shacl_bridges.core.diff¶

`shacl_bridges.core.diff` ¶

Graph validation and diff computation.

Runs pyshacl in two passes: 1. Base pass: validates the data graph against itself (no external shape), with RDFS inference enabled. This captures any triples that RDFS alone would add and establishes the baseline. 2. Bridge pass: runs the generated SHACL shape against the data graph. The SPARQL CONSTRUCT rule fires and adds new triples.

The diff between pass-1 and pass-2 (via rdflib's isomorphic graph diff) gives exactly the triples introduced by the bridge — nothing more.

`BridgeResult` `dataclass` ¶

Outcome of running the bridge pipeline on a data graph.

Source code in shacl_bridges/core/diff.py

@dataclass
class BridgeResult:
    """Outcome of running the bridge pipeline on a data graph."""

    expanded_graph: Graph
    """The full data graph after SHACL rule application (base + bridged triples)."""

    diff_graph: Graph
    """Only the triples introduced by the bridge (expanded minus inferred base)."""

    conforms: bool
    """Whether the data graph conforms to the validation constraints."""

    report_text: str
    """Human-readable SHACL validation report."""

    report_graph: Graph
    """Machine-readable SHACL report graph."""

`conforms` `instance-attribute` ¶

Whether the data graph conforms to the validation constraints.

`diff_graph` `instance-attribute` ¶

Only the triples introduced by the bridge (expanded minus inferred base).

`expanded_graph` `instance-attribute` ¶

The full data graph after SHACL rule application (base + bridged triples).

`report_graph` `instance-attribute` ¶

Machine-readable SHACL report graph.

`report_text` `instance-attribute` ¶

Human-readable SHACL validation report.

`run_bridge(data_graph, shacl_graph, inference='rdfs')` ¶

Apply a SHACL bridge shape to data_graph and return the result.

Parameters:

Name	Type	Description	Default
`data_graph`	`Graph`	The instance data to transform.	required
`shacl_graph`	`Graph`	The generated SHACL shape (containing the SPARQLRule).	required
`inference`	`str`	Reasoner to apply before validation. `"rdfs"` is the default and sufficient for most harmonization needs. Pass `"none"` to disable inference entirely.	`'rdfs'`

Returns:

Name	Type	Description
`A`	`BridgeResult`	class:`BridgeResult` with expanded graph, diff, and report.

Source code in shacl_bridges/core/diff.py

def run_bridge(
    data_graph: Graph,
    shacl_graph: Graph,
    inference: str = "rdfs",
) -> BridgeResult:
    """Apply a SHACL bridge shape to *data_graph* and return the result.

    Args:
        data_graph: The instance data to transform.
        shacl_graph: The generated SHACL shape (containing the SPARQLRule).
        inference: Reasoner to apply before validation. ``"rdfs"`` is the default
                   and sufficient for most harmonization needs.  Pass ``"none"``
                   to disable inference entirely.

    Returns:
        A :class:`BridgeResult` with expanded graph, diff, and report.
    """
    # Pass 1 — baseline with inference only, no external shape
    val_base = Validator(
        data_graph,
        options={"advanced": True, "inference": inference},
    )
    _, _, _ = val_base.run()
    inferred_base = val_base.target_graph

    # Pass 2 — full bridge shape
    val_bridge = Validator(
        data_graph,
        shacl_graph=shacl_graph,
        options={"advanced": True, "inference": inference},
    )
    conforms, report_g, report_text = val_bridge.run()
    expanded = val_bridge.target_graph

    # Diff
    iso_base = to_isomorphic(inferred_base)
    iso_expanded = to_isomorphic(expanded)
    _both, _only_base, only_expanded = graph_diff(iso_base, iso_expanded)

    return BridgeResult(
        expanded_graph=expanded,
        diff_graph=only_expanded,
        conforms=bool(conforms),
        report_text=report_text,
        report_graph=report_g,
    )

`run_bridge_from_files(data_path, shacl_path, inference='rdfs')` ¶

Convenience wrapper: load graphs from file paths, then call :func:run_bridge.

Parameters:

Name	Type	Description	Default
`data_path`	`str \| Path`	Path to the instance data Turtle file.	required
`shacl_path`	`str \| Path`	Path to the SHACL shape Turtle file.	required
`inference`	`str`	Reasoner to apply.	`'rdfs'`

Returns:

Name	Type	Description
`A`	`BridgeResult`	class:`BridgeResult`.

Source code in shacl_bridges/core/diff.py

def run_bridge_from_files(
    data_path: str | Path,
    shacl_path: str | Path,
    inference: str = "rdfs",
) -> BridgeResult:
    """Convenience wrapper: load graphs from file paths, then call :func:`run_bridge`.

    Args:
        data_path: Path to the instance data Turtle file.
        shacl_path: Path to the SHACL shape Turtle file.
        inference: Reasoner to apply.

    Returns:
        A :class:`BridgeResult`.
    """
    data_graph = Graph()
    data_graph.parse(str(data_path), format="turtle")

    shacl_graph = Graph()
    shacl_graph.parse(str(shacl_path), format="turtle")

    return run_bridge(data_graph, shacl_graph, inference=inference)

`save_result(result, expanded_path, diff_path)` ¶

Serialize expanded and diff graphs to Turtle files.

Parameters:

Name	Type	Description	Default
`result`	`BridgeResult`	Output of :func:`run_bridge`.	required
`expanded_path`	`str \| Path`	Destination path for the expanded graph.	required
`diff_path`	`str \| Path`	Destination path for the diff graph.	required

Source code in shacl_bridges/core/diff.py

def save_result(
    result: BridgeResult,
    expanded_path: str | Path,
    diff_path: str | Path,
) -> None:
    """Serialize expanded and diff graphs to Turtle files.

    Args:
        result: Output of :func:`run_bridge`.
        expanded_path: Destination path for the expanded graph.
        diff_path: Destination path for the diff graph.
    """
    result.expanded_graph.serialize(destination=str(expanded_path), format="turtle")
    result.diff_graph.serialize(destination=str(diff_path), format="turtle")

shacl_bridges.validate¶

`shacl_bridges.validate` ¶

Bridge mapping validator.

Runs structural and semantic checks on a loaded :class:BridgeMapping and returns a list of :class:ValidationIssue objects. An empty list means the mapping passed all checks.

CLI usage::

shacl-bridges validate my_bridge.yaml

Python usage::

from shacl_bridges.io.yaml_reader import load_mapping
from shacl_bridges.validate import validate_mapping, Severity

mapping = load_mapping("bridge.yaml")
issues = validate_mapping(mapping)
errors = [i for i in issues if i.severity == Severity.ERROR]

`ValidationIssue` `dataclass` ¶

A single validation finding.

Source code in shacl_bridges/validate.py

@dataclass
class ValidationIssue:
    """A single validation finding."""

    severity: Severity
    message: str
    hint: str = field(default="")

    def __str__(self) -> str:
        icon = "✗" if self.severity == Severity.ERROR else "⚠"
        line = f"{icon}  {self.message}"
        if self.hint:
            line += f"\n   hint: {self.hint}"
        return line

`validate_mapping(mapping)` ¶

Run all validation checks on mapping.

Checks performed:

Prefix completeness — every CURIE used references a declared prefix (or a well-known built-in).
Root exists — source_pattern.root (if set) appears in at least one source triple.
Source connectivity — every node in source_pattern is reachable from the chosen root. Disconnected nodes would cause silent over-matching.
Class-map sources ⊆ source nodes — every class_map[].source appears in source_pattern.triples.
Class-map targets ⊆ target nodes — every class_map[].target appears in target_pattern.triples.
Target connectivity — the target pattern forms a connected graph. Disconnected target nodes produce isolated triples in the bridge output.

Parameters:

Name	Type	Description	Default
`mapping`	`BridgeMapping`	A loaded :class:`~shacl_bridges.io.yaml_reader.BridgeMapping`.	required

Returns:

Type	Description
`list[ValidationIssue]`	List of :class:`ValidationIssue` objects. An empty list means the
`list[ValidationIssue]`	mapping passed all checks.

Source code in shacl_bridges/validate.py

def validate_mapping(mapping: BridgeMapping) -> list[ValidationIssue]:
    """Run all validation checks on *mapping*.

    Checks performed:

    1. **Prefix completeness** — every CURIE used references a declared prefix
       (or a well-known built-in).
    2. **Root exists** — ``source_pattern.root`` (if set) appears in at least
       one source triple.
    3. **Source connectivity** — every node in ``source_pattern`` is reachable
       from the chosen root. Disconnected nodes would cause silent over-matching.
    4. **Class-map sources ⊆ source nodes** — every ``class_map[].source`` appears
       in ``source_pattern.triples``.
    5. **Class-map targets ⊆ target nodes** — every ``class_map[].target`` appears
       in ``target_pattern.triples``.
    6. **Target connectivity** — the target pattern forms a connected graph.
       Disconnected target nodes produce isolated triples in the bridge output.

    Args:
        mapping: A loaded :class:`~shacl_bridges.io.yaml_reader.BridgeMapping`.

    Returns:
        List of :class:`ValidationIssue` objects. An empty list means the
        mapping passed all checks.
    """
    issues: list[ValidationIssue] = []

    # ------------------------------------------------------------------
    # 1. Prefix completeness
    # ------------------------------------------------------------------
    declared = set(mapping.prefixes.keys()) | _BUILTIN_PREFIXES
    seen_bad_prefixes: set[str] = set()
    for curie in _all_curies(mapping):
        if ":" in curie and not curie.startswith(("http", "urn", "_")):
            prefix = curie.split(":")[0]
            if prefix not in declared and prefix not in seen_bad_prefixes:
                seen_bad_prefixes.add(prefix)
                issues.append(ValidationIssue(
                    Severity.ERROR,
                    f"Prefix '{prefix}' (used in '{curie}') is not declared in prefixes",
                    f"Add '{prefix}: <namespace_IRI>' to the prefixes block",
                ))

    # ------------------------------------------------------------------
    # 2. Root exists in source_pattern
    # ------------------------------------------------------------------
    root = mapping.source_pattern.root
    src_nodes = _source_nodes(mapping)
    if root and root not in src_nodes:
        issues.append(ValidationIssue(
            Severity.ERROR,
            f"source_pattern.root '{root}' does not appear in any source_pattern triple",
            "Check for typos or add a triple that involves this class",
        ))

    # ------------------------------------------------------------------
    # 3. Source graph connectivity from root
    # ------------------------------------------------------------------
    if mapping.source_pattern.triples:
        try:
            effective_root = select_root_class(mapping.source_pattern.triples, root)
            # Skip connectivity check if root was already flagged as absent (check 2)
            if effective_root not in src_nodes:
                effective_root = select_root_class(mapping.source_pattern.triples, None)
            disconnected = check_connectivity(mapping.source_pattern.triples, effective_root)
            for node in disconnected:
                issues.append(ValidationIssue(
                    Severity.WARNING,
                    f"'{node}' is not reachable from root '{effective_root}' in source_pattern",
                    (
                        "This node will be omitted from the SPARQL WHERE clause, "
                        "causing silent over-matching. Set a different source_pattern.root "
                        "or connect this node to the rest of the pattern."
                    ),
                ))
        except ValueError as exc:
            issues.append(ValidationIssue(Severity.ERROR, str(exc)))

    # ------------------------------------------------------------------
    # 4. Class-map sources ⊆ source_pattern nodes
    # ------------------------------------------------------------------
    for entry in mapping.class_map:
        if entry.source not in src_nodes:
            issues.append(ValidationIssue(
                Severity.ERROR,
                f"class_map source '{entry.source}' does not appear in source_pattern.triples",
                (
                    "Add a source_pattern triple that involves this class, "
                    "or remove the class_map entry"
                ),
            ))

    # ------------------------------------------------------------------
    # 5. Class-map targets ⊆ target_pattern nodes
    # ------------------------------------------------------------------
    tgt_nodes = _target_nodes(mapping)
    for entry in mapping.class_map:
        if entry.target not in tgt_nodes:
            issues.append(ValidationIssue(
                Severity.ERROR,
                f"class_map target '{entry.target}' does not appear in target_pattern.triples",
                (
                    "Add a target_pattern triple that involves this class, "
                    "or remove the class_map entry"
                ),
            ))

    # ------------------------------------------------------------------
    # 6. Target pattern connectivity
    # ------------------------------------------------------------------
    if len(mapping.target_pattern.triples) > 1:
        G_tgt = nx.DiGraph()
        for s, _p, o in mapping.target_pattern.triples:
            G_tgt.add_edge(s, o)
        if not nx.is_weakly_connected(G_tgt):
            issues.append(ValidationIssue(
                Severity.WARNING,
                "target_pattern.triples do not form a connected graph",
                (
                    "Disconnected target nodes may produce isolated triples "
                    "in the bridge output that are hard to trace"
                ),
            ))

    return issues

shacl_bridges.visualize.mermaid¶

`shacl_bridges.visualize.mermaid` ¶

Mermaid flowchart generation.

Produces a Mermaid flowchart TD diagram that shows:

Core source nodes (in class_map) — rectangle [Label]
Peripheral source nodes (validation-only, not in class_map) — stadium shape ([Label])
Target nodes — rounded rectangle (Label)
ShapeValidation subgraph:
- CoreShapeInformation inner subgraph: core structural triples (thick ==> arrows)
- Peripheral/upper-level triples outside the inner subgraph (thin ---> arrows)
TransformedGraph subgraph: target pattern triples (--> arrows)
Bridge connections: dotted -.....-> arrows from each source class to its target class

This diagram is generated automatically from the YAML mapping and stays in sync with the source/target patterns without manual maintenance.

`generate_mermaid(mapping)` ¶

Generate a Mermaid flowchart diagram for mapping.

Parameters:

Name	Type	Description	Default
`mapping`	`BridgeMapping`	Loaded :class:`~shacl_bridges.io.yaml_reader.BridgeMapping`.	required

Returns:

Type	Description
`str`	Mermaid diagram string (suitable for embedding in Markdown or saving
`str`	to a `.mmd` file).

Source code in shacl_bridges/visualize/mermaid.py

def generate_mermaid(mapping: BridgeMapping) -> str:
    """Generate a Mermaid flowchart diagram for *mapping*.

    Args:
        mapping: Loaded :class:`~shacl_bridges.io.yaml_reader.BridgeMapping`.

    Returns:
        Mermaid diagram string (suitable for embedding in Markdown or saving
        to a ``.mmd`` file).
    """
    source_triples = mapping.source_pattern.triples
    target_triples = mapping.target_pattern.triples
    class_alignment = mapping.class_alignment()

    # Core source classes = those present in the class_map (will be bridged)
    source_classes: set[str] = set(class_alignment.keys())

    # All nodes that appear anywhere in the source pattern
    all_source_nodes: set[str] = set()
    for s, _p, o in source_triples:
        all_source_nodes.add(s)
        all_source_nodes.add(o)

    # All nodes that appear anywhere in the target pattern
    all_target_nodes: set[str] = set()
    for s, _p, o in target_triples:
        all_target_nodes.add(s)
        all_target_nodes.add(o)

    # Peripheral = source nodes that are NOT bridged (validation-only)
    peripheral: set[str] = all_source_nodes - source_classes

    # Path lengths for dotted bridge arrow sizing
    try:
        G_src = build_validation_graph(source_triples)
        src_len = longest_path_length(G_src)
    except ValueError:
        src_len = 1
    try:
        G_tgt = build_validation_graph(target_triples)
        tgt_len = longest_path_length(G_tgt)
    except ValueError:
        tgt_len = 1

    dot_count = src_len + tgt_len + 3
    dotted = "-" + "." * dot_count + "->"

    lines: list[str] = ["flowchart TD"]

    # ------------------------------------------------------------------
    # Node declarations
    # ------------------------------------------------------------------
    for node in sorted(source_classes):
        lines.append(f"    {node}[{_local_name(node)}]")
    for node in sorted(peripheral):
        lines.append(f"    {node}([{_local_name(node)}])")
    for node in sorted(all_target_nodes):
        lines.append(f"    {node}({_local_name(node)})")
    lines.append("")

    # ------------------------------------------------------------------
    # ShapeValidation subgraph
    # ------------------------------------------------------------------
    lines.append("    subgraph ShapeValidation")
    lines.append("        subgraph CoreShapeInformation")

    extended: list[str] = []
    for s, p, o in source_triples:
        if s in source_classes and o in source_classes:
            lines.append(f"        {s} ==>|{_local_name(p)}| {o}")
        else:
            extended.append(f"    {s} --->|{_local_name(p)}| {o}")

    lines.append("        end")
    lines.extend(extended)
    lines.append("    end")
    lines.append("")

    # ------------------------------------------------------------------
    # TransformedGraph subgraph
    # ------------------------------------------------------------------
    lines.append("    subgraph TransformedGraph")
    for s, p, o in target_triples:
        lines.append(f"    {s} -->|{_local_name(p)}| {o}")
    lines.append("    end")
    lines.append("")

    # ------------------------------------------------------------------
    # Bridge connections (dotted arrows from source to target class)
    # ------------------------------------------------------------------
    for src, tgt in sorted(class_alignment.items()):
        lines.append(f"    {src} {dotted}|SHACL_bridge| {tgt}")

    return "\n".join(lines)

`generate_mermaid_markdown(mapping)` ¶

Wrap the Mermaid diagram in a fenced code block for Markdown embedding.

Source code in shacl_bridges/visualize/mermaid.py

def generate_mermaid_markdown(mapping: BridgeMapping) -> str:
    """Wrap the Mermaid diagram in a fenced code block for Markdown embedding."""
    diagram = generate_mermaid(mapping)
    return f"```mermaid\n{diagram}\n```"

API Reference¶

Top-level imports¶

load_mapping(path) ¶

select_root_class(triples, explicit_root=None) ¶

check_connectivity(triples, root) ¶

generate_shacl(mapping, root_class, shape_name='shapes:BridgeShape') ¶

run_bridge(data_graph, shacl_graph, inference='rdfs') ¶

run_bridge_from_files(data_path, shacl_path, inference='rdfs') ¶

save_result(result, expanded_path, diff_path) ¶

harmonize_to_turtle(source, destination=None, fmt=None) ¶

shacl_bridges.io.yaml_reader¶

shacl_bridges.io.yaml_reader ¶

Triple = tuple[str, str, str] module-attribute ¶

BridgeMapping dataclass ¶

class_map instance-attribute ¶

metadata = field(default_factory=Metadata) class-attribute instance-attribute ¶

prefixes instance-attribute ¶

source_pattern instance-attribute ¶

target_pattern instance-attribute ¶

class_alignment() ¶

derived_class_map() ¶

prefix_map() ¶

root_class() ¶

source_classes() ¶

target_classes() ¶

ClassMapEntry dataclass ¶

comment = None class-attribute instance-attribute ¶

derived_iri = None class-attribute instance-attribute ¶

justification = None class-attribute instance-attribute ¶

source instance-attribute ¶

target instance-attribute ¶

Metadata dataclass ¶

mapping_justification = 'semapv:ManualMappingCuration' class-attribute instance-attribute ¶

SourcePattern dataclass ¶

root = None class-attribute instance-attribute ¶

triples instance-attribute ¶

TargetPattern dataclass ¶

load_mapping(path) ¶

shacl_bridges.io.rdf_utils¶

shacl_bridges.io.rdf_utils ¶

harmonize_many(sources, output_dir=None, fmt=None) ¶

harmonize_to_turtle(source, destination=None, fmt=None) ¶

load_graph(source, fmt=None) ¶

shacl_bridges.core.graph¶

shacl_bridges.core.graph ¶

build_validation_graph(triples) ¶

check_connectivity(triples, root) ¶

longest_path_length(G) ¶

select_root_class(triples, explicit_root=None) ¶

shacl_bridges.core.sparql¶

shacl_bridges.core.sparql ¶

build_sparql_construct(class_alignment, source_triples, target_triples, root_class, prefix_map, derived_entries=None) ¶

shacl_bridges.core.shacl¶

shacl_bridges.core.shacl ¶

generate_shacl(mapping, root_class, shape_name='shapes:BridgeShape') ¶

shacl_bridges.core.diff¶

shacl_bridges.core.diff ¶

BridgeResult dataclass ¶

conforms instance-attribute ¶

diff_graph instance-attribute ¶

expanded_graph instance-attribute ¶

report_graph instance-attribute ¶

report_text instance-attribute ¶

run_bridge(data_graph, shacl_graph, inference='rdfs') ¶

run_bridge_from_files(data_path, shacl_path, inference='rdfs') ¶

save_result(result, expanded_path, diff_path) ¶

shacl_bridges.validate¶

shacl_bridges.validate ¶

ValidationIssue dataclass ¶

validate_mapping(mapping) ¶

shacl_bridges.visualize.mermaid¶

shacl_bridges.visualize.mermaid ¶

generate_mermaid(mapping) ¶

generate_mermaid_markdown(mapping) ¶

`load_mapping(path)` ¶

`select_root_class(triples, explicit_root=None)` ¶

`check_connectivity(triples, root)` ¶

`generate_shacl(mapping, root_class, shape_name='shapes:BridgeShape')` ¶

`run_bridge(data_graph, shacl_graph, inference='rdfs')` ¶

`run_bridge_from_files(data_path, shacl_path, inference='rdfs')` ¶

`save_result(result, expanded_path, diff_path)` ¶

`harmonize_to_turtle(source, destination=None, fmt=None)` ¶

`shacl_bridges.io.yaml_reader` ¶

`Triple = tuple[str, str, str]` `module-attribute` ¶

`BridgeMapping` `dataclass` ¶

`class_map` `instance-attribute` ¶

`metadata = field(default_factory=Metadata)` `class-attribute` `instance-attribute` ¶

`prefixes` `instance-attribute` ¶

`source_pattern` `instance-attribute` ¶

`target_pattern` `instance-attribute` ¶

`class_alignment()` ¶

`derived_class_map()` ¶

`prefix_map()` ¶

`root_class()` ¶

`source_classes()` ¶

`target_classes()` ¶

`ClassMapEntry` `dataclass` ¶

`comment = None` `class-attribute` `instance-attribute` ¶

`derived_iri = None` `class-attribute` `instance-attribute` ¶

`justification = None` `class-attribute` `instance-attribute` ¶

`source` `instance-attribute` ¶

`target` `instance-attribute` ¶

`Metadata` `dataclass` ¶

`mapping_justification = 'semapv:ManualMappingCuration'` `class-attribute` `instance-attribute` ¶

`SourcePattern` `dataclass` ¶

`root = None` `class-attribute` `instance-attribute` ¶

`triples` `instance-attribute` ¶

`TargetPattern` `dataclass` ¶

`load_mapping(path)` ¶

`shacl_bridges.io.rdf_utils` ¶

`harmonize_many(sources, output_dir=None, fmt=None)` ¶

`harmonize_to_turtle(source, destination=None, fmt=None)` ¶

`load_graph(source, fmt=None)` ¶

`shacl_bridges.core.graph` ¶

`build_validation_graph(triples)` ¶

`check_connectivity(triples, root)` ¶

`longest_path_length(G)` ¶

`select_root_class(triples, explicit_root=None)` ¶

`shacl_bridges.core.sparql` ¶

`build_sparql_construct(class_alignment, source_triples, target_triples, root_class, prefix_map, derived_entries=None)` ¶

`shacl_bridges.core.shacl` ¶

`generate_shacl(mapping, root_class, shape_name='shapes:BridgeShape')` ¶

`shacl_bridges.core.diff` ¶

`BridgeResult` `dataclass` ¶

`conforms` `instance-attribute` ¶

`diff_graph` `instance-attribute` ¶

`expanded_graph` `instance-attribute` ¶

`report_graph` `instance-attribute` ¶

`report_text` `instance-attribute` ¶

`run_bridge(data_graph, shacl_graph, inference='rdfs')` ¶

`run_bridge_from_files(data_path, shacl_path, inference='rdfs')` ¶

`save_result(result, expanded_path, diff_path)` ¶

`shacl_bridges.validate` ¶

`ValidationIssue` `dataclass` ¶

`validate_mapping(mapping)` ¶

`shacl_bridges.visualize.mermaid` ¶

`generate_mermaid(mapping)` ¶

`generate_mermaid_markdown(mapping)` ¶