Best Practices for EDG Data Migration

Introduction

This document describes best practices for migrating data between EDG servers. Most commonly, this supports promoting asset collections from a Development server, to a Testing/Staging server, and finally to a Production server.

In many deployments:

  • Ongoing stewardship and editing of operational data occurs in Prod

  • New ontologies, taxonomies, ADS scripts, and EDG customizations are developed in Dev

  • Some organizations use Test / Staging to validate changes prior to Prod deployment

These same migration approaches can also be used to reset Dev or Test by reloading asset collections from Prod (e.g., to start a new development cycle).

This can be important when UUIDs appear in URIs, because creating the “same” thing (e.g., a Product class) on multiple servers can result in different UUID-based URIs.

Several migration approaches are described below. The best choice depends on:

  • Whether the target server already has a version of the collection(s)

  • Whether EDG Change History must be migrated

  • Whether you need “replace” vs “merge” behavior

  • Whether you want repeatable automation vs a UI-driven process

All approaches described use only out-of-the-box EDG features.

Note

These approaches assume execution by a System Administrator with access to the required asset collections on the server.

Approach 1: Send Projects

The Send Projects approach is best when you want to migrate complete EDG projects and optionally include Change History graphs.

Send Projects using EDG UI

The Send Projects to Another Server feature is available under:

Server Administration → Send Projects to Another Server

Key steps:

  • Configure the target server URL (for example https://testserver.company.com/edg/)

  • Provide target credentials

  • Select the asset collections to send (expand the Repositories folder)

Best practices:

  • Do not select the entire Repositories folder (data safety + performance)

  • EDG does not automatically include collections referenced via includes (select the full set explicitly)

  • When sending asset collections, select Also send database triples

  • Prefer replace (clear destination) over merge when re-sending a collection:

    • Select: Clear the destination project of triples before sending triples

Change History migration:

  • A collection’s Change History graph (team graph) may be sent alongside the collection

  • Team graph identifiers typically appear as .tch plus underlying file type

Note

The Send Projects feature does not automatically send included collections. Ensure you select the complete dependency set, including required team graphs.

Send Projects using Python script

The Send Projects feature can be called programmatically via the sendProjects service.

The typical pattern:

  • Maintain a JSON file listing the projects / graphs to send

  • Invoke a Python script that posts to the source server’s /tbl/sendProjects endpoint

Example invocation:

  • python sendProjects.py --url_source ... --url_target ... --sendTriples true --clearGraph true

Warning

EDG deployed on Tomcat commonly uses URLs that include /edg/ while EDG Studio does not. Take care to use the correct base URL structure when testing locally.

Example parameters.json

{
  "file-/Repositories/ontology_1.tch.xdb": "true",
  "file-/Repositories/ontology_1.xdb": "true",
  "file-/Repositories/taxonomy_1.tch.xdb": "true",
  "file-/Repositories/taxonomy_1.xdb": "true"
}

Example Python script

import requests
import json
import argparse
from requests.auth import HTTPBasicAuth

parser = argparse.ArgumentParser(description='Send files/asset collections from one EDG server to another.')

parser.add_argument('--url_source', required=True, help='The source URL, e.g. localhost:8083')
parser.add_argument('--url_target', required=True, help='The target URL, e.g. localhost:8080')
parser.add_argument('--source_username', required=True, help='Username for source authentication')
parser.add_argument('--source_password', required=True, help='Password for source authentication')
parser.add_argument('--target_username', required=True, help='Username for target authentication')
parser.add_argument('--target_password', required=True, help='Password for target authentication')
parser.add_argument('--sendTriples', required=True, help='Send triples (when sending a collection)')
parser.add_argument('--clearGraph', required=True, help='Clear triples when sending to an existing asset collection')
args = parser.parse_args()

with open('parameters.json', 'r') as file:
    parameters = json.load(file)

parameters['userName'] = args.source_username
parameters['password'] = args.source_password
parameters['serverURL'] = f"{args.url_target}/tbl"
parameters['sendTriples'] = args.sendTriples
parameters['clearGraph'] = args.clearGraph

url = f"{args.url_source}/tbl/sendProjects"
response = requests.post(url, data=parameters, auth=HTTPBasicAuth(args.target_username, args.target_password))

print("Status Code:", response.status_code)
print("Response Body:", response.text)

Download script here: sendProjects_approach1.py

Approach 2: Export/Import Zip for New Collection Sets

This approach is best when:

  • You want to migrate new collections to a target environment

  • You want to include included collections in the same package

  • You do not need to migrate Change History graphs

Export/Import Zip using EDG UI

Use the Export tab of an asset collection:

  • Export <collection type> with Includes as a File

  • Choose Zip File without system graphs

This produces a ZIP containing each included collection as Turtle (TTL).

Best practices:

  • Use the option without system graphs

  • Be aware: Change History is not exported via this feature

Import is performed via:

  • New+ → Import Asset Collection from Trig or Zip File

Notes:

  • Collections that already exist on the target server are typically ignored (only “new” collections are created)

  • For initial migrations, a common pattern is to create an artificial umbrella collection that includes all collections to be migrated, export that, import to target, then delete the umbrella collection

Important

If Change History migration is required, use Approach 1 (Send Projects) or Approach 3 (Export RDF / Import RDF).

Export/Import Asset Collection Zips using Python script

Collections can be bulk exported and imported using EDG ZIP APIs:

  • Export via /datasetZip from the source

  • Import via the EDG upload service endpoint on the target

Example input file (one collection per line)

geo
kennedy_family

Example Python script

import argparse
import requests
from requests.auth import HTTPBasicAuth
import os

def export_edg_zip(base_value, base_uri, username, password, includeSystemTriples, excludeSystemGraphs):
    filename = f"{base_value}.zip"
    endpoint = "/datasetZip"
    params = {
        "_base": f"urn:x-evn-master:{base_value}",
        "includeSystemTriples": includeSystemTriples,
        "excludeSystemGraphs": excludeSystemGraphs,
    }

    full_url = f"{base_uri}{endpoint}"

    try:
        response = requests.get(full_url, params=params, auth=HTTPBasicAuth(username, password), timeout=30)
        response.raise_for_status()
    except requests.exceptions.RequestException as e:
        print(f"Error: Failed to make the EDG API request: {e}")
        return

    with open(filename, "wb") as file:
        file.write(response.content)
    print(f'Response saved to "{filename}"')

def import_zip_file(file_name, base_url, username, password):
    url = f"{base_url}/swp"
    file_path = os.path.join("", f"{file_name}.zip")

    if not os.path.exists(file_path):
        print(f"Error: File '{file_path}' not found.")
        return

    with open(file_path, 'rb') as file:
        files = {'filePath': (f"{file_name}.zip", file, 'application/zip; charset=utf-8')}
        data = {
            "_fileUpload": "true",
            "_viewClass": "http://topbraid.org/teamwork#ImportDatasetFileService",
            "trig": "false"
        }

        response = requests.post(
            url, files=files, data=data,
            auth=HTTPBasicAuth(username, password),
            timeout=30
        )

        if response.status_code == 200:
            print(f"Success: File '{file_name}.zip' uploaded successfully.")
        else:
            print(f"Error: Upload failed with status code {response.status_code}.")
            print("Target server response:", response.text)

    os.remove(file_path)
    print(f"Deleted the file '{file_name}.zip' after import.")

def process_collections_list(collections_file, url_source, url_target,
                            source_username, source_password, target_username, target_password,
                            includeSystemTriples, excludeSystemGraphs):
    with open(collections_file, 'r') as file:
        collections = [line.strip() for line in file.readlines() if line.strip()]

    for base_value in collections:
        print(f"Processing collection: {base_value}")
        export_edg_zip(base_value, url_source, source_username, source_password, includeSystemTriples, excludeSystemGraphs)
        import_zip_file(base_value, url_target, target_username, target_password)

if __name__ == "__main__":
    parser = argparse.ArgumentParser(description="Call EDG ZIP export and import APIs with basic authentication.")
    parser.add_argument("--collections_file", required=True)
    parser.add_argument("--url_source", required=True)
    parser.add_argument("--url_target", required=True)
    parser.add_argument("--source_username", required=True)
    parser.add_argument("--source_password", required=True)
    parser.add_argument("--target_username", required=True)
    parser.add_argument("--target_password", required=True)
    parser.add_argument("--includeSystemTriples", choices=['true', 'false'], default='false')
    parser.add_argument("--excludeSystemGraphs", choices=['true', 'false'], default='true')
    args = parser.parse_args()

    process_collections_list(
        args.collections_file,
        args.url_source,
        args.url_target,
        args.source_username,
        args.source_password,
        args.target_username,
        args.target_password,
        args.includeSystemTriples == 'true',
        args.excludeSystemGraphs == 'true'
    )

Download script here: sendProjects_approach2.py

Warning

If multiple collections include the same large dependency collection, ZIP export may download that dependency repeatedly.

Approach 3: Export RDF / Import RDF for Incremental Updates

This approach is best when:

  • The target already has corresponding collections (same graph IDs)

  • You want incremental updates or controlled replacement

  • You may need to preserve or create Change History depending on import options

Export RDF / Import RDF via Swagger

The Export RDF and Import RDF APIs are visible in the EDG Swagger interface (typically accessible from an asset collection under the Reports tab).

Typical usage pattern:

  • Export RDF from a source collection

  • Ensure the target has an asset collection with the same graph ID

  • Import RDF into the target collection

Note

To capture both additions and deletions during import, prefer using import options such as Replace previous contents when appropriate.

Export RDF / Import RDF using EDG APIs (Python)

The following pattern is used:

  • Determine each asset collection type

  • Create matching collections on the target (if needed)

  • Export RDF from the source

  • Import RDF into the target

Example input file

geography_ontology
geo

Example Python script

import argparse
import requests
from requests.auth import HTTPBasicAuth

def asset_collection_type(edg_url, graphId, username, password):
    url = f"{edg_url}/tbl/service/{graphId}/tbs/assetCollectionType"
    response = requests.post(url, headers={'accept': 'application/json'}, auth=HTTPBasicAuth(username, password))
    response.raise_for_status()
    return response.content

def export_rdf(edg_url, graphID, username, password, exclude_values_rules, fmt, include_inferences, keep_edg_triples):
    url = f"{edg_url}/tbl/service/{graphID}/tbs/exportRDFFile"
    params = {
        "excludeValuesRules": str(exclude_values_rules).lower(),
        "format": fmt,
        "includeInferences": str(include_inferences).lower(),
        "keepEDGTriples": str(keep_edg_triples).lower(),
    }
    response = requests.get(url, auth=HTTPBasicAuth(username, password), params=params)
    response.raise_for_status()
    return response.content

def create_asset_collection(edg_url, graphID, type_label, username, password):
    url = f"{edg_url}/tbl/service/_/tbs/createAssetCollection"
    params = {'defaultNamespace': 'www.temporary.org', 'id': graphID, 'name': graphID, 'typeLabel': type_label}
    response = requests.post(url, auth=HTTPBasicAuth(username, password), params=params)
    response.raise_for_status()
    return True

def import_rdf(edg_url, graphID, username, password, rdf_content, fmt='turtle'):
    url = f"{edg_url}/tbl/service/{graphID}/tbs/importRDFFile"
    files = {
        "file": (f"{graphID}.ttl", rdf_content, "text/turtle"),
        "fileName": (None, f"{graphID}.ttl"),
        "format": (None, fmt),
    }
    response = requests.post(url, files=files, auth=HTTPBasicAuth(username, password), headers={"accept": "application/json"})
    response.raise_for_status()
    return True

def process_list(collections_file, url_source, url_target,
                 source_username, source_password, target_username, target_password,
                 exclude_values_rules, fmt, include_inferences, keep_edg_triples):

    with open(collections_file, 'r') as file:
        collections = [line.strip() for line in file.readlines() if line.strip()]

    for graph_id in collections:
        rdf = export_rdf(url_source, graph_id, source_username, source_password, exclude_values_rules, fmt, include_inferences, keep_edg_triples)
        type_label = asset_collection_type(url_source, graph_id, source_username, source_password)
        create_asset_collection(url_target, graph_id, type_label, target_username, target_password)
        import_rdf(url_target, graph_id, target_username, target_password, rdf, fmt=fmt)
        print(f"Migrated: {graph_id}")

if __name__ == "__main__":
    parser = argparse.ArgumentParser(description="Export RDF from EDG and import into another EDG instance.")
    parser.add_argument("--collections_file", required=True)
    parser.add_argument("--url_source", required=True)
    parser.add_argument("--url_target", required=True)
    parser.add_argument("--source_username", required=True)
    parser.add_argument("--source_password", required=True)
    parser.add_argument("--target_username", required=True)
    parser.add_argument("--target_password", required=True)
    parser.add_argument("--excludeValuesRules", type=bool, default=True)
    parser.add_argument("--format", type=str, default="turtle")
    parser.add_argument("--includeInferences", type=bool, default=True)
    parser.add_argument("--keepEDGTriples", type=bool, default=True)
    args = parser.parse_args()

    process_list(
        args.collections_file,
        args.url_source,
        args.url_target,
        args.source_username,
        args.source_password,
        args.target_username,
        args.target_password,
        args.excludeValuesRules,
        args.format,
        args.includeInferences,
        args.keepEDGTriples
    )

Download script here: sendProjects_approach3.py

Warning

Included asset collections are not automatically migrated by this approach. Ensure the input list includes all required dependencies.

Approach 4: Git Integration (EDG 8.3+)

EDG 8.3 and later include improved Git support that can support migration and promotion workflows.

A common pattern is to use:

  • A single Git repository with multiple branches (e.g., Dev/Test/Prod)

  • Push/pull from EDG to Git rather than server-to-server transfers

Git Configuration in EDG

Configure Git repositories under:

Product Configuration → Git Integration

Best practices:

  • Configure a separate repository entry per target (e.g., Dev and Prod)

  • Add a Git repository password (token-based authentication)

  • Configure which EDG users may access each repository

Linking an Asset Collection to Git in EDG

After Git is configured:

  • Open the asset collection to migrate

  • Use the cloud icon to Link to File on Git

  • Choose a new file name (e.g., geo.ttl) or select an existing file

  • Use push to write the collection to Git

  • In the target EDG, create a corresponding collection and link it to the same file

  • Use pull to populate the collection from Git

Important

Only one asset collection can be connected to a given Git file at a time.

You can remove the connection via the Git integration instance (Modify → Delete), and the form can also show metadata such as last push/pull execution.

Determining Changes Between Collections

Collections can be compared in EDG or externally. Common validation use cases include:

  • Verifying changes before migration

  • Confirming a promotion package

  • Reviewing additions and deletions between versions

The main approaches are:

  • EDG Comparison Report

  • EDG Workflow and Workflow Reports

  • Git diff in GitHub

Comparison Report

The Comparison Report is available under the Reports tab of an asset collection.

Typical usage:

  • Open the source collection

  • Navigate to Reports → Comparison Report

  • Select the target collection from the dropdown

  • Review additions and deletions detected by EDG

This approach is fast and requires no export.

EDG Workflow and Workflow Reports

A workflow can be used to create a controlled, reviewable change set:

  • Export the updated collection as Turtle (TTL)

  • In the original collection, create a new workflow (e.g., Basic Workflow)

  • Use Make Changes → Import within the workflow

  • To capture additions and deletions, select Replace previous contents during import

Review locations:

  • Workflow Reports panel: structured list of detected changes

  • Workflow Preview panel: git-style diff view

This approach supports approve/reject governance before committing changes.

Git diff in GitHub

If you push versions to Git:

  • Push the original collection to Git

  • Apply updates (or import a new version), then push again

  • Use GitHub diff between commits to review changes

This is especially useful for teams already using Git-based review processes.

See also

EDG environment promotion patterns and governance controls are covered in Best Practices for Dev, Test, and Prod EDG Servers.