Best Practices for EDG Data Migration
Introduction
This document describes best practices for migrating data between EDG servers. Most commonly, this supports promoting asset collections from a Development server, to a Testing/Staging server, and finally to a Production server.
In many deployments:
Ongoing stewardship and editing of operational data occurs in Prod
New ontologies, taxonomies, ADS scripts, and EDG customizations are developed in Dev
Some organizations use Test / Staging to validate changes prior to Prod deployment
These same migration approaches can also be used to reset Dev or Test by reloading asset collections from Prod (e.g., to start a new development cycle).
This can be important when UUIDs appear in URIs, because creating the “same” thing
(e.g., a Product class) on multiple servers can result in different UUID-based URIs.
Several migration approaches are described below. The best choice depends on:
Whether the target server already has a version of the collection(s)
Whether EDG Change History must be migrated
Whether you need “replace” vs “merge” behavior
Whether you want repeatable automation vs a UI-driven process
All approaches described use only out-of-the-box EDG features.
Note
These approaches assume execution by a System Administrator with access to the required asset collections on the server.
Approach 1: Send Projects
The Send Projects approach is best when you want to migrate complete EDG projects and optionally include Change History graphs.
Send Projects using EDG UI
The Send Projects to Another Server feature is available under:
Server Administration → Send Projects to Another Server
Key steps:
Configure the target server URL (for example
https://testserver.company.com/edg/)Provide target credentials
Select the asset collections to send (expand the Repositories folder)
Best practices:
Do not select the entire Repositories folder (data safety + performance)
EDG does not automatically include collections referenced via includes (select the full set explicitly)
When sending asset collections, select Also send database triples
Prefer replace (clear destination) over merge when re-sending a collection:
Select: Clear the destination project of triples before sending triples
Change History migration:
A collection’s Change History graph (team graph) may be sent alongside the collection
Team graph identifiers typically appear as
.tchplus underlying file type
Note
The Send Projects feature does not automatically send included collections. Ensure you select the complete dependency set, including required team graphs.
Send Projects using Python script
The Send Projects feature can be called programmatically via the sendProjects service.
The typical pattern:
Maintain a JSON file listing the projects / graphs to send
Invoke a Python script that posts to the source server’s
/tbl/sendProjectsendpoint
Example invocation:
python sendProjects.py --url_source ... --url_target ... --sendTriples true --clearGraph true
Warning
EDG deployed on Tomcat commonly uses URLs that include /edg/ while EDG Studio
does not. Take care to use the correct base URL structure when testing locally.
Example parameters.json
{
"file-/Repositories/ontology_1.tch.xdb": "true",
"file-/Repositories/ontology_1.xdb": "true",
"file-/Repositories/taxonomy_1.tch.xdb": "true",
"file-/Repositories/taxonomy_1.xdb": "true"
}
Example Python script
import requests
import json
import argparse
from requests.auth import HTTPBasicAuth
parser = argparse.ArgumentParser(description='Send files/asset collections from one EDG server to another.')
parser.add_argument('--url_source', required=True, help='The source URL, e.g. localhost:8083')
parser.add_argument('--url_target', required=True, help='The target URL, e.g. localhost:8080')
parser.add_argument('--source_username', required=True, help='Username for source authentication')
parser.add_argument('--source_password', required=True, help='Password for source authentication')
parser.add_argument('--target_username', required=True, help='Username for target authentication')
parser.add_argument('--target_password', required=True, help='Password for target authentication')
parser.add_argument('--sendTriples', required=True, help='Send triples (when sending a collection)')
parser.add_argument('--clearGraph', required=True, help='Clear triples when sending to an existing asset collection')
args = parser.parse_args()
with open('parameters.json', 'r') as file:
parameters = json.load(file)
parameters['userName'] = args.source_username
parameters['password'] = args.source_password
parameters['serverURL'] = f"{args.url_target}/tbl"
parameters['sendTriples'] = args.sendTriples
parameters['clearGraph'] = args.clearGraph
url = f"{args.url_source}/tbl/sendProjects"
response = requests.post(url, data=parameters, auth=HTTPBasicAuth(args.target_username, args.target_password))
print("Status Code:", response.status_code)
print("Response Body:", response.text)
Download script here: sendProjects_approach1.py
Approach 2: Export/Import Zip for New Collection Sets
This approach is best when:
You want to migrate new collections to a target environment
You want to include included collections in the same package
You do not need to migrate Change History graphs
Export/Import Zip using EDG UI
Use the Export tab of an asset collection:
Export <collection type> with Includes as a File
Choose Zip File without system graphs
This produces a ZIP containing each included collection as Turtle (TTL).
Best practices:
Use the option without system graphs
Be aware: Change History is not exported via this feature
Import is performed via:
New+ → Import Asset Collection from Trig or Zip File
Notes:
Collections that already exist on the target server are typically ignored (only “new” collections are created)
For initial migrations, a common pattern is to create an artificial umbrella collection that includes all collections to be migrated, export that, import to target, then delete the umbrella collection
Important
If Change History migration is required, use Approach 1 (Send Projects) or Approach 3 (Export RDF / Import RDF).
Export/Import Asset Collection Zips using Python script
Collections can be bulk exported and imported using EDG ZIP APIs:
Export via
/datasetZipfrom the sourceImport via the EDG upload service endpoint on the target
Example input file (one collection per line)
geo
kennedy_family
Example Python script
import argparse
import requests
from requests.auth import HTTPBasicAuth
import os
def export_edg_zip(base_value, base_uri, username, password, includeSystemTriples, excludeSystemGraphs):
filename = f"{base_value}.zip"
endpoint = "/datasetZip"
params = {
"_base": f"urn:x-evn-master:{base_value}",
"includeSystemTriples": includeSystemTriples,
"excludeSystemGraphs": excludeSystemGraphs,
}
full_url = f"{base_uri}{endpoint}"
try:
response = requests.get(full_url, params=params, auth=HTTPBasicAuth(username, password), timeout=30)
response.raise_for_status()
except requests.exceptions.RequestException as e:
print(f"Error: Failed to make the EDG API request: {e}")
return
with open(filename, "wb") as file:
file.write(response.content)
print(f'Response saved to "{filename}"')
def import_zip_file(file_name, base_url, username, password):
url = f"{base_url}/swp"
file_path = os.path.join("", f"{file_name}.zip")
if not os.path.exists(file_path):
print(f"Error: File '{file_path}' not found.")
return
with open(file_path, 'rb') as file:
files = {'filePath': (f"{file_name}.zip", file, 'application/zip; charset=utf-8')}
data = {
"_fileUpload": "true",
"_viewClass": "http://topbraid.org/teamwork#ImportDatasetFileService",
"trig": "false"
}
response = requests.post(
url, files=files, data=data,
auth=HTTPBasicAuth(username, password),
timeout=30
)
if response.status_code == 200:
print(f"Success: File '{file_name}.zip' uploaded successfully.")
else:
print(f"Error: Upload failed with status code {response.status_code}.")
print("Target server response:", response.text)
os.remove(file_path)
print(f"Deleted the file '{file_name}.zip' after import.")
def process_collections_list(collections_file, url_source, url_target,
source_username, source_password, target_username, target_password,
includeSystemTriples, excludeSystemGraphs):
with open(collections_file, 'r') as file:
collections = [line.strip() for line in file.readlines() if line.strip()]
for base_value in collections:
print(f"Processing collection: {base_value}")
export_edg_zip(base_value, url_source, source_username, source_password, includeSystemTriples, excludeSystemGraphs)
import_zip_file(base_value, url_target, target_username, target_password)
if __name__ == "__main__":
parser = argparse.ArgumentParser(description="Call EDG ZIP export and import APIs with basic authentication.")
parser.add_argument("--collections_file", required=True)
parser.add_argument("--url_source", required=True)
parser.add_argument("--url_target", required=True)
parser.add_argument("--source_username", required=True)
parser.add_argument("--source_password", required=True)
parser.add_argument("--target_username", required=True)
parser.add_argument("--target_password", required=True)
parser.add_argument("--includeSystemTriples", choices=['true', 'false'], default='false')
parser.add_argument("--excludeSystemGraphs", choices=['true', 'false'], default='true')
args = parser.parse_args()
process_collections_list(
args.collections_file,
args.url_source,
args.url_target,
args.source_username,
args.source_password,
args.target_username,
args.target_password,
args.includeSystemTriples == 'true',
args.excludeSystemGraphs == 'true'
)
Download script here: sendProjects_approach2.py
Warning
If multiple collections include the same large dependency collection, ZIP export may download that dependency repeatedly.
Approach 3: Export RDF / Import RDF for Incremental Updates
This approach is best when:
The target already has corresponding collections (same graph IDs)
You want incremental updates or controlled replacement
You may need to preserve or create Change History depending on import options
Export RDF / Import RDF via Swagger
The Export RDF and Import RDF APIs are visible in the EDG Swagger interface (typically accessible from an asset collection under the Reports tab).
Typical usage pattern:
Export RDF from a source collection
Ensure the target has an asset collection with the same graph ID
Import RDF into the target collection
Note
To capture both additions and deletions during import, prefer using import options such as Replace previous contents when appropriate.
Export RDF / Import RDF using EDG APIs (Python)
The following pattern is used:
Determine each asset collection type
Create matching collections on the target (if needed)
Export RDF from the source
Import RDF into the target
Example input file
geography_ontology
geo
Example Python script
import argparse
import requests
from requests.auth import HTTPBasicAuth
def asset_collection_type(edg_url, graphId, username, password):
url = f"{edg_url}/tbl/service/{graphId}/tbs/assetCollectionType"
response = requests.post(url, headers={'accept': 'application/json'}, auth=HTTPBasicAuth(username, password))
response.raise_for_status()
return response.content
def export_rdf(edg_url, graphID, username, password, exclude_values_rules, fmt, include_inferences, keep_edg_triples):
url = f"{edg_url}/tbl/service/{graphID}/tbs/exportRDFFile"
params = {
"excludeValuesRules": str(exclude_values_rules).lower(),
"format": fmt,
"includeInferences": str(include_inferences).lower(),
"keepEDGTriples": str(keep_edg_triples).lower(),
}
response = requests.get(url, auth=HTTPBasicAuth(username, password), params=params)
response.raise_for_status()
return response.content
def create_asset_collection(edg_url, graphID, type_label, username, password):
url = f"{edg_url}/tbl/service/_/tbs/createAssetCollection"
params = {'defaultNamespace': 'www.temporary.org', 'id': graphID, 'name': graphID, 'typeLabel': type_label}
response = requests.post(url, auth=HTTPBasicAuth(username, password), params=params)
response.raise_for_status()
return True
def import_rdf(edg_url, graphID, username, password, rdf_content, fmt='turtle'):
url = f"{edg_url}/tbl/service/{graphID}/tbs/importRDFFile"
files = {
"file": (f"{graphID}.ttl", rdf_content, "text/turtle"),
"fileName": (None, f"{graphID}.ttl"),
"format": (None, fmt),
}
response = requests.post(url, files=files, auth=HTTPBasicAuth(username, password), headers={"accept": "application/json"})
response.raise_for_status()
return True
def process_list(collections_file, url_source, url_target,
source_username, source_password, target_username, target_password,
exclude_values_rules, fmt, include_inferences, keep_edg_triples):
with open(collections_file, 'r') as file:
collections = [line.strip() for line in file.readlines() if line.strip()]
for graph_id in collections:
rdf = export_rdf(url_source, graph_id, source_username, source_password, exclude_values_rules, fmt, include_inferences, keep_edg_triples)
type_label = asset_collection_type(url_source, graph_id, source_username, source_password)
create_asset_collection(url_target, graph_id, type_label, target_username, target_password)
import_rdf(url_target, graph_id, target_username, target_password, rdf, fmt=fmt)
print(f"Migrated: {graph_id}")
if __name__ == "__main__":
parser = argparse.ArgumentParser(description="Export RDF from EDG and import into another EDG instance.")
parser.add_argument("--collections_file", required=True)
parser.add_argument("--url_source", required=True)
parser.add_argument("--url_target", required=True)
parser.add_argument("--source_username", required=True)
parser.add_argument("--source_password", required=True)
parser.add_argument("--target_username", required=True)
parser.add_argument("--target_password", required=True)
parser.add_argument("--excludeValuesRules", type=bool, default=True)
parser.add_argument("--format", type=str, default="turtle")
parser.add_argument("--includeInferences", type=bool, default=True)
parser.add_argument("--keepEDGTriples", type=bool, default=True)
args = parser.parse_args()
process_list(
args.collections_file,
args.url_source,
args.url_target,
args.source_username,
args.source_password,
args.target_username,
args.target_password,
args.excludeValuesRules,
args.format,
args.includeInferences,
args.keepEDGTriples
)
Download script here: sendProjects_approach3.py
Warning
Included asset collections are not automatically migrated by this approach. Ensure the input list includes all required dependencies.
Approach 4: Git Integration (EDG 8.3+)
EDG 8.3 and later include improved Git support that can support migration and promotion workflows.
A common pattern is to use:
A single Git repository with multiple branches (e.g., Dev/Test/Prod)
Push/pull from EDG to Git rather than server-to-server transfers
Git Configuration in EDG
Configure Git repositories under:
Product Configuration → Git Integration
Best practices:
Configure a separate repository entry per target (e.g., Dev and Prod)
Add a Git repository password (token-based authentication)
Configure which EDG users may access each repository
Linking an Asset Collection to Git in EDG
After Git is configured:
Open the asset collection to migrate
Use the cloud icon to Link to File on Git
Choose a new file name (e.g.,
geo.ttl) or select an existing fileUse push to write the collection to Git
In the target EDG, create a corresponding collection and link it to the same file
Use pull to populate the collection from Git
Important
Only one asset collection can be connected to a given Git file at a time.
You can remove the connection via the Git integration instance (Modify → Delete), and the form can also show metadata such as last push/pull execution.
Determining Changes Between Collections
Collections can be compared in EDG or externally. Common validation use cases include:
Verifying changes before migration
Confirming a promotion package
Reviewing additions and deletions between versions
The main approaches are:
EDG Comparison Report
EDG Workflow and Workflow Reports
Git diff in GitHub
Comparison Report
The Comparison Report is available under the Reports tab of an asset collection.
Typical usage:
Open the source collection
Navigate to Reports → Comparison Report
Select the target collection from the dropdown
Review additions and deletions detected by EDG
This approach is fast and requires no export.
EDG Workflow and Workflow Reports
A workflow can be used to create a controlled, reviewable change set:
Export the updated collection as Turtle (TTL)
In the original collection, create a new workflow (e.g., Basic Workflow)
Use Make Changes → Import within the workflow
To capture additions and deletions, select Replace previous contents during import
Review locations:
Workflow Reports panel: structured list of detected changes
Workflow Preview panel: git-style diff view
This approach supports approve/reject governance before committing changes.
Git diff in GitHub
If you push versions to Git:
Push the original collection to Git
Apply updates (or import a new version), then push again
Use GitHub diff between commits to review changes
This is especially useful for teams already using Git-based review processes.
See also
EDG environment promotion patterns and governance controls are covered in Best Practices for Dev, Test, and Prod EDG Servers.