Publishing data
Publishing on an Endpoint requires the writer (or admin) role — see Requesting access and the role tiers.
Once you have a role, the navigation bar shows a + New menu. The menu
groups every creation flow the Endpoint offers — six in total, split into
catalog containers and data resources.
At a glance
| Kind | What it is |
|---|---|
| Organization | A top-level group that owns datasets and services. |
| Dataset | A logical container of related resources, owned by an organization. |
| Service | A network-accessible service (REST API, web app, etc.) owned by an organization. |
| URL resource | A link to a file or service (CSV, JSON, NetCDF, stream, …). |
| S3 resource | An object in S3-compatible storage. |
| Kafka topic | A streaming data flow registered as a system dataset. |
Catalog containers (Organization, Dataset, Service) are the things users search and browse. Data resources (URL, S3, Kafka) live inside a dataset and point at the actual data.
A typical publishing flow
- Create an Organization (once per group/lab/project).
- Create a Dataset under that Organization — give it a title, description and tags.
- Add one or more resources to the dataset: a URL to a file, an object in S3, or a Kafka topic — whichever describes where the data lives.
You can also register Services (an API, a dashboard, a webhook…) under the
special services organization — these appear in the catalog so other users
can discover and call them.
Field cheat-sheet
Required fields in bold. Slugs (name, service_name, resource_name,
dataset_name) accept lowercase letters, digits, _ and -.
Organization
name, title, description.
Dataset
name, title, owner_org, notes, tags, groups,
license_id, version, extras, resources, private.
Service
service_name, service_title, owner_org (must be services),
service_url, service_type (one of API, UI, Trigger or
free-text), notes, health_check_url, documentation_url, extras.
URL resource
resource_name, resource_title, owner_org, resource_url,
file_type (one of stream, CSV, TXT, JSON, NetCDF or custom),
processing (type-specific: CSV delimiter and header line, JSON data_key,
NetCDF group, …), notes, mapping, extras.
S3 resource
resource_name, resource_title, owner_org, resource_s3
(s3://bucket/path or http(s)://…), notes (may be empty), extras.
Kafka topic
dataset_name, dataset_title, owner_org, kafka_topic,
kafka_host, kafka_port (1–65535), dataset_description,
mapping, processing, extras.
Managing what you publish
On the Search page, your own items expose Publish and Delete actions. Use them to push a dataset upstream (when the Endpoint is connected to a staging Pre-CKAN or to the federation) or to retire it.
When the data is large
For large object datasets, the writer can also use the S3 Management tool to create buckets, upload objects (drag-and-drop or file picker), generate presigned URLs for time-limited sharing, view metadata, and delete objects. S3 Management is restricted to writers and admins.
When you have many things to publish
If you have many datasets or files to register, do it from code. The
ndp-ep Python library performs every
operation in this page from a script — see
Automating with Python.