Metadata Management Overview

To get started with the Hydra platform, the first step is to create a data topic with topic metadata. Metadata includes detail specific to the entire topic or stream e.g. topic name, id, owner, etc. Producers register their metadata via an endpoint, and the platform stores the metadata, and then uses it to create the Kafka topic, and register a corresponding Avro schema for the topic.

To update metadata fields or the schema, a new payload must be submitted that includes all required fields.

Topic Metadata Configuration

Topics are created once and only once, the first time metadata is registered for a given topic. The platform assumes sensible defaults for the number of partitions and replication factor (3 each), and so if this does not meet your use case please reach out to the data platform team for further instructions.

Field	Purpose	Defined By	Required?
id	Stream GUID	System Generated	N/A
createdDate	Date/time topic was created	System Generated	N/A
subject	Stream name	Data Producer	Mandatory
streamType	Notification, CurrentState, or History	Data Producer	Mandatory
derived	false means this topic is from the source of truth	Data Producer	Mandatory
dataClassification	Public, InternalUseOnly, ConfidentialPII or ConfidentialFinancial	Data Producer	Mandatory
contact	Preferred method for contacting the data owner e.g. Slack, email, etc.	Data Producer	Mandatory
schema	Entity's payload schema definition	Data Producer	Mandatory
additionalDocumentation	Location to additional data documentation	Data Producer	Optional
notes	Additional notes	Data Producer	Optional

Topic Schema

The schema is included as a normal JSON object, and is subject to Avro's evolution scheme. Any invalid evolutions will return a Bad Request to the client.

Some basic ground rules for schema evolutions are as follows:

Field types cannot change after creation. This is a breaking change that will require a new version of the schema.
To add a new field, it must be made optional using Avro's union type with a default value.
To remove a field, it must be made optional and nullable using the union type.

Note on Required Fields:

There are some fields that Hydra requires within a schema to ensure better traceability of messages. Examples of these fields include:

eventName - a name for the event that occurred

eventTime - a timestamp for when the event occurred

Please refer to our documentation on avro schemas for more information.

Endpoint

Metadata payloads should be submitted to: http://{{hydra-host-url}}:{{port}}/topics, and include a Content-Type header of application/json.

Example CURL request:

curl -X POST http://{{hydra-host-url}}:{{port}}/topics -H "Content-Type: application/json" -d '{...}'

Payload Example

Here is an example of a metadata payload:

{
	"subject": "exp.dataplatform.TestSubject",
	"streamType": "Notification",
    "derived": false,
	"dataClassification": "Public",
	"contact": "bob@myemail.com",
	"additionalDocumentation": "This is a test stream of data",
	"notes": "additional notes",
	"schema": {
	  "namespace": "exp.dataplatform",
	  "name": "TestSubject",
	  "type": "record",
	  "version": 1,
	  /* hydra key can be a single field or a comma-separated list of fields */
      "hydra.key": "testField",
	  "fields": [
	    {
	      "name": "testField",
	      "type": "string"
	    },
	    {
	      "name": "testField2",
	      "type": ["null", "int"],
	      "default": null
	    },
		/* example of some fields required for ENTITY data streams. Strongly recommended to include for ALL streams. */
		{
		  "name": "eventName",
          "type": "string"
		},
		{
		  "name": "eventType",
          "type": "string"
		}
		/* end examples */
	  ]
	}
}