Dataset Records

Dataset records are essential components that store valuable data points, enabling the dataset to serve as a comprehensive knowledge base for various applications.

Creating Dataset Records

Creating a dataset is the first step in organizing information, but populating it with records is what makes it truly useful. You can create a record by making a POST request to the following endpoint:

POST /api/v1/dataset/{datasetId}/record/create Content-Type: application/json { "text": "The content of the record goes here.", "source": "Optional source information.", "meta": { "key": "value" } }

http

Deleting a Dataset Record

You can delete a dataset record by sending a POST request to the following endpoint:

POST /api/v1/dataset/{datasetId}/record/{recordId}/delete

http

Warning: Deleting a dataset record is a permanent action and cannot be undone.

Fetching a Specific Dataset Record

Retrieving detailed information about an individual record allows you to access its complete content, source information, metadata, and indexing details. This operation is essential for verifying record content, debugging search behavior, auditing data accuracy, or displaying record information in administrative interfaces.

When you fetch a record, you receive the full record object including the original text content, any source attribution, custom metadata fields, and system-generated information like creation and update timestamps. This comprehensive view helps you understand exactly what information is stored and how it's being used in search operations.

To retrieve a specific record by its ID, send a GET request:

GET /api/v1/dataset/{datasetId}/record/{recordId}/fetch

http

Replace {datasetId} with your dataset identifier (e.g., dts_abc123xyz) and {recordId} with the specific record identifier (e.g., rec_def456ghi).

Response Structure

The response includes the complete record data:

{ "id": "rec_def456ghi", "text": "Our standard shipping takes 5-7 business days and costs $9.99. Express shipping is available for $24.99 with 2-3 day delivery.", "source": "shipping-policy.pdf, page 3", "meta": { "category": "shipping", "lastReviewed": "2024-01-15", "verified": true }, "createdAt": "2024-01-10T14:30:00.000Z", "updatedAt": "2024-01-15T10:45:00.000Z" }

json

Field Explanations

  • id: Unique identifier for this record
  • text: The actual content that will be searched and retrieved
  • source: Optional attribution indicating where this information came from
  • meta: Custom metadata fields for organization and filtering
  • createdAt: Timestamp when the record was initially created
  • updatedAt: Timestamp of the most recent modification

Common Use Cases

Content Verification: Review the actual text content to ensure accuracy and completeness of information stored in your knowledge base.

Search Debugging: When search results seem incorrect, fetch the actual records being returned to understand what information the AI is working with.

Data Auditing: Verify source attribution and metadata to ensure proper documentation of information provenance.

UI Display: Show detailed record information in administrative dashboards or content management interfaces.

Quality Assurance: Review records systematically to maintain high-quality knowledge base content.

Authorization

You can only fetch records from datasets that belong to your account. Attempting to access records from other users' datasets will result in an authorization error.

Performance Note

Fetching individual records is a lightweight operation suitable for frequent access. For bulk operations or comprehensive dataset reviews, consider using the list endpoint with appropriate filters instead.

Updating a Dataset Record

Modifying existing records allows you to keep your knowledge base current, correct inaccuracies, refine content for better search results, and update metadata as your organizational needs evolve. Record updates automatically trigger re-indexing, ensuring that the new content is immediately searchable and will be reflected in future query results.

When you update a record, you can modify its text content, change source attribution, or update custom metadata fields. The update operation preserves the record's unique identifier while applying your changes and updating the modification timestamp. This maintains referential integrity while allowing content evolution.

The ability to update records incrementally is essential for maintaining knowledge base quality without disrupting service. Whether you're fixing typos, expanding explanations, updating product information, or refining categorization metadata, record updates provide the flexibility needed for continuous improvement of your AI's knowledge foundation.

To update an existing record, send a POST request with the fields you want to modify:

POST /api/v1/dataset/{datasetId}/record/{recordId}/update Content-Type: application/json { "text": "Updated product information: Our premium support package includes 24/7 live chat, priority email response within 2 hours, and dedicated account management.", "source": "support-packages-2024.pdf, page 2", "meta": { "category": "support", "tier": "premium", "lastReviewed": "2024-01-20", "reviewedBy": "support-team" } }

http

Replace {datasetId} with your dataset identifier and {recordId} with the specific record you want to update. You only need to include the fields you want to change—omitted fields will retain their current values.

Updatable Fields

  • text: The primary content that will be searched and retrieved
  • source: Attribution indicating where this information originated
  • meta: Custom metadata object for organization and filtering

Response

Upon successful update, the API returns the record ID:

{ "id": "rec_def456ghi" }

json

Automatic Re-indexing

When you update a record's text content, the system automatically:

  1. Regenerates embeddings: Creates new vector representations of the updated text for semantic search
  2. Updates search indexes: Ensures the new content is immediately searchable
  3. Maintains record identity: Preserves the record ID and relationships
  4. Updates timestamps: Records when the modification occurred

This automatic re-indexing means your changes take effect immediately without requiring manual reprocessing or service restarts.

Common Update Scenarios

Content Corrections: Fix typos, grammatical errors, or factual inaccuracies discovered through use or review.

Information Updates: Refresh content when underlying facts change, such as policy updates, pricing changes, or product specifications.

Search Optimization: Refine text to improve search relevance by adding keywords, clarifying terminology, or restructuring content.

Metadata Enhancement: Add or update categorization metadata to improve filtering and organization without changing the core content.

Source Attribution: Update source information when content is verified against newer documentation or different authoritative sources.

Best Practices

  • Preserve context: When updating text, maintain enough context for the record to be understood independently
  • Update sources: Keep source attribution current to maintain content provenance
  • Use metadata effectively: Leverage metadata updates for versioning, review tracking, and quality management
  • Test search impact: After significant content updates, verify that search results still return relevant information
  • Batch similar updates: If updating multiple related records, consider doing so in sequence to maintain consistency

Authorization

You can only update records in datasets that belong to your account. Attempting to modify records in other users' datasets will result in an authorization error.

Exporting Records

You can export dataset records for backup or migration purposes in various formats such as JSON, JSONL, or CSV.

Here is how to export records in JSON format:

GET /api/v1/dataset/{datasetId}/record/export Accept: application/json

http

To export in JSONL format:

GET /api/v1/dataset/{datasetId}/record/export Accept: application/jsonl

http

To export in CSV format:

GET /api/v1/dataset/{datasetId}/record/export Accept: text/csv

http

Listing Records

Listing records within a dataset allows you to retrieve and manage the individual entries that make up your dataset.

GET /api/v1/dataset/{datasetId}/record/list

http