LogoLogo
  • Introduction
    • What Is an Agent Application?
    • Core Concepts
    • Getting Started
  • GLIK Cloud
    • Getting Started
      • GLIK Cloud vs. GLIK Studio
    • Prompt Orchestration Interface
    • What Is a Workflow?
    • Workflow Editor
    • App Types (Overview)
    • Workspaces & Permissions
    • Enterprise Readiness & Compatibility
    • Security & Data Handling
  • Enterprise SaaS
    • Expense Policy Decision Engine
    • Compliance & Audit Automation
    • Compliance Advisor Copilot
  • App Types
    • Overview
    • Chatbot
    • Agent
    • Advanced Chat
    • Workflow
  • Templates
    • Overview
    • Policy Automation
      • Overview
        • Expense Policy Validator — Summary Sheet
        • Enterprise Policy Intelligence
      • Expense Policy Decision Engine
        • Expense Policy Decision Engine (Starter)
          • Practice Exercises
        • Expense Policy Decision Engine (Threshold Memory)
          • Practice Exercises
          • Threshold Agent Response Tuning Exercises
        • Expense Policy Decision Engine (Policy API Integration)
          • Practice Exercises
        • Expense Policy Decision Engine (Escalation & Conflict Resolution)
        • Expense Policy Decision Engine (Audit & Logging)
    • Compliance and Audit Automation
      • KYC/AML Review Copilot
        • Learning Track
      • Compliance Copilot – MiCA Reporting
        • Workflow Phases
        • Block-by-Block Guide
    • Compliance Advisors
      • Global Control Copilot – Cross-Jurisdiction Policy Interpreter
        • Input Combinations & Workflow Outcomes
        • Reference Input Payloads
        • Policy Retrieval via Input Routing
        • Input Logic & Routing Behavior
        • Predefined Policy Thresholds
    • Knowledge Systems
      • Overview
      • Compliance SOP Agent
    • Process Automation
    • Task Resolution Agents
      • Why Agentify Task Resolution
    • Work Coordination Agents
      • Escrow Agent Orchestration
    • Embedded Operational Copilots
    • Expense & ERP Agents
      • ERP Vendor AI Copilots and Agents
      • Custody Approval Workflow for Token Issuance
    • Inventory & Logistics Agents
    • Sales & Forecasting Agents
    • Plugin-Based Agent Platforms
  • Marketplace
    • Overview
    • Publishing Templates
  • System Architecture
    • Overview
    • Blocks & Nodes
      • Utilities
        • Start Block
        • End Node
        • HTTP Request
        • List Operator
      • Classifier Nodes
        • Question Classifier
      • Logic Blocks
        • IF/ELSE Branch
        • Iteration
        • Loop
      • Transform Blocks
        • Variable Assigner
        • Variable Aggregator
        • Parameter Extractor
        • Data Enrichment
        • Prompt Template
        • Code
      • Input & Extraction
        • Doc Extractor
        • Knowledge Retrieval
        • LLM Block
          • LLM Reasoning
          • Fallback to LLM Reasoning
        • Tool Node
        • Agent
        • Answer
    • GLIK Knowledge
      • Creating & Managing Knowledge
      • GLIK Knowledge Retrieval
      • Writing to Knowledge
      • Scoped Memory & Access Control
      • Injection & Variable Binding
      • Performance & Limits
    • Execution Model
      • Workflow Architecture
      • Flow Engine
      • Node Lifecycle
      • Protocol Compatibility & Schema Interoperability
      • Input Binding & Value Resolution
    • Memory & Variable Scope
      • Conversation Variables
      • Memory Layers (User, App, Org)
      • Memory Slot Injection
      • Memory Retention Policy
    • Decision Routing
      • Conditional Logic Engine
      • LLM Fallthrough Patterns
      • Policy Enforcement & Escalation Paths
    • Enterprise Orchestration
      • Policy-Driven Automation
      • Enterprise Modularity
      • Auditability & Governance
      • Explainability & Decision Transparency
    • Agentifying Legacy Systems
      • Why Legacy Systems Resist Change
      • Best Practices for Agentifying ERP Workflows
      • GLIK’s Wrap-Around Model
      • Agent Surfaces (PDF, OCR, API, UI)
      • No-API Memory-Based Control
      • Compliance & Risk Considerations
    • System Observability
      • Execution Logs
      • Save Points & Snapshots
      • Variable Debugging
      • Session Trace Viewer
  • Developers
    • Overview
    • GLIK Open Core
      • Deployment & Installation
      • CLI Reference
      • Security & Compliance
      • Customization Guide
      • Versioning & Updates
  • GLIK Roadmap
  • Deprecation
    • Orchestration Interface
      • Node Orchestration
        • Node
          • Start
          • End
          • Direct Reply
          • LLM
          • Question Classifier
          • Knowledge Retrieval
          • Code Execution
          • Doc Extractor
          • HTTP Request
          • Conditional Branch IF/ELSE
          • Iteration
          • List Operator
          • Parameter Extraction
          • Template
          • Tools
          • Variable Aggregator
          • Variable Assigner
      • Variables
      • Application Toolkits
      • File Upload
    • Chatbot Features
    • Dataset
      • Dataset Creation
      • Text Preprocessing and Cleaning
        • Advanced Configuration
      • Retrieval Test/Citation
    • Studio
  • Brand Kit & Identity
    • Logos & Visual Assets
    • Typography & Colors
    • Messaging Pillars
    • Product Screenshots
    • Diagrams & Icons
    • Company Boilerplate
    • Downloads (.zip)
  • Legal
    • Terms of Service
    • Privacy Policy
    • Cookie Policy
    • Trademark Notice
    • Acceptable Use Policy
    • Open Core License
Powered by GitBook

Platform

  • Open GLIK Cloud
  • Getting Started
  • Templates

Documentation

  • Core Concepts
  • GLIK Open Core
  • Security & Data Handling
  • Workspaces & Permissions

Company

  • RIvalz AI
  • Contact Support
  • Status Page

© 2023–2025 Rivalz Technologies Ltd.

On this page
  • Dataset
  • Dataset and Documents

Was this helpful?

  1. Deprecation

Dataset

Dataset

The training data for large language models is generally based on publicly available data, and each training session requires a significant amount of computational power. This means that the knowledge of the models generally does not include private domain knowledge, and there is a certain delay in the public knowledge domain. To solve this problem, the current common solution is to use RAG (Retrieval-Augmented Generation) technology, which uses users' questions to match the most relevant external data, and after retrieving the relevant content, reorganize and insert the response back as the context of the model prompt.

To learn more, please check the extended reading on Retrieval-Augmented Generation (RAG)

Glik's knowledge base feature visualizes each step in the RAG pipeline, providing a simple and easy-to-use user interface to help application builders in managing personal or team knowledge bases, and quickly integrating them into AI applications. You only need to prepare text content, such as:

  • Long text content (TXT, Markdown, DOCX, HTML, JSONL, or even PDF files)

  • Structured data (CSV, Excel, etc.)

Additionally, we are gradually supporting synchronizing data from various data sources to datasets, including:

  • Web Scraping

  • Notion

  • Google Drive (Coming soon)

  • OneDrive (Coming soon)

Scenario: If your company wants to establish an AI customer service assistant based on the existing knowledge base and product documentation, you can upload the documents to the dataset in Glik and build a chatbot. In the past, this might have taken you weeks and been difficult to maintain continuously.

Dataset and Documents

In Glik, Dataset is a collection of documents. A dataset can be integrated into an application as a retrieval context. Documents can be uploaded by developers or a member of operation team, or synchronized from other data sources (usually corresponding to one unit file in the data source).

PreviousChatbot FeaturesNextDataset Creation

Last updated 3 months ago

Was this helpful?