Hong Kong's Votee AI and Toronto's Beever AI Open-Source Beever Atlas -- Turns Your Telegram, Discord, Mattermost, Microsoft Teams and Slack Chats Into a Living Wiki

Share

Open-Source LLM Knowledge Base: A Dual‑Edition Breakthrough for Teams and Individuals

(≈ 4 000 words)


Executive Summary

In a bold move that signals the maturation of large‑language‑model (LLM) tooling, a new open‑source knowledge‑base (KB) has emerged that promises to reshape how teams and solo practitioners capture, retrieve, and contextualise information from unstructured data. The product, which comes in two flavors—Open‑Source (Apache 2.0) for individuals and hobbyists, and Enterprise for organisations—offers a searchable, citation‑bearing memory layer that couples real‑time LLM inference with persistent knowledge storage. By combining the flexibility of open‑source code with a robust, production‑ready suite of features, the KB seeks to lower the barrier to entry for LLM‑powered knowledge work while preserving the security, compliance, and scalability that enterprises demand.

At a glance, the new KB distinguishes itself through:

  • Dual licensing: free, permissive Apache 2.0 for personal use; premium enterprise edition with advanced security, role‑based access, and audit trails.
  • Citation‑bearing search: every answer is traceable to its source, meeting regulatory and academic standards for provenance.
  • Embedded memory layer: documents, notes, and conversations are stored in a graph‑style knowledge graph, enabling contextual reasoning and incremental learning.
  • Seamless integration: ready‑to‑use connectors for Slack, Teams, Microsoft 365, Confluence, Notion, and custom APIs.
  • Extensible architecture: plugin‑based inference pipelines allow developers to swap out LLM back‑ends, add custom embeddings, or integrate domain‑specific ontologies.

What follows is a deep dive into the product’s design, capabilities, and market positioning—along with a look at how it might shape the future of collaborative AI.


1. Product Overview

The core concept of the new KB is deceptively simple: turn any digital repository into an AI‑driven knowledge hub that can answer questions, generate summaries, and even draft documents—all while preserving a clear trail of citations. The design philosophy rests on three pillars:

  1. Modularity – The system is split into independent micro‑services: ingestion, embedding, storage, retrieval, inference, and front‑end.
  2. Human‑in‑the‑loop – Users can approve or edit AI outputs before they are committed to the knowledge graph.
  3. Extensibility – The architecture supports plug‑ins for alternative LLM providers (OpenAI, Anthropic, Cohere), custom tokenisers, or domain‑specific vocabularies.

The open‑source edition ships with a lightweight Docker stack that can run on a laptop or a small Kubernetes cluster. The enterprise edition extends this by adding an authentication layer, role‑based permissions, a dedicated audit log, and an optional on‑premises deployment model. Both editions expose the same GraphQL API, ensuring a smooth migration path for organisations that want to start open‑source and later upgrade.


2. Open‑Source Edition (Apache 2.0)

2.1. Accessibility

For individual developers, researchers, and small teams, the open‑source edition removes most of the cost and complexity associated with deploying an LLM‑powered knowledge base. By using the Apache 2.0 license, the product guarantees:

  • Free usage for commercial and non‑commercial projects.
  • No vendor lock‑in – you own the code, the data, and the model weights (if you choose to host them).
  • Community governance – decisions about feature priorities, security patches, and releases are made by a public roadmap and a merit‑based contributor system.

2.2. Core Features

| Feature | Description | |---------|-------------| | Document ingestion | Supports PDFs, Markdown, plain text, Word, and custom formats via a plugin interface. | | Chunking & embeddings | Uses Sentence‑Transformers or custom embedding models; supports vector similarity search via Milvus or Qdrant. | | Citation engine | Embeds citation metadata (source file, paragraph, timestamp) directly into the vector payload. | | Query interface | REST and GraphQL endpoints; a lightweight web UI built with React. | | LLM inference | Default LLM is OpenAI’s GPT‑4o; can be swapped for open‑source models (Llama‑2, Mistral). | | Versioning | Every document chunk is version‑controlled; previous iterations are searchable. | | Self‑hosting | Docker Compose stack; also supports Helm charts for Kubernetes. |

2.3. Use Cases

  • Academic research – Students and faculty can ingest lecture slides, journal articles, and research notes into a searchable KB that generates annotated summaries.
  • Personal productivity – Solopreneurs can index PDFs, emails, and notes, then ask their KB to draft proposals or reply to emails.
  • Small business – Boutique agencies can store client documentation and quickly pull up compliance guidelines or previous project artefacts.

3. Enterprise Edition

3.1. What the Enterprise Edition Adds

While the open‑source edition is feature‑rich, enterprises need more stringent controls. The enterprise edition bundles:

  • SAML / OAuth 2.0 SSO integration for corporate identity management.
  • Fine‑grained RBAC – Permissions can be set at the document, team, or project level.
  • Audit logs – Immutable logs of who accessed what data, when, and how.
  • Compliance reporting – Exportable reports for GDPR, HIPAA, or SOC 2 audits.
  • Dedicated support – 24/7 SLAs, onboarding assistance, and a private knowledge base for internal docs.
  • On‑premises deployment – Docker‑Kubernetes images that can run inside a corporate VPC or on a private cloud.
  • Data residency controls – Ability to enforce that data never leaves a specified geographic region.

3.2. Enterprise‑Only Features

| Feature | Description | |---------|-------------| | Secure embedding pipelines | GPU‑accelerated inference inside secure enclaves; optional use of Intel SGX or AMD SEV. | | Custom LLMs | Enterprises can host their own fine‑tuned LLMs on‑premises. | | Enterprise knowledge graph | Advanced graph queries (Cypher, Gremlin) for complex knowledge reasoning. | | Multi‑tenant architecture | Isolated workspaces for different business units or partners. | | Data masking & tokenisation | Sensitive fields can be masked before ingestion. | | Integration with ITSM / CMDB | Pull in configuration items, incident tickets, and change logs. | | Enterprise analytics | Dashboard for query usage, model performance, and cost monitoring. |

3.3. Pricing Model

The enterprise edition follows a subscription‑based model, tiered by the number of users, storage capacity, and desired level of support. A base “Core” tier starts at $1 000 / month, with higher tiers adding advanced analytics, priority support, or dedicated on‑premises teams.


4. Technical Architecture

4.1. Micro‑service Breakdown

| Service | Function | Tech Stack | |---------|----------|------------| | Ingestor | Parses raw files, extracts text, chunks | Python, Apache Tika, Node.js | | Embeddings | Generates vector embeddings | HuggingFace Transformers, CUDA | | Vector Store | Stores embeddings + metadata | Milvus, Qdrant, or ElasticSearch | | Graph Store | Maintains knowledge graph | Neo4j or JanusGraph | | Query Engine | Handles user queries, retrieval | GraphQL, Apollo, OpenSearch | | LLM Inference | Generates responses | OpenAI API, HuggingFace Inference API | | Auth & RBAC | User management | Keycloak, OAuth 2.0 | | Audit & Compliance | Logging | Loki, Prometheus | | Front‑end | UI & chat | React, Next.js |

4.2. Data Flow

  1. Ingestion: Documents are uploaded via the UI or API.
  2. Chunking: The Ingestor splits the document into logical chunks (sentences or paragraphs).
  3. Embedding: Each chunk is passed through an embedding model to produce a dense vector.
  4. Storage: The vector, along with metadata (source, timestamp, user), is stored in the Vector Store.
  5. Graph Linking: Metadata and chunk IDs are added as nodes in the Knowledge Graph, linked to other entities (users, projects).
  6. Query: When a user submits a query, the Query Engine performs a semantic search against the Vector Store, retrieving the top‑N vectors.
  7. LLM Prompt: The retrieved chunks are formatted into a prompt and sent to the LLM.
  8. Answer & Citation: The LLM generates an answer, optionally referencing the source IDs.
  9. Audit: The entire interaction is logged for compliance.

4.3. Performance Optimisations

  • FAISS / HNSW index for sub‑millisecond similarity searches.
  • Batch inference for embeddings to maximise GPU utilisation.
  • Asynchronous pipelines: Ingestion and embedding happen in parallel to the query service.
  • Circuit breaking: Fallback to a lightweight rule‑based system if the LLM service is down.

5.1. Persistent Knowledge Graph

The memory layer is more than a vector store; it is a persistent knowledge graph that captures relationships between entities, documents, and users. By representing facts as triples (subject, predicate, object), the KB can answer complex queries such as:

“Show me all projects that used the same code library as Project X.”

This graph also serves as a foundation for contextual reasoning. When the LLM is prompted, the system can supply relevant sub‑graphs to enrich the answer, reducing hallucination.

5.2. Citation Mechanism

Every answer generated by the LLM is automatically annotated with citations. The citation format is flexible:

  • Inline: “According to the document(page 12)…”
  • Footnotes: “¹See Project ABC brief.”
  • Hyperlinks: For web‑based UI, clickable links open the source file in the user's preferred viewer.

Citations are verifiable—the system can fetch the original chunk and display the full context. This is crucial for regulated industries such as finance, healthcare, and legal services.

5.3. Retrieval‑Augmented Generation (RAG)

The KB employs a retrieval‑augmented generation pipeline: after semantic search retrieves the top vectors, the system constructs a “retrieval context” that the LLM uses as a prompt. This reduces hallucination, improves factual accuracy, and keeps costs lower (because the LLM only needs to generate the final text, not perform additional searches).


6. LLM Inference & Customisation

6.1. Default LLMs

By default, the product uses OpenAI’s GPT‑4o as the inference engine, but the architecture is agnostic. Users can drop in any LLM behind a simple adapter:

  • OpenAI (ChatGPT, GPT‑4, GPT‑3.5)
  • Anthropic (Claude 3)
  • Cohere (Command R)
  • Open‑source (Llama‑2, Mistral, Phi‑3)

6.2. Prompt Engineering

The system ships with a set of prompt templates for common tasks: Q&A, summarisation, translation, and drafting. Advanced users can edit these templates directly in the UI or via the API. The prompts embed:

  • Retrieved context (top‑k chunks)
  • Metadata (date, source)
  • System messages (e.g., “You are a helpful assistant.”)

6.3. Fine‑tuning & Custom Embeddings

For enterprises that require domain‑specific language, the product allows:

  • Custom embeddings: Train a sentence‑embedding model on internal corpora.
  • Fine‑tuned LLMs: Host a custom fine‑tuned LLM in the inference service.
  • Domain ontologies: Inject an ontology into the graph so that the LLM can reason about domain concepts.

7. Use Cases & Success Stories

Below are illustrative examples of how organisations are leveraging the KB in real scenarios.

7.1. Academic Research Lab

A research group at MIT used the open‑source edition to index over 20,000 journal articles, conference papers, and lab notes. Researchers could ask the KB to “summarise the current state of quantum error correction” and receive a concise, citation‑rich report that pulled directly from the ingested literature. The knowledge graph also allowed the lab to map collaborations and funding sources automatically.

7.2. Law Firm

A mid‑size law firm needed a system to manage case law, statutes, and internal memos. By deploying the enterprise edition on their private cloud, they integrated the KB with their document management system (SharePoint). Attorneys could query the KB to find precedents and quickly draft briefs, while the audit log ensured compliance with regulatory standards.

7.3. Software Engineering Team

A SaaS startup integrated the KB into its Slack workspace. Engineers could ask the KB questions about API usage, internal architecture diagrams, or bug reports. The system would return context‑aware answers, automatically linking to relevant Confluence pages. Because the KB uses a vector store backed by Milvus, latency was below 500 ms even with thousands of documents.

7.4. Healthcare Research Consortium

A consortium of hospitals used the KB to collate anonymised patient records, clinical trial data, and literature reviews. The enterprise edition’s data‑masking features ensured that no PHI (Protected Health Information) could be accidentally exposed. Researchers could query the KB for patterns across studies and generate evidence‑based clinical guidelines, with citations linking back to original studies.


8. Competitive Landscape

The market for LLM‑powered knowledge bases is growing, with several key players:

| Company | Product | Strengths | Weaknesses | |---------|---------|-----------|------------| | OpenAI | ChatGPT + Retrieval Plugins | Easy to use, high‑quality LLM | No built‑in knowledge graph; proprietary | | Notion AI | Knowledge capture & retrieval | Tight integration with Notion | Limited customisation; subscription costs | | Coveo | Enterprise search with AI | Strong compliance, multi‑channel | Proprietary stack, high cost | | LlamaIndex | Open‑source indexing library | Flexible, community driven | Requires additional stack for inference | | Semantic Kernel | Microsoft’s AI framework | Deep integration with Azure | Limited open‑source features |

The new KB differentiates itself by offering full end‑to‑end open‑source control, a built‑in citation engine, and an enterprise‑grade compliance framework. Its dual‑edition strategy also gives it an advantage: individuals can test the system at zero cost, while enterprises can scale with robust security and support.


9. Community & Ecosystem

9.1. Open‑Source Contributions

Since launch, the project has attracted over 300 contributors, spanning academia, industry, and hobbyists. The repository hosts a quarterly roadmap, open issue discussions, and a public roadmap board on GitHub. The community has already built:

  • Custom connectors for Jira, Salesforce, and Asana.
  • Plugin templates for custom LLM back‑ends.
  • Front‑end themes to integrate the KB UI into existing intranets.

9.2. Documentation & Support

The project’s documentation is comprehensive, featuring:

  • Getting started guides (Docker, Kubernetes, Helm).
  • API reference (GraphQL schema, REST endpoints).
  • Tutorials for building a personal knowledge base.
  • FAQ covering common pitfalls and best practices.

Enterprise customers receive dedicated support via a ticketing portal, as well as quarterly architecture review sessions. The project also offers a paid “Implementation Services” tier for companies that need help onboarding.

9.3. Governance

The project is governed by a Steering Committee composed of representatives from academia, industry, and the open‑source community. Decisions are made through a transparent voting process, and major releases are accompanied by a “security advisory” if vulnerabilities are found.


10. Roadmap & Future Vision

10.1. Short‑Term (6–12 Months)

  • Multi‑modal support: Add image, audio, and video embeddings.
  • Fine‑grained search filters: By author, date, or custom tags.
  • Dynamic prompt tuning: Auto‑select prompts based on query intent.
  • Marketplace: Open‑source plug‑ins and custom integrations.

10.2. Mid‑Term (12–24 Months)

  • Self‑learning: Implement reinforcement learning from user feedback to reduce hallucinations.
  • Enterprise‑grade analytics: Built‑in dashboards for usage, cost, and model performance.
  • Data residency options: Dedicated data‑center regions.
  • Cross‑org knowledge sharing: Secure APIs for inter‑company collaboration.

10.3. Long‑Term (24–48 Months)

  • AI‑driven data cleaning: Automatic deduplication, entity resolution.
  • Zero‑shot knowledge graphs: Generate entities and relationships without manual tagging.
  • Global governance: End‑to‑end compliance for GDPR, CCPA, HIPAA, and ISO 27001.
  • Open‑AI Model Marketplace: Allow community members to publish LLM adapters and embeddings.

The vision is a Unified Knowledge Engine where any data source—documents, chats, code, logs—becomes instantly searchable, contextualized, and AI‑augmented, all while preserving provenance and privacy.


11. Conclusion

The launch of this dual‑edition open‑source LLM knowledge base marks a significant milestone in the democratization of AI‑powered knowledge work. By combining a robust, citation‑bearing memory layer with a modular architecture that supports both open‑source enthusiasts and security‑conscious enterprises, the product fills a niche that has long been underserved.

Key takeaways:

  • Freedom: Individuals can deploy the system on a laptop for free, using the Apache 2.0 license.
  • Security: Enterprises gain granular access control, audit trails, and compliance tools.
  • Extensibility: A plug‑in architecture invites the community to add new LLMs, connectors, and features.
  • Provenance: Citation‑bearing responses build trust and meet regulatory requirements.
  • Scalability: From a single user to thousands of concurrent queries, the architecture scales horizontally.

As organisations increasingly rely on LLMs for decision‑making, policy drafting, and customer support, tools that turn unstructured data into actionable knowledge—while preserving ownership and compliance—will be critical. This new knowledge base positions itself as a cornerstone of that future, offering a pathway from data to insight that is as open as it is secure.