DigitalOcean Data & Learning

Fresh data. Agents that remember. Systems that get smarter over time. Fully-managed data infrastructure, integrated with every layer of the DigitalOcean stack, with no egress fees between them.

Integrated Capabilities. One Unified Platform. Scale without Complexity.

AI-Ready Data That You Control

With infrastructure powered by VAST Data—which supports 40% of the world's GPUs—your data is queryable the moment it arrives at low latency. No ETL, no egress, built on open source and queryable by open source tooling, you can run anywhere, under terms you control.

Native Integration, No Lock-in

We believe in an open platform. Our Managed Weaviate service uses standard APIs and clients, meaning you can use the tools you already know without being forced into proprietary SDKs or abstractions.

Efficient Unit Economics

Break free from the "cloud tax." By co-locating data and intelligence, we eliminate egress fees between your data layer and inference engines.

Reduced Operational Complexity

We provide a single control panel for your entire AI stack-from raw data storage to agent deployment so you don't have to manage multiple vendor relationships, separate billing, and complex authentication layers.

AI-ready systems of record

Leverage scalable, low-latency, highly available, manageable and secure managed databases. With our new Advanced Edition, databases can extend the boundaries of scale, failover and manageability.

Open memory. Fresh data. Systems that get smarter.

Managed Databases

Worry-free database hosting. We offer automated, high-availability setups for MySQL, PostgreSQL, and Redis, removing the need for manual server administration.

Features:

High scalability

Easily scale your database to match business growth. Add CPUs, RAM, and nodes to handle heavier workloads and boost performance. Plus, with automated storage scaling, you’ll never run out of space—no manual intervention required.

Fast, reliable performance

Managed Databases run on enterprise-class hardware for fast performance. Run your clusters on Droplets with shared vCPUs or choose Droplets with 100% dedicated vCPUs for mission critical workloads.

End-to-end security

Databases run in your account's private network, and only whitelisted requests via the public internet can reach your database. Data is also encrypted in transit and at rest.

Streamlined setup & maintenance

Launch a database cluster with just a few clicks and then access it via our simplified UI or API. Easily migrate your database from another location with minimal downtime.

Knowledge Bases

A fully managed Retrieval-Augmented Generation (RAG) service that automates the entire pipeline—ingestion, chunking, embedding, retrieval and reranking.

Features:

Diverse sources

Connect to data in S3, Dropbox,and local file system.

Chunking Strategies

Smart defaults out of the box, deep control when you need it. Pick semantic for topical shifts, hierarchical for precise retrieval with broader grounding, section-based for structured docs, or fixed-length for sheer speed - then evaluate, adjust, and re-index until retrieval lands.

Hybrid Search & Advanced Reranking

Enhance retrieval accuracy with sophisticated search techniques that combine keyword and semantic results. Enable bge-reranker-v2-m3 to re-score results with cross-encoder precision. Add a reranking step to your retrieval pipeline. Higher precision on the top results, $0.010 per 1M tokens.

New Embedding Models

Additional open-source models (e5-large-v2, bge-m3) are now available, offering high-precision English retrieval and versatile support for long-form, multilingual documents.

Model Context Protocol (MCP) Support

Turn your Knowledge Bases into a plug-and-play retrieval tool for any MCP-compatible agent framework.

Managed Weaviate (Private Preview)

Skip the complexity of self-hosting and the high cost of vendor lock-in with Managed Weaviate, an open-source compatible solution that lets you focus on building, not operations. Launch production-ready vector infrastructure for your AI apps with 1-click provisioning. Sign-up for early access

Features:

Simple, predictable pricing

$20/month to start on the Small plan, $120/month Medium, $1,600/month Large - with built in HA functionality, no per-query or per-vector metering.

Built on Weaviate 1.37.1

Full compatibility with the upstream Python, JavaScript/TypeScript, Go, and Java clients, plus REST, gRPC, and GraphQL APIs.

RQ8 compression by default

Rotational Quantization 8-bit is enabled at cluster level, giving roughly 4x less RAM per vector than uncompressed storage while preserving recall.

Native DigitalOcean ecosystem integration

OpenAI-compatible pairing with DigitalOcean Serverless Inference for embeddings.

Frequently asked questions

Do I need to manage a vector database with Knowledge Bases?

No. Knowledge Bases includes fully managed vector storage and retrieval infrastructure. No need to create a separate vector store.

Can I use my own models with Knowledge bases?

No. Knowledge bases currently supports 6 embedding models.

Do Knowledge Bases work with AI agents?

Yes. MCP integration allows any compatible agent framework to connect directly.

Is reranking required for Knowledge Bases?

No. It is optional but recommended for higher-quality retrieval results.

What are the key capabilities of Managed Weaviate with DigitalOcean?

Our offering includes full Weaviate compatibility (GraphQL, REST, HNSW configuration)-DigitalOcean manages backups, patching, and availability. Available now in Private Preview.

How do I access Knowledge Bases?

Directly in the DigitalOcean console under Data Services, via API, via the CLI or through the DigitalOcean AI SDK.

How do I access Weaviate capabilities on DigitalOcean?
Weaviate features are in Private Preview. Sign-up for early access and once enabled, these features can be accessed in the DigitalOcean control panel in the Vector Databases section.