Product | EdgeRunner AI

EdgeRunner Beta now available for DoW users at no cost. Click here to get started for free.

Introducing

EdgeRunner

EdgeRunner is your MOS-specific, air-gapped, on-device AI assistant for improved productivity, decision making, and operational efficiency. Powered by EdgeRunner's own Small Language Models (SLMs), it's the ultimate compression function for knowledge at the edge. Edgerunner is completely on-device, ensuring the highest level of privacy and data protection. EdgeRunner provides a ChatGPT-like experience without requiring internet, running 100% locally and privately, powered by EdgeRunner-Medium or EdgeRunner-Light depending on your hardware requirements.

KEY BENEFITS

Ownership

Full control over your AI—no more renting a black box.

Ownership

Full control over your AI—no more renting a black box.

Privacy

Your data stays on your device, secure and private. Never send data to a 3rd party again.

Privacy

Your data stays on your device, secure and private. Never send data to a 3rd party again.

Personalization

Trained on your unique data, providing answers that fit your specific needs.

Personalization

Trained on your unique data, providing answers that fit your specific needs.

Transparency

Built on open-source models. You have full visibility into how it works and how it’s improved.

Transparency

Built on open-source models. You have full visibility into how it works and how it’s improved.

Productivity

Empower your team with AI that enhances productivity and effectiveness.

Productivity

Empower your team with AI that enhances productivity and effectiveness.

Custom Deployments

More SOTA models available for more basic to more powerful setups.

Custom Deployments

More SOTA models available for more basic to more powerful setups.

Infinite Savings

Zero inference costs. No token fees—use your AI as much as you need without extra charges.

Infinite Savings

Zero inference costs. No token fees—use your AI as much as you need without extra charges.

Power Efficient

Only requires 8GB of VRAM—leverages GPU/NPU but can fall back on CPU only.

Power Efficient

Only requires 8GB of VRAM—leverages GPU/NPU but can fall back on CPU only.

Runs Anywhere

No internet required, running 100% locally and privately.

Runs Anywhere

No internet required, running 100% locally and privately.

KEY FEATURES

Specialized AI Agents

Powered by the EdgeRunner Command function calling model - enabling task execution within agentic workflows.

Our function calling model can automate tasks such as opening Slack, managing emails, browsing, invoking other models, handling Excel, swapping LoRAs, and more.

Air-Gapped Offline Functionality

Operates independently of the internet, safeguarding sensitive data from cyber threats and maintaining operational security.

Hardware & OS Agnostic

Leverages GPU/NPU when available but can fall back on CPU only - only requires 8GB of VRAM. Works on any OS.

Natural flow

of communication

Listens to and transcribes meetings and interactions.

image_/images/product/Transcription.webp

Localized RAG (Retrieval-Augmented Generation)

Provides real-time, context-aware recommendations to enhance decision-making and operational efficiency.

Local Data Integration

Delivers precise outputs with citations derived from local data sets provided by the user.

Domain-Specific Expertise

Customizable to meet the specific needs of various roles.

Customizable Personas

Can be tailored to assist with specific missions and tasks.

Custom Adaptors (LoRA)

Utilizes Low Rank Adaptation of LLMs for task-specific enhancements, tailoring AI capabilities to distinct roles.

Dynamically switching between SLMs.

Intelligent routing of requests: EdgeRunner will intelligently switch between models, dynamically routing requests to the best task-specific model for the use case.

Power efficiency: Saves RAM and power, increasing efficiency and performance.

Streamlined accessibility: Makes Generative AI simple and boring, enabling widespread adoption.

Ever-evolving standard: Becomes the enterprise standard for leveraging multiple models at once, continuously leveraging the latest models, future proofing intelligence.

MODELS

Small, open models are the solution.

Swarm intelligence

Multiple tiny, task-specific models working in unison provide better results than large general purpose models.

Swarm intelligence

Multiple tiny, task-specific models working in unison provide better results than large general purpose models.

Higher performance

Large general purpose models (e.g. Llama 3) are deeply degraded after 2-Bit quantization and distillation to fit on chip.

Higher performance

Large general purpose models (e.g. Llama 3) are deeply degraded after 2-Bit quantization and distillation to fit on chip.

Task-specific

To solve real challenges, enterprises need task-specific models running locally rather than general purpose models.

Task-specific

To solve real challenges, enterprises need task-specific models running locally rather than general purpose models.

Offline computing

Due to data gravity, AI will increasingly move to the edge, which requires smaller models that can run without connection to the internet.

Offline computing

Due to data gravity, AI will increasingly move to the edge, which requires smaller models that can run without connection to the internet.

Data transparency

Enterprises and governments will require open models with open datasets due to concerns around IP, explainability, and bias.

Data transparency

Enterprises and governments will require open models with open datasets due to concerns around IP, explainability, and bias.

INTRODUCING

EdgeRunner-Light

EdgeRunner Tactical-7B

Hyper efficient military LLM for on-device deployment with laptops, tablets, and smartphones

Runs on as little as 8GB+ VRAM

Hyper efficient military LLM for on-device deployment with laptops, tablets, and smartphones

Runs on as little as 8GB+ VRAM

INTRODUCING

EdgeRunner-Medium

EdgeRunner Command-7B

Efficient military LLM for on-device deployment with workstations and laptops

Run on as little 16GB+ VRAM

Efficient military LLM for on-device deployment with workstations and laptops

Run on as little 16GB+ VRAM

Advanced Function Calling for Air-Gapped Workflow Execution: Excels at interpreting, executing, and chaining function.

Dual-Mode Functionality: Can serve as a tool router for request analysis and routing & standalone Chat Agent.

Guaranteed Helpful Ability: Achieved strong scores on popular benchmarks, including Arena Hard Benchmark and MT-Bench.