Introducing
EdgeRunner is your MOS-specific, air-gapped, on-device AI assistant for improved productivity, decision making, and operational efficiency. Powered by EdgeRunner's own Small Language Models (SLMs), it's the ultimate compression function for knowledge at the edge. Edgerunner is completely on-device, ensuring the highest level of privacy and data protection. EdgeRunner provides a ChatGPT-like experience without requiring internet, running 100% locally and privately, powered by EdgeRunner-Medium or EdgeRunner-Light depending on your hardware requirements.
A containerized, web-hosted version of EdgeRunner is also available for users who wish to deploy our platform on their network. This version is ideal for use cases where network connectivity is always available. In addition, our team can rapidly build MOS-specific adapters for both our disconnected and web-hosted versions. To learn more about this process, please reach out via our contact form.
EXAMPLE USE CASES
Pocket Field Service Rep (FSR)
Pocket Logistics Assistance Rep (LAR)
Pocket Subject Matter Expert (SME)
KEY BENEFITS
Operate Anywhere
EdgeRunner runs directly on laptops, mobile devices, vehicles, and edge hardware - with or without connectivity - enabling AI in disconnected, remote, and contested environments.
Operate Anywhere
EdgeRunner runs directly on laptops, mobile devices, vehicles, and edge hardware - with or without connectivity - enabling AI in disconnected, remote, and contested environments.
Data Security
Sensitive or proprietary data remains local. EdgeRunner eliminates the need to send information to external cloud services, reducing risk of exposure, interception, or compliance breaches.
Data Security
Sensitive or proprietary data remains local. EdgeRunner eliminates the need to send information to external cloud services, reducing risk of exposure, interception, or compliance breaches.
Mission Adaptability
Models are adapted to specific operational domains, workflows, and data sources, delivering relevant outputs aligned to real-world tasks.
Mission Adaptability
Models are adapted to specific operational domains, workflows, and data sources, delivering relevant outputs aligned to real-world tasks.
Near Zero Latency
Local execution removes network delays, enabling rapid responses and real-time decision support when timing matters most.
Near Zero Latency
Local execution removes network delays, enabling rapid responses and real-time decision support when timing matters most.
Hardware Agnostic
EdgeRunner integrates with current hardware and secure environments, minimizing deployment friction and reducing reliance on new infrastructure.
Hardware Agnostic
EdgeRunner integrates with current hardware and secure environments, minimizing deployment friction and reducing reliance on new infrastructure.
Bring Your Own Models
Deploy and run third-party or government-developed AI models within the EdgeRunner platform, enabling a unified, secure runtime environment without dependence on external cloud infrastructure.
Bring Your Own Models
Deploy and run third-party or government-developed AI models within the EdgeRunner platform, enabling a unified, secure runtime environment without dependence on external cloud infrastructure.
KEY FEATURES
Powered by the EdgeRunner Command function calling model - enabling task execution within agentic workflows.
Our function calling model can automate tasks such as opening Slack, managing emails, browsing, invoking other models, handling Excel, swapping LoRAs, and more.


Air-Gapped Offline Functionality
Operates independently of the internet, safeguarding sensitive data from cyber threats and maintaining operational security.

Hardware & OS Agnostic
Leverages GPU/NPU when available but can fall back on CPU only - only requires 8GB of VRAM. Works on any OS.

Natural flow
of communication
Listens to and transcribes meetings and interactions.
Answers to questions and offers insights.
Multimodal: understands information contained in images.
Provides document-agnostic summarization and transcription.
Translates prompts and outputs into 25+ languages
TTS and STT capable for maximum flexibility
Upload documents and files for on-device Retrieval Augmented Generation (RAG)

Localized RAG (Retrieval-Augmented Generation)
Provides real-time, context-aware recommendations to enhance decision-making and operational efficiency.

Local Data Integration
Delivers precise outputs with citations derived from local data sets provided by the user.
Domain-Specific Expertise
Customizable to meet the specific needs of various roles.
Customizable Personas
Can be tailored to assist with specific missions and tasks.
Custom Adaptors (LoRA)
Utilizes Low Rank Adaptation of LLMs for task-specific enhancements, tailoring AI capabilities to distinct roles.
Intelligent routing of requests: EdgeRunner will intelligently switch between models, dynamically routing requests to the best task-specific model for the use case.
Power efficiency: Saves RAM and power, increasing efficiency and performance.
Streamlined accessibility: Makes Generative AI simple and boring, enabling widespread adoption.
Ever-evolving standard: Becomes the enterprise standard for leveraging multiple models at once, continuously leveraging the latest models, future proofing intelligence.
MODELS
Swarm intelligence
Multiple tiny, task-specific models working in unison provide better results than large general purpose models.
Swarm intelligence
Multiple tiny, task-specific models working in unison provide better results than large general purpose models.
Higher performance
Large general purpose models (e.g. Llama 3) are deeply degraded after 2-Bit quantization and distillation to fit on chip.
Higher performance
Large general purpose models (e.g. Llama 3) are deeply degraded after 2-Bit quantization and distillation to fit on chip.
Task-specific
To solve real challenges, enterprises need task-specific models running locally rather than general purpose models.
Task-specific
To solve real challenges, enterprises need task-specific models running locally rather than general purpose models.
Offline computing
Due to data gravity, AI will increasingly move to the edge, which requires smaller models that can run without connection to the internet.
Offline computing
Due to data gravity, AI will increasingly move to the edge, which requires smaller models that can run without connection to the internet.
Data transparency
Enterprises and governments will require open models with open datasets due to concerns around IP, explainability, and bias.
Data transparency
Enterprises and governments will require open models with open datasets due to concerns around IP, explainability, and bias.
INTRODUCING
EdgeRunner-Light
Hyper efficient military LLM for on-device deployment with laptops, tablets, and smartphones

Runs on as little as 8GB+ VRAM

Hyper efficient military LLM for on-device deployment with laptops, tablets, and smartphones
Runs on as little as 8GB+ VRAM
INTRODUCING
EdgeRunner-Medium
Efficient military LLM for on-device deployment with workstations and laptops

Run on as little 16GB+ VRAM

Efficient military LLM for on-device deployment with workstations and laptops
Run on as little 16GB+ VRAM
Advanced Function Calling for Air-Gapped Workflow Execution: Excels at interpreting, executing, and chaining function.
Dual-Mode Functionality: Can serve as a tool router for request analysis and routing & standalone Chat Agent.
Guaranteed Helpful Ability: Achieved strong scores on popular benchmarks, including Arena Hard Benchmark and MT-Bench.
