AI app

Overview

AI apps integrate AI and ML capabilities with data-rich backend systems, deeply embedded in organizational processes, creating a seamless experience where intelligent services and data flow bidirectionally between various system components. By combining AI/ML models and services with conventional application logic and data, AI apps deliver intelligent, adaptive solutions that optimize data-driven dynamic resource allocation, real-time natural language processing for automated customer support, and contextual recommendations for personalized user experiences across different application stack layers.

AI Apps Architecture

lorem ipsum lorem ipsum lorem ipsum lorem ipsum lorem ipsum lorem ipsum

What is an AI app?

AI apps integrate AI and ML capabilities with conventional software systems. These applications typically involve two main areas of AI functionality. The first is training, which involves developing and fine-tuning AI models or training ML models using application-specific datasets. The second is inference, which encompasses the deployment and execution of AI/ML models to make predictions or decisions based on app context-specific input data.

From an architectural perspective, AI apps treat AI/ML components as external services integrated with conventional application logic and data. AI apps require robust APIs and data management for effective interaction between AI and conventional components. They achieve scalability by independently scaling training pipelines, inference services, and application logic, optimizing resource use and performance across diverse conditions.

Key properties of AI apps

AI applications rely on seamlessly integrating AI/ML services with conventional application components, distributed processing capabilities, and adaptive resource management to operate effectively across various computational environments. They scale by making their inference, conventional application, and integration each independently elastic.

Scalable inference

Deploy and execute inference models across diverse environments, from edge to cloud, enabling responsive predictions adaptable to various hardware and network conditions.  

Dynamic resource management

Adjusts resource usage based on AI task complexity, available computing power, and data volume, optimizing performance across diverse environments from edge devices to cloud servers.

Data integration and preprocessing

Integrates and preprocesses diverse data types from multiple sources, transforming raw data into suitable formats for AI/ML model training, fine-tuning, and inference tasks.

Adaptive model selection 

Selects models based on task complexity, responsiveness needs, and costs, using simpler models for straightforward tasks and complex ones for intricate problems, optimizing performance and resources.

Continuous learning pipeline

Implements continuous learning pipelines for ML training, AI/LLM fine-tuning, and RAG, enabling AI components to evolve and improve performance with new data over time.

Automated AI/MLOps

Integrates automated MLOps practices with version control, testing, and deployment pipelines, ensuring consistent performance and reliability of AI components.

Akka components

The client sends a request to the endpoint component (API).
The endpoint component forwards the command to the App Data Entity, which processes it and emits events.
Events go to a streaming component, which reliably forwards them to Vector DB Service.
Vector DB Service stores and retrieves data for AI processing.
RAG endpoint component retrieves context from Vector DB for AI / LLM.
AI / LLM uses the context to generate an informed response.

How Akka enables AI apps

Akka provides a robust framework for developing and operating distributed applications that seamlessly integrate conventional processing systems with AI and ML services, enabling scalable and resilient AI-powered applications.

Realtime and background inference

Akka's distributed messaging enables low-latency data streaming for real-time inference, while its HTTP and gRPC support allows for background processing of computationally intensive tasks.

Akka components

Designing HTTP Endpoints

Elastic resource allocation

Akka provides dynamic resource management for AI applications, enabling real-time adaptation to changing workloads, automatic resource scaling, and optimizing performance and efficiency for AI tasks.

Key Principles

Implementing Event Sourced Entities

Implementing Key Value Entities

Data-driven RAG inference

Akka uses event sourcing and CQRS patterns for RAG processing, separating data write and read operations.

The Event Sourced state model

Key Components

Implementing Views

Agents and agentic workflows

Agents and Agentic Workflows: Akka supports AI agent-based systems, maintaining agent states for stateful workflows in distributed environments.

Entity state models

State models and replication

Model serving and versioning

Akka HTTP and elastic Cluster management provide robust mechanisms for serving multiple versions of AI models, facilitating A/B testing and gradual model updates.

Serialization

Model deployment and execution 

Akka cluster and cluster sharding facilitate the deployment and distributed execution of AI models across a scalable and resilient infrastructure, enabling efficient inference processing flows at scale.

Welcome to Akka

Deployment model