> ## Documentation Index > Fetch the complete documentation index at: https://rllm-org-rllm-19-feat-renderer-parser-backend.mintlify.site/llms.txt > Use this file to discover all available pages before exploring further. # RolloutEngine > Model inference engine for agent rollouts The `RolloutEngine` provides an abstraction for model inference across different backends (OpenAI, Fireworks, vLLM, etc.). ## RolloutEngine Base class for all rollout engines. ```python theme={null} from rllm.engine.rollout import RolloutEngine ``` ### Methods #### get\_model\_response Generate a model response for the given messages. ```python theme={null} output = await engine.get_model_response( messages=[{"role": "user", "content": "Hello"}], application_id="task_0", temperature=0.7, max_tokens=2048 ) ``` List of chat messages in OpenAI format. Unique identifier for tracking requests. Additional sampling parameters (temperature, top\_p, max\_tokens, etc.). Model output containing text, tokens, and metadata. #### wake\_up Initialize or warm up the engine (implementation-specific). ```python theme={null} await engine.wake_up() ``` #### sleep Shutdown or clean up the engine (implementation-specific). ```python theme={null} await engine.sleep() ``` *** ## ModelOutput Dataclass containing model generation output. ```python theme={null} from rllm.engine.rollout import ModelOutput ``` ### Fields Complete generated text (may include reasoning). Content portion of the response (excluding reasoning). Reasoning or thought process (if model supports it). List of tool calls made by the model. Token IDs for the input prompt. Token IDs for the completion. Multimodal inputs (e.g., images). Log probabilities for completion tokens. Log probabilities for prompt tokens (aligned to prompt\_ids). Length of prompt in tokens. Length of completion in tokens. Reason generation stopped ("stop", "length", etc.). ### Methods ```python theme={null} # Serialize output_dict = model_output.to_dict() # Deserialize model_output = ModelOutput.from_dict(output_dict) ``` *** ## OpenAIEngine Rollout engine using OpenAI-compatible APIs. ```python theme={null} from rllm.engine.rollout import OpenAIEngine engine = OpenAIEngine( base_url="http://localhost:8000/v1", api_key="EMPTY", model="Qwen/Qwen3-4B" ) ``` ### Constructor Base URL for the API endpoint. API key for authentication. Model identifier. *** ## FireworksEngine Rollout engine using Fireworks AI API. ```python theme={null} from rllm.engine.rollout import FireworksEngine engine = FireworksEngine( api_key="your_fireworks_key", model="accounts/fireworks/models/deepseek-r1" ) ``` *** ## Example: Basic Usage ```python theme={null} import asyncio from rllm.engine.rollout import OpenAIEngine engine = OpenAIEngine( base_url="http://localhost:4000/v1", api_key="EMPTY", model="deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B" ) async def generate(): messages = [ {"role": "system", "content": "You are a helpful assistant."}, {"role": "user", "content": "What is the capital of France?"} ] output = await engine.get_model_response( messages, temperature=0.7, max_tokens=512 ) print(f"Content: {output.content}") print(f"Reasoning: {output.reasoning}") print(f"Tokens: {output.completion_length}") print(f"Finish reason: {output.finish_reason}") asyncio.run(generate()) ``` *** ## Example: Batch Generation ```python theme={null} import asyncio from rllm.engine.rollout import OpenAIEngine engine = OpenAIEngine( base_url="http://localhost:8000/v1", api_key="EMPTY", model="Qwen/Qwen3-4B" ) async def generate_batch(): tasks = [ [{"role": "user", "content": "What is 2+2?"}], [{"role": "user", "content": "What is the speed of light?"}], [{"role": "user", "content": "Who wrote Romeo and Juliet?"}] ] # Generate concurrently results = await asyncio.gather(*[ engine.get_model_response(messages, application_id=f"task_{i}") for i, messages in enumerate(tasks) ]) for i, output in enumerate(results): print(f"\nTask {i}:") print(f"Response: {output.content}") asyncio.run(generate_batch()) ```