Chat AI
Introduction
Open WebUI is an AI-powered chatbot interface that provides secure, on-premise access to advanced language models for chatting, coding, document processing, and image generation. The same models are accessible via a documented REST API and can be connected to other applications such as Visual Studio. This guide provides instructions on how to log in, use models, generate images, and create an API key. Our AI services operate entirely within the secure e-INFRA CZ infrastructure—your data never leaves this environment.
Accessing Open-WebUI
Open-WebUI is accessible at https://chat.ai.e-infra.cz. To use the platform, you need a valid Metacentrum account, see How to get Access.
Logging In
- Open your web browser and navigate to https://chat.ai.e-infra.cz.
- Click on the Login button.
- Select the option to log in with e-INFRA CZ.
- Once logged in, you will be redirected to the Open-WebUI dashboard.
Using AI Models
Open-WebUI provides access to various AI models for text generation. To use them:
- After logging in, navigate to the chat interface.
- Select a model from the available options in the dropdown menu.
- Type your query or request in the input field.
- Press Enter or click Submit to receive a response from the selected model.
- Do not hesitate to scroll through the model list! There are more models available. 👇⬇️
Currently Available Models (as of 04/29/2025)
We categorize our models into two groups:
- Guaranteed Models – These are stable and expected to remain available long-term. Any replacements or updates will be announced here and via WebUI banners.
- Experimental Models – These are subject to change as we optimize resources or test new capabilities.
For exact model names, query the model list as shown below. When accessing externally, you must use the exact model names, such as llama3.3:latest
.
Model names are case-sensitive and may include version tags like :latest
or specific quantization formats.
Guaranteed Models
Model | Description |
---|---|
LLaMA 3.3 | A 70B language model from Meta. Its efficiency and versatility make it well-suited for a wide range of natural language processing tasks. |
DeepSeek R1 | A 32B Qwen-distilled model from DeepSeek (China), designed with a focus on reasoning. It excels at complex tasks such as mathematics and code generation. |
Qwen 2.5 Coder | A 32B Q8 variant specialized in code understanding and generation, ideal for developers needing programming assistance. |
Gemma 3 | A 27B FP16 language model from Google, part of the lightweight Gemma family built on Gemini technology. |
Command-A | A 111B model from Cohere. Though slower in response time, it is particularly strong in programming tasks and manifest generation. |
Experimental Models
Model | Description |
---|---|
Aya Expanse | A 32B Q8 multilingual model from Cohere, trained to perform well across 23 languages, including Czech. |
Phi-4 | A 14B Q8 model from Microsoft, trained on a mix of synthetic datasets, filtered public web content, academic books, and Q&A datasets. It represents the current state of the art in open-access models. |
LLaMA 4 | Meta’s Scout 17B-16E variant. While intended for general NLP tasks, it currently underperforms compared to LLaMA 3.3. |
Mistral-Small 3.1 | A 24B language model from Mistral AI, featuring built-in vision capabilities. |
Embedding Models
The Open WebUI currently does not support API access to embedding models (even if compatible with the OpenAI API). As a result, no embedding models are currently available through the Open WebUI API.
However, users may request IP-based access to the vLLM system, which includes support for embedding models.
Generating Images
Open-WebUI also allows users to generate images using AI.
- Select text model for prompt generation (such as LLama 3.3).
- Click on Image icon (Generate an Image).
- Enter the text prompt, e.g,
Four horsemen of apocalypse
. Click send or Enter. - Image will be generated and displayed.
Creating an API Key
To use Open-WebUI’s API, you need to generate an API key.
- Go to the Settings section of the Open-WebUI interface.
- Nagigate to the Account (Účet).
- Click on API keys (display).
- Ignore JWT token and select API key and either generate new or display existing.
- Copy the generated API key and store it securely.
- Use this key in API requests to authenticate and access Open-WebUI services.
- Endpoint API is: https://chat.ai.e-infra.cz/api/.
Models for API Interface
For API interfaces, you should query exact model names using, e.g., curl
and jq
commands, replacing TOKEN with your real token.
curl -H "Authorization: Bearer TOKEN" https://chat.ai.e-infra.cz/api/models | jq .data[].id
You will see output similar to:
"llama3.3:latest"
"llama3.3:70b-instruct-fp16"
"deepseek-r1:32b-qwen-distill-fp16"
"qwen2.5-coder:32b-instruct-q8_0"
"aya-expanse:latest"
Then use, e.g., the llama3.3:latest
as the model name in the API.
Knowledge Function (Alpha)
Open WebUI includes an experimental Knowledge Function, which is essentially a RAG (Retrieval-Augmented Generation) system. This allows users to upload custom texts and query a model that generates answers based on that content.
Currently, the Knowledge Function only supports global configuration, meaning that a single embedding model is used for all stored texts. We are still evaluating which embedding models are best suited for this purpose. A key limitation is that changing the embedding model requires all previously uploaded texts to be reprocessed, as RAG relies on consistent embeddings to function correctly. For this reason, the feature is not recommended for production use and is intended primarily for testing and preview.
Another challenge lies in finding suitable embedding models that support the Czech language and large input contexts. Most available models are limited to a 512-token input, which is suboptimal for longer texts, as it requires splitting the content into small fragments—often too small to provide high-quality answers.
While we could integrate external embedding models like OpenAI’s text-embedding-3-small
, this approach would compromise data privacy, as it involves sending data to a third-party service.
Data Privacy
In Open WebUI, model labels are now standardized for clarity. All local models are correctly labelled as internal, ensuring consistent and intuitive categorization.
All models accessible to users run on our infrastructure, and inference-related data does not leave our systems. However, we can provide access to truly external models—such as GPT-4o—upon special agreement. In such cases, data is understandably transmitted to a third party.
The Ollama inference system has all request/response logging completely disabled. In contrast, the vLLM system logs requests and responses only while the associated Pod is running.
Open WebUI itself logs at the INFO level by default, meaning request/response data is not logged. Occasionally, when troubleshooting is required, the log level may be temporarily raised to DEBUG. In such cases, request/response data is stored until the Pod is restarted—typically when a new version is deployed.
Only system administrators have access to these logs, and they are not transmitted anywhere else.
Saved conversations are stored in a PostgreSQL database hosted within our infrastructure. When a conversation is deleted, it should also be removed from the database—though we have not independently verified whether WebUI fully implements this behavior. The database is backed up to an S3 storage on CESNET with a 30-day retention period, meaning older backups are deleted after 30 days. Theoretically, a deleted conversation will be completely erased within that window. Again, access to this system is restricted to administrators only.
All administrators have NDA signed.
A Practical Guide to Using AI Chat
As a scientist, you’re likely no stranger to seeking out information and guidance to help you with your work. When interacting with AI chatbots, effective communication is key to getting the results you need. In this guide, we’ll walk you through the best practices for crafting high-quality prompts and engaging in productive conversations with AI chatbots.
Why Proper Prompting Matters
A poorly written prompt can lead to a subpar response. AI chatbots are only as good as the input they receive. By providing clear and concise prompts, you can unlock more accurate and relevant results.
Communicating Effectively with AI
Interacting with chatbots is similar to communicating with humans. The quality of the response depends on the clarity and specificity of your prompt. Don’t worry about complexity; focus on describing your task and asking questions.
Basic Principles of Communication
- Define Your Goal: Clearly state what you want to achieve. The more specific you are, the better the response will be. Avoid vague or general prompts, as they can lead to disappointing results.
- Provide Context: Offer relevant details that can help the AI understand the task and provide a more accurate response. Context helps the AI grasp the purpose and significance of the task.
- Specify the Output: Indicate how you want the response to be formatted, including length, style, and tone. This ensures you receive a response that meets your needs.
- Choose a proper language model available: Try more LLM models and compare the results.
Let AI Help You Craft Prompts
- Start with a Basic Idea: Instead of struggling to come up with a prompt, ask the AI chatbot to help you create one. For example: “Create a high-quality prompt for a chatbot that will help me write engaging articles on productivity topics for my colleagues.”
- Specify the Purpose: Add context about how you plan to use the prompt, such as: “I’ll use this prompt to write articles on productivity topics for my colleagues.”
- Refine the Result: The AI will provide a complete prompt that you can use as-is or refine to better suit your needs.
Engaging in Productive Conversations with AI
- Start with a Strong Prompt: Begin with the best prompt you can craft, using the techniques outlined above.
- Evaluate the Response: Review the response and identify areas for improvement. AI chatbots may not always provide exactly what you want on the first try.
- Refine Your Request: Continue the conversation by providing specific feedback, such as: “Make it more concise” or “Write it in a more formal tone.”
- Repeat the Process: Iterate until you achieve the desired result. Ask questions, provide feedback, and refine your prompt to get the best possible response.
Improving Prompts with Roles and Details
- Assign a Role to AI: Provide context by assigning a role to the AI, such as: “Act as an expert copywriter with 15 years of experience.” This helps the AI provide more relevant and sophisticated responses.
- Describe the Problem in Detail: Clearly explain the problem or task you want the AI to help with. Provide as much context as possible to ensure the AI understands your needs.
- Specify the Output Format: Indicate how you want the response to be formatted, such as a list, table, or paragraph.
- Define the Tone and Style: Specify the tone and style you want the AI to use, such as: “Write it in a friendly, approachable tone” or “Use a formal, technical tone.”
Using Examples for Better Results
- Provide a Sample: Offer a sample or example of what you’re looking for. This helps the AI understand your needs and provide a more accurate response.
- Use Positive Instructions: Instead of telling the AI what not to do, focus on what you want it to do. For example: “Use simple language” instead of “Avoid technical jargon.”
- Avoid Conflicting Instructions: Be consistent in your prompts and avoid contradictory instructions.
- Let AI Ask Questions: Encourage the AI to ask questions if it needs clarification or more information. This can help ensure you receive a more accurate and relevant response.
Last updated on