yanis/jan - jan - WEVAL Git

yanis/jan

Author	SHA1	Message	Date
Full Stack Engineer	98f1139852	fix: ctx_size overflow causing model load failure after reopening chat (#7879 ) * feat: paragraph-level Edit with AI for assistant markdown (#7812) * fix: ctx_size overflow causing model load failure after reopening chat * fix linter issue --------- Co-authored-by: Clayton <118192227+claytonlin1110@users.noreply.github.com>	2026-04-03 06:55:44 +05:30
Clayton	1c3a03557c	feat: add optional health check auto-restart for crashed model sessions (#7855 ) * feat: add optional health check auto-restart for crashed model sessions * fix: update * fix: update * fix: lint * fix: lint * fix: tauri * fix: build * fix: update * fix: update	2026-03-31 12:34:23 +05:30
dataCenter430	42b2d13321	feat(gguf): fix model compatibility heuristic for Apple Silicon unified memory (#7842 ) * feat(gguf): improve model compatibility heuristic for Apple Silicon unified memory * fix: resolve review --------- Co-authored-by: Louis <louis@jan.ai>	2026-03-28 13:36:32 +07:00
Clayton	d469c2e38c	feat: add /v1/orchestrations for server-side MCP tool execution (#7800 ) * feat: add /v1/orchestrations for server-side MCP tool execution * fix: test * fix: optional assistant_id	2026-03-24 14:31:34 +07:00
Vanalite	0ec8edc639	Merge remote-tracking branch 'origin/main' into chore/merge_main_to_v.0.7.9 # Conflicts: # Makefile	2026-03-23 13:31:38 +07:00
dev-miro26	5504a7aa7e	chore: update dependencies in Cargo.lock and Cargo.toml, add fm-rs package and modify tauri-plugin-foundation-models dependencies	2026-03-20 06:04:33 +00:00
dev-miro26	c1eeee0d9c	refactor: update macOS target configuration to support Apple Silicon architecture for foundation models	2026-03-20 05:21:50 +00:00
dev-miro26	9a684f5473	fix: ensure chunk_text is converted to a string in chat completion stream	2026-03-19 21:28:29 +00:00
dev-miro26	7d98402c62	refactor: change max_tokens type to u32 and improve variable naming in commands.rs	2026-03-19 21:21:59 +00:00
dev-miro26	da499a579b	refactor: remove foundation models server dependency and streamline integration with Apple's FoundationModels framework	2026-03-19 19:08:12 +00:00
Vanalite	ab12f7463f	chore: disable foundation model for RC	2026-03-19 14:43:29 +07:00
dev-miro26	aa2b968181	feat: expand file type support in document parser and UI components	2026-03-18 15:41:59 +00:00
dev-miro26	d5aa7d4022	feat: add foundation models plugin with server management capabilities (#7744 )	2026-03-18 13:08:52 +07:00
Louis	2b18fb5ab1	fix: kv-cache defaults and fit migration (#7751 ) * fix: kv-cache default to q8 * fix: add tests * fix: disable fit setting by default * fix: default parallel	2026-03-17 21:45:38 +07:00
dataCenter430	8df4bf887e	Merge pull request #7605 from dataCenter430/fix/GPU-detection-losing Fix: gpu detection losing	2026-03-12 18:10:06 +07:00
Louis	49404b75ad	fix: bundle jan-cli.exe in Windows NSIS installer (#7618 ) * fix: copy jan-cli nsis * fix: bin file location * fix: add app to PATH * fix: polishing * fix: polish cli commands * fix: download default model when serving * fix: less clicks / steps * fix: add select option * fix: Default 32k ctx when running in CLI * fix: deprecate codex for now * fix: disable uninstalled option	2026-03-05 17:43:02 +07:00
dev-miro26	7d0e861ba1	fix: update flash attention handling in ArgumentBuilder (#7565 )	2026-03-04 06:48:56 +07:00
Louis	10d9649fb5	feat: add cli support # Conflicts: # src-tauri/Cargo.lock # src-tauri/Cargo.toml # src-tauri/src/lib.rs	2026-03-03 20:06:58 +07:00
Louis	305e2a8bdc	fix: simplify mlx-server backend with new ChatSession update (#7538 )	2026-02-19 00:20:47 +07:00
Louis	89e5bba1f8	fix: outdated test	2026-02-09 22:41:34 +07:00
Louis	ed7d5158c2	fix: disable fit does not work well - llama.cpp somehow use max ctx-size	2026-02-09 22:23:39 +07:00
Louis	d4614a2323	refactor: remove model planner since now use auto fit setting	2026-02-09 12:43:12 +07:00
Akarshan Biswas	fe3fe43b7f	feat: implement fit settings in llamacpp extension and overhaul argument builder tests (#7442 ) Introduces support for the `fit` parameter and its associated configurations (`fit_target`, `fit_ctx`) to allow automatic adjustment of arguments to device memory. This change spans the extension settings, guest-js types, and the Rust argument builder. Key changes: * Settings & Types: Added `fit`, `fit_target`, and `fit_ctx` to `settings.json` and synchronized these fields across the TypeScript definitions and the Rust `LlamacppConfig` struct. * Logic Updates: * Implemented `add_fit_settings` in the `ArgumentBuilder` to handle `--fit`, `--fit-target`, and `--fit-ctx` flags. * Modified `add_gpu_layers` to use `-1` as the default for loading all layers, while treating `100` as a manual override. * Updated several argument methods (batch size, context size, etc.) to only append flags if the values differ from the defaults, reducing command-line clutter. * Added a check to exclude `fit` settings when using the `ik` backend fork. * Testing: Significantly expanded the Rust test suite. Replaced basic assertions with dedicated helper functions (`assert_arg_pair`, `assert_has_flag`, `assert_no_flag`) and added comprehensive test cases for various configurations, including GPU layers, embedding mode, and backend-specific behavior.	2026-02-09 08:56:57 +05:30
Louis	d01ec56c75	feat: detect vision capability while importing model	2026-02-04 09:27:38 +07:00
Louis	1482565248	feat: vision support	2026-02-04 09:27:38 +07:00
Louis	023e5dea10	feat: add prompt cache and fix binary bundle	2026-02-04 09:27:38 +07:00
Louis	b16b519f4e	feat: support mlx plugin # Conflicts: # Makefile # web-app/src/routes/settings/providers/$providerName.tsx	2026-02-04 09:27:38 +07:00
Akarshan Biswas	81cd0dfae8	refactor: migrate llamacpp backend logic to rust plugin (#7171 ) * refactor: migrate llamacpp backend logic to rust plugin Moves the core logic for managing llama.cpp backends—including version detection, compatibility checking, migration, prioritization, and updates—from the TypeScript extension to the Rust Tauri plugin. Changes: - tauri-plugin-llamacpp: - Added `src/backend.rs` containing the logic for backend management. - Exposed new commands: `map_old_backend_to_new`, `list_supported_backends`, `determine_supported_backends`, `prioritize_backends`, `check_backend_for_updates`, `remove_old_backend_versions`, etc. - Added unit tests for backend logic in Rust. - Updated permissions and guest-js bindings to include new commands. - llamacpp-extension: - Refactored `src/backend.ts` and `src/index.ts` to delegate logic to the Rust plugin. - Removed obsolete TypeScript implementation of backend logic and corresponding tests. - Simplified configuration and update workflows by using the centralized Rust API. * tests: fix parse backend version tests * fix: correct backenddir path	2026-01-02 11:10:03 +05:30
Akarshan Biswas	bf08f8d6b4	refactor: move llama.cpp config handling to Rust (#7047 ) * refactor: move llama.cpp config handling to Rust - Removed duplicated TypeScript type definitions for LlamacppConfig, ModelPlan, DownloadItem, ModelConfig, etc. - Added a new `src/guest-js/types.ts` that exports the consolidated types and a helper `normalizeLlamacppConfig` for converting raw config objects. - Implemented a dedicated Rust module `args.rs` that builds all command‑line arguments for llama.cpp from a `LlamacppConfig` struct, handling embedding, flash‑attention, GPU/CPU flags, and other options. - Updated `commands.rs` to construct arguments via `ArgumentBuilder`, validate paths, and log the generated args. - Added more explicit error handling for invalid configuration arguments and updated the error enum to include `InvalidArgument`. - Exported the new `cleanupLlamaProcesses` command and updated the guest‑JS API accordingly. - Adjusted the TypeScript `loadLlamaModel` helper to use the new config normalization and argument shape. - Improved logging and documentation for clarity. * fix: ignore empty mmproj path arguments Prevent adding the `--mmproj` flag when the provided path string is empty. An empty `mmproj_path` previously caused an empty argument to be passed to the model loader, potentially leading to errors or undefined behavior. By filtering out empty strings before pushing the flag, the command line construction is now robust against malformed input. * refactor: use String::new() for empty API key Use `String::new()` instead of `“”.to_string()` when no API key is supplied. This eliminates an unnecessary heap allocation and clarifies that the intent is to create an empty string without creating a temporary literal.	2025-12-09 13:32:59 +05:30
Dinh Long Nguyen	5ede3436a8	fix: attachment parsing error (exit gracefully) (#7093 ) * Handle attachment parsing error (exit gracefully) * Revert "Handle attachment parsing error (exit gracefully)" This reverts commit 56a356f3f60a25c0aa243e5957b76d9545098c9e. * Handle attachment parsing error (exit gracefully)	2025-12-03 19:31:11 +07:00
Akarshan Biswas	2af5ab9566	fix: set backend path environment variables for llama.cpp (#6937 ) * fix: set backend path environment variables for llama.cpp Ensure that the backend executable’s directory is added to the appropriate environment variable (`PATH`, `LD_LIBRARY_PATH`, or `DYLD_LIBRARY_PATH`) before invoking `llama_load` and `get_devices`. This change fixes load failures on Windows, Linux, and macOS where the dynamic loader cannot locate the required libraries without the proper search paths, and cleans up unused imports. * refactor: centralize library path setup in Rust utilities Move the library‑path configuration logic out of the TypeScript code into the Rust `setup_library_path` helper. The TypeScript files no longer set the `PATH`, `LD_LIBRARY_PATH`, or `DYLD_LIBRARY_PATH` environment variables directly; instead they defer to the Rust side, which now accepts a `Path` and performs platform‑specific normalization (including UNC‑prefix trimming on Windows). This removes duplicated code, keeps environment configuration consistent across the plugin, and simplifies maintenance. The import order in `device.rs` was corrected and small formatting fixes were applied. No functional changes to the public API occur. * feat: add CUDA path detection and warnings for llama.cpp Add utilities to detect CUDA installations on Windows and Linux, automatically inject CUDA paths into the process environment, and warn when the llama.cpp binary requires CUDA but the runtime is not found. The library‑path setup has been refactored to prepend new paths and normalise UNC prefixes for Windows. This ensures the backend can load CUDA libraries correctly and provides diagnostic information when CUDA is missing. * refactor: correctly map and store effective backend type This update unifies backend type handling across the llamacpp extension. Previously, the stored backend preference, the version string, and the auto‑update logic used inconsistent identifiers (raw backend names versus their effective mapped forms). The patch: * Maps legacy backend names to their new “effective” type before any comparison or storage. * Stores the full `version/effectiveType` string instead of just the type, ensuring the configuration and localStorage stay in sync. * Updates all logging and warning messages to reference the effective backend type. * Simplifies the update check logic by comparing the effective type and version together, preventing unnecessary migrations. These changes eliminate bugs that occurred when the backend type changed after an update and make the internal state more coherent. * refactor: improve CUDA detection and migrate legacy libs Enhance `_isCudaInstalled` to accept the backend directory and CUDA version, checking both the new and legacy installation paths. If a library is found in the old location, move it to the new `build/bin` directory and create any missing folders. Update `mapOldBackendToNew` formatting and remove duplicated comments. Minor consistency and readability fixes were also applied throughout the backend module. * refactor: broaden llama backend archive regex This update expands the regular expression used to parse llama‑cpp extension archives. The new pattern now supports: - Optional prefixes and the `-main` segment - Version strings that include a hash suffix - An optional `-cudart-llama` part - A wide range of backend detail strings These changes ensure `installBackend` can correctly handle the latest naming conventions (e.g., `k_llama-main-b4314-09c61e1-bin-win-cuda-12.8-x64-avx2.zip`) while preserving backward compatibility with older formats.	2025-11-13 09:12:59 +05:30
Louis	384eed56e6	feat: add backend migration mapping and update backend handling (#6917 ) (#6920 ) Added `mapOldBackendToNew` to translate legacy backend strings (e.g., `win-avx2-x64`, `win-avx512-cuda-cu12.0-x64`) into the new unified names (`win-common_cpus-x64`, `win-cuda-12-common_cpus-x64`). Updated backend selection, installation, and download logic to use the mapper, ensuring consistent naming across the extension and tests. Updated tests to verify the mapping, new download items, and correct extraction paths. Minor formatting updates to the Tauri command file for clearer logging. This change enables smoother migration for stored user preferences and reduces duplicate asset handling. Co-authored-by: Akarshan Biswas <akarshan@menlo.ai>	2025-11-11 11:57:43 +07:00
Vanalite	a951e6d7d7	chore: address PR comments	2025-11-05 16:10:37 +07:00
Akarshan Biswas	047483f815	feat: add configurable timeout for llamacpp connections (#6872 ) * feat: add configurable timeout for llamacpp connections This change introduces a user-configurable read/write timeout (in seconds) for llamacpp connections, replacing the hard-coded 600s value. The timeout is now settable via the extension settings and used in both HTTP requests and server readiness checks. This provides flexibility for different deployment scenarios, allowing users to adjust connection duration based on their specific use cases while maintaining the default 10-minute timeout behavior. * fix: correct timeout conversion factor and clarify settings description The previous timeout conversion used `timeout * 100` instead of `timeout * 1000`, which incorrectly shortened the timeout to 1/10 of the intended value (e.g., 10 minutes became 1 minute). This change corrects the conversion factor to milliseconds. Additionally, the settings description was updated to explicitly state that this timeout applies to both connection and load operations, improving user understanding of its scope. * style: replace loose equality with strict equality in key comparison This change updates the comparison operator from loose equality (`==`) to strict equality (`===`) when checking for the 'timeout' key. While the key is always a string in this context (making the behavior identical), using strict equality prevents potential type conversion issues and adheres to JavaScript best practices for reliable comparisons.	2025-11-04 17:34:58 +05:30
Akarshan Biswas	b2a8efd799	refactor: Simplify Tauri plugin calls and update 'FA' setting (#6779 ) * refactor: Simplify Tauri plugin calls and enhance 'Flash Attention' setting This commit introduces significant improvements to the llama.cpp extension, focusing on the 'Flash Attention' setting and refactoring Tauri plugin interactions for better code clarity and maintenance. The backend interaction is streamlined by removing the unnecessary `libraryPath` argument from the Tauri plugin commands for loading models and listing devices. * Simplified API Calls: The `loadLlamaModel`, `unloadLlamaModel`, and `get_devices` functions in both the extension and the Tauri plugin now manage the library path internally based on the backend executable's location. * Decoupled Logic: The extension (`src/index.ts`) now uses the new, simplified Tauri plugin functions, which enhances modularity and reduces boilerplate code in the extension. * Type Consistency: Added `UnloadResult` interface to `guest-js/index.ts` for consistency. * Updated UI Control: The 'Flash Attention' setting in `settings.json` is changed from a boolean checkbox to a string-based dropdown, offering 'auto', 'on', and 'off' options. * Improved Logic: The extension logic in `src/index.ts` is updated to correctly handle the new string-based `flash_attn` configuration. It now passes the string value (`'auto'`, `'on'`, or `'off'`) directly as a command-line argument to the llama.cpp backend, simplifying the version-checking logic previously required for older llama.cpp versions. The old, complex logic tied to specific backend versions is removed. This refactoring cleans up the extension's codebase and moves environment and path setup concerns into the Tauri plugin where they are most relevant. * feat: Simplify backend architecture This commit introduces a functional flag for embedding models and refactors the backend detection logic for cleaner implementation. Key changes: - Embedding Support: The loadLlamaModel API and SessionInfo now include an isEmbedding: boolean flag. This allows the core process to differentiate and correctly initialize models intended for embedding tasks. - Backend Naming Simplification (Refactor): Consolidated the CPU-specific backend tags (e.g., win-noavx-x64, win-avx2-x64) into generic -common_cpus-x64 variants (e.g., win-common_cpus-x64). This streamlines supported backend detection. - File Structure Update: Changed the download path for CUDA runtime libraries (cudart) to place them inside the specific backend's directory (/build/bin/) rather than a shared lib folder, improving asset isolation. fix: compare * fix mmap settings and adjust flash attention * fix: correct flash_attn and main_gpu flag checks in llamacpp extension Previously the condition for `flash_attn` was always truthy, causing unnecessary or incorrect `--flash-attn` arguments to be added. The `main_gpu` check also used a loose inequality which could match values that were not intended. The updated logic uses strict comparison and correctly handles the empty string case, ensuring the command line arguments are generated only when appropriate.	2025-11-01 23:30:49 +05:30
Minh141120	15c426aefc	chore: update org name	2025-10-28 17:26:27 +07:00
Dinh Long Nguyen	340042682a	ui ux enhancement	2025-10-09 03:48:51 +07:00
Akarshan	7762cea10a	feat: Distinguish and preserve embedding model sessions This commit introduces a new field, `is_embedding`, to the `SessionInfo` structure to clearly mark sessions running dedicated embedding models. Key changes: - Adds `is_embedding` to the `SessionInfo` interface in `AIEngine.ts` and the Rust backend. - Updates the `loadLlamaModel` command signatures to pass this new flag. - Modifies the llama.cpp extension's auto-unload logic to explicitly filter out and not unload any currently loaded embedding models when a new text generation model is loaded. This is a critical performance fix to prevent the embedding model (e.g., used for RAG) from being repeatedly reloaded. Also includes minor code style cleanup/reformatting in `jan-provider-web/provider.ts` for improved readability.	2025-10-08 20:03:35 +05:30
Dinh Long Nguyen	ff93dc3c5c	Merge branch 'dev' into feat/file-attachment	2025-10-08 16:34:45 +07:00
Dinh Long Nguyen	510c4a5188	working attachments	2025-10-08 16:08:40 +07:00
Louis	fe2c2a8687	Merge branch 'dev' into release/v0.7.0 # Conflicts: # web-app/src/containers/DropdownModelProvider.tsx # web-app/src/containers/ThreadList.tsx # web-app/src/containers/__tests__/DropdownModelProvider.displayName.test.tsx # web-app/src/hooks/__tests__/useModelProvider.test.ts # web-app/src/hooks/useChat.ts # web-app/src/lib/utils.ts	2025-10-06 20:42:05 +07:00
Vanalite	fa61163350	fix: Fix openssl issue on mobile after merging	2025-10-05 14:40:39 +07:00
Akarshan Biswas	0f0ba43b7f	feat: Adjust RAM/VRAM calculation for unified memory systems (#6687 ) * feat: Adjust RAM/VRAM calculation for unified memory systems This commit refactors the logic for calculating total RAM and total VRAM in `is_model_supported` and `plan_model_load` commands, specifically targeting systems with unified memory (like modern macOS devices where the GPU list may be empty). The changes are as follows: * Total RAM Calculation: If no GPUs are detected (`sys_info.gpus.is_empty()` is true), total RAM is now set to $0$. This avoids confusing total system memory with dedicated GPU memory when planning model placement. * Total VRAM Calculation: If no GPUs are detected, total VRAM is still calculated as the system's total memory (RAM), as this shared memory acts as VRAM on unified memory architectures. This adjustment improves the accuracy of memory availability checks and model planning on unified memory systems. * fix: total usable memory in case there is no system vram reported * chore: temporarily change to self-hosted runner mac * ci: revert back to github hosted runner macos --------- Co-authored-by: Louis <louis@jan.ai> Co-authored-by: Minh141120 <minh.itptit@gmail.com>	2025-10-01 18:58:14 +07:00
Roushan Kumar Singh	247db95bad	resolve TypeScript and Rust warnings (#6612 ) * chore: fix warnings * fix: add missing scrollContainerRef dependencies to React hooks * fix: typo * fix: remove unsupported fetch option and enable AsyncIterable types - Removed `connectTimeout` from fetch init (not supported in RequestInit) - Updated tsconfig to target ES2018 * chore: refactor rename * fix(hooks): update dependency arrays for useThreadScrolling effects * Add type.d.ts to extend requestinit with connectionTimeout * remove commentd unused import	2025-10-01 16:06:41 +07:00
Vanalite	262a1a9544	Merge remote-tracking branch 'origin/dev' into mobile/dev # Conflicts: # src-tauri/src/core/setup.rs # src-tauri/src/lib.rs # web-app/src/hooks/useChat.ts	2025-10-01 09:52:01 +07:00
Dinh Long Nguyen	e6bc1182a6	Merge branch 'dev' into feat/sync-release=to-dev	2025-09-30 22:04:27 +07:00
Vanalite	43d20e2a32	fix: revert the modification of vulkan	2025-09-30 14:50:54 +07:00
Akarshan	34b254e2d8	fix: Improve KV cache estimation robustness The KV cache size calculation in estimate_kv_cache_internal now includes a fallback mechanism for models that do not explicitly define key_length and value_length in the GGUF metadata. If these attention keys are missing, the head dimension (and thus key/value length) is calculated using the formula embedding_length / total_heads. This improves robustness and compatibility with GGUF models that don't have the proper keys in metadata. Also adds logging of the full model metadata for easier debugging of the estimation process.	2025-09-30 11:14:18 +05:30
Vanalite	549c962248	fix: Fix nvidia and vulkan after upgrade to be compatible with mobile compiling too	2025-09-30 09:44:21 +07:00
Vanalite	5e57caee43	Merge remote-tracking branch 'origin/dev' into mobile/dev # Conflicts: # extensions/yarn.lock # package.json # src-tauri/plugins/tauri-plugin-hardware/src/vendor/vulkan.rs # src-tauri/src/lib.rs # yarn.lock	2025-09-29 22:22:00 +07:00

1 2

79 Commits