HF TGI - 搜索

约 436,000 个结果

在新选项卡中打开链接

时间不限

github.com
https://github.com › huggingface › text-generation-inference
huggingface/text-generation-inference - GitHub
TGI enables high-performance text generation for the most popular open-source LLMs, including Llama, Falcon, StarCoder, BLOOM, GPT-NeoX, and more. TGI implements many features, such as: Guidance/JSON. Specify output format to speed up inference and make sure the output is valid according to some specs..
huggingface.co
https://huggingface.co › docs › text-generation-inference
Text Generation Inference - Hugging Face
Text Generation Inference (TGI) is a toolkit for deploying and serving Large Language Models (LLMs). TGI enables high-performance text generation for the most popular open-source LLMs, including Llama, Falcon, StarCoder, BLOOM, GPT-NeoX, and T5.
zhihu.com
https://zhuanlan.zhihu.com
vllm vs TGI 部署 llama v2 7B 踩坑笔记 - 知乎 - 知乎专栏
可以通过 text-generation-launcher --help 查看到可配置参数，相对 vllm 来说，TGI 在服务部署上的参数配置更丰富一些，其中比较重要的有： model-id：模型 path 或者 hf.co 的 model_id。 revision：模型版本，比如 hf.co 里仓库的 branch名称。 quantize：TGI 支持使用 GPTQ 来部署模 …
hugging-face.cn
https://hugging-face.cn › docs › text-generation-inference
文本生成推理 - Hugging Face 机器学习平台
文本生成推理 (tgi) 是一种用于部署和提供大型语言模型 (llm) 的工具包。 TGI 为最流行的开源 LLM 提供高性能文本生成，包括 Llama、Falcon、StarCoder、BLOOM、GPT-NeoX 和 T5。
github.com
https://github.com › huggingface › text-generation-inference › releases
Releases · huggingface/text-generation-inference - GitHub
New transformers backend supporting flashattention at roughly same performance as pure TGI for all non officially supported models directly in TGI. Congrats @Cyrilvallez. New models unlocked: Cohere2, olmo, olmo2, helium. Full Changelog: v3.0.1...v3.0.2. Patch release to handle a few older models and corner cases. Full Changelog: v3.0.0...v3.0.1.
zhihu.com
https://zhuanlan.zhihu.com
Text Generation Inference源码解读（一）：架构设计与业务逻辑
Text Generation Inference（TGI）是 HuggingFace 推出的大模型推理部署框架，支持主流大模型和主流大模型量化方案，相对其他大模型推理框架框架TGI的特色是联用 Rust 和 Python 达到服务效率和业务灵活性的平衡。
huggingface.co
https://huggingface.co › ... › basic_tutorials › consuming_tgi
Consuming Text Generation Inference - Hugging Face
There are many ways to consume Text Generation Inference (TGI) server in your applications. After launching the server, you can use the Messages API /v1/chat/completions route and make a POST request to get results from the server. You can also pass "stream": true to the call if you want TGI to return a stream of tokens.
csdn.net
https://blog.csdn.net › article › details
vLLM vs TGI 部署大模型以及注意点 - CSDN博客
2024年4月5日 · VLLM 是一种高效的深度学习推理库，通过PagedAttention算法有效管理大语言模型的注意力内存，其特点包括24倍的吞吐提升和3.5倍的TGI性能，无需修改模型结构，专门设计用于加速大规模语言模型（LLM）的推理过程。
readthedocs.io
https://qwen.readthedocs.io › zh-cn › latest › deployment › tgi.html
TGI - Qwen - Read the Docs
Hugging Face 的 Text Generation Inference (TGI) 是一个专为部署大规模语言模型 (Large Language Models, LLMs) 而设计的生产级框架。 TGI提供了流畅的部署体验，并稳定支持如下特性：推测解码 (Speculative Decoding) ：提升生成速度。张量并行 (Tensor Parallelism) ：高效多卡部署。流式生成 (Token Streaming) ：支持持续性生成文本。灵活的硬件支持：与 AMD ， Gaudi 和 AWS Inferentia 无缝衔接。通过 TGI docker 镜像使用 TGI 轻而易举。本文将主要介 …
csdn.net
https://blog.csdn.net › article › details
Text Generation Inference（TGI） - CSDN博客
2024年4月11日 · Text Generation Inference（TGI）是一个由Hugging Face开发的用于部署和提供大型语言模型（LLMs）的框架。它是一个生产级别的工具包，专门设计用于在本地机器上以服务的形式运行大型语言模型。

分页
- 1
- 2
- 3
- 4
- 下一页

huggingface/text-generation-inference - GitHub

Text Generation Inference - Hugging Face

vllm vs TGI 部署 llama v2 7B 踩坑笔记 - 知乎 - 知乎专栏

文本生成推理 - Hugging Face 机器学习平台

Releases · huggingface/text-generation-inference - GitHub

Text Generation Inference源码解读（一）：架构设计与业务逻辑

Consuming Text Generation Inference - Hugging Face

vLLM vs TGI 部署大模型以及注意点 - CSDN博客

TGI - Qwen - Read the Docs

Text Generation Inference（TGI） - CSDN博客