in

Unlock the Power of AI: Run Language Models Locally with LangChain and Hugging Face Models

hugging face

LangChain: Run Language Models Locally – Hugging Face Models

AI技術の魅力を伝える記事を書くライターとして、Hugging Faceの大規模言語モデルをローカルで使用する方法や、Hugging Face Hub APIを使用した同じモデルの活用方法などについて紹介します。また、エンコーダーデコーダーやデコーダーモデル(テキスト生成モデル)についても探求します。

Hugging Faceの大規模言語モデルを使って素晴らしい世界を探索してみましょう。

この記事では、LangChainプラットフォームを使用してHugging Faceの大規模言語モデルをローカルで利用する方法や、Hugging Face Hub APIを使用した同じモデルの活用方法について詳しく紹介します。さらに、エンコーダーデコーダーやデコーダーモデル(テキスト2テキスト生成およびテキスト生成モデル)についても掘り下げます。

興味深いビデオや関連リンクもご紹介します。ぜひ一緒に大規模言語モデルの驚異的な世界を探検しましょう。

※上記は参考として挙げたポイントですが、実際の内容はより詳細かつ魅力的なものとして執筆されることが望まれます。



動画はこちら

LangChain: Run Language Models Locally - Hugging Face Models の画像

コメントを残す

メールアドレスが公開されることはありません。 が付いている欄は必須項目です

GIPHY App Key not set. Please check settings

48 Comments

  1. I've found it near impossible to find info on memory requirements for using any model. If I want to load in a model out of the box locally (for example the flan-t5 model in your video), how can you determine this given the parameter size of the model, assuming no quantization, full-fine-tuning, and inference? Also what is actually getting loaded into memory as soon as you load in the model?

  2. I have spent the last 3 days trying to learn all this through the langchain documentation. You made everything so much simple and clearer to understand. Thank you so much for your work! I unfortunately have failed multiple times to run StableLM 3b locally in google colab due to it crashing the session (RAM shortage). I've watched your other video about 8 bit quantization and have tried it, yet it still crashes the session. I've found useful articles about instantiating large models in huggingface but I can't quite understand what I'm reading. Any ideas on what I should try?

  3. This is great. I liked how you explained each terms and each line of code. However, it would be nice if you could just point me some details on how i would be able to run this on vs code?

    Should I simply just copy paste each line in vs code? I think that it won't work. We will need to pass the path of the models and may be other stuff that i don't know.

    Please reply. This is important for me

  4. I've tried the first approach and over 4minutes of response, the api reported "out of time". I tried through virtual environments, docker python image installing the proper ROC for the AMD card, but no results 🙁 I suppose it is the use of the AMD card and their incompatibilities with Pytorch

  5. I got the following error while giving it a try on kaggle :

    —————————————————————————
    TypeError Traceback (most recent call last)
    Cell In[2], line 8
    6 model_id = 'google/flan-t5-small'
    7 tokenizer = AutoTokenizer.from_pretrained(model_id)
    —-> 8 model = AutoModelForSeq2SeqLM.from_pretrained(model_id, load_in_8bit=True, device_map='auto')
    10 pipeline = pipeline(
    11 "text2text-generation",
    12 model=model,
    13 tokenizer=tokenizer,
    14 max_length=128
    15 )
    17 local_llm = HuggingFacePipeline(pipeline=pipeline)

    File /opt/conda/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py:471, in _BaseAutoModelClass.from_pretrained(cls, pretrained_model_name_or_path, *model_args, **kwargs)
    469 elif type(config) in cls._model_mapping.keys():
    470 model_class = _get_model_class(config, cls._model_mapping)
    –> 471 return model_class.from_pretrained(
    472 pretrained_model_name_or_path, *model_args, config=config, **hub_kwargs, **kwargs
    473 )
    474 raise ValueError(
    475 f"Unrecognized configuration class {config.__class__} for this kind of AutoModel: {cls.__name__}.n"
    476 f"Model type should be one of {', '.join(c._name_ for c in cls._model_mapping.keys())}."
    477 )

    File /opt/conda/lib/python3.10/site-packages/transformers/modeling_utils.py:2846, in PreTrainedModel.from_pretrained(cls, pretrained_model_name_or_path, *model_args, **kwargs)
    2844 # Dispatch model with hooks on all devices if necessary
    2845 if device_map is not None:
    -> 2846 dispatch_model(model, device_map=device_map, offload_dir=offload_folder, offload_index=offload_index)
    2848 if output_loading_info:
    2849 if loading_info is None:

    TypeError: dispatch_model() got an unexpected keyword argument 'offload_index'

  6. With which open source structures is an artificial intelligence created to run text to speech and speech to text in a call center style in an institution? 52 x 8gb rx570 graphics cards, which are currently idle as Ethereum rig, are considered to be used in this business? Which open source builds do you think would be appropriate? especially inbound calls for support are aimed. Or survey calls.

  7. I want to query my own library of PDFs, without sending anything to OpenAI et al. Will you have a video for that soon? (please!)
    There are lots of examples of loading own content which focus on 'prompt stuffing' which presumably does not scale well, whereas I have thousands of PDFs to 'load', so I really need a different solution. Your insights would be greatly appreciated, thank you!

  8. The problem here is you use the word locally which can be connected to the word off-line. If I can run something locally I would want to be able to run it off-line as well. Your solution here requires an online connection to that other service. Effectively you’ve moved one online moment to a different one. I’m only looking for off-line local chat like Oogabooga.

오픈AI, '챗GPT' 기업용 사용자 100만명 돌파..."산업 특화 챗봇으로 확장" - AI타임스

오픈AI ‘챗GPT’, 기업용 사용자 100만명 돌파 – 산업 맞춤형 챗봇 혁신의 새 시대 개막!

Los creadores de TikTok lanzan sus primeros auriculares con el popular 'ChatGPT' chino

新時代の音声体験:TikTokクリエイターが中国版ChatGPT搭載の画期的ヘッドフォンを発表!