Unlock the Power of AI: Run Language Models Locally with LangChain and Hugging Face Models

by Prompt Engineering 2024年10月20日, 17:01 48 Comments

LangChain: Run Language Models Locally – Hugging Face Models

AI技術の魅力を伝える記事を書くライターとして、Hugging Faceの大規模言語モデルをローカルで使用する方法や、Hugging Face Hub APIを使用した同じモデルの活用方法などについて紹介します。また、エンコーダーデコーダーやデコーダーモデル（テキスト生成モデル）についても探求します。

Hugging Faceの大規模言語モデルを使って素晴らしい世界を探索してみましょう。

この記事では、LangChainプラットフォームを使用してHugging Faceの大規模言語モデルをローカルで利用する方法や、Hugging Face Hub APIを使用した同じモデルの活用方法について詳しく紹介します。さらに、エンコーダーデコーダーやデコーダーモデル（テキスト2テキスト生成およびテキスト生成モデル）についても掘り下げます。

興味深いビデオや関連リンクもご紹介します。ぜひ一緒に大規模言語モデルの驚異的な世界を探検しましょう。

※上記は参考として挙げたポイントですが、実際の内容はより詳細かつ魅力的なものとして執筆されることが望まれます。

動画はこちら

Written by Prompt Engineering

コメントを残すコメントをキャンセル

GIPHY App Key not set. Please check settings

48 Comments

Sort by

@engineerprompt says:

2024年10月20日 at 17:01 Copy Link of a Comment

Want to connect?

💼Consulting: https://calendly.com/engineerprompt/consulting-call

🦾 Discord: https://discord.com/invite/t4eYQRUcXB

☕ Buy me a Coffee: https://ko-fi.com/promptengineering

|🔴 Join Patreon: Patreon.com/PromptEngineering

▶ Subscribe: https://www.youtube.com/@engineerprompt?sub_confirmation=1

0

返信
@kamitalia707 says:

2024年10月20日 at 17:01 Copy Link of a Comment

Brother can we have a call ? i need your help to make a tool .

0

返信

@botondvasvari5758 says:

2024年10月20日 at 17:01 Copy Link of a Comment

and how can I use big models from huggingface ? I can't load them into memory because many of them are bigger than 15gb, some of them are 130gb+ . Any thoughts?

0

返信
@eresque7766 says:

2024年10月20日 at 17:01 Copy Link of a Comment

Gave you the thousandth like
This is the exact info I was searching for
Thank you man

0

返信
@mohsenghafari7652 says:

2024年10月20日 at 17:01 Copy Link of a Comment

hi. please help me. how to create custom model from many pdfs in Persian language? tank you.

0

返信

@mohsenghafari7652 says:

2024年10月20日 at 17:01 Copy Link of a Comment

hi. please help me. how to create custom model from many pdfs in Persian language? tank you.

0

返信
@fredidi4918 says:

2024年10月20日 at 17:01 Copy Link of a Comment

Thanks buddy ❤

0

返信
@learninggodot says:

2024年10月20日 at 17:01 Copy Link of a Comment

why do all these tutorials use jupyter notebooks, i get so lost in that stuff… just show me the damn code

0

返信

@ruirodrigues2938 says:

2024年10月20日 at 17:01 Copy Link of a Comment

I want to run models locally…
On my PC, that is my definition of locally, no API, have the model binary on my PC

0

返信
@chuanjiang6931 says:

2024年10月20日 at 17:01 Copy Link of a Comment

If one llm (e.g. THUDM/chatglm2-6b) does not have the tags for both 'text2text-generation' and 'text-generation', how can we use it in LangChain?

0

返信
@pp6547 says:

2024年10月20日 at 17:01 Copy Link of a Comment

I've found it near impossible to find info on memory requirements for using any model. If I want to load in a model out of the box locally (for example the flan-t5 model in your video), how can you determine this given the parameter size of the model, assuming no quantization, full-fine-tuning, and inference? Also what is actually getting loaded into memory as soon as you load in the model?

0

返信

@not_amanullah says:

2024年10月20日 at 17:01 Copy Link of a Comment

Zoom in can't see clearly

0

返信
@amalaann9182 says:

2024年10月20日 at 17:01 Copy Link of a Comment

how can I use a summarization model here ?

0

返信
@MichaelGrewal says:

2024年10月20日 at 17:01 Copy Link of a Comment

What's the difference between Text Generation vs Text-2-Text Generation?
Excellent video!

0

返信

@narenkumar2109 says:

2024年10月20日 at 17:01 Copy Link of a Comment

Can you create video for download LLM from huggingface and Run models without api key and offline

0

返信
@polinalee9128 says:

2024年10月20日 at 17:01 Copy Link of a Comment

Thank you so much for the well-structured video and accompanying google collab! Other YouTubers often assume the viewer is experienced, but you are patient enough to explain the basic terms and ideas.

0

返信
@ujjwalchetan4907 says:

2024年10月20日 at 17:01 Copy Link of a Comment

How to get answers from local pdf files using these huggingface models?

0

返信

@ujjwalchetan4907 says:

2024年10月20日 at 17:01 Copy Link of a Comment

Well explained bro. Thanks

0

返信
@KashishVarshney-v8w says:

2024年10月20日 at 17:01 Copy Link of a Comment

i have a question that can we use pipeline and tokenizer for the downloaded model instead of downloading from hugging face

0

返信
@ChintuNaveenmeb says:

2024年10月20日 at 17:01 Copy Link of a Comment

Can we call model through API and fine tune for our purpose?

0

返信

@roboko6618 says:

2024年10月20日 at 17:01 Copy Link of a Comment

Pardon me for being ignorant – But this doesn't look very local to me, it looks like you're running it in some kind of google application, which is the opposite of local

0

返信
@MrGarg10may says:

2024年10月20日 at 17:01 Copy Link of a Comment

ERROR: Could not find a version that satisfies the requirement InstrcutorEmbedding (from versions: none)

0

返信
@jordoobodi says:

2024年10月20日 at 17:01 Copy Link of a Comment

this code gives me unfinished answer everytime as if there is token limit for output:
Area 51 is a United States military base in Nevada. Area 51 is known for its secretive

Should I change something in the code?

0

返信

@vinaysamant9445 says:

2024年10月20日 at 17:01 Copy Link of a Comment

Running the model using Google colab GPU is taking too much time which leads to connection timeout. Is it because of the free APIs?

0

返信
@mirohernz3218 says:

2024年10月20日 at 17:01 Copy Link of a Comment

I have spent the last 3 days trying to learn all this through the langchain documentation. You made everything so much simple and clearer to understand. Thank you so much for your work! I unfortunately have failed multiple times to run StableLM 3b locally in google colab due to it crashing the session (RAM shortage). I've watched your other video about 8 bit quantization and have tried it, yet it still crashes the session. I've found useful articles about instantiating large models in huggingface but I can't quite understand what I'm reading. Any ideas on what I should try?

0

返信
@ramzan1813 says:

2024年10月20日 at 17:01 Copy Link of a Comment

I was looking for a video that shows "how to use hugging face models locally" for a long time and finally find it thanks so much, bro

0

返信

@anirudhdhsinhjadeja8622 says:

2024年10月20日 at 17:01 Copy Link of a Comment

This is great. I liked how you explained each terms and each line of code. However, it would be nice if you could just point me some details on how i would be able to run this on vs code?

Should I simply just copy paste each line in vs code? I think that it won't work. We will need to pass the path of the models and may be other stuff that i don't know.

Please reply. This is important for me

0

返信
@TheMacister says:

2024年10月20日 at 17:01 Copy Link of a Comment

I've tried the first approach and over 4minutes of response, the api reported "out of time". I tried through virtual environments, docker python image installing the proper ROC for the AMD card, but no results 🙁 I suppose it is the use of the AMD card and their incompatibilities with Pytorch

0

返信
@human_agi says:

2024年10月20日 at 17:01 Copy Link of a Comment

Has anyone try to run this as chatbot because it doesn’t work and only work for hub

0

返信

@AdrienSales says:

2024年10月20日 at 17:01 Copy Link of a Comment

I got the following error while giving it a try on kaggle :

—————————————————————————
TypeError Traceback (most recent call last)
Cell In[2], line 8
6 model_id = 'google/flan-t5-small'
7 tokenizer = AutoTokenizer.from_pretrained(model_id)
—-> 8 model = AutoModelForSeq2SeqLM.from_pretrained(model_id, load_in_8bit=True, device_map='auto')
10 pipeline = pipeline(
11 "text2text-generation",
12 model=model,
13 tokenizer=tokenizer,
14 max_length=128
15 )
17 local_llm = HuggingFacePipeline(pipeline=pipeline)

File /opt/conda/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py:471, in _BaseAutoModelClass.from_pretrained(cls, pretrained_model_name_or_path, *model_args, **kwargs)
469 elif type(config) in cls._model_mapping.keys():
470 model_class = _get_model_class(config, cls._model_mapping)
–> 471 return model_class.from_pretrained(
472 pretrained_model_name_or_path, *model_args, config=config, **hub_kwargs, **kwargs
473 )
474 raise ValueError(
475 f"Unrecognized configuration class {config.__class__} for this kind of AutoModel: {cls.__name__}.n"
476 f"Model type should be one of {', '.join(c._name_ for c in cls._model_mapping.keys())}."
477 )

File /opt/conda/lib/python3.10/site-packages/transformers/modeling_utils.py:2846, in PreTrainedModel.from_pretrained(cls, pretrained_model_name_or_path, *model_args, **kwargs)
2844 # Dispatch model with hooks on all devices if necessary
2845 if device_map is not None:
-> 2846 dispatch_model(model, device_map=device_map, offload_dir=offload_folder, offload_index=offload_index)
2848 if output_loading_info:
2849 if loading_info is None:

TypeError: dispatch_model() got an unexpected keyword argument 'offload_index'

0

返信
@ikjb8561 says:

2024年10月20日 at 17:01 Copy Link of a Comment

Great video. Sometimes it takes forever to get a response.

0

返信
@engineerprompt says:

2024年10月20日 at 17:01 Copy Link of a Comment

If you liked this video, we will love to use LangChain for Talking to your own data: Watch this: https://youtu.be/TLf90ipMzfE

0

返信

@ressamendy says:

2024年10月20日 at 17:01 Copy Link of a Comment

With which open source structures is an artificial intelligence created to run text to speech and speech to text in a call center style in an institution? 52 x 8gb rx570 graphics cards, which are currently idle as Ethereum rig, are considered to be used in this business? Which open source builds do you think would be appropriate? especially inbound calls for support are aimed. Or survey calls.

0

返信
@toukoum says:

2024年10月20日 at 17:01 Copy Link of a Comment

when i run this step : print(llm_chain.run(question))
google collab can't execute this line, how long is it taking for you?

0

返信
@beacon446 says:

2024年10月20日 at 17:01 Copy Link of a Comment

I want to query my own library of PDFs, without sending anything to OpenAI et al. Will you have a video for that soon? (please!)
There are lots of examples of loading own content which focus on 'prompt stuffing' which presumably does not scale well, whereas I have thousands of PDFs to 'load', so I really need a different solution. Your insights would be greatly appreciated, thank you!

0

返信

@vinaynaman5697 says:

2024年10月20日 at 17:01 Copy Link of a Comment

hey thanks for detailed infomation, Can you also make a video on how to use this approach for the custom data?

0

返信
@AISlopForHumans says:

2024年10月20日 at 17:01 Copy Link of a Comment

This isn't local but thanks for the info

0

返信
@almirbolduan says:

2024年10月20日 at 17:01 Copy Link of a Comment

What would be the best "base" LLM model in portuguese language for using LangChain and creating a question answering bot plugged to local documents? Thanks!

0

返信

@captainyossarian388 says:

2024年10月20日 at 17:01 Copy Link of a Comment

FYI your video thumbnail has a typo, spelling locally as locallay.

0

返信
@hilmiterzi3847 says:

2024年10月20日 at 17:01 Copy Link of a Comment

What is your Linkedin?

0

返信
@ifeanyiidiaye1889 says:

2024年10月20日 at 17:01 Copy Link of a Comment

My OpenAI API key has expired. Does that mean I can't use Lang Chain to build apps?

0

返信

@jawadmansoor6064 says:

2024年10月20日 at 17:01 Copy Link of a Comment

what tokenizer and automodel to give for vicuna? also, how to give it a model already downloaded in a directory?

0

返信
@bakistas20 says:

2024年10月20日 at 17:01 Copy Link of a Comment

Great and informative video again! one bit to add is if you develop your chatbot, and doing vector search, decoder-encoder LLM models performs better, and when you generate human-like responses decode-only llm models are suitable more for that

0

返信
@ShaunPrince says:

2024年10月20日 at 17:01 Copy Link of a Comment

Thank you for providing and sharing a simple workflow, using self and cloud-hosted options. This is pure gold.

0

返信

@lionardo says:

2024年10月20日 at 17:01 Copy Link of a Comment

this isn't working for me. I even created the API key. I just runs into the first question. Thereafter it stops working

I found the issue: just google = Cannot run large models using API token

0

返信
@mygamecomputer1691 says:

2024年10月20日 at 17:01 Copy Link of a Comment

The problem here is you use the word locally which can be connected to the word off-line. If I can run something locally I would want to be able to run it off-line as well. Your solution here requires an online connection to that other service. Effectively you’ve moved one online moment to a different one. I’m only looking for off-line local chat like Oogabooga.

0

返信
@luizaugusto8154 says:

2024年10月20日 at 17:01 Copy Link of a Comment

How to create and train your own model, based on the rules of a business, and use it as explained in the video? Excellent content! I thank you!

0

返信

@AndyBarbosa96 says:

2024年10月20日 at 17:01 Copy Link of a Comment

For the local version of this models it seems you're still using the hugging face Id. May you please explain how to download and what exactly do we need to download in order to run these locally without invoking external APIs?

0

返信

Unlock the Power of AI: Run Language Models Locally with LangChain and Hugging Face Models

LangChain: Run Language Models Locally – Hugging Face Models

関連

Written by Prompt Engineering

「ローカルでHugging Faceモデルを使用するLangChainの魅力（コード解説付き）」

Unlock the Power of AI with Hugging Face + Langchain: Access 200k+ FREE AI Models in Just 5 Minutes!

Transforming Communication: Unleashing the Power of AI with Hugging Face Transformers Library and Chatbot UI

Unlock the Power of AI: Harnessing Pretrained Models from Hugging Face in Just a Few Lines of Code

Unlocking the Power of AI: Exploring HuggingFace’s Machine Learning Community, Datasets, and Models

Unleashing the Power of AI: How Hugging Face Pre-trained Models Can Revolutionize Your Data Science Journey

「OpenAI o1の進化を追う：信頼性と安全性の新たなステージに迫る」

Unleashing the Power of AI: The Ultimate Showdown Between Claude 3 and ChatGPT

「ChatGPTがGoogleに挑戦状！革新的な検索機能でインターネットの未来を切り拓く」

「AIが投資委員会に？新しいETFがGPT、Gemini、Claudeを搭載！」

GoogleとTeam USAの新コラボ『Gemini』がAI技術界を席巻！最新情報を一挙公開！

「Z世代がChatGPTを『肌の専門家』として活用中：ブランド戦略に何をもたらす？」