We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
目前我想用databricks 開源的Dolly模型做出個可以針對我給的數據集內的問題給專業知識的回覆的機器人 數據庫裡大概都是這樣的教你步驟去解決問題 我想要用langchain去實現 不過我實際使用後發現回覆的不是我要的答案 這是我的code
from langchain.embeddings import HuggingFaceEmbeddings from PyPDF2 import PdfReader from langchain.text_splitter import CharacterTextSplitter from langchain.vectorstores import ElasticVectorSearch, Pinecone, Weaviate, FAISS from langchain.chains.question_answering import load_qa_chain from langchain.llms import HuggingFacePipeline from transformers import AutoTokenizer, AutoModelForCausalLM, pipeline from langchain.prompts import PromptTemplate import torch hf_embed = HuggingFaceEmbeddings(model_name="sentence-transformers/all-mpnet-base-v2") from google.colab import drive drive.mount('/content/gdrive', force_remount=True) root_dir = "/content/gdrive/My Drive/" reader = PdfReader('/content/gdrive/My Drive/data/operation Manual.pdf') raw_text = '' for i, page in enumerate(reader.pages): text = page.extract_text() if text: raw_text += text text_splitter = CharacterTextSplitter( separator = "\n", chunk_size = 1000, chunk_overlap = 200, length_function = len, ) texts = text_splitter.split_text(raw_text) docsearch = FAISS.from_texts(texts, hf_embed) model_name = "databricks/dolly-v2-3b" instruct_pipeline = pipeline(model=model_name, torch_dtype=torch.bfloat16, trust_remote_code=True, device_map="auto", return_full_text=True, max_new_tokens=256, top_p=0.95, top_k=50) hf_pipe = HuggingFacePipeline(pipeline=instruct_pipeline) prompt_template = """Use the following pieces of context to answer the question at the end. If you don't know the answer, just say that you don't know, don't try to make up an answer. {context} Question: {question} """ PROMPT = PromptTemplate( template=prompt_template, input_variables=["context", "question"] ) query = "I forgot my login password." docs = docsearch.similarity_search(query) chain = load_qa_chain(llm = hf_pipe, chain_type="stuff", prompt=PROMPT) chain({"input_documents": docs, "question": query}, return_only_outputs=True)
不只回覆的答案不是我數據集中的解答這個問題,產生回覆的速度也需要花費大約2.5個小時 不知道我在哪個步驟使用錯誤了 求大老幫忙解答一下 謝謝!
The text was updated successfully, but these errors were encountered:
No branches or pull requests
目前我想用databricks 開源的Dolly模型做出個可以針對我給的數據集內的問題給專業知識的回覆的機器人
數據庫裡大概都是這樣的教你步驟去解決問題
我想要用langchain去實現 不過我實際使用後發現回覆的不是我要的答案
這是我的code
不只回覆的答案不是我數據集中的解答這個問題,產生回覆的速度也需要花費大約2.5個小時
不知道我在哪個步驟使用錯誤了
求大老幫忙解答一下 謝謝!
The text was updated successfully, but these errors were encountered: