microsoft / azure-openai-in-a-day-workshop Goto Github PK

License: MIT License

Jupyter Notebook 97.69% Python 2.31%

azure-openai-in-a-day-workshop's Introduction

azure-openai-in-a-day-workshop

In this technical workshop, you will get a comprehensive introduction to Azure OpenAI Service and Azure OpenAI Studio. You will learn how to create and refine prompts for various scenarios using hands-on exercises. You will also discover how to leverage Azure OpenAI Service to access and analyze your company data. Moreover, you will explore existing solution accelerators and best practices for prototyping and deploying use cases end-to-end. The workshop will end with a Q&A session and a wrap-up.

Workshop agenda

🌅 Morning (9:00 – 12:00)

Fokus: Introduction and first steps

📣 Intro (90min)
- Into Workshop (15mins)
- Intro Azure OpenAI Service (30mins)
- Azure Azure OpenAI Studio (45mins)
🧑🏼‍💻 Prompt Engineering Exercises using Studio (90mins)

🌆 Afternoon (1:00 – 4:30)

Focus: Solutions

Recap (15min)
📣 Using Azure OpenAI Service to access company data (60min)
- How to bring your own data
- Fine-tuning and embedding
- Solutions Accelerators
QnA session (30min)
💻 Hands-on lab on two exemplay use-cases (90min)

^{📣 Presentation, 🧑🏼‍💻 Hands-on lab}

Preparation

This is only required for the hands-on lab. If you are only attending the presentation, you can skip this section.

Azure OpenAI Service subscription and deployments

Grant the participant access to the Azure OpenAI Service subscription and create the required deployments.

Ideally, grant the participants access to Azure OpenAI Service service be assigning the Cognitive Service OpenAI user. If the participant is a Cognitive Service OpenAI contributor, they can create the following deployments themselves.

Otherwise, create 'text-davinci-003' and 'text-embedding-ada-002' deployments (and assign the participant to the deployments).

There are two ways to authenticate (see Jupyter notebooks):

(Recommended) Use the Azure CLI to authenticate to Azure and Azure OpenAI Service
Using a token (not needed if using the Azure CLI)

Get the Azure OpenAI Service endpoint (and key) from the Azure portal.

Workspace environment

Choose one of the following options to set up your environment: Codespaces, Devcontainer or bring your own environment (Anaconda). Building the environment can take a few minutes, so please start early.

1️⃣ Codespaces

🌟 Highly recommended: Best option if you already have a Github account. You can develop on local VSCode or in a browser window.

Go to Github repository and click on Code button
Create and edit the .env file in the base folder including Azure OpenAI Service endpoint and key before starting any notebooks

2️⃣ Devcontainer

Usually a good option if VSCode and Docker Desktop are already installed.

Install Docker
Install Visual Studio Code
Install Remote - Containers extension
Clone this repository
Open the repository in Visual Studio Code
Click on the green button in the bottom left corner of the window
Select Reopen in Container
Wait for the container to be built and started
Create and edit .env file in the base folder including Azure OpenAI Service endpoint and key before starting any notebooks

3️⃣ Bring your own environment

If you already have a Python environment with Jupyter Notebook and the Azure CLI installed.

Make sure you have the requirements installed in your Python environment using pip install -r requirements.txt.

Content of the repository

💡 Guideline for writing better prompts

Exercises

💪 Simple prompt writing exercises
💪 Quickstart - just to make sure everything works!
💪 Preprocessing - principles of preprocessing and token handling!
💪 Q&A with embeddings
💪 Unsupervised movie classification and recommendations
💪 Email Summarization and Answering App

Solutions

Do not cheat! 😅

💡 Solution - Q&A with embeddings
💡 Solution - Unsupervised movie classification and recommendations
💡 Solution - Email Summarization and Answering App

Q&A Quick Start

If you want to quickly create a Q&A webapp using your own data, please follow the quickstart guide notebook.

If you want to use LangChain to build an interactive chat experience on your own data, follow the quickstart chat on private data using LangChain.

If you want to use LlamaIndex 🦙 (GPT Index), follow the quickstart guide notebook with llama-index.

azure-openai-in-a-day-workshop's People

Contributors

Stargazers

Watchers

Forkers

mvpmark ddobric lordaouy aprabhat ujjwalmsft trefoil-ml mooncowboy steven1791 yehaisong shawaug vinodramasubbu nacartwright kalchakra13 viveonline vijayyadavcloud jonas-schmitt az-optimum bob-okeefe vpatil-ms getazureready sailor723 dr-data alemcuevas sajitsasi vladfeigin mafascio nukhetms simsimmal p0drian a-min-7 saikirankannaiah harshitjn810 tamarafmsft davmil salekh zhengdewei0817 takawago socullen codinggirlsclub nycnx matemsg philippschoen1 qlycool johndohoneyjr tgowthambits lloydcol heibad gordonby techthiyanes feng-huang silentwind1234 jdonis gustavoeperez ysh6638 datanir onestair jennypopova eikden ihateusernames1 wax911 annziel wi5nia arturo-quiroga-msft skumarrm akashchaudhary-git aganiezgoda jiyongseong markwme millanie onlywhale pakbaz lebaker fieri uoleszek hichemessafi neuron-vision jplummer01 wiiki0807 akilagithub trentbrucegithub sathishphcl jhajduk-microsoft rajani-janaki-ram niceysj yuanzhang9 dost2010 pinyuko rodrigosantosms nazeer2013 beyondtrust-poc thisgrrlbytes kubasiak amishsinhams duddit2ltd kenichi-segawa schneutzi-81 the-cognitiveservices-ninja lingfengdk codess-aus githubabcs-devops

azure-openai-in-a-day-workshop's Issues

Azurechatgpt?

Hopefully not too off topic, but I don´t know where else to ask: Where did https://github.com/microsoft/azurechatgpt go? Was it renamed or removed? Why?

How to leverage Azure Open AI embeddings instead of Open AI embeddings in langchain framewrok

As there is a difference in the way we generate embeddings in Azure Open AI vs Open AI as we are not able to create bulk embeddings in single request through Azure Open AI. I have created Azure Open AI embeddings in the below way:-
from langchain.text_splitter import CharacterTextSplitter
#from langchain.embeddings import OpenAIEmbeddings
from langchain.vectorstores import Chroma
from langchain.chains import RetrievalQA
text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)
texts = text_splitter.create_documents(contents)
print(texts)
texts_new = [doc.page_content for doc in texts]

Print the extracted texts

list_docs=[]
for text in texts_new:
print(text)
list_docs.append(text)

import time
Embeddings_list=[]
for item in list_docs:
print(item)
while True:
try:
response = openai.Embedding.create(input=item, engine="text-embedding-ada-002")
embeddings = response['data'][0]['embedding']
Embeddings_list.append(embeddings)
break
except Exception as e:
print(e)
time.sleep(15)

How to use this Embedding_list to ingest into vector store and club it to QNA process as this way of embedding integration into the below process has no guided documentation

Create a vectorstore from documents

db = Chroma.from_documents(texts, embeddings)

Create retriever interface

retriever = db.as_retriever()

Create QA chain

qa = RetrievalQA.from_chain_type(llm=OpenAI(openai_api_key=openai_api_key), chain_type='stuff', retriever=retriever)

API Connection Error

I followed the instructions as follows to run quickstart.ipynb:

In terminal logged in to my azure account by running az login --use-device-code

Post authentication I go back to VScode and run the following code block within quickstart.ipynb which runs fine:
import os
import tiktoken
import openai
import numpy as np
import pandas as pd
from dotenv import load_dotenv
from azure.identity import DefaultAzureCredential
from openai.embeddings_utils import cosine_similarity
from tenacity import retry, wait_random_exponential, stop_after_attempt

Load environment variables

load_dotenv()

Option 1 - Use Azure AD authentication with az cli (use az login in terminal)

default_credential = DefaultAzureCredential()
token = default_credential.get_token("https://cognitiveservices.azure.com/.default")
openai.api_type = "azure_ad"
openai.api_base = os.environ.get("OPENAI_API_BASE")
openai.api_key = token.token
openai.api_version = "2023-06-13"

Option 2 - Using Access Key

openai.api_type = "azure"

openai.api_base = os.environ.get("OPENAI_API_BASE")

openai.api_key = os.environ.get("OPENAI_API_KEY")

openai.api_version = "2022-12-01"

Define embedding model and encoding

EMBEDDING_MODEL = 'text-embedding-ada-002'
COMPLETION_MODEL = 'gpt-4'
encoding = tiktoken.get_encoding('cl100k_base')

Now when I run the following code block I get the error below:
response = openai.Completion.create(engine="gpt-4",
prompt="Knock knock.",
temperature=0)
print(response.choices[0].text)

Error:

MissingSchema Traceback (most recent call last)
File ~/miniconda3/lib/python3.10/site-packages/openai/api_requestor.py:596, in APIRequestor.request_raw(self, method, url, params, supplied_headers, files, stream, request_id, request_timeout)
595 try:
--> 596 result = _thread_context.session.request(
597 method,
598 abs_url,
599 headers=headers,
600 data=data,
601 files=files,
602 stream=stream,
603 timeout=request_timeout if request_timeout else TIMEOUT_SECS,
604 proxies=_thread_context.session.proxies,
605 )
606 except requests.exceptions.Timeout as e:

File ~/miniconda3/lib/python3.10/site-packages/requests/sessions.py:573, in Session.request(self, method, url, params, data, headers, cookies, files, auth, timeout, allow_redirects, proxies, hooks, stream, verify, cert, json)
561 req = Request(
562 method=method.upper(),
563 url=url,
(...)
571 hooks=hooks,
572 )
--> 573 prep = self.prepare_request(req)
575 proxies = proxies or {}

File ~/miniconda3/lib/python3.10/site-packages/requests/sessions.py:484, in Session.prepare_request(self, request)
483 p = PreparedRequest()
--> 484 p.prepare(
485 method=request.method.upper(),
486 url=request.url,
487 files=request.files,
488 data=request.data,
489 json=request.json,
490 headers=merge_setting(
491 request.headers, self.headers, dict_class=CaseInsensitiveDict
492 ),
493 params=merge_setting(request.params, self.params),
494 auth=merge_setting(auth, self.auth),
495 cookies=merged_cookies,
496 hooks=merge_hooks(request.hooks, self.hooks),
497 )
498 return p

File ~/miniconda3/lib/python3.10/site-packages/requests/models.py:368, in PreparedRequest.prepare(self, method, url, headers, files, data, params, auth, cookies, hooks, json)
367 self.prepare_method(method)
--> 368 self.prepare_url(url, params)
369 self.prepare_headers(headers)

File ~/miniconda3/lib/python3.10/site-packages/requests/models.py:439, in PreparedRequest.prepare_url(self, url, params)
438 if not scheme:
--> 439 raise MissingSchema(
440 f"Invalid URL {url!r}: No scheme supplied. "
441 f"Perhaps you meant https://{url}?"
442 )
444 if not host:

MissingSchema: Invalid URL 'None/openai/deployments/gpt-4/completions?api-version=2023-06-13': No scheme supplied. Perhaps you meant https://none/openai/deployments/gpt-4/completions?api-version=2023-06-13?

The above exception was the direct cause of the following exception:

APIConnectionError Traceback (most recent call last)
Cell In[22], line 1
----> 1 response = openai.Completion.create(engine="gpt-4",
2 prompt="Knock knock.",
3 temperature=0)
4 print(response.choices[0].text)

File ~/miniconda3/lib/python3.10/site-packages/openai/api_resources/completion.py:25, in Completion.create(cls, *args, **kwargs)
23 while True:
24 try:
---> 25 return super().create(*args, **kwargs)
26 except TryAgain as e:
27 if timeout is not None and time.time() > start + timeout:

File ~/miniconda3/lib/python3.10/site-packages/openai/api_resources/abstract/engine_api_resource.py:153, in EngineAPIResource.create(cls, api_key, api_base, api_type, request_id, api_version, organization, **params)
127 @classmethod
128 def create(
129 cls,
(...)
136 **params,
137 ):
138 (
139 deployment_id,
140 engine,
(...)
150 api_key, api_base, api_type, api_version, organization, **params
151 )
--> 153 response, _, api_key = requestor.request(
154 "post",
155 url,
156 params=params,
157 headers=headers,
158 stream=stream,
159 request_id=request_id,
...
617 request_id=result.headers.get("X-Request-Id"),
618 )
619 # Don't read the whole stream for debug logging unless necessary.

APIConnectionError: Error communicating with OpenAI: Invalid URL 'None/openai/deployments/gpt-4/completions?api-version=2023-06-13': No scheme supplied. Perhaps you meant https://none/openai/deployments/gpt-4/completions?api-version=2023-06-13?
Output is truncated. View as a scrollable element or open in a text editor. Adjust cell output settings...

Upload slides

getting error

qna-quickstart-with-gpt-index/qna-quickstart-with-llama-index.ipynb

got error

TypeError Traceback (most recent call last)
Input In [3], in <cell line: 31>()
28 prompt_helper = PromptHelper(max_input_size=max_input_size, num_output=num_output, max_chunk_overlap=max_chunk_overlap, chunk_size_limit=chunk_size_limit)
30 # Create index
---> 31 index = GPTSimpleVectorIndex(documents, llm_predictor=llm_predictor, embed_model=embedding_llm, prompt_helper=prompt_helper)
32 index.save_to_disk("index.json")

File ~/.local/lib/python3.9/site-packages/llama_index/indices/vector_store/vector_indices.py:73, in GPTSimpleVectorIndex.init(self, nodes, index_struct, service_context, vector_store, **kwargs)
64 def init(
65 self,
66 nodes: Optional[Sequence[Node]] = None,
(...)
70 **kwargs: Any,
71 ) -> None:
72 """Init params."""
---> 73 super().init(
74 nodes=nodes,
75 index_struct=index_struct,
76 service_context=service_context,
77 vector_store=vector_store,
78 **kwargs,
79 )

File ~/.local/lib/python3.9/site-packages/llama_index/indices/vector_store/base.py:54, in GPTVectorStoreIndex.init(self, nodes, index_struct, service_context, vector_store, use_async, **kwargs)
51 self._vector_store = vector_store or SimpleVectorStore()
53 self._use_async = use_async
---> 54 super().init(
55 nodes=nodes,
56 index_struct=index_struct,
57 service_context=service_context,
58 **kwargs,
59 )

TypeError: init() got an unexpected keyword argument 'llm_predictor'

movie_classification_unsupervised_incl_recommendations_solution: Error 'type' object is not subscriptable

In movie_classification_unsupervised_incl_recommendations_solution.ipynb at cell N°4:
@Retry(wait=wait_random_exponential(min=1, max=20), stop=stop_after_attempt(10))
def get_embedding(text) -> list[float]:
text = text.replace("\n", " ")
return openai.Embedding.create(input=text, engine=EMBEDDING_MODEL)["data"][0]["embedding"]

get an error:

TypeError Traceback (most recent call last)
Cell In[4], line 2
1 @Retry(wait=wait_random_exponential(min=1, max=20), stop=stop_after_attempt(10))
----> 2 def get_embedding(text) -> list[float]:
3 text = text.replace("\n", " ")
4 return openai.Embedding.create(input=text, engine=EMBEDDING_MODEL)["data"][0]["embedding"]

TypeError: 'type' object is not subscriptable

Any recordings of the lecture part?

This looks like a really great workshop. We're doing a corporate Hackathon, and I'm wondering if you have recordings of the presentation parts?

The API deployment for this resource does not exist.

When I run, prompt ：The API deployment for this resource does not exist. If you created the deployment within the last 5 minutes, please wait a moment and try again. But my deployment already exists.Has anyone encountered it？