diff --git a/python/FormRecognizer/how-to-find-endpoint-and-key.png b/python/FormRecognizer/how-to-find-endpoint-and-key.png deleted file mode 100644 index 50cb17fd..00000000 Binary files a/python/FormRecognizer/how-to-find-endpoint-and-key.png and /dev/null differ diff --git a/python/FormRecognizer/images/how-to-find-endpoint-and-key.png b/python/FormRecognizer/images/how-to-find-endpoint-and-key.png index 50cb17fd..feda9212 100644 Binary files a/python/FormRecognizer/images/how-to-find-endpoint-and-key.png and b/python/FormRecognizer/images/how-to-find-endpoint-and-key.png differ diff --git a/python/FormRecognizer/quickstart-receipt-analyze-preview.ipynb b/python/FormRecognizer/quickstart-receipt-analyze-preview.ipynb new file mode 100644 index 00000000..f1906227 --- /dev/null +++ b/python/FormRecognizer/quickstart-receipt-analyze-preview.ipynb @@ -0,0 +1,262 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "This notebook gives you an end-to-end example on how to get started using Python SDK (preview) to analyze a receipt with Azure Form Recognizer." + ] + }, + { + "attachments": {}, + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "## Prerequistes\n", + "- Azure subscription - [Create one for free](https://azure.microsoft.com/en-us/free/cognitive-services/)\n", + "- [Python 3.x](https://www.python.org/) - Your Python installation should include [pip](https://pip.pypa.io/en/stable/). You can check if you have pip installed by running `pip --version` on the command line. Get pip by installing the latest version of Python.\n", + "- Once you have your Azure subscription, [create a Form Recognizer resource](https://ms.portal.azure.com/#create/Microsoft.CognitiveServicesFormRecognizer) in the Azure portal to get your **key** and **endpoint**. After it deploys, click **Go to resource** - You will need the key and endpoint from the resource you create to connect your application to the Form Recognizer API. Later in the quickstart, you will paste your key and endpoint into the code below. You can use the free pricing tier (`F0`) to try the service, and upgrade later to a paid tier (`S0`) for production." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Setting up - Install the client library\n", + "After installing Python, you can install the preview version of Form Recognier client library with:\n", + "`pip install azure-ai-formrecognizer --pre`" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "#!pip install azure-ai-formrecognizer --pre" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Get the key and endpoint\n", + "Refer to the screenshot on how to get the key and endpoint of your Form Recognizer resource.\n", + "![How to find endpoint and key](./images/how-to-find-endpoint-and-key.png)" + ] + }, + { + "cell_type": "code", + "execution_count": 1, + "metadata": {}, + "outputs": [], + "source": [ + "FORMRECOGNIZER_ENDPOINT = \"{enter your endpoint}\"\n", + "FORMRECOGNIZER_KEY = \"{enter your key}\"\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Authenticate a FormRecognizerClient for document analysis\n", + "[FormRecognizerClient](https://docs.microsoft.com/en-us/python/api/azure-ai-formrecognizer/azure.ai.formrecognizer.formrecognizerclient?view=azure-python) is used to query the service to recongize document fields and conent like key-value pairs, tables with prebuilt or custom trained models." + ] + }, + { + "cell_type": "code", + "execution_count": 3, + "metadata": {}, + "outputs": [], + "source": [ + "import os\n", + "from azure.core.exceptions import ResourceNotFoundError\n", + "from azure.ai.formrecognizer import DocumentAnalysisClient\n", + "from azure.core.credentials import AzureKeyCredential\n", + "\n", + "# Initiate client with given endpoint and credential\n", + "document_analysis_client = DocumentAnalysisClient(endpoint=FORMRECOGNIZER_ENDPOINT, credential=AzureKeyCredential(FORMRECOGNIZER_KEY))" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Start an analyze request for your local files with `begin_analyze_document`" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# Read the sample image file into memory and begin analyzing the file\n", + "IMAGE_FILE = 'sample-hotel-receipt.png'\n", + "with open(IMAGE_FILE, 'rb') as f:\n", + " poller = document_analysis_client.begin_analyze_document(\n", + " \"prebuilt-receipt\", document=f\n", + " )\n", + "\n", + "# Get the analyze result\n", + "receipts = poller.result()\n", + "print('Status: {}, Document(s): {}'.format(poller.status(), receipts.documents))" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## You can also analyze files from the web using `begin_analyze_document_from_url`" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "IMAGE_URL = 'https://raw.githubusercontent.com/Azure-Samples/cognitive-services-REST-api-samples/master/curl/form-recognizer/contoso-receipt.png'\n", + "\n", + "# Send request to Form Recognizer service to process data\n", + "poller = document_analysis_client.begin_analyze_document_from_url(\n", + " \"prebuilt-receipt\", document_url=IMAGE_URL\n", + ")\n", + "\n", + "# Get the analyze result\n", + "receipts = poller.result()\n", + "print('Status: {}, Document(s): {}'.format(poller.status(), receipts.documents))" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Extract information from analyzed result" + ] + }, + { + "cell_type": "code", + "execution_count": 9, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "--------Recognizing receipt #1--------\n", + "Receipt Type: Hotel has confidence: 0.988\n", + "ArrivalDate: 2021-03-27 has confidence 0.986\n", + "Currency: USD has confidence 0.995\n", + "DepartureDate: 2021-03-28 has confidence 0.986\n", + "Receipt Items:\n", + "...Item #1\n", + "......Category: Room has confidence 0.988\n", + "......Date: 2021-03-27 has confidence 0.99\n", + "......Description: Room Charge has confidence 0.989\n", + "......TotalPrice: 88.0 has confidence 0.99\n", + "...Item #2\n", + "......Category: Tax has confidence 0.986\n", + "......Date: 2021-03-27 has confidence 0.99\n", + "......Description: County Tax 6% has confidence 0.987\n", + "......TotalPrice: 5.28 has confidence 0.99\n", + "...Item #3\n", + "......Category: Tax has confidence 0.986\n", + "......Date: 2021-03-27 has confidence 0.99\n", + "......Description: State Tax 6% has confidence 0.985\n", + "......TotalPrice: 5.0 has confidence 0.99\n", + "...Item #4\n", + "......Category: Other has confidence 0.988\n", + "......Date: 2021-03-27 has confidence 0.99\n", + "......Description: Daily Parking has confidence 0.989\n", + "......TotalPrice: 8.0 has confidence 0.989\n", + "...Item #5\n", + "......Category: Tax has confidence 0.988\n", + "......Date: 2021-03-27 has confidence 0.989\n", + "......Description: Parking Tax has confidence 0.989\n", + "......TotalPrice: 0.38 has confidence 0.99\n", + "...Item #6\n", + "......Category: Credit has confidence 0.988\n", + "......Date: 2021-03-28 has confidence 0.99\n", + "......Description: American Express has confidence 0.989\n", + "......TotalPrice: 104.92 has confidence 0.99\n", + "Locale: en-US has confidence 0.992\n", + "MerchantAddress: 5600 148th Ave NE, Redmond, WA 98052 has confidence 0.988\n", + "MerchantAliases: [DocumentField(value_type=string, value='Contoso', content=Contoso, bounding_regions=[BoundingRegion(page_number=1, bounding_box=[Point(x=265.0, y=75.0), Point(x=343.0, y=76.0), Point(x=343.0, y=96.0), Point(x=266.0, y=96.0)])], spans=[DocumentSpan(offset=0, length=7)], confidence=0.813)] has confidence None\n", + "MerchantName: Contoso Inn has confidence 0.986\n", + "MerchantPhoneNumber: +19876544321 has confidence 0.99\n", + "ReceiptType: Hotel has confidence 0.988\n", + "Total: 104.92 has confidence 0.994\n" + ] + } + ], + "source": [ + "for idx, receipt in enumerate(receipts.documents):\n", + " print(\"--------Recognizing receipt #{}--------\".format(idx + 1))\n", + " receipt_type = receipt.fields.get(\"ReceiptType\")\n", + " if receipt_type:\n", + " print(\n", + " \"Receipt Type: {} has confidence: {}\".format(\n", + " receipt_type.value, receipt_type.confidence\n", + " )\n", + " )\n", + " merchant_name = receipt.fields.get(\"MerchantName\")\n", + " for name, field in receipt.fields.items():\n", + " if name == \"Items\":\n", + " print(\"Receipt Items:\")\n", + " for id_item, items in enumerate(field.value):\n", + " print(\"...Item #{}\".format(id_item + 1))\n", + " for item_name, item in items.value.items():\n", + " print(\"......{}: {} has confidence {}\".format(item_name, item.value, item.confidence))\n", + " else:\n", + " print(\"{}: {} has confidence {}\".format(name, field.value, field.confidence))" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Next steps\n", + "- Learn [Receipt concept](https://docs.microsoft.com/en-us/azure/cognitive-services/form-recognizer/concept-receipts)\n", + "- Explore the [different offerings](https://docs.microsoft.com/en-us/azure/cognitive-services/form-recognizer/overview) in Form Recognizer\n", + "- Try Form Recognizer with [Form Recognizer Studio](https://docs.microsoft.com/en-us/azure/applied-ai-services/form-recognizer/quickstarts/try-v3-form-recognizer-studio)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [] + } + ], + "metadata": { + "interpreter": { + "hash": "be34584a914411b92112902d7e0134e77b56151a1dd3ca41a681ece390d32dda" + }, + "kernelspec": { + "display_name": "Python 3.7.11 64-bit ('fr-demo': conda)", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.7.11" + } + }, + "nbformat": 4, + "nbformat_minor": 2 +} diff --git a/python/FormRecognizer/quickstart-receipt-analyze.ipynb b/python/FormRecognizer/quickstart-receipt-analyze.ipynb index 2c6c6486..101ef467 100644 --- a/python/FormRecognizer/quickstart-receipt-analyze.ipynb +++ b/python/FormRecognizer/quickstart-receipt-analyze.ipynb @@ -2,172 +2,161 @@ "cells": [ { "cell_type": "markdown", + "metadata": {}, "source": [ "This notebook gives you an end-to-end example on how to get started using Python SDK to analyze a receipt with Azure Form Recognizer." - ], - "metadata": {} + ] }, { + "attachments": {}, "cell_type": "markdown", + "metadata": {}, "source": [ "\n", "## Prerequistes\n", "- Azure subscription - [Create one for free](https://azure.microsoft.com/en-us/free/cognitive-services/)\n", "- [Python 3.x](https://www.python.org/) - Your Python installation should include [pip](https://pip.pypa.io/en/stable/). You can check if you have pip installed by running `pip --version` on the command line. Get pip by installing the latest version of Python.\n", "- Once you have your Azure subscription, [create a Form Recognizer resource](https://ms.portal.azure.com/#create/Microsoft.CognitiveServicesFormRecognizer) in the Azure portal to get your **key** and **endpoint**. After it deploys, click **Go to resource** - You will need the key and endpoint from the resource you create to connect your application to the Form Recognizer API. Later in the quickstart, you will paste your key and endpoint into the code below. You can use the free pricing tier (`F0`) to try the service, and upgrade later to a paid tier (`S0`) for production." - ], - "metadata": {}, - "attachments": {} + ] }, { "cell_type": "markdown", + "metadata": {}, "source": [ "## Setting up - Install the client library\n", "After installing Python, you can install the latest version of Form Recognier client library with:\n", "`pip3 install --upgrade azure-ai-formrecognizer`" - ], - "metadata": {} + ] }, { "cell_type": "code", "execution_count": 19, + "metadata": {}, + "outputs": [], "source": [ "# !pip3 install --upgrade azure-ai-formrecognizer" - ], - "outputs": [], - "metadata": {} + ] }, { "cell_type": "markdown", + "metadata": {}, "source": [ - "## Get the key and endpoint\r\n", - "Refer to the screenshot on how to get the key and endpoint of your Form Recognizer resource.\r\n", + "## Get the key and endpoint\n", + "Refer to the screenshot on how to get the key and endpoint of your Form Recognizer resource.\n", "![How to find endpoint and key](./images/how-to-find-endpoint-and-key.png)" - ], - "metadata": {} + ] }, { "cell_type": "code", "execution_count": 6, + "metadata": {}, + "outputs": [], "source": [ - "FORMRECOGNIZER_ENDPOINT = \"{enter your endpoint}\"\r\n", + "FORMRECOGNIZER_ENDPOINT = \"{enter your endpoint}\"\n", "FORMRECOGNIZER_KEY = \"{enter your key}\"" - ], - "outputs": [], - "metadata": {} + ] }, { "cell_type": "markdown", + "metadata": {}, "source": [ "## Authenticate a FormRecognizerClient for document analysis\n", "[FormRecognizerClient](https://docs.microsoft.com/en-us/python/api/azure-ai-formrecognizer/azure.ai.formrecognizer.formrecognizerclient?view=azure-python) is used to query the service to recongize document fields and conent like key-value pairs, tables with prebuilt or custom trained models." - ], - "metadata": {} + ] }, { "cell_type": "code", "execution_count": 7, + "metadata": {}, + "outputs": [], "source": [ - "from azure.core.credentials import AzureKeyCredential\r\n", - "from azure.ai.formrecognizer import FormRecognizerClient, FormField\r\n", - "\r\n", - "# Initiate client with given endpoint and credential\r\n", + "from azure.core.credentials import AzureKeyCredential\n", + "from azure.ai.formrecognizer import FormRecognizerClient, FormField\n", + "\n", + "# Initiate client with given endpoint and credential\n", "client = FormRecognizerClient(FORMRECOGNIZER_ENDPOINT, AzureKeyCredential(FORMRECOGNIZER_KEY))" - ], - "outputs": [], - "metadata": {} + ] }, { "cell_type": "markdown", + "metadata": {}, "source": [ "## Start an analyze request for your local files with `begin_recognize_receipts`" - ], - "metadata": {} + ] }, { "cell_type": "code", "execution_count": 16, - "source": [ - "# Read the sample image file into memory\r\n", - "IMAGE_FILE = 'sample-receipt.png'\r\n", - "with open(IMAGE_FILE, 'rb') as f:\r\n", - " data = f.read()\r\n", - "\r\n", - "# Send request to Form Recognizer service to process data\r\n", - "task = client.begin_recognize_receipts(data)\r\n", - "\r\n", - "# Get the analyze result\r\n", - "analyzed_result = task.result()\r\n", - "print('Status: {}, Document(s): {}'.format(task.status(), len(analyzed_result)))" - ], + "metadata": {}, "outputs": [ { - "output_type": "stream", "name": "stdout", + "output_type": "stream", "text": [ "Status: succeeded, Document(s): 1\n" ] } ], - "metadata": {} + "source": [ + "# Read the sample image file into memory\n", + "IMAGE_FILE = 'sample-retail-receipt.png'\n", + "with open(IMAGE_FILE, 'rb') as f:\n", + " data = f.read()\n", + "\n", + "# Send request to Form Recognizer service to process data\n", + "task = client.begin_recognize_receipts(data)\n", + "\n", + "# Get the analyze result\n", + "analyzed_result = task.result()\n", + "print('Status: {}, Document(s): {}'.format(task.status(), len(analyzed_result)))" + ] }, { "cell_type": "markdown", + "metadata": {}, "source": [ "## You can also analyze files from the web using `begin_recognize_receipts_from_url`" - ], - "metadata": {} + ] }, { "cell_type": "code", "execution_count": 17, - "source": [ - "IMAGE_URL = 'https://raw.githubusercontent.com/Azure-Samples/cognitive-services-REST-api-samples/master/curl/form-recognizer/contoso-receipt.png'\r\n", - "\r\n", - "# Send request to Form Recognizer service to process data\r\n", - "task = client.begin_recognize_receipts_from_url(IMAGE_URL)\r\n", - "\r\n", - "# Get the analyze result\r\n", - "analyzed_result = task.result()\r\n", - "print('Status: {}, Document(s): {}'.format(task.status(), len(analyzed_result)))" - ], + "metadata": {}, "outputs": [ { - "output_type": "stream", "name": "stdout", + "output_type": "stream", "text": [ "Status: succeeded, Document(s): 1\n" ] } ], - "metadata": {} + "source": [ + "IMAGE_URL = 'https://raw.githubusercontent.com/Azure-Samples/cognitive-services-REST-api-samples/master/curl/form-recognizer/contoso-receipt.png'\n", + "\n", + "# Send request to Form Recognizer service to process data\n", + "task = client.begin_recognize_receipts_from_url(IMAGE_URL)\n", + "\n", + "# Get the analyze result\n", + "analyzed_result = task.result()\n", + "print('Status: {}, Document(s): {}'.format(task.status(), len(analyzed_result)))" + ] }, { "cell_type": "markdown", + "metadata": {}, "source": [ "## Extract information from analyzed result" - ], - "metadata": {} + ] }, { "cell_type": "code", "execution_count": 18, - "source": [ - "for receipt in analyzed_result:\r\n", - " for name, field in receipt.fields.items():\r\n", - " if name == \"Items\":\r\n", - " print(\"Receipt Items:\")\r\n", - " for idx, items in enumerate(field.value):\r\n", - " print(\"...Item #{}\".format(idx + 1))\r\n", - " for item_name, item in items.value.items():\r\n", - " print(\"......{}: {} has confidence {}\".format(item_name, item.value, item.confidence))\r\n", - " else:\r\n", - " print(\"{}: {} has confidence {}\".format(name, field.value, field.confidence))" - ], + "metadata": {}, "outputs": [ { - "output_type": "stream", "name": "stdout", + "output_type": "stream", "text": [ "Receipt Items:\n", "...Item #1\n", @@ -190,24 +179,35 @@ ] } ], - "metadata": {} + "source": [ + "for receipt in analyzed_result:\n", + " for name, field in receipt.fields.items():\n", + " if name == \"Items\":\n", + " print(\"Receipt Items:\")\n", + " for idx, items in enumerate(field.value):\n", + " print(\"...Item #{}\".format(idx + 1))\n", + " for item_name, item in items.value.items():\n", + " print(\"......{}: {} has confidence {}\".format(item_name, item.value, item.confidence))\n", + " else:\n", + " print(\"{}: {} has confidence {}\".format(name, field.value, field.confidence))" + ] }, { "cell_type": "markdown", + "metadata": {}, "source": [ "## Next steps\n", "- Learn [Receipt concept](https://docs.microsoft.com/en-us/azure/cognitive-services/form-recognizer/concept-receipts)\n", "- Explore the [different offerings](https://docs.microsoft.com/en-us/azure/cognitive-services/form-recognizer/overview) in Form Recognizer\n", "- Try Form Recognizer with [sample tool](https://docs.microsoft.com/en-us/azure/cognitive-services/form-recognizer/quickstarts/get-started-with-form-recognizer)" - ], - "metadata": {} + ] }, { "cell_type": "code", "execution_count": null, - "source": [], + "metadata": {}, "outputs": [], - "metadata": {} + "source": [] } ], "metadata": { @@ -234,4 +234,4 @@ }, "nbformat": 4, "nbformat_minor": 2 -} \ No newline at end of file +} diff --git a/python/FormRecognizer/sample-hotel-receipt.png b/python/FormRecognizer/sample-hotel-receipt.png new file mode 100644 index 00000000..ae63aee8 Binary files /dev/null and b/python/FormRecognizer/sample-hotel-receipt.png differ diff --git a/python/FormRecognizer/sample-receipt.png b/python/FormRecognizer/sample-retail-receipt.png similarity index 100% rename from python/FormRecognizer/sample-receipt.png rename to python/FormRecognizer/sample-retail-receipt.png