{"id":963,"date":"2020-05-04T15:18:41","date_gmt":"2020-05-04T22:18:41","guid":{"rendered":"http:\/\/blog.nillsf.com\/?p=963"},"modified":"2020-05-04T15:24:03","modified_gmt":"2020-05-04T22:24:03","slug":"using-form-recognizer-to-recognize-custom-forms","status":"publish","type":"post","link":"https:\/\/blog.nillsf.com\/index.php\/2020\/05\/04\/using-form-recognizer-to-recognize-custom-forms\/","title":{"rendered":"Using Form Recognizer to recognize custom forms"},"content":{"rendered":"\n<p class=\"wp-block-paragraph\">One of my learning goals for this half year was to learn more about AI. Although I&#8217;m focusing my learning mainly on low-level AI (Python, ML, Data Science), I was pretty happy to get involved in a customer project using the Microsoft Form Recognizer. This blog post will cover the Form Recognizer and it&#8217;s functionality. <\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Let&#8217;s start at the beginning: Form Recognizer is one of Microsoft&#8217;s cognitive services that allows you extract structured text from forms. The service itself has three building blocks:<\/p>\n\n\n\n<ul class=\"wp-block-list\"><li><strong>Layout API<\/strong>: An API to extract text and table structure from a document using OCR.<\/li><li><strong>Prebuilt receipt model:<\/strong> An API that allows you to extract data from (USA) sales receipts.<\/li><li><strong>Custom Models<\/strong>: A service that allows you to build custom models to extract data from your custom forms. Custom models use the Layout API to extract text and structure, and can either be trained without labels (using no human input) or using custom labels (allowing you to provide the model with your labels and positions).<\/li><\/ul>\n\n\n\n<p class=\"wp-block-paragraph\">At the time of writing, the service is in preview. This means you can use it at a discounted rate right now, but don&#8217;t get a SLA for it yet. <\/p>\n\n\n\n<p class=\"wp-block-paragraph\">In this post, we&#8217;ll walk through the custom model capabilities in the Form Recognizer API and how to build that out. To start out, let&#8217;s walk through the workflow for the custom model API.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Custom model workflow<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">When building a custom model form recognizer, you&#8217;ll need to do a couple of steps:<\/p>\n\n\n\n<ol class=\"wp-block-list\"><li>Build a training and testing dataset<\/li><li>Upload training dataset<\/li><li>Label training dataset<\/li><li>Train model<\/li><li>Test model<\/li><\/ol>\n\n\n\n<p class=\"wp-block-paragraph\">All of these steps can be done directly against the API of Form Recognizer. To make it easier, Microsoft has developed <a href=\"https:\/\/github.com\/microsoft\/OCR-Form-Tools\/blob\/master\/README.md\">an open source tool<\/a> to help with steps 3 to 5. After uploading your training dataset to Azure blob storage, you can use the tool to label your training dataset and train your model. <\/p>\n\n\n\n<p class=\"wp-block-paragraph\">You could also do this directly from a programming environment. The Azure documentation has a sample on<a href=\"https:\/\/docs.microsoft.com\/en-us\/azure\/cognitive-services\/form-recognizer\/quickstarts\/python-labeled-data\"> how to do this from Python<\/a>. <\/p>\n\n\n\n<p class=\"wp-block-paragraph\">And with that, let&#8217;s have a look at how we can build a custom model using the Form Recognizer. <\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Prerequisites<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">First off, we need a storage account with training data in it. I will use the <a href=\"https:\/\/github.com\/Azure-Samples\/cognitive-services-REST-api-samples\/blob\/master\/curl\/form-recognizer\/sample_data.zip\">default training<\/a> data from Microsoft. I created a new storage account for this, and uploaded the data to it.<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"851\" height=\"821\" src=\"\/wp-content\/uploads\/2020\/05\/image.png\" alt=\"\" class=\"wp-image-964\" srcset=\"https:\/\/nillsfblog.blob.core.windows.net\/media\/2020\/05\/image.png 851w, https:\/\/nillsfblog.blob.core.windows.net\/media\/2020\/05\/image-300x289.png 300w, https:\/\/nillsfblog.blob.core.windows.net\/media\/2020\/05\/image-768x741.png 768w\" sizes=\"auto, (max-width: 851px) 100vw, 851px\" \/><figcaption>Creating a new storage account.<\/figcaption><\/figure>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"672\" height=\"451\" src=\"\/wp-content\/uploads\/2020\/05\/image-1.png\" alt=\"\" class=\"wp-image-965\" srcset=\"https:\/\/nillsfblog.blob.core.windows.net\/media\/2020\/05\/image-1.png 672w, https:\/\/nillsfblog.blob.core.windows.net\/media\/2020\/05\/image-1-300x201.png 300w\" sizes=\"auto, (max-width: 672px) 100vw, 672px\" \/><figcaption>Uploading the 5 training invoices<\/figcaption><\/figure>\n\n\n\n<p class=\"wp-block-paragraph\">In our storage account, we&#8217;ll also need to allow CORS. <\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"438\" src=\"\/wp-content\/uploads\/2020\/05\/image-5-1024x438.png\" alt=\"\" class=\"wp-image-969\" srcset=\"https:\/\/nillsfblog.blob.core.windows.net\/media\/2020\/05\/image-5-1024x438.png 1024w, https:\/\/nillsfblog.blob.core.windows.net\/media\/2020\/05\/image-5-300x128.png 300w, https:\/\/nillsfblog.blob.core.windows.net\/media\/2020\/05\/image-5-768x328.png 768w, https:\/\/nillsfblog.blob.core.windows.net\/media\/2020\/05\/image-5.png 1231w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><figcaption>Setting up CORS<\/figcaption><\/figure>\n\n\n\n<p class=\"wp-block-paragraph\">And finally for storage, we&#8217;ll also need a SAS token to the container hosting our files. We&#8217;ll create a full account SAS for this demo purpose:<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"698\" src=\"\/wp-content\/uploads\/2020\/05\/image-6-1024x698.png\" alt=\"\" class=\"wp-image-970\" srcset=\"https:\/\/nillsfblog.blob.core.windows.net\/media\/2020\/05\/image-6-1024x698.png 1024w, https:\/\/nillsfblog.blob.core.windows.net\/media\/2020\/05\/image-6-300x205.png 300w, https:\/\/nillsfblog.blob.core.windows.net\/media\/2020\/05\/image-6-768x524.png 768w, https:\/\/nillsfblog.blob.core.windows.net\/media\/2020\/05\/image-6.png 1226w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><figcaption>Generating a SAS token<\/figcaption><\/figure>\n\n\n\n<p class=\"wp-block-paragraph\">Copy and paste the SAS token to a temporary file, we&#8217;ll use that later on.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Next, we&#8217;ll create the <a href=\"https:\/\/ms.portal.azure.com\/#create\/Microsoft.CognitiveServicesFormRecognizer\">Form Recognizer resource<\/a> itself. <\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"660\" height=\"569\" src=\"\/wp-content\/uploads\/2020\/05\/image-2.png\" alt=\"\" class=\"wp-image-966\" srcset=\"https:\/\/nillsfblog.blob.core.windows.net\/media\/2020\/05\/image-2.png 660w, https:\/\/nillsfblog.blob.core.windows.net\/media\/2020\/05\/image-2-300x259.png 300w\" sizes=\"auto, (max-width: 660px) 100vw, 660px\" \/><figcaption>Creating the form recognizer resource itself<\/figcaption><\/figure>\n\n\n\n<p class=\"wp-block-paragraph\">Once this is created &#8211; which took less than a minute &#8211; we&#8217;ll need to grab the endpoint and the key to access it. <\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"557\" src=\"\/wp-content\/uploads\/2020\/05\/image-3-1024x557.png\" alt=\"\" class=\"wp-image-967\" srcset=\"https:\/\/nillsfblog.blob.core.windows.net\/media\/2020\/05\/image-3-1024x557.png 1024w, https:\/\/nillsfblog.blob.core.windows.net\/media\/2020\/05\/image-3-300x163.png 300w, https:\/\/nillsfblog.blob.core.windows.net\/media\/2020\/05\/image-3-768x418.png 768w, https:\/\/nillsfblog.blob.core.windows.net\/media\/2020\/05\/image-3.png 1105w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><figcaption>Getting the endpoint and the key.<\/figcaption><\/figure>\n\n\n\n<p class=\"wp-block-paragraph\">With that deployed, we can go ahead and deploy the labeling tool itself.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Deploying the labeling tool<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">We&#8217;ll deploy and run the labeling tool locally. In my case, I&#8217;ll run this on my <a href=\"https:\/\/blog.nillsf.com\/index.php\/2020\/02\/17\/setting-up-wsl2-windows-terminal-and-oh-my-zsh\/\">Ubuntu system running on WSL on Windows<\/a>.<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>docker pull mcr.microsoft.com\/azure-cognitive-services\/custom-form\/labeltool\ndocker run -it -p 3000:80 mcr.microsoft.com\/azure-cognitive-services\/custom-form\/labeltool eula=accept<\/code><\/pre>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"542\" src=\"\/wp-content\/uploads\/2020\/05\/image-4-1024x542.png\" alt=\"\" class=\"wp-image-968\" srcset=\"https:\/\/nillsfblog.blob.core.windows.net\/media\/2020\/05\/image-4-1024x542.png 1024w, https:\/\/nillsfblog.blob.core.windows.net\/media\/2020\/05\/image-4-300x159.png 300w, https:\/\/nillsfblog.blob.core.windows.net\/media\/2020\/05\/image-4-768x407.png 768w, https:\/\/nillsfblog.blob.core.windows.net\/media\/2020\/05\/image-4.png 1235w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n\n<p class=\"wp-block-paragraph\">We can now connect to the labeling tool at <code>localhost:3000<\/code>.<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"954\" height=\"634\" src=\"\/wp-content\/uploads\/2020\/05\/image-7.png\" alt=\"\" class=\"wp-image-971\" srcset=\"https:\/\/nillsfblog.blob.core.windows.net\/media\/2020\/05\/image-7.png 954w, https:\/\/nillsfblog.blob.core.windows.net\/media\/2020\/05\/image-7-300x199.png 300w, https:\/\/nillsfblog.blob.core.windows.net\/media\/2020\/05\/image-7-768x510.png 768w\" sizes=\"auto, (max-width: 954px) 100vw, 954px\" \/><\/figure>\n\n\n\n<p class=\"wp-block-paragraph\">First thing, we&#8217;ll need to pair up our storage account via the SAS URL. To do this, hit the connection icon on the left panel, and fill in the storage account details. Make sure to include the container name in the SAS URL. <\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"954\" height=\"675\" src=\"\/wp-content\/uploads\/2020\/05\/image-8.png\" alt=\"\" class=\"wp-image-972\" srcset=\"https:\/\/nillsfblog.blob.core.windows.net\/media\/2020\/05\/image-8.png 954w, https:\/\/nillsfblog.blob.core.windows.net\/media\/2020\/05\/image-8-300x212.png 300w, https:\/\/nillsfblog.blob.core.windows.net\/media\/2020\/05\/image-8-768x543.png 768w\" sizes=\"auto, (max-width: 954px) 100vw, 954px\" \/><figcaption>Create the storage account connection<\/figcaption><\/figure>\n\n\n\n<p class=\"wp-block-paragraph\">Next, we can create a project to start labeling. <\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"954\" height=\"675\" src=\"\/wp-content\/uploads\/2020\/05\/image-9.png\" alt=\"\" class=\"wp-image-973\" srcset=\"https:\/\/nillsfblog.blob.core.windows.net\/media\/2020\/05\/image-9.png 954w, https:\/\/nillsfblog.blob.core.windows.net\/media\/2020\/05\/image-9-300x212.png 300w, https:\/\/nillsfblog.blob.core.windows.net\/media\/2020\/05\/image-9-768x543.png 768w\" sizes=\"auto, (max-width: 954px) 100vw, 954px\" \/><figcaption>Create a new project<\/figcaption><\/figure>\n\n\n\n<p class=\"wp-block-paragraph\">Provide all the necessary details to connect your blob storage account and connect it to the forms endpoint we created earlier. Finally, hit save project.<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"954\" height=\"951\" src=\"\/wp-content\/uploads\/2020\/05\/image-10.png\" alt=\"\" class=\"wp-image-974\" srcset=\"https:\/\/nillsfblog.blob.core.windows.net\/media\/2020\/05\/image-10.png 954w, https:\/\/nillsfblog.blob.core.windows.net\/media\/2020\/05\/image-10-300x300.png 300w, https:\/\/nillsfblog.blob.core.windows.net\/media\/2020\/05\/image-10-150x150.png 150w, https:\/\/nillsfblog.blob.core.windows.net\/media\/2020\/05\/image-10-768x766.png 768w, https:\/\/nillsfblog.blob.core.windows.net\/media\/2020\/05\/image-10-60x60.png 60w\" sizes=\"auto, (max-width: 954px) 100vw, 954px\" \/><figcaption>Provide all the details to create the project.<\/figcaption><\/figure>\n\n\n\n<p class=\"wp-block-paragraph\">Immediately after saving the project, you&#8217;ll see the form analyzer tool starts loading the forms you submitted and runs OCR on the first one of them. Click on the green button to run OCR on all files, and then we can start manually labeling the forms in the next step.  <\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"954\" height=\"951\" src=\"\/wp-content\/uploads\/2020\/05\/image-11.png\" alt=\"\" class=\"wp-image-975\" srcset=\"https:\/\/nillsfblog.blob.core.windows.net\/media\/2020\/05\/image-11.png 954w, https:\/\/nillsfblog.blob.core.windows.net\/media\/2020\/05\/image-11-300x300.png 300w, https:\/\/nillsfblog.blob.core.windows.net\/media\/2020\/05\/image-11-150x150.png 150w, https:\/\/nillsfblog.blob.core.windows.net\/media\/2020\/05\/image-11-768x766.png 768w, https:\/\/nillsfblog.blob.core.windows.net\/media\/2020\/05\/image-11-60x60.png 60w\" sizes=\"auto, (max-width: 954px) 100vw, 954px\" \/><figcaption>Running OCR on the first file. Make sure to run OCR on all files, to avoid waiting in the next step. <\/figcaption><\/figure>\n\n\n\n<h2 class=\"wp-block-heading\">Labeling the forms<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">Now we can go ahead and label our forms. The first we&#8217;ll do here is create a set of tags about the information that is contained in the form:<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1008\" height=\"438\" src=\"\/wp-content\/uploads\/2020\/05\/image-12.png\" alt=\"\" class=\"wp-image-976\" srcset=\"https:\/\/nillsfblog.blob.core.windows.net\/media\/2020\/05\/image-12.png 1008w, https:\/\/nillsfblog.blob.core.windows.net\/media\/2020\/05\/image-12-300x130.png 300w, https:\/\/nillsfblog.blob.core.windows.net\/media\/2020\/05\/image-12-768x334.png 768w\" sizes=\"auto, (max-width: 1008px) 100vw, 1008px\" \/><figcaption>Creating a set of tags.<\/figcaption><\/figure>\n\n\n\n<p class=\"wp-block-paragraph\">And next, we can link the item on the screen to the tag. To do this, you click the element (or multiple in case of the address) and then click the tag itself.<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1003\" height=\"421\" src=\"\/wp-content\/uploads\/2020\/05\/image-13.png\" alt=\"\" class=\"wp-image-977\" srcset=\"https:\/\/nillsfblog.blob.core.windows.net\/media\/2020\/05\/image-13.png 1003w, https:\/\/nillsfblog.blob.core.windows.net\/media\/2020\/05\/image-13-300x126.png 300w, https:\/\/nillsfblog.blob.core.windows.net\/media\/2020\/05\/image-13-768x322.png 768w\" sizes=\"auto, (max-width: 1003px) 100vw, 1003px\" \/><figcaption>To link an element on the page to a tag, click the element first and then click the tag.<\/figcaption><\/figure>\n\n\n\n<p class=\"wp-block-paragraph\">Do this for all tags on all the training invoices. <\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"657\" src=\"\/wp-content\/uploads\/2020\/05\/image-14-1024x657.png\" alt=\"\" class=\"wp-image-978\" srcset=\"https:\/\/nillsfblog.blob.core.windows.net\/media\/2020\/05\/image-14-1024x657.png 1024w, https:\/\/nillsfblog.blob.core.windows.net\/media\/2020\/05\/image-14-300x193.png 300w, https:\/\/nillsfblog.blob.core.windows.net\/media\/2020\/05\/image-14-768x493.png 768w, https:\/\/nillsfblog.blob.core.windows.net\/media\/2020\/05\/image-14.png 1234w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><figcaption>Label everything in each file.<\/figcaption><\/figure>\n\n\n\n<p class=\"wp-block-paragraph\">Once this is done, you can go ahead and train your custom model. To do this, hit the symbol of the neural net and hit the <em>Train <\/em>button.<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"382\" src=\"\/wp-content\/uploads\/2020\/05\/image-15-1024x382.png\" alt=\"\" class=\"wp-image-979\" srcset=\"https:\/\/nillsfblog.blob.core.windows.net\/media\/2020\/05\/image-15-1024x382.png 1024w, https:\/\/nillsfblog.blob.core.windows.net\/media\/2020\/05\/image-15-300x112.png 300w, https:\/\/nillsfblog.blob.core.windows.net\/media\/2020\/05\/image-15-768x286.png 768w, https:\/\/nillsfblog.blob.core.windows.net\/media\/2020\/05\/image-15.png 1232w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><figcaption>After training, you see our model as a 95% average accuracy.<\/figcaption><\/figure>\n\n\n\n<p class=\"wp-block-paragraph\">And now, we can start analyzing new invoices. To do this in the tool, you the lightbulb icon and upload a file to the tool:<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"293\" src=\"\/wp-content\/uploads\/2020\/05\/image-16-1024x293.png\" alt=\"\" class=\"wp-image-980\" srcset=\"https:\/\/nillsfblog.blob.core.windows.net\/media\/2020\/05\/image-16-1024x293.png 1024w, https:\/\/nillsfblog.blob.core.windows.net\/media\/2020\/05\/image-16-300x86.png 300w, https:\/\/nillsfblog.blob.core.windows.net\/media\/2020\/05\/image-16-768x220.png 768w, https:\/\/nillsfblog.blob.core.windows.net\/media\/2020\/05\/image-16.png 1230w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><figcaption>Running the prediction on an actual invoice.<\/figcaption><\/figure>\n\n\n\n<p class=\"wp-block-paragraph\">This will generate a dataset, containing the returned tags:<\/p>\n\n\n\n<figure class=\"wp-block-image size-large is-resized\"><img loading=\"lazy\" decoding=\"async\" src=\"\/wp-content\/uploads\/2020\/05\/image-17.png\" alt=\"\" class=\"wp-image-981\" width=\"396\" height=\"695\" srcset=\"https:\/\/nillsfblog.blob.core.windows.net\/media\/2020\/05\/image-17.png 396w, https:\/\/nillsfblog.blob.core.windows.net\/media\/2020\/05\/image-17-171x300.png 171w\" sizes=\"auto, (max-width: 396px) 100vw, 396px\" \/><figcaption>The returned tags from our invoice.<\/figcaption><\/figure>\n\n\n\n<h2 class=\"wp-block-heading\">Calling Form Recognizer from Python<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">Form Recognizer is an API, which can be called from a multitude of tools. To show the raw return, I also wanted to test the experience in Python. To run Python, I&#8217;ll use a <a href=\"https:\/\/notebooks.azure.com\">Jupyter notebook on Azure,<\/a> which are available for free.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">There, I created a new project, uploaded an invoice, and created a new Jupyter notebook:<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"965\" height=\"425\" src=\"\/wp-content\/uploads\/2020\/05\/image-18.png\" alt=\"\" class=\"wp-image-982\" srcset=\"https:\/\/nillsfblog.blob.core.windows.net\/media\/2020\/05\/image-18.png 965w, https:\/\/nillsfblog.blob.core.windows.net\/media\/2020\/05\/image-18-300x132.png 300w, https:\/\/nillsfblog.blob.core.windows.net\/media\/2020\/05\/image-18-768x338.png 768w\" sizes=\"auto, (max-width: 965px) 100vw, 965px\" \/><figcaption>Uploading the invoice and creating a new Python notebook.<\/figcaption><\/figure>\n\n\n\n<p class=\"wp-block-paragraph\">The Python code required to run and get a return comes from the <a href=\"https:\/\/docs.microsoft.com\/en-us\/azure\/cognitive-services\/form-recognizer\/quickstarts\/python-labeled-data\">Microsoft quickstart<\/a>. That code has two parts: one part is the part that uploads your invoice to the API, and a second one that polls for the results to be complete. Let&#8217;s start with the first part of the code:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>########### Python Form Recognizer Async Analyze #############\nimport json\nimport time\nfrom requests import get, post\n\n# Endpoint URL\nendpoint = r\"https:\/\/nf-form.cognitiveservices.azure.com\/\"\napim_key = \"baXXX\"\nmodel_id = \"e195db40-baf7-4573-8224-fc6d1277e719\"\npost_url = endpoint + \"\/formrecognizer\/v2.0-preview\/custom\/models\/%s\/analyze\" % model_id\nsource = r\"Invoice_7.pdf\"\nparams = {\n    \"includeTextDetails\": True\n}\n\nheaders = {\n    # Request headers\n    'Content-Type': 'application\/pdf',\n    'Ocp-Apim-Subscription-Key': apim_key,\n}\nwith open(source, \"rb\") as f:\n    data_bytes = f.read()\n\ntry:\n    resp = post(url = post_url, data = data_bytes, headers = headers, params = params)\n    if resp.status_code != 202:\n        print(\"POST analyze failed:\\n%s\" % json.dumps(resp.json()))\n        quit()\n    print(\"POST analyze succeeded:\\n%s\" % resp.headers)\n    get_url = resp.headers[\"operation-location\"]\nexcept Exception as e:\n    print(\"POST analyze failed:\\n%s\" % str(e))\n    quit()<\/code><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\">One thing that is interesting here, is you need your model_id. To get your model ID, go back to the web tool and go to the training view. That contains your model ID.<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"984\" height=\"536\" src=\"\/wp-content\/uploads\/2020\/05\/image-19.png\" alt=\"\" class=\"wp-image-983\" srcset=\"https:\/\/nillsfblog.blob.core.windows.net\/media\/2020\/05\/image-19.png 984w, https:\/\/nillsfblog.blob.core.windows.net\/media\/2020\/05\/image-19-300x163.png 300w, https:\/\/nillsfblog.blob.core.windows.net\/media\/2020\/05\/image-19-768x418.png 768w, https:\/\/nillsfblog.blob.core.windows.net\/media\/2020\/05\/image-19-750x410.png 750w\" sizes=\"auto, (max-width: 984px) 100vw, 984px\" \/><figcaption>Getting your model ID.<\/figcaption><\/figure>\n\n\n\n<p class=\"wp-block-paragraph\">Once you executed this step, you can execute the second step. This will poll the API endpoint and get the results when they are ready.<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>n_tries = 15\nn_try = 0\nwait_sec = 5\nmax_wait_sec = 60\nwhile n_try &lt; n_tries:\n    try:\n        resp = get(url = get_url, headers = {\"Ocp-Apim-Subscription-Key\": apim_key})\n        resp_json = resp.json()\n        if resp.status_code != 200:\n            print(\"GET analyze results failed:\\n%s\" % json.dumps(resp_json))\n            quit()\n        status = resp_json[\"status\"]\n        if status == \"succeeded\":\n            print(\"Analysis succeeded:\\n%s\" % json.dumps(resp_json))\n            quit()\n            n_try = n_tries\n        if status == \"failed\":\n            print(\"Analysis failed:\\n%s\" % json.dumps(resp_json))\n            quit()\n        # Analysis still running. Wait and retry.\n        time.sleep(wait_sec)\n        n_try += 1\n        wait_sec = min(2*wait_sec, max_wait_sec)     \n    except Exception as e:\n        msg = \"GET analyze results failed:\\n%s\" % str(e)\n        print(msg)\n        quit()\nprint(\"Analyze operation did not complete within the allocated time.\")<\/code><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\">This returns a big JSON object. I&#8217;ll include it for reference in the bottom of the post, and I&#8217;ll show you a snippet here already. <\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>\"documentResults\": [\n    {\n        \"docType\": \"custom:form\",\n        \"pageRange\": [\n            1,\n            1\n        ],\n        \"fields\": {\n           <strong> \"Customer Address\": <\/strong> {\n                \"type\": \"string\",\n                 <strong>\"valueString\": \"The Phone Company 5506 Main St Redmond, WA 73493\", <\/strong>\n                \"text\": \"The Phone Company 5506 Main St Redmond, WA 73493\",\n                \"page\": 1,\n                \"boundingBox\": [\n                    5.195,\n                    1.51,\n                    6.57,\n                    1.51,\n                    6.57,\n                    2.0300000000000002,\n                    5.195,\n                    2.0300000000000002\n                ],\n                \"confidence\": 1.0,\n                \"elements\": [\n                    \"#\/analyzeResult\/readResults\/0\/lines\/2\/words\/2\",\n                    \"#\/analyzeResult\/readResults\/0\/lines\/2\/words\/3\",\n                    \"#\/analyzeResult\/readResults\/0\/lines\/2\/words\/4\",\n                    \"#\/analyzeResult\/readResults\/0\/lines\/4\/words\/0\",\n                    \"#\/analyzeResult\/readResults\/0\/lines\/4\/words\/1\",\n                    \"#\/analyzeResult\/readResults\/0\/lines\/4\/words\/2\",\n                    \"#\/analyzeResult\/readResults\/0\/lines\/6\/words\/0\",\n                    \"#\/analyzeResult\/readResults\/0\/lines\/6\/words\/1\",\n                    \"#\/analyzeResult\/readResults\/0\/lines\/6\/words\/2\"\n                ]\n            },<\/code><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\">As you can see here, this returns the elements of our custom form. This includes for instance the Customer Address we tagged, followed by all other elements. The full JSON document contains a full analysis of the document. It identifies all text boxes and all OCR results. <\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Summary<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">This was a quick overview of the form recognizer&#8217;s ability to recognize custom forms. We looked into tagging custom forms, and training a custom model to recognize data in these forms. We then used both the web tool and Python to process form information.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Full JSON object returned by Form Recognizer<\/h2>\n\n\n\n<pre class=\"wp-block-code\"><code>{\n    \"status\": \"succeeded\",\n    \"createdDateTime\": \"2020-05-04T20:25:55Z\",\n    \"lastUpdatedDateTime\": \"2020-05-04T20:26:06Z\",\n    \"analyzeResult\": {\n        \"version\": \"2.0.0\",\n        \"readResults\": [{\n            \"page\": 1,\n            \"language\": \"en\",\n            \"angle\": 0,\n            \"width\": 8.5,\n            \"height\": 11,\n            \"unit\": \"inch\",\n            \"lines\": [{\n                \"boundingBox\": [0.5492, 1.1349, 2.6403, 1.1349, 2.6403, 1.4069, 0.5492, 1.4069],\n                \"text\": \"Margie's Travel\",\n                \"words\": [{\n                    \"boundingBox\": [0.5492, 1.1349, 1.7043, 1.1349, 1.7043, 1.4069, 0.5492, 1.4069],\n                    \"text\": \"Margie's\",\n                    \"confidence\": 1\n                }, {\n                    \"boundingBox\": [1.7903, 1.1349, 2.6403, 1.1349, 2.6403, 1.3534, 1.7903, 1.3534],\n                    \"text\": \"Travel\",\n                    \"confidence\": 1\n                }]\n            }, {\n                \"boundingBox\": [0.7984, 1.515, 1.3826, 1.515, 1.3826, 1.6161, 0.7984, 1.6161],\n                \"text\": \"Address:\",\n                \"words\": [{\n                    \"boundingBox\": [0.7984, 1.515, 1.3826, 1.515, 1.3826, 1.6161, 0.7984, 1.6161],\n                    \"text\": \"Address:\",\n                    \"confidence\": 1\n                }]\n            }, {\n                \"boundingBox\": [4.4033, 1.5114, 6.5682, 1.5114, 6.5682, 1.6425, 4.4033, 1.6425],\n                \"text\": \"Invoice For: The Phone Company\",\n                \"words\": [{\n                    \"boundingBox\": [4.4033, 1.5143, 4.8234, 1.5143, 4.8234, 1.6155, 4.4033, 1.6155],\n                    \"text\": \"Invoice\",\n                    \"confidence\": 1\n                }, {\n                    \"boundingBox\": [4.8793, 1.5143, 5.1013, 1.5143, 5.1013, 1.6154, 4.8793, 1.6154],\n                    \"text\": \"For:\",\n                    \"confidence\": 1\n                }, {\n                    \"boundingBox\": [5.1974, 1.513, 5.4354, 1.513, 5.4354, 1.6151, 5.1974, 1.6151],\n                    \"text\": \"The\",\n                    \"confidence\": 1\n                }, {\n                    \"boundingBox\": [5.489, 1.513, 5.8966, 1.513, 5.8966, 1.6151, 5.489, 1.6151],\n                    \"text\": \"Phone\",\n                    \"confidence\": 1\n                }, {\n                    \"boundingBox\": [5.9466, 1.5114, 6.5682, 1.5114, 6.5682, 1.6425, 5.9466, 1.6425],\n                    \"text\": \"Company\",\n                    \"confidence\": 1\n                }]\n            }, {\n                \"boundingBox\": [0.8107, 1.7037, 2.0158, 1.7037, 2.0158, 1.8076, 0.8107, 1.8076],\n                \"text\": \"134 El Camino Real\",\n                \"words\": [{\n                    \"boundingBox\": [0.8107, 1.705, 1.0195, 1.705, 1.0195, 1.8076, 0.8107, 1.8076],\n                    \"text\": \"134\",\n                    \"confidence\": 1\n                }, {\n                    \"boundingBox\": [1.0755, 1.7054, 1.1779, 1.7054, 1.1779, 1.806, 1.0755, 1.806],\n                    \"text\": \"El\",\n                    \"confidence\": 1\n                }, {\n                    \"boundingBox\": [1.2329, 1.7037, 1.6975, 1.7037, 1.6975, 1.8075, 1.2329, 1.8075],\n                    \"text\": \"Camino\",\n                    \"confidence\": 1\n                }, {\n                    \"boundingBox\": [1.752, 1.7054, 2.0158, 1.7054, 2.0158, 1.8075, 1.752, 1.8075],\n                    \"text\": \"Real\",\n                    \"confidence\": 1\n                }]\n            }, {\n                \"boundingBox\": [5.1995, 1.7133, 6.0298, 1.7133, 6.0298, 1.8172, 5.1995, 1.8172],\n                \"text\": \"5506 Main St\",\n                \"words\": [{\n                    \"boundingBox\": [5.1995, 1.7145, 5.4962, 1.7145, 5.4962, 1.817, 5.1995, 1.817],\n                    \"text\": \"5506\",\n                    \"confidence\": 1\n                }, {\n                    \"boundingBox\": [5.5494, 1.7149, 5.8453, 1.7149, 5.8453, 1.817, 5.5494, 1.817],\n                    \"text\": \"Main\",\n                    \"confidence\": 1\n                }, {\n                    \"boundingBox\": [5.8982, 1.7133, 6.0298, 1.7133, 6.0298, 1.8172, 5.8982, 1.8172],\n                    \"text\": \"St\",\n                    \"confidence\": 1\n                }]\n            }, {\n                \"boundingBox\": [0.8062, 1.8967, 2.0399, 1.8967, 2.0399, 1.9993, 0.8062, 1.9993],\n                \"text\": \"New York NY 46233\",\n                \"words\": [{\n                    \"boundingBox\": [0.8062, 1.8971, 1.0712, 1.8971, 1.0712, 1.9992, 0.8062, 1.9992],\n                    \"text\": \"New\",\n                    \"confidence\": 1\n                }, {\n                    \"boundingBox\": [1.1112, 1.8971, 1.3946, 1.8971, 1.3946, 1.9992, 1.1112, 1.9992],\n                    \"text\": \"York\",\n                    \"confidence\": 1\n                }, {\n                    \"boundingBox\": [1.442, 1.8971, 1.6226, 1.8971, 1.6226, 1.9976, 1.442, 1.9976],\n                    \"text\": \"NY\",\n                    \"confidence\": 1\n                }, {\n                    \"boundingBox\": [1.6633, 1.8967, 2.0399, 1.8967, 2.0399, 1.9993, 1.6633, 1.9993],\n                    \"text\": \"46233\",\n                    \"confidence\": 1\n                }]\n            }, {\n                \"boundingBox\": [5.2018, 1.9045, 6.554, 1.9045, 6.554, 2.0275, 5.2018, 2.0275],\n                \"text\": \"Redmond, WA 73493\",\n                \"words\": [{\n                    \"boundingBox\": [5.2018, 1.9049, 5.8581, 1.9049, 5.8581, 2.0275, 5.2018, 2.0275],\n                    \"text\": \"Redmond,\",\n                    \"confidence\": 1\n                }, {\n                    \"boundingBox\": [5.9069, 1.9049, 6.1364, 1.9049, 6.1364, 2.0055, 5.9069, 2.0055],\n                    \"text\": \"WA\",\n                    \"confidence\": 1\n                }, {\n                    \"boundingBox\": [6.1812, 1.9045, 6.554, 1.9045, 6.554, 2.0072, 6.1812, 2.0072],\n                    \"text\": \"73493\",\n                    \"confidence\": 1\n                }]\n            }, {\n                \"boundingBox\": [0.5439, 2.8733, 1.5729, 2.8733, 1.5729, 2.9754, 0.5439, 2.9754],\n                \"text\": \"Invoice Number\",\n                \"words\": [{\n                    \"boundingBox\": [0.5439, 2.8733, 1.0098, 2.8733, 1.0098, 2.9754, 0.5439, 2.9754],\n                    \"text\": \"Invoice\",\n                    \"confidence\": 1\n                }, {\n                    \"boundingBox\": [1.0611, 2.8743, 1.5729, 2.8743, 1.5729, 2.9754, 1.0611, 2.9754],\n                    \"text\": \"Number\",\n                    \"confidence\": 1\n                }]\n            }, {\n                \"boundingBox\": [1.9491, 2.8733, 2.7527, 2.8733, 2.7527, 2.9754, 1.9491, 2.9754],\n                \"text\": \"Invoice Date\",\n                \"words\": [{\n                    \"boundingBox\": [1.9491, 2.8733, 2.415, 2.8733, 2.415, 2.9754, 1.9491, 2.9754],\n                    \"text\": \"Invoice\",\n                    \"confidence\": 1\n                }, {\n                    \"boundingBox\": [2.4673, 2.8743, 2.7527, 2.8743, 2.7527, 2.9754, 2.4673, 2.9754],\n                    \"text\": \"Date\",\n                    \"confidence\": 1\n                }]\n            }, {\n                \"boundingBox\": [3.3495, 2.8733, 4.4547, 2.8733, 4.4547, 2.9754, 3.3495, 2.9754],\n                \"text\": \"Invoice Due Date\",\n                \"words\": [{\n                    \"boundingBox\": [3.3495, 2.8733, 3.8155, 2.8733, 3.8155, 2.9754, 3.3495, 2.9754],\n                    \"text\": \"Invoice\",\n                    \"confidence\": 1\n                }, {\n                    \"boundingBox\": [3.8677, 2.8743, 4.1149, 2.8743, 4.1149, 2.9754, 3.8677, 2.9754],\n                    \"text\": \"Due\",\n                    \"confidence\": 1\n                }, {\n                    \"boundingBox\": [4.1678, 2.8743, 4.4547, 2.8743, 4.4547, 2.9754, 4.1678, 2.9754],\n                    \"text\": \"Date\",\n                    \"confidence\": 1\n                }]\n            }, {\n                \"boundingBox\": [4.7468, 2.8717, 5.289, 2.8717, 5.289, 3.0035, 4.7468, 3.0035],\n                \"text\": \"Charges\",\n                \"words\": [{\n                    \"boundingBox\": [4.7468, 2.8717, 5.289, 2.8717, 5.289, 3.0035, 4.7468, 3.0035],\n                    \"text\": \"Charges\",\n                    \"confidence\": 1\n                }]\n            }, {\n                \"boundingBox\": [6.141, 2.873, 6.5875, 2.873, 6.5875, 2.9736, 6.141, 2.9736],\n                \"text\": \"VAT ID\",\n                \"words\": [{\n                    \"boundingBox\": [6.141, 2.873, 6.4147, 2.873, 6.4147, 2.9736, 6.141, 2.9736],\n                    \"text\": \"VAT\",\n                    \"confidence\": 1\n                }, {\n                    \"boundingBox\": [6.4655, 2.873, 6.5875, 2.873, 6.5875, 2.9736, 6.4655, 2.9736],\n                    \"text\": \"ID\",\n                    \"confidence\": 1\n                }]\n            }, {\n                \"boundingBox\": [0.535, 3.4097, 1.1504, 3.4097, 1.1504, 3.5136, 0.535, 3.5136],\n                \"text\": \"AC-32322\",\n                \"words\": [{\n                    \"boundingBox\": [0.535, 3.4097, 1.1504, 3.4097, 1.1504, 3.5136, 0.535, 3.5136],\n                    \"text\": \"AC-32322\",\n                    \"confidence\": 1\n                }]\n            }, {\n                \"boundingBox\": [1.9461, 3.411, 2.8569, 3.411, 2.8569, 3.5136, 1.9461, 3.5136],\n                \"text\": \"03 March 2018\",\n                \"words\": [{\n                    \"boundingBox\": [1.9461, 3.411, 2.0879, 3.411, 2.0879, 3.5136, 1.9461, 3.5136],\n                    \"text\": \"03\",\n                    \"confidence\": 1\n                }, {\n                    \"boundingBox\": [2.1428, 3.4114, 2.5074, 3.4114, 2.5074, 3.5135, 2.1428, 3.5135],\n                    \"text\": \"March\",\n                    \"confidence\": 1\n                }, {\n                    \"boundingBox\": [2.5593, 3.411, 2.8569, 3.411, 2.8569, 3.5135, 2.5593, 3.5135],\n                    \"text\": \"2018\",\n                    \"confidence\": 1\n                }]\n            }, {\n                \"boundingBox\": [3.3465, 3.411, 4.119, 3.411, 4.119, 3.5135, 3.3465, 3.5135],\n                \"text\": \"06 Nov 2019\",\n                \"words\": [{\n                    \"boundingBox\": [3.3465, 3.411, 3.4882, 3.411, 3.4882, 3.5135, 3.3465, 3.5135],\n                    \"text\": \"06\",\n                    \"confidence\": 1\n                }, {\n                    \"boundingBox\": [3.5435, 3.4114, 3.7773, 3.4114, 3.7773, 3.5135, 3.5435, 3.5135],\n                    \"text\": \"Nov\",\n                    \"confidence\": 1\n                }, {\n                    \"boundingBox\": [3.8214, 3.411, 4.119, 3.411, 4.119, 3.5135, 3.8214, 3.5135],\n                    \"text\": \"2019\",\n                    \"confidence\": 1\n                }]\n            }, {\n                \"boundingBox\": [5.2909, 3.4114, 6.0483, 3.4114, 6.0483, 3.5381, 5.2909, 3.5381],\n                \"text\": \"$110,153.22\",\n                \"words\": [{\n                    \"boundingBox\": [5.2909, 3.4114, 6.0483, 3.4114, 6.0483, 3.5381, 5.2909, 3.5381],\n                    \"text\": \"$110,153.22\",\n                    \"confidence\": 1\n                }]\n            }, {\n                \"boundingBox\": [6.2288, 3.4114, 6.3995, 3.4114, 6.3995, 3.5119, 6.2288, 3.5119],\n                \"text\": \"RT\",\n                \"words\": [{\n                    \"boundingBox\": [6.2288, 3.4114, 6.3995, 3.4114, 6.3995, 3.5119, 6.2288, 3.5119],\n                    \"text\": \"RT\",\n                    \"confidence\": 1\n                }]\n            }, {\n                \"boundingBox\": [6.2429, 9.667, 6.5489, 9.667, 6.5489, 9.7966, 6.2429, 9.7966],\n                \"text\": \"Page\",\n                \"words\": [{\n                    \"boundingBox\": [6.2429, 9.667, 6.5489, 9.667, 6.5489, 9.7966, 6.2429, 9.7966],\n                    \"text\": \"Page\",\n                    \"confidence\": 1\n                }]\n            }, {\n                \"boundingBox\": [6.8409, 9.6656, 7.0593, 9.6656, 7.0593, 9.7681, 6.8409, 9.7681],\n                \"text\": \"1 of\",\n                \"words\": [{\n                    \"boundingBox\": [6.8409, 9.6681, 6.8837, 9.6681, 6.8837, 9.7663, 6.8409, 9.7663],\n                    \"text\": \"1\",\n                    \"confidence\": 1\n                }, {\n                    \"boundingBox\": [6.9512, 9.6656, 7.0593, 9.6656, 7.0593, 9.7681, 6.9512, 9.7681],\n                    \"text\": \"of\",\n                    \"confidence\": 1\n                }]\n            }, {\n                \"boundingBox\": [7.4076, 9.6681, 7.4503, 9.6681, 7.4503, 9.7663, 7.4076, 9.7663],\n                \"text\": \"1\",\n                \"words\": [{\n                    \"boundingBox\": [7.4076, 9.6681, 7.4503, 9.6681, 7.4503, 9.7663, 7.4076, 9.7663],\n                    \"text\": \"1\",\n                    \"confidence\": 1\n                }]\n            }]\n        }],\n        \"pageResults\": [{\n            \"page\": 1,\n            \"tables\": [{\n                \"rows\": 2,\n                \"columns\": 6,\n                \"cells\": [{\n                    \"rowIndex\": 0,\n                    \"columnIndex\": 0,\n                    \"text\": \"Invoice Number\",\n                    \"boundingBox\": [0.5075, 2.8088, 1.9061, 2.8088, 1.9061, 3.3219, 0.5075, 3.3219],\n                    \"elements\": [\"#\/readResults\/0\/lines\/7\/words\/0\", \"#\/readResults\/0\/lines\/7\/words\/1\"]\n                }, {\n                    \"rowIndex\": 0,\n                    \"columnIndex\": 1,\n                    \"text\": \"Invoice Date\",\n                    \"boundingBox\": [1.9061, 2.8088, 3.3074, 2.8088, 3.3074, 3.3219, 1.9061, 3.3219],\n                    \"elements\": [\"#\/readResults\/0\/lines\/8\/words\/0\", \"#\/readResults\/0\/lines\/8\/words\/1\"]\n                }, {\n                    \"rowIndex\": 0,\n                    \"columnIndex\": 2,\n                    \"text\": \"Invoice Due Date\",\n                    \"boundingBox\": [3.3074, 2.8088, 4.7074, 2.8088, 4.7074, 3.3219, 3.3074, 3.3219],\n                    \"elements\": [\"#\/readResults\/0\/lines\/9\/words\/0\", \"#\/readResults\/0\/lines\/9\/words\/1\", \"#\/readResults\/0\/lines\/9\/words\/2\"]\n                }, {\n                    \"rowIndex\": 0,\n                    \"columnIndex\": 3,\n                    \"text\": \"Charges\",\n                    \"boundingBox\": [4.7074, 2.8088, 5.386, 2.8088, 5.386, 3.3219, 4.7074, 3.3219],\n                    \"elements\": [\"#\/readResults\/0\/lines\/10\/words\/0\"]\n                }, {\n                    \"rowIndex\": 0,\n                    \"columnIndex\": 5,\n                    \"text\": \"VAT ID\",\n                    \"boundingBox\": [6.1051, 2.8088, 7.5038, 2.8088, 7.5038, 3.3219, 6.1051, 3.3219],\n                    \"elements\": [\"#\/readResults\/0\/lines\/11\/words\/0\", \"#\/readResults\/0\/lines\/11\/words\/1\"]\n                }, {\n                    \"rowIndex\": 1,\n                    \"columnIndex\": 0,\n                    \"text\": \"AC-32322\",\n                    \"boundingBox\": [0.5075, 3.3219, 1.9061, 3.3219, 1.9061, 3.859, 0.5075, 3.859],\n                    \"elements\": [\"#\/readResults\/0\/lines\/12\/words\/0\"]\n                }, {\n                    \"rowIndex\": 1,\n                    \"columnIndex\": 1,\n                    \"text\": \"03 March 2018\",\n                    \"boundingBox\": [1.9061, 3.3219, 3.3074, 3.3219, 3.3074, 3.859, 1.9061, 3.859],\n                    \"elements\": [\"#\/readResults\/0\/lines\/13\/words\/0\", \"#\/readResults\/0\/lines\/13\/words\/1\", \"#\/readResults\/0\/lines\/13\/words\/2\"]\n                }, {\n                    \"rowIndex\": 1,\n                    \"columnIndex\": 2,\n                    \"text\": \"06 Nov 2019\",\n                    \"boundingBox\": [3.3074, 3.3219, 4.7074, 3.3219, 4.7074, 3.859, 3.3074, 3.859],\n                    \"elements\": [\"#\/readResults\/0\/lines\/14\/words\/0\", \"#\/readResults\/0\/lines\/14\/words\/1\", \"#\/readResults\/0\/lines\/14\/words\/2\"]\n                }, {\n                    \"rowIndex\": 1,\n                    \"columnIndex\": 3,\n                    \"columnSpan\": 2,\n                    \"text\": \"$110,153.22\",\n                    \"boundingBox\": [4.7074, 3.3219, 6.1051, 3.3219, 6.1051, 3.859, 4.7074, 3.859],\n                    \"elements\": [\"#\/readResults\/0\/lines\/15\/words\/0\"]\n                }, {\n                    \"rowIndex\": 1,\n                    \"columnIndex\": 5,\n                    \"text\": \"RT\",\n                    \"boundingBox\": [6.1051, 3.3219, 7.5038, 3.3219, 7.5038, 3.859, 6.1051, 3.859],\n                    \"elements\": [\"#\/readResults\/0\/lines\/16\/words\/0\"]\n                }]\n            }]\n        }],\n        \"documentResults\": [{\n            \"docType\": \"custom:form\",\n            \"pageRange\": [1, 1],\n            \"fields\": {\n                \"Customer Address\": {\n                    \"type\": \"string\",\n                    \"valueString\": \"The Phone Company 5506 Main St Redmond, WA 73493\",\n                    \"text\": \"The Phone Company 5506 Main St Redmond, WA 73493\",\n                    \"page\": 1,\n                    \"boundingBox\": [5.195, 1.51, 6.57, 1.51, 6.57, 2.0300000000000002, 5.195, 2.0300000000000002],\n                    \"confidence\": 1.0,\n                    \"elements\": [\"#\/analyzeResult\/readResults\/0\/lines\/2\/words\/2\", \"#\/analyzeResult\/readResults\/0\/lines\/2\/words\/3\", \"#\/analyzeResult\/readResults\/0\/lines\/2\/words\/4\", \"#\/analyzeResult\/readResults\/0\/lines\/4\/words\/0\", \"#\/analyzeResult\/readResults\/0\/lines\/4\/words\/1\", \"#\/analyzeResult\/readResults\/0\/lines\/4\/words\/2\", \"#\/analyzeResult\/readResults\/0\/lines\/6\/words\/0\", \"#\/analyzeResult\/readResults\/0\/lines\/6\/words\/1\", \"#\/analyzeResult\/readResults\/0\/lines\/6\/words\/2\"]\n                },\n                \"Invoice Date\": {\n                    \"type\": \"date\",\n                    \"text\": \"03 March 2018\",\n                    \"page\": 1,\n                    \"boundingBox\": [1.945, 3.41, 2.855, 3.41, 2.855, 3.515, 1.945, 3.515],\n                    \"confidence\": 0.88,\n                    \"elements\": [\"#\/analyzeResult\/readResults\/0\/lines\/13\/words\/0\", \"#\/analyzeResult\/readResults\/0\/lines\/13\/words\/1\", \"#\/analyzeResult\/readResults\/0\/lines\/13\/words\/2\"]\n                },\n                \"Invoice Number\": {\n                    \"type\": \"string\",\n                    \"valueString\": \"AC-32322\",\n                    \"text\": \"AC-32322\",\n                    \"page\": 1,\n                    \"boundingBox\": [0.535, 3.41, 1.1500000000000001, 3.41, 1.1500000000000001, 3.515, 0.535, 3.515],\n                    \"confidence\": 0.99,\n                    \"elements\": [\"#\/analyzeResult\/readResults\/0\/lines\/12\/words\/0\"]\n                },\n                \"Invoice Due Date\": {\n                    \"type\": \"date\",\n                    \"text\": \"06 Nov 2019\",\n                    \"page\": 1,\n                    \"boundingBox\": [3.345, 3.41, 4.12, 3.41, 4.12, 3.515, 3.345, 3.515],\n                    \"confidence\": 0.99,\n                    \"elements\": [\"#\/analyzeResult\/readResults\/0\/lines\/14\/words\/0\", \"#\/analyzeResult\/readResults\/0\/lines\/14\/words\/1\", \"#\/analyzeResult\/readResults\/0\/lines\/14\/words\/2\"]\n                },\n                \"VAT ID\": {\n                    \"type\": \"string\",\n                    \"valueString\": \"RT\",\n                    \"text\": \"RT\",\n                    \"page\": 1,\n                    \"boundingBox\": [6.23, 3.41, 6.4, 3.41, 6.4, 3.5100000000000002, 6.23, 3.5100000000000002],\n                    \"confidence\": 1.0,\n                    \"elements\": [\"#\/analyzeResult\/readResults\/0\/lines\/16\/words\/0\"]\n                },\n                \"Charges\": {\n                    \"type\": \"string\",\n                    \"valueString\": \"$110,153.22\",\n                    \"text\": \"$110,153.22\",\n                    \"page\": 1,\n                    \"boundingBox\": [5.29, 3.41, 6.05, 3.41, 6.05, 3.54, 5.29, 3.54],\n                    \"confidence\": 1.0,\n                    \"elements\": [\"#\/analyzeResult\/readResults\/0\/lines\/15\/words\/0\"]\n                },\n                \"Company Address\": {\n                    \"type\": \"string\",\n                    \"valueString\": \"134 El Camino Real New York NY 46233\",\n                    \"text\": \"134 El Camino Real New York NY 46233\",\n                    \"page\": 1,\n                    \"boundingBox\": [0.805, 1.705, 2.04, 1.705, 2.04, 2.0, 0.805, 2.0],\n                    \"confidence\": 1.0,\n                    \"elements\": [\"#\/analyzeResult\/readResults\/0\/lines\/3\/words\/0\", \"#\/analyzeResult\/readResults\/0\/lines\/3\/words\/1\", \"#\/analyzeResult\/readResults\/0\/lines\/3\/words\/2\", \"#\/analyzeResult\/readResults\/0\/lines\/3\/words\/3\", \"#\/analyzeResult\/readResults\/0\/lines\/5\/words\/0\", \"#\/analyzeResult\/readResults\/0\/lines\/5\/words\/1\", \"#\/analyzeResult\/readResults\/0\/lines\/5\/words\/2\", \"#\/analyzeResult\/readResults\/0\/lines\/5\/words\/3\"]\n                },\n                \"Company Name\": {\n                    \"type\": \"string\",\n                    \"valueString\": \"Margie's Travel\",\n                    \"text\": \"Margie's Travel\",\n                    \"page\": 1,\n                    \"boundingBox\": [0.55, 1.135, 2.64, 1.135, 2.64, 1.405, 0.55, 1.405],\n                    \"confidence\": 0.94,\n                    \"elements\": [\"#\/analyzeResult\/readResults\/0\/lines\/0\/words\/0\", \"#\/analyzeResult\/readResults\/0\/lines\/0\/words\/1\"]\n                }\n            }\n        }],\n        \"errors\": []\n    }\n}<\/code><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\"><\/p>\n","protected":false},"excerpt":{"rendered":"<p>One of my learning goals for this half year was to learn more about AI. Although I&#8217;m focusing my learning mainly on low-level AI (Python, ML, Data Science), I was pretty happy to get involved in a customer project using the Microsoft Form Recognizer. This blog post will cover the Form Recognizer and it&#8217;s functionality. [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":987,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_jetpack_memberships_contains_paid_content":false,"footnotes":""},"categories":[2,47,31],"tags":[8,44,113,114],"class_list":["post-963","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-azure","category-data-science","category-software-development","tag-azure","tag-data-science","tag-form-recognizer","tag-python"],"jetpack_featured_media_url":"https:\/\/nillsfblog.blob.core.windows.net\/media\/2020\/05\/2020-05-04-15_09_33-PowerPoint-Slide-Show-Presentation1-1-e1588631374103.png","jetpack_sharing_enabled":true,"_links":{"self":[{"href":"https:\/\/blog.nillsf.com\/index.php\/wp-json\/wp\/v2\/posts\/963","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/blog.nillsf.com\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/blog.nillsf.com\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/blog.nillsf.com\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/blog.nillsf.com\/index.php\/wp-json\/wp\/v2\/comments?post=963"}],"version-history":[{"count":4,"href":"https:\/\/blog.nillsf.com\/index.php\/wp-json\/wp\/v2\/posts\/963\/revisions"}],"predecessor-version":[{"id":989,"href":"https:\/\/blog.nillsf.com\/index.php\/wp-json\/wp\/v2\/posts\/963\/revisions\/989"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/blog.nillsf.com\/index.php\/wp-json\/wp\/v2\/media\/987"}],"wp:attachment":[{"href":"https:\/\/blog.nillsf.com\/index.php\/wp-json\/wp\/v2\/media?parent=963"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/blog.nillsf.com\/index.php\/wp-json\/wp\/v2\/categories?post=963"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/blog.nillsf.com\/index.php\/wp-json\/wp\/v2\/tags?post=963"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}