Skip to content

Form Recognizer API with Python for PDF throwing "BadArgument" error #323

@scoopmans

Description

@scoopmans

The issue

The Quickstart for Form Recognizer REST API with Python is not functioning any more. It throws the following error:
POST analyze failed: {"error":{"code":"BadArgument","message":"Bad or unrecognizable request JSON or binary file."}}

Not sure if this has to do with the update to Form Recognizer REST API v3.0, but is not functioning for PDF anymore. It worked a few weeks ago.

My code

from requests import get, post, Request, Session

endpoint = r"https://westeurope.api.cognitive.microsoft.com"
key = "<my_personal_key"
source = r"<path to pdf>"

def get_url(filename, endpoint, apim_key):

    # v2.1
    post_url = endpoint + "/formrecognizer/v2.1/layout/analyze"

    headers = {
        # Request headers
        'Content-Type': 'application/pdf',
        'Ocp-Apim-Subscription-Key': apim_key,
    }

    with open(filename, "rb") as f:
        data_bytes = f.read()

    try:
        resp = post(post_url, data=data_bytes, headers=headers)
        if resp.status_code != 202:
            print("POST analyze failed:\n%s" % resp.text)
            return None
        print("POST analyze succeeded: %s" % resp.headers["operation-location"])
        get_url = resp.headers["operation-location"]
        return get_url
    except Exception as e:
        print("POST request failed:\n%s" % str(e))
        return None

resp_url = get_url(source, endpoint, key)

This issue is for a: (mark with an x)

- [ ] bug report -> please search issues before submitting
- [ ] feature request
- [ ] documentation issue or request
- [x] regression (a behavior that used to work and stopped in a new release)

Minimal steps to reproduce

Follow the Quickstart: Extract text and layout information using the Form Recognizer REST API with Python

Any log messages given by the failure

POST analyze failed:
{"error":{"code":"BadArgument","message":"Bad or unrecognizable request JSON or binary file."}}

Expected/desired behavior

I would expect to get a url returned

OS and Version?

Working on Windows 10, using Visual Studio Code

Versions

requests == 2.28.1
Python == 3.9.13

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions