Issues with FastAPI Uvicorn workers

  Kiến thức lập trình

I have a sample FastAPI application, with a single endpoint which returns the result a after a delay of 5 seconds.
Please find the code below.

from fastapi import FastAPI
import uvicorn, os
app = FastAPI()

async def read_root():
    return {"message": f"This is a delayed response!- {os.getpid()}"}

if __name__ == "__main__":
        workers=10, #Just having 10 workers to understand the concept.

Now I also have a script that will be sending parallel requests to the endpoint http://localhost:9090/delayed-response

Upon starting the Application I came to know that all 10 workers with process id from 9 to 18 are started successfully.
enter image description here

When I send 20 parallel request to this endpoint I observe that only first few requests are handled parallel by the workers and later only one worker is handling all the remaining requests. (May be from a certain point).
Attaching few screenshots of the response:-
enter image description here

enter image description here

Can anyone explain this behavior?

  1. When sudden requests are being sent to an endpoint, why is the first set of requests are handled in parallel (which is equal to the number of workers) later only one worker starts handling the requests?
  2. Is it a behaviour of the FastAPI or Uvicorn?

Note: Let us consider this application to be a synchronous application. I understand that there are few concepts to make is work in a concurrent way, but at this point I wish to understand this working.

My Expectation : My expectation is that when 10 workers are configured and 30 requests are sent in parallel, the 30th request to an endpoint with 5s delay should get the response at ~15th to 17th second.