how do i create a python3 asynchronous web request with asyncio?

Couple of important things:

  • Python interpreter GIL, runs on a single-thread; so technically you arent really running things in parallel
  • But the catch is, most I/O operations ‘hog’ resources while your CPU during these periods is still idle. Thats where libraries like asyncio comes to your rescue.
  • They try to ensure minimal CPU-idle-time, by running other tasks in your queue while major I/O operations are awaiting their results

In your case, update_posts() doesnt really seem like an async method in an ideal sense; because this method is technically only used to figure out which posts are to be downloaded and written

And since we are already discussing about download and writing, you can notice that you can actually make them run as independent tasks so ensure minimal downtime.

Here is how I might approach this:

import asyncio
from asyncio import Queue
import aiohttp
import os


async def generate_download_post_tasks(path, queue: Queue):
    myid = get_myid(path)
    if myid < maxid: # Database can be updated
        for id in range(myid+1, maxid):
            queue.put_nowait((id, path))


async def download_post_tasks(download_queue: Queue, write_queue: Queue):
    async with aiohttp.ClientSession() as session:
        while True:
            download_request_id, path = await download_queue.get()
            async with session.get(f"{apiurl}item/{download_request_id}.json?print=pretty") as resp:
                json = await resp.json()
                content = await resp.text()
                print(f"Done downloading {download_request_id}")
                if json["type"] == "story":
                    write_queue.put_nowait((download_request_id, content, path))


async def write_post_tasks(write_queue: Queue):
    while True:
        post_id, post_content, path = await write_queue.get()
        print(f"Begin writing to {post_id}.json")
        with open(os.path.join(path, f"{post_id}.json"), "w") as file:
            file.write(post_content)
            print(f"Done writing to {post_id}.json")


async def async_main():
    if not os.path.exists(posts_dir):
        os.makedirs(posts_dir)
    path = os.path.join(os.getcwd(), posts_dir)

    tasks = set()
    download_queue = Queue()
    write_queue = Queue()
    tasks.add(asyncio.create_task(generate_download_post_tasks(path=path, queue=download_queue)))
    tasks.add(asyncio.create_task(download_post_tasks(download_queue=download_queue, write_queue=write_queue)))
    tasks.add(asyncio.create_task(write_post_tasks(write_queue=write_queue)))

    wait_time = 100

    try:
        await asyncio.wait_for(asyncio.gather(*tasks), wait_time)
    except:
        # Catch errors
        print("End!!")
        

if __name__ == '__main__':
    asyncio.run(async_main())

CLICK HERE to find out more related problems solutions.

Leave a Comment

Your email address will not be published.

Scroll to Top