Guide to the Python SDK for Azure Blob Storage.

Whenever I'm writing code to interact with Blob storage I always end up searching previous code or the internet to find them. I've decided to compile a few of the most common snippets below so that I (and you) can copy/paste them easily whenever they are needed.

I'm writing all of them in a self contained manner so that one doesn't have to go digging around for the imports.

Copy files from Blob to local

Option 1 makes use of a BlobServiceClient instance while the second variant makes use of a ContainerClient. If we are going to do many operations on the same container using Option 2 will save us some typing and reduce the risk of addressing the wrong container. The ContainerClient also has a get_blob_client similar to the one from BlobServiceClient in case we need to do blob specific operations.

Option 1

from azure.storage.blob import BlobServiceClient

blob_service_client = BlobServiceClient.from_connection_string(connect_str)

blob_client = blob_service_client.get_blob_client(
    container=container_name,
    blob=local_file_name,
)

with open(download_file_path, "wb") as download_file:
    download_file.write(blob_client.download_blob().readall())

Option 2

from azure.storage.blob import ContainerClient

container_client = ContainerClient.from_connection_string(
    connection_str,
    container_name=container_name,
)

with open(downlad_file_path, "wb") as download_file:
    downlad_file.write(container_client.download_blob(blob))

Uploading blobs

We again we have two options which are mediated by either a BlobServiceClient or a ContainerClient.

Option 1

from azure.storage.blob import BlobServiceClient

blob_service_client = BlobServiceClient.from_connection_string(connect_str)

blob_client = blob_service_client.get_blob_client(
    container=container_name,
    blob=blob_path_and_name,
)

with open(upload_file_path, "rb") as data:
    blob_client.upload_blob(data)

Option 2

from azure.storage.blob import ContainerClient

container_client = ContainerClient.from_connection_string(
    connection_str,
    container_name=container_name,
)

with open(upload_file_path, "rb") as data:
    container_client.upload_blob(
        blob_path_and_name,
        data,
        overwrite=False # or True. Do we want to overwrite?
    )

Bonus Points

Uploading data without creating a file

Sometimes you don't have a file but you still want to create a blob out from some data. Wouldn't it be inconvenient if you had to create a temp file just for this. Worry no more, BytesIO has got you covered.

from io
import json
from azure.storage.blob import ContainerClient

container_client = ContainerClient.from_connection_string(
    connection_str,
    container_name=container_name,
)

some_random_json = {"error": "what error?"}

data = io.BytesIO(json.dumps(some_random_json).encode())

container_client.upload_blob(
    blob_path_and_name,
    data,
)

Changing ContentSettings

Sometimes we need to explicitly set the mime type for the blob or the md5 hash. In those cases you will need to specify it using a ContentSetings object.

from azure.storage.blob import ContainerClient, ContentSettings

container_client = ContainerClient.from_connection_string(
    connection_str,
    container_name=container_name,
)

with open(path_to_file, "rb") as data:
    container_client.upload_blob(
        fname,
        data,
        content_settings=ContentSettings(
            content_type=fname_to_mime(fname),
            content_md5=getmd5(path_to_file),
        ),
        overwrite=True,
    )

References