Guide to the Python SDK for Azure Blob Storage.
Whenever I'm writing code to interact with Blob storage I always end up searching previous code or the internet to find them. I've decided to compile a few of the most common snippets below so that I (and you) can copy/paste them easily whenever they are needed.
I'm writing all of them in a self contained manner so that one doesn't have to go digging around for the imports.
Copy files from Blob to local
Option 1 makes use of a BlobServiceClient
instance while the second variant
makes use of a ContainerClient
. If we are going to do many operations on the
same container using Option 2 will save us some typing and reduce the risk of
addressing the wrong container. The ContainerClient
also has a
get_blob_client
similar to the one from BlobServiceClient
in case we need
to do blob specific operations.
Option 1
from azure.storage.blob import BlobServiceClient
blob_service_client = BlobServiceClient.from_connection_string(connect_str)
blob_client = blob_service_client.get_blob_client(
container=container_name,
blob=local_file_name,
)
with open(download_file_path, "wb") as download_file:
download_file.write(blob_client.download_blob().readall())
Option 2
from azure.storage.blob import ContainerClient
container_client = ContainerClient.from_connection_string(
connection_str,
container_name=container_name,
)
with open(downlad_file_path, "wb") as download_file:
downlad_file.write(container_client.download_blob(blob))
Uploading blobs
We again we have two options which are mediated by either a BlobServiceClient
or
a ContainerClient
.
Option 1
from azure.storage.blob import BlobServiceClient
blob_service_client = BlobServiceClient.from_connection_string(connect_str)
blob_client = blob_service_client.get_blob_client(
container=container_name,
blob=blob_path_and_name,
)
with open(upload_file_path, "rb") as data:
blob_client.upload_blob(data)
Option 2
from azure.storage.blob import ContainerClient
container_client = ContainerClient.from_connection_string(
connection_str,
container_name=container_name,
)
with open(upload_file_path, "rb") as data:
container_client.upload_blob(
blob_path_and_name,
data,
overwrite=False # or True. Do we want to overwrite?
)
Bonus Points
Uploading data without creating a file
Sometimes you don't have a file but you still want to create a blob out from some data. Wouldn't it be inconvenient if you had to create a temp file just for this. Worry no more, BytesIO has got you covered.
from io
import json
from azure.storage.blob import ContainerClient
container_client = ContainerClient.from_connection_string(
connection_str,
container_name=container_name,
)
some_random_json = {"error": "what error?"}
data = io.BytesIO(json.dumps(some_random_json).encode())
container_client.upload_blob(
blob_path_and_name,
data,
)
Changing ContentSettings
Sometimes we need to explicitly set the mime type for the blob or
the md5 hash. In those cases you will need to specify it using a
ContentSetings
object.
from azure.storage.blob import ContainerClient, ContentSettings
container_client = ContainerClient.from_connection_string(
connection_str,
container_name=container_name,
)
with open(path_to_file, "rb") as data:
container_client.upload_blob(
fname,
data,
content_settings=ContentSettings(
content_type=fname_to_mime(fname),
content_md5=getmd5(path_to_file),
),
overwrite=True,
)