Part 2 – Building A Multi-Subscription Azure Savings & Score Card Report – Visualisation and Email

Have you read Part 1 of this multi part post on creating a Multi-Subscription Azure Savings & Score Card Report? If not you can bring yourself up to speed here

But to recap. Cost, it’s something that a lot of organisations struggle with when operating in the cloud.
I have authored many posts on mechanisms you can take to reduce your costs, and they broadly apply across all the hyperscalers, but as they say, sunlight is the best disinfectant.

The sunlight I am referring to is cost visibility. Something many organisations lack to implement.

The challenge in many an organisations are that engineering teams operate in the dark about their spending impact. Finance may see ballooning costs that can be hard to correlate to business outcomes.

This isn’t a technical problem. We have two parties, one that is building, pulling architectural levers and spending and then there is another party thats paying bills that are almost always not in sync with each other.

So in part 2 of this multi-part post, we are going to take our data that we calculated in part 1 and then visualise and then automate the whole end-2-end process so that at the start of every month we automatically generate and send cost reports

This post can be broken in to a few key steps

  1. Visulation With Python Libraries
  2. Conversion to PDF
  3. Automation With Power Automate

Visualisation With Python Libraries
We will be using code to turn our CSV in to a visulation. This is Python that will be ran by Azure Automation. What this code will do is the following

  • Generated detailed Azure Cost Report based on the CSV file
  • Converts these HTML reports in to PDF
  • Report Shows
    • Cost Overview (Current Month, Monthly Average, Annual Projection, Potential Annual Savings)
    • Efficiency Rating (Current Effeciency, Potential Monthly Savings, Azure Advisor Score
    • Monthly Cost Trend (Last 5 months)
    • Subscription Comparison
      • Cost Of This Subscription vs Average In Organisation
      • Saving Potential Of This Subscription vs Average In Organisation
    • Service Cost Distribution
      • Pie Chart of Top 5 Used Azure Services
    • Savings Opportunity
      • Graph Showing Current vs. Optimal Cost
      • Chart Showing Cost Saving Opportunities
    • Service Specific Recommendations
    • Next Steps

Short of going through the code line by line I am going to insert a copy of a cost report.

To generate these we first need to install our Python libraries and this can be done via PIP. Create a file called requirements.txt with the following content

pandas>=1.3.0
matplotlib>=3.4.0
numpy>=1.20.0
seaborn>=0.11.0
fpdf>=1.7.2

Then execute this file

pip install -r requirements.txt

The pyhton code to generate these reports is as follows.

#!/usr/bin/env python3
"""
Azure Subscription Cost HTML Report Generator
--------------------------------------------
This script generates one detailed HTML report per subscription
from an Azure cost report CSV file, then converts them to PDF.
"""

import pandas as pd
import numpy as np
import os
import argparse
import re
import json
from datetime import datetime
import jinja2
import plotly.graph_objects as go
import plotly.express as px
from plotly.subplots import make_subplots
import weasyprint

# Set up template environment
template_loader = jinja2.FileSystemLoader(searchpath="./")
template_env = jinja2.Environment(loader=template_loader)

def extract_service_costs(services_str):
    """Parse the Top 5 Services string from the CSV"""
    if pd.isna(services_str) or services_str == 'No data':
        return {}
    
    service_costs = {}
    services = services_str.split('), ')
    for service in services:
        service = service.replace('(', '').replace(')', '')
        if '$' in service:
            parts = service.split('$')
            service_name = parts[0].strip()
            cost_str = parts[1].strip().replace(',', '')
            try:
                service_costs[service_name] = float(cost_str)
            except ValueError:
                continue
    
    return service_costs

def format_currency(value):
    """Format a value as currency"""
    if value >= 1e6:
        return f'${value*1e-6:.1f}M'
    elif value >= 1e3:
        return f'${value*1e-3:.0f}K'
    else:
        return f'${value:.0f}'

def get_month_data(df, subscription_row):
    """Extract month names and cost data"""
    # Get month columns
    months = [col for col in df.columns if 'Cost' in col]
    
    month_labels = []
    for col in months:
        parts = col.split(' ')
        if len(parts) == 2:  # Format: "Month Cost"
            month_labels.append(parts[0])
        elif len(parts) == 3:  # Format: "Month YYYY Cost"
            month_labels.append(f"{parts[0]} {parts[1]}")
        else:
            month_labels.append(col.replace(' Cost', ''))
    
    # Extract monthly costs
    monthly_costs = [subscription_row[month] for month in months]
    
    return months, month_labels, monthly_costs

def create_monthly_trend_chart(month_labels, monthly_costs):
    """Create a monthly cost trend chart using Plotly"""
    fig = go.Figure()
    
    fig.add_trace(go.Bar(
        x=month_labels,
        y=monthly_costs,
        text=[f'${cost:,.0f}' for cost in monthly_costs],
        textposition='outside',
        marker_color='#1f77b4'
    ))
    
    fig.update_layout(
        title="Monthly Cost Trend",
        xaxis_title="Month",
        yaxis_title="Cost (AUD)",
        height=400,
        margin=dict(l=40, r=40, t=60, b=40),
        plot_bgcolor='rgba(0,0,0,0)'
    )
    
    return fig.to_json()

def create_services_pie_chart(subscription_row):
    """Create a pie chart for service breakdown"""
    # Parse top services
    top_services = extract_service_costs(subscription_row['Top 5 Services (Current Month)'])
    
    if not top_services:
        # Return empty chart if no data
        fig = go.Figure()
        fig.update_layout(
            title="No Service Data Available",
            height=400
        )
        return fig.to_json()
    
    # Sort services by cost
    services = sorted(top_services.items(), key=lambda x: x[1], reverse=True)
    service_names = [s[0] for s in services]
    service_costs = [s[1] for s in services]
    
    # Create labels with costs
    labels = [f"{name} (${cost:,.0f})" for name, cost in zip(service_names, service_costs)]
    
    # Create pie chart
    fig = go.Figure(data=[go.Pie(
        labels=labels,
        values=service_costs,
        hole=.4,
        textinfo='percent',
        textposition='inside'
    )])
    
    fig.update_layout(
        title="Service Cost Breakdown",
        height=400,
        margin=dict(l=20, r=20, t=60, b=20),
        legend=dict(
            orientation="h",
            yanchor="bottom",
            y=-0.2,
            xanchor="center",
            x=0.5
        )
    )
    
    return fig.to_json()

def create_savings_chart(monthly_costs, est_savings):
    """Create a chart showing current vs optimized costs"""
    months, month_labels = list(range(len(monthly_costs))), [f"Month {i+1}" for i in range(len(monthly_costs))]
    optimal_costs = [cost - est_savings if cost > est_savings else 0 for cost in monthly_costs]
    
    fig = go.Figure()
    
    # Add current costs line
    fig.add_trace(go.Scatter(
        x=month_labels,
        y=monthly_costs,
        mode='lines+markers',
        name='Current Cost',
        line=dict(color='#1f77b4', width=3),
        marker=dict(size=8)
    ))
    
    # Add optimal costs line
    fig.add_trace(go.Scatter(
        x=month_labels,
        y=optimal_costs,
        mode='lines+markers',
        name='Optimal Cost',
        line=dict(color='#2ca02c', width=3, dash='dash'),
        marker=dict(size=8)
    ))
    
    # Add filled area for potential savings
    fig.add_trace(go.Scatter(
        x=month_labels,
        y=monthly_costs,
        mode='none',
        name='Potential Savings',
        fill='tonexty',
        fillcolor='rgba(255, 127, 14, 0.3)'
    ))
    
    fig.update_layout(
        title="Current vs. Optimal Costs",
        xaxis_title="Month",
        yaxis_title="Cost (AUD)",
        height=400,
        margin=dict(l=40, r=40, t=60, b=40),
        plot_bgcolor='rgba(0,0,0,0)',
        legend=dict(
            orientation="h",
            yanchor="bottom",
            y=1.02,
            xanchor="right",
            x=1
        )
    )
    
    return fig.to_json()

def create_savings_breakdown_chart(est_savings):
    """Create a chart showing breakdown of potential savings"""
    # Sample savings categories (in a real implementation, this would come from actual recommendations)
    savings_categories = {
        'Resource Sizing': 0.35,
        'Reserved Instances': 0.25,
        'Unused Resources': 0.20,
        'Storage Optimization': 0.15,
        'Other': 0.05
    }
    
    # Convert to absolute values based on estimated savings
    for category in savings_categories:
        savings_categories[category] *= est_savings
    
    categories = list(savings_categories.keys())
    values = list(savings_categories.values())
    
    fig = go.Figure(go.Bar(
        x=values,
        y=categories,
        orientation='h',
        text=[f'${v:,.0f}' for v in values],
        textposition='auto',
        marker_color='#1f77b4'
    ))
    
    fig.update_layout(
        title="Estimated Savings Breakdown",
        xaxis_title="Monthly Savings (AUD)",
        height=300,
        margin=dict(l=40, r=40, t=60, b=40),
        plot_bgcolor='rgba(0,0,0,0)'
    )
    
    return fig.to_json()

def generate_recommendations(advisor_score, top_services):
    """Generate recommendations based on advisor score and services"""
    if advisor_score == 'High':
        priority = "Critical"
        key_recs = [
            "Immediate action needed to address cost inefficiencies",
            "Schedule a cost optimization workshop within 2 weeks",
            "Review all VMs for right-sizing opportunities",
            "Implement auto-shutdown for non-production VMs",
            "Review storage accounts for unused/unattached disks",
            "Consider Azure Reservations for stable workloads"
        ]
    elif advisor_score == 'Medium':
        priority = "Important"
        key_recs = [
            "Review cost optimization within the next month",
            "Analyze VM usage patterns and right-size as needed",
            "Implement resource tagging for better cost allocation",
            "Move infrequently accessed storage to cooler tiers",
            "Review and optimize network resources",
            "Evaluate reserved instances for consistent workloads"
        ]
    else:  # Low
        priority = "Recommended"
        key_recs = [
            "Continue regular cost monitoring practices",
            "Schedule quarterly cost optimization reviews",
            "Implement governance policies to maintain efficiency",
            "Document current cost optimization strategies",
            "Consider advanced cost allocation mechanisms",
            "Explore new Azure features that may reduce costs"
        ]
    
    # Generate service-specific recommendations
    service_recs = []
    
    for service, cost in sorted(top_services.items(), key=lambda x: x[1], reverse=True):
        service_lower = service.lower()
        
        if any(x in service_lower for x in ['virtual machine', 'vm']):
            service_recs.append({
                "service": service,
                "cost": cost,
                "recommendations": [
                    "Right-size VMs based on CPU and memory utilization metrics",
                    "Implement auto-shutdown schedules for development/test machines",
                    "Consider Azure Reserved VM Instances for consistent workloads"
                ]
            })
        elif any(x in service_lower for x in ['storage', 'blob']):
            service_recs.append({
                "service": service,
                "cost": cost,
                "recommendations": [
                    "Implement lifecycle management policies to move data to cooler tiers",
                    "Delete unused snapshots and unattached disks",
                    "Optimize premium storage usage for only critical workloads"
                ]
            })
        elif any(x in service_lower for x in ['sql', 'database']):
            service_recs.append({
                "service": service,
                "cost": cost,
                "recommendations": [
                    "Right-size SQL databases based on performance metrics",
                    "Consider elastic pools for multiple smaller databases",
                    "Implement auto-pause for development/test databases"
                ]
            })
        elif any(x in service_lower for x in ['app service', 'function']):
            service_recs.append({
                "service": service,
                "cost": cost,
                "recommendations": [
                    "Consolidate App Service Plans where possible",
                    "Scale down dev/test environments during non-business hours",
                    "Consider consumption plan for Functions with intermittent usage"
                ]
            })
        else:
            service_recs.append({
                "service": service,
                "cost": cost,
                "recommendations": [
                    "Review service usage patterns for optimization opportunities",
                    "Ensure proper resource governance and tagging",
                    "Implement cost alerting for unexpected increases"
                ]
            })
    
    return {
        "priority": priority,
        "key_recommendations": key_recs,
        "service_recommendations": service_recs
    }

def generate_html_report(subscription_row, df):
    """Generate an HTML report for a subscription"""
    # Extract necessary data
    months, month_labels, monthly_costs = get_month_data(df, subscription_row)
    est_savings = subscription_row['Est. Monthly Savings']
    advisor_score = subscription_row['Advisor Score']
    subscription_name = subscription_row['Subscription Name']
    subscription_id = subscription_row['Subscription ID']
    
    # Parse top services
    top_services = extract_service_costs(subscription_row['Top 5 Services (Current Month)'])
    
    # Calculate key metrics
    current_month_cost = monthly_costs[-1] if monthly_costs else 0
    max_month_cost = max(monthly_costs) if monthly_costs else 0
    min_month_cost = min(monthly_costs) if monthly_costs else 0
    avg_month_cost = np.mean(monthly_costs) if monthly_costs else 0
    total_annual_est = avg_month_cost * 12
    annual_savings_est = est_savings * 12
    
    # Calculate efficiency metrics
    efficiency_ratio = (current_month_cost - est_savings) / current_month_cost * 100 if current_month_cost > 0 else 100
    savings_ratio = 100 - efficiency_ratio
    
    # Determine efficiency rating
    if savings_ratio > 25:
        efficiency_rating = "Poor"
        efficiency_color = "#d9534f"  # Red
    elif savings_ratio > 10:
        efficiency_rating = "Average"
        efficiency_color = "#f0ad4e"  # Yellow
    elif savings_ratio > 0:
        efficiency_rating = "Good"
        efficiency_color = "#5cb85c"  # Green
    else:
        efficiency_rating = "Excellent"
        efficiency_color = "#428bca"  # Blue
    
    # Create charts
    monthly_trend_chart = create_monthly_trend_chart(month_labels, monthly_costs)
    services_pie_chart = create_services_pie_chart(subscription_row)
    savings_chart = create_savings_chart(monthly_costs, est_savings)
    savings_breakdown_chart = create_savings_breakdown_chart(est_savings)
    
    # Generate recommendations
    recommendations = generate_recommendations(advisor_score, top_services)
    
    # Generate comparison data
    # Get average cost and savings across subscriptions
    avg_sub_cost = df[months[-1]].mean() if months else 0
    avg_sub_savings = df['Est. Monthly Savings'].mean()
    
    # Check if current subscription cost is above average
    cost_vs_avg = current_month_cost - avg_sub_cost
    cost_vs_avg_pct = (cost_vs_avg / avg_sub_cost * 100) if avg_sub_cost > 0 else 0
    cost_vs_avg_class = "text-danger" if cost_vs_avg > 0 else "text-success"
    
    # Check if current subscription savings is above average
    savings_vs_avg = est_savings - avg_sub_savings
    savings_vs_avg_pct = (savings_vs_avg / avg_sub_savings * 100) if avg_sub_savings > 0 else 0
    savings_vs_avg_class = "text-danger" if savings_vs_avg > 0 else "text-success"
    
    # Create template data
    template_data = {
        "subscription_name": subscription_name,
        "subscription_id": subscription_id,
        "current_date": datetime.now().strftime("%Y-%m-%d"),
        "advisor_score": advisor_score,
        
        # Charts (as JSON for plotly.js)
        "monthly_trend_chart": monthly_trend_chart,
        "services_pie_chart": services_pie_chart,
        "savings_chart": savings_chart,
        "savings_breakdown_chart": savings_breakdown_chart,
        
        # Key metrics
        "current_month_cost": current_month_cost,
        "current_month_cost_formatted": format_currency(current_month_cost),
        "max_month_cost": max_month_cost,
        "max_month_cost_formatted": format_currency(max_month_cost),
        "min_month_cost": min_month_cost,
        "min_month_cost_formatted": format_currency(min_month_cost),
        "avg_month_cost": avg_month_cost,
        "avg_month_cost_formatted": format_currency(avg_month_cost),
        "est_monthly_savings": est_savings,
        "est_monthly_savings_formatted": format_currency(est_savings),
        "annual_cost_est": total_annual_est,
        "annual_cost_est_formatted": format_currency(total_annual_est),
        "annual_savings_est": annual_savings_est,
        "annual_savings_est_formatted": format_currency(annual_savings_est),
        
        # Efficiency metrics
        "efficiency_ratio": efficiency_ratio,
        "savings_ratio": savings_ratio,
        "efficiency_rating": efficiency_rating,
        "efficiency_color": efficiency_color,
        
        # Comparison data
        "avg_sub_cost": avg_sub_cost,
        "avg_sub_cost_formatted": format_currency(avg_sub_cost),
        "cost_vs_avg": cost_vs_avg,
        "cost_vs_avg_formatted": format_currency(abs(cost_vs_avg)),
        "cost_vs_avg_pct": cost_vs_avg_pct,
        "cost_vs_avg_class": cost_vs_avg_class,
        "avg_sub_savings": avg_sub_savings,
        "avg_sub_savings_formatted": format_currency(avg_sub_savings),
        "savings_vs_avg": savings_vs_avg,
        "savings_vs_avg_formatted": format_currency(abs(savings_vs_avg)),
        "savings_vs_avg_pct": savings_vs_avg_pct,
        "savings_vs_avg_class": savings_vs_avg_class,
        
        # Recommendations
        "recommendations": recommendations
    }
    
    # Render template
    template = template_env.get_template("azure_cost_report_template.html")
    return template.render(**template_data)

def create_html_template():
    """Create the HTML template file if it doesn't exist"""
    template_path = "azure_cost_report_template.html"
    
    if os.path.exists(template_path):
        print(f"Template already exists at {template_path}")
        return template_path
    
    print(f"Creating HTML template at {template_path}")
    
    template_content = """<!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="UTF-8">
    <meta name="viewport" content="width=device-width, initial-scale=1.0">
    <title>Azure Cost Report - {{ subscription_name }}</title>
    <script src="https://cdn.plot.ly/plotly-latest.min.js"></script>
    <style>
        @page {
            size: A4;
            margin: 1cm;
        }
        body {
            font-family: 'Segoe UI', Arial, sans-serif;
            line-height: 1.6;
            color: #333;
            max-width: 1170px;
            margin: 0 auto;
            background-color: #f8f9fa;
            padding: 0;
        }
        .report-container {
            background-color: white;
            box-shadow: 0 4px 6px rgba(0,0,0,0.1);
            margin: 0;
            padding: 0;
            page-break-after: always;
        }
        .header {
            background-color: #0078d4;
            color: white;
            padding: 20px;
            position: relative;
            display: flex;
            align-items: center;
        }
        .logo-container {
            flex: 0 0 auto;
            margin-right: 20px;
            display: flex;
            align-items: center;
            justify-content: center;
        }
        .company-logo {
            height: 60px;
            width: auto;
            background-color: white;
            border-radius: 5px;
            padding: 5px;
        }
        .header-text {
            flex: 1 1 auto;
        }
        .header h1 {
            margin: 0;
            font-size: 24px;
        }
        .header p {
            margin: 5px 0 0;
            opacity: 0.9;
            font-size: 14px;
        }
        .date-stamp {
            position: absolute;
            top: 20px;
            right: 20px;
            font-size: 14px;
            opacity: 0.9;
        }
        .content {
            padding: 20px;
        }
        .section {
            margin-bottom: 30px;
        }
        .section-title {
            color: #0078d4;
            border-bottom: 1px solid #ddd;
            padding-bottom: 8px;
            font-size: 18px;
            font-weight: bold;
        }
        .card {
            background-color: white;
            border-radius: 5px;
            box-shadow: 0 2px 4px rgba(0,0,0,0.05);
            padding: 15px;
            margin-bottom: 20px;
        }
        .chart-container {
            width: 100%;
            height: 400px;
        }
        .metrics-grid {
            display: grid;
            grid-template-columns: repeat(2, 1fr);
            grid-gap: 15px;
        }
        .metric-card {
            background-color: #f8f9fa;
            border-radius: 5px;
            padding: 15px;
            text-align: center;
        }
        .metric-value {
            font-size: 24px;
            font-weight: bold;
            color: #0078d4;
            margin: 10px 0;
        }
        .metric-label {
            font-size: 14px;
            color: #666;
        }
        .recommendations-list {
            list-style-type: none;
            padding: 0;
        }
        .recommendations-list li {
            padding: 8px 0;
            border-bottom: 1px solid #f0f0f0;
        }
        .recommendations-list li:last-child {
            border-bottom: none;
        }
        .service-rec {
            background-color: #f8f9fa;
            border-left: 4px solid #0078d4;
            padding: 15px;
            margin-bottom: 15px;
        }
        .service-rec h4 {
            margin-top: 0;
            display: flex;
            justify-content: space-between;
        }
        .service-cost {
            color: #0078d4;
            font-weight: bold;
        }
        .footer {
            text-align: center;
            padding: 15px;
            font-size: 12px;
            color: #666;
            border-top: 1px solid #ddd;
        }
        .efficiency-badge {
            display: inline-block;
            padding: 5px 10px;
            border-radius: 20px;
            color: white;
            font-weight: bold;
        }
        .comparison-block {
            padding: 15px;
            border-radius: 5px;
            margin-top: 10px;
        }
        .text-danger {
            color: #d9534f;
        }
        .text-success {
            color: #5cb85c;
        }
        .two-col {
            display: grid;
            grid-template-columns: 1fr 1fr;
            grid-gap: 20px;
        }
        @media print {
            body {
                background-color: white;
            }
            .report-container {
                box-shadow: none;
            }
        }
    </style>
</head>
<body>

<!-- PAGE 1: Summary & Trend Analysis -->
<div class="report-container">
    <div class="header">
        <div class="logo-container">
            <img src="https://logo.stocklight.com/ASX/VEA.png" alt="VEA Logo" class="company-logo">
        </div>
        <div class="header-text">
            <h1>Azure Cost Report - {{ subscription_name }}</h1>
            <p>Subscription ID: {{ subscription_id }}</p>
        </div>
        <div class="date-stamp">Generated: {{ current_date }}</div>
    </div>
    
    <div class="content">
        <div class="section">
            <h2 class="section-title">Executive Summary</h2>
            <div class="two-col">
                <div class="card">
                    <h3>Cost Overview</h3>
                    <div class="metrics-grid">
                        <div class="metric-card">
                            <div class="metric-label">Current Month</div>
                            <div class="metric-value">{{ current_month_cost_formatted }}</div>
                        </div>
                        <div class="metric-card">
                            <div class="metric-label">Monthly Average</div>
                            <div class="metric-value">{{ avg_month_cost_formatted }}</div>
                        </div>
                        <div class="metric-card">
                            <div class="metric-label">Annual Projection</div>
                            <div class="metric-value">{{ annual_cost_est_formatted }}</div>
                        </div>
                        <div class="metric-card">
                            <div class="metric-label">Potential Annual Savings</div>
                            <div class="metric-value">{{ annual_savings_est_formatted }}</div>
                        </div>
                    </div>
                </div>
                <div class="card">
                    <h3>Efficiency Rating</h3>
                    <div style="text-align: center; padding: 20px 0;">
                        <div class="efficiency-badge" style="background-color: {{ efficiency_color }}; font-size: 18px; padding: 10px 20px;">
                            {{ efficiency_rating }}
                        </div>
                        <p style="margin-top: 20px;">
                            Current efficiency: {{ efficiency_ratio|round(1) }}%<br>
                            Potential monthly savings: {{ est_monthly_savings_formatted }} ({{ savings_ratio|round(1) }}% of current cost)
                        </p>
                        <p>Advisor Score: <strong>{{ advisor_score }}</strong></p>
                    </div>
                </div>
            </div>
        </div>

        <div class="section">
            <h2 class="section-title">Monthly Cost Trend</h2>
            <div class="card">
                <div id="monthlyTrendChart" class="chart-container"></div>
            </div>
        </div>
        
        <div class="section">
            <h2 class="section-title">Subscription Comparison</h2>
            <div class="card">
                <div class="two-col">
                    <div>
                        <h3>Cost vs. Average</h3>
                        <p>
                            Your cost: <strong>{{ current_month_cost_formatted }}</strong><br>
                            Average subscription cost: <strong>{{ avg_sub_cost_formatted }}</strong>
                        </p>
                        <div class="comparison-block" style="background-color: #f8f9fa;">
                            <span class="{{ cost_vs_avg_class }}">
                                {% if cost_vs_avg > 0 %}
                                    Your subscription costs <strong>{{ cost_vs_avg_formatted }}</strong> more than average ({{ cost_vs_avg_pct|round(1) }}% higher)
                                {% else %}
                                    Your subscription costs <strong>{{ cost_vs_avg_formatted }}</strong> less than average ({{ cost_vs_avg_pct|round(1)|abs }}% lower)
                                {% endif %}
                            </span>
                        </div>
                    </div>
                    <div>
                        <h3>Savings Potential vs. Average</h3>
                        <p>
                            Your potential savings: <strong>{{ est_monthly_savings_formatted }}</strong><br>
                            Average potential savings: <strong>{{ avg_sub_savings_formatted }}</strong>
                        </p>
                        <div class="comparison-block" style="background-color: #f8f9fa;">
                            <span class="{{ savings_vs_avg_class }}">
                                {% if savings_vs_avg > 0 %}
                                    Your subscription has <strong>{{ savings_vs_avg_formatted }}</strong> more savings potential ({{ savings_vs_avg_pct|round(1) }}% higher)
                                {% else %}
                                    Your subscription has <strong>{{ savings_vs_avg_formatted }}</strong> less savings potential ({{ savings_vs_avg_pct|round(1)|abs }}% lower)
                                {% endif %}
                            </span>
                        </div>
                    </div>
                </div>
            </div>
        </div>
    </div>
    
    <div class="footer">
        Page 1 of 3 | Azure Cost Optimization Report
    </div>
</div>

<!-- PAGE 2: Service Breakdown & Savings Analysis -->
<div class="report-container">
    <div class="header">
        <div class="logo-container">
            <img src="https://logo.stocklight.com/ASX/VEA.png" alt="VEA Logo" class="company-logo">
        </div>
        <div class="header-text">
            <h1>Service Breakdown - {{ subscription_name }}</h1>
            <p>Subscription ID: {{ subscription_id }}</p>
        </div>
        <div class="date-stamp">Generated: {{ current_date }}</div>
    </div>
    
    <div class="content">
        <div class="section">
            <h2 class="section-title">Service Cost Distribution</h2>
            <div class="card">
                <div id="servicesChart" class="chart-container"></div>
            </div>
        </div>
        
        <div class="section">
            <h2 class="section-title">Savings Opportunity</h2>
            <div class="two-col">
                <div class="card">
                    <h3>Current vs. Optimal Costs</h3>
                    <div id="savingsChart" class="chart-container"></div>
                </div>
                <div class="card">
                    <h3>Savings Breakdown</h3>
                    <div id="savingsBreakdownChart" class="chart-container"></div>
                </div>
            </div>
        </div>
        
        <div class="section">
            <h2 class="section-title">Key Metrics</h2>
            <div class="two-col">
                <div class="card">
                    <h3>Cost Metrics</h3>
                    <table style="width: 100%; border-collapse: collapse;">
                        <tr>
                            <td style="padding: 8px; border-bottom: 1px solid #f0f0f0;">Current Month Cost</td>
                            <td style="padding: 8px; border-bottom: 1px solid #f0f0f0; text-align: right; font-weight: bold;">{{ current_month_cost_formatted }}</td>
                        </tr>
                        <tr>
                            <td style="padding: 8px; border-bottom: 1px solid #f0f0f0;">Average Monthly Cost</td>
                            <td style="padding: 8px; border-bottom: 1px solid #f0f0f0; text-align: right; font-weight: bold;">{{ avg_month_cost_formatted }}</td>
                        </tr>
                        <tr>
                            <td style="padding: 8px; border-bottom: 1px solid #f0f0f0;">Highest Monthly Cost</td>
                            <td style="padding: 8px; border-bottom: 1px solid #f0f0f0; text-align: right; font-weight: bold;">{{ max_month_cost_formatted }}</td>
                        </tr>
                        <tr>
                            <td style="padding: 8px;">Lowest Monthly Cost</td>
                            <td style="padding: 8px; text-align: right; font-weight: bold;">{{ min_month_cost_formatted }}</td>
                        </tr>
                    </table>
                </div>
                <div class="card">
                    <h3>Savings Metrics</h3>
                    <table style="width: 100%; border-collapse: collapse;">
                        <tr>
                            <td style="padding: 8px; border-bottom: 1px solid #f0f0f0;">Monthly Savings Potential</td>
                            <td style="padding: 8px; border-bottom: 1px solid #f0f0f0; text-align: right; font-weight: bold;">{{ est_monthly_savings_formatted }}</td>
                        </tr>
                        <tr>
                            <td style="padding: 8px; border-bottom: 1px solid #f0f0f0;">Annual Savings Potential</td>
                            <td style="padding: 8px; border-bottom: 1px solid #f0f0f0; text-align: right; font-weight: bold;">{{ annual_savings_est_formatted }}</td>
                        </tr>
                        <tr>
                            <td style="padding: 8px; border-bottom: 1px solid #f0f0f0;">Advisor Score</td>
                            <td style="padding: 8px; border-bottom: 1px solid #f0f0f0; text-align: right; font-weight: bold;">{{ advisor_score }}</td>
                        </tr>
                        <tr>
                            <td style="padding: 8px;">Cost Efficiency</td>
                            <td style="padding: 8px; text-align: right; font-weight: bold;">{{ efficiency_ratio|round(1) }}%</td>
                        </tr>
                    </table>
                </div>
            </div>
        </div>
    </div>
    
    <div class="footer">
        Page 2 of 3 | Azure Cost Optimization Report
    </div>
</div>

<!-- PAGE 3: Recommendations -->
<div class="report-container">
    <div class="header">
        <div class="logo-container">
            <img src="https://logo.stocklight.com/ASX/VEA.png" alt="VEA Logo" class="company-logo">
        </div>
        <div class="header-text">
            <h1>Optimization Recommendations - {{ subscription_name }}</h1>
            <p>Subscription ID: {{ subscription_id }}</p>
        </div>
        <div class="date-stamp">Generated: {{ current_date }}</div>
    </div>
    
    <div class="content">
        <div class="section">
            <h2 class="section-title">Key Recommendations <span style="background-color: {{ efficiency_color }}; color: white; font-size: 12px; padding: 3px 10px; border-radius: 20px; margin-left: 10px;">{{ recommendations.priority }}</span></h2>
            <div class="card">
                <ul class="recommendations-list">
                    {% for rec in recommendations.key_recommendations %}
                    <li>{{ rec }}</li>
                    {% endfor %}
                </ul>
            </div>
        </div>
        
        <div class="section">
            <h2 class="section-title">Service-Specific Recommendations</h2>
            {% for service_rec in recommendations.service_recommendations %}
            <div class="service-rec">
                <h4>
                    {{ service_rec.service }} 
                    <span class="service-cost">${{ service_rec.cost|round(2) }}</span>
                </h4>
                <ul>
                    {% for rec in service_rec.recommendations %}
                    <li>{{ rec }}</li>
                    {% endfor %}
                </ul>
            </div>
            {% endfor %}
        </div>
        
        <div class="section">
            <h2 class="section-title">Next Steps</h2>
            <div class="card">
                <ol style="margin-left: 20px; padding-left: 0;">
                    <li>Export Azure Advisor recommendations to Excel for detailed analysis</li>
                    <li>Create a cost optimization plan targeting {{ est_monthly_savings_formatted }} monthly savings</li>
                    <li>Schedule resource review with technical teams</li>
                    <li>Implement automated cost reports and alerts</li>
                    <li>Develop a cloud financial operations playbook</li>
                </ol>
            </div>
        </div>
    </div>
    
    <div class="footer">
        Page 3 of 3 | Azure Cost Optimization Report
    </div>
</div>

<script>
    // Parse the chart JSON data
    const monthlyTrendChart = JSON.parse('{{ monthly_trend_chart|safe }}');
    const servicesChart = JSON.parse('{{ services_pie_chart|safe }}');
    const savingsChart = JSON.parse('{{ savings_chart|safe }}');
    const savingsBreakdownChart = JSON.parse('{{ savings_breakdown_chart|safe }}');
    
    // Render the charts
    Plotly.newPlot('monthlyTrendChart', monthlyTrendChart.data, monthlyTrendChart.layout);
    Plotly.newPlot('servicesChart', servicesChart.data, servicesChart.layout);
    Plotly.newPlot('savingsChart', savingsChart.data, savingsChart.layout);
    Plotly.newPlot('savingsBreakdownChart', savingsBreakdownChart.data, savingsBreakdownChart.layout);
</script>
</body>
</html>
"""
    
    with open(template_path, "w") as f:
        f.write(template_content)
    
    return template_path

def main(csv_file, output_folder, limit=None):
    """Main function to generate HTML reports and convert to PDF"""
    # Create the template file if needed
    template_path = create_html_template()
    
    # Create output folder if it doesn't exist
    os.makedirs(output_folder, exist_ok=True)
    
    # Read the CSV file
    print(f"Reading CSV file: {csv_file}")
    df = pd.read_csv(csv_file)
    
    # Print column names for debugging
    print("CSV columns:", df.columns.tolist())
    print(f"Number of subscriptions: {len(df)}")
    
    # Check if we have month columns
    month_cols = [col for col in df.columns if 'Cost' in col]
    print(f"Month columns: {month_cols}")
    
    if not month_cols:
        raise ValueError("No monthly cost columns found. Column names should include 'Cost'.")
    
    # Limit the number of subscriptions to process if specified
    if limit is not None:
        df = df.head(limit)
        print(f"Limited to processing {limit} subscriptions")
    
    # Process each subscription
    for idx, subscription_row in df.iterrows():
        subscription_name = subscription_row['Subscription Name']
        
        # Create a safe filename (remove invalid characters)
        safe_name = re.sub(r'[\\/*?:"<>|]', "_", subscription_name)
        
        # Generate timestamp
        timestamp = datetime.now().strftime("%Y%m%d_%H%M")
        
        # Set file paths
        html_filename = f'{safe_name}_cost_report_{timestamp}.html'
        html_path = os.path.join(output_folder, html_filename)
        
        pdf_filename = f'{safe_name}_cost_report_{timestamp}.pdf'
        pdf_path = os.path.join(output_folder, pdf_filename)
        
        print(f"Generating report for {subscription_name}...")
        
        # Generate HTML report
        html_content = generate_html_report(subscription_row, df)
        
        # Save HTML file
        with open(html_path, 'w') as f:
            f.write(html_content)
        
        print(f"  HTML report generated: {html_path}")
        
        # Convert HTML to PDF
        try:
            html = weasyprint.HTML(filename=html_path)
            html.write_pdf(pdf_path)
            print(f"  PDF report generated: {pdf_path}")
        except Exception as e:
            print(f"  Error generating PDF: {str(e)}")
            print(f"  Please manually convert the HTML report to PDF.")
    
    print(f"\nAll reports generated successfully in folder: {output_folder}")
    return output_folder

if __name__ == '__main__':
    # Set up command line argument parsing
    parser = argparse.ArgumentParser(description='Generate Azure Cost Report PDFs (one per subscription)')
    parser.add_argument('-i', '--input', required=True, help='Input CSV file with Azure cost data')
    parser.add_argument('-o', '--output', default='./subscription_reports', help='Output folder for reports')
    parser.add_argument('-l', '--limit', type=int, help='Limit to process only a specific number of subscriptions (for testing)')
    
    args = parser.parse_args()
    
    # Generate reports
    main(args.input, args.output, args.limit)

Sending Via Email
This post will not cover this step but you can add this on quite easily and I have for many of the organisations I deal with on an a daily basis. If your subscriptions dont have accurate billing contact details you can get creative. In the most recent implementation of this script I am looking at a Cost-Centre key value tag on the subscription and making a call to an ERP to get the billing owners email address. There are many ways to do this.

The one initial struggle I had with this was sending HTML based emails with images, ensure your images are CI-D based and not BASE64. A CID email image (Content-ID image) is an image attached to an email and referenced within its HTML content using a unique content ID, allowing it to be displayed inline within the email body instead of as a separate attachment. Email clients like Outlook and Gmail block base64-encoded images for security reasons. This means that while base64 embedded images work fine in HTML when viewed in a browser and in PDFs, they won’t display in emails.



My email conversion code is as follows and is in two parts. The first part is my SMTP settings, and for that I am using Google Apps and the second is my email sending code which converts my HTML Base64 inages in to CID and send the emails

SMTP Sending Code – save this email_config.json

{
    "recipients": ["first.lastname@yourdomain.com"],
    "smtp_server": "smtp.gmail.com",
    "smtp_port": 465,
    "smtp_username": "first.lastname@yourdomain.com",
    "smtp_password": "xxxxxxxxxxxxxx",
    "sender_email": "Azure Cost Reports <no-reply@yourdomain.com>"
}

Email Sending Code

#!/usr/bin/env python3
"""
Email Sender for Azure Cost Reports

This script handles sending HTML reports via email
with proper image embedding and PDF attachments.
"""

import os
import smtplib
import argparse
import json
import csv
import logging
import requests
import glob
import base64
from email.mime.multipart import MIMEMultipart
from email.mime.text import MIMEText
from email.mime.application import MIMEApplication
from email.mime.image import MIMEImage
import re
from datetime import datetime
from logo_handler import get_absolute_logo_path

# Set up logging
logging.basicConfig(level=logging.INFO, 
                   format='%(asctime)s - %(levelname)s - %(message)s')
logger = logging.getLogger(__name__)

def load_email_config(config_path):
    """Load email configuration from a JSON file"""
    try:
        with open(config_path, 'r') as f:
            return json.load(f)
    except Exception as e:
        logger.error(f"Failed to load email config: {str(e)}")
        return None

def load_costcentre_mapping(mapping_path):
    """Load the mapping from CostCentre to email addresses"""
    costcentre_mapping = {}
    try:
        with open(mapping_path, 'r') as f:
            reader = csv.reader(f)
            header = next(reader)  # Skip header row
            
            costcentre_idx = header.index('CostCentre') if 'CostCentre' in header else 0
            email_idx = header.index('EmailAddress') if 'EmailAddress' in header else 1
            
            for row in reader:
                if len(row) > max(costcentre_idx, email_idx):
                    costcentre = row[costcentre_idx].strip()
                    email = row[email_idx].strip()
                    if costcentre and email:
                        costcentre_mapping[costcentre] = email
        
        return costcentre_mapping
    except Exception as e:
        logger.error(f"Failed to load CostCentre mapping: {str(e)}")
        return {}

def extract_images_from_html(html_path):
    """Extract all image paths referenced in the HTML file"""
    try:
        with open(html_path, 'r') as f:
            html_content = f.read()
        
        # Find all image references in the HTML
        img_pattern = r'<img src="([^"]+)"'
        matches = re.findall(img_pattern, html_content)
        
        # Convert relative paths to absolute paths
        base_dir = os.path.dirname(os.path.abspath(html_path))
        image_paths = []
        remote_images = []
        
        for match in matches:
            if match.startswith('http'):
                # Keep track of remote images for later handling
                remote_images.append(match)
                continue
            img_path = os.path.join(base_dir, match)
            if os.path.exists(img_path):
                image_paths.append(img_path)
            else:
                logger.warning(f"Image not found: {img_path}")
        
        return image_paths, remote_images
    
    except Exception as e:
        logger.error(f"Error extracting images from HTML: {str(e)}")
        return []

def send_email_with_report(html_path, pdf_path, recipient, config):
    """Send an email with the HTML report in body only - NO PDF ATTACHMENT - FIXED VERSION"""
    import traceback  # Import traceback for better error reporting
    try:
        # Extract subscription name from the filename
        filename = os.path.basename(html_path)
        subscription_name = filename.split('_cost_report_')[0]
        
        # Create email message
        msg = MIMEMultipart('related')
        msg['Subject'] = f"Azure Cost Report - {subscription_name}"
        # Use sender_email if available, otherwise use from_email
        msg['From'] = config.get('sender_email', config.get('from_email', 'no-reply@baldacchino.net'))
        msg['To'] = recipient
        
        # Load HTML content
        with open(html_path, 'r') as f:
            html_content = f.read()
        
        # Create alternative part for HTML content
        alt_part = MIMEMultipart('alternative')
        msg.attach(alt_part)
        
        # Add plain text version
        text_content = f"Please see the Azure Cost Report for {subscription_name}. This email contains HTML content with embedded images."
        alt_part.attach(MIMEText(text_content, 'plain'))
        
        # Extract base64 images from HTML
        def extract_base64_images(html):
            base64_pattern = r'src="data:image/png;base64,([^"]+)"'
            matches = re.findall(base64_pattern, html)
            return matches
        
        # Check if we have base64 embedded images
        base64_images = extract_base64_images(html_content)
        has_embedded_images = len(base64_images) > 0
        
        # Save the original HTML content with base64 images for PDF generation
        original_html_content = html_content
        
        if has_embedded_images:
            # With embedded images, use enhanced styling that works in Gmail
            logger.info(f"Found {len(base64_images)} embedded base64 images in {html_path}")
            
            # Add robust styles that work for email
            simple_styles = '''
            <style type="text/css">
                body {
                    font-family: Arial, sans-serif;
                    line-height: 1.6;
                    max-width: 800px;
                    margin: 0 auto;
                    padding: 20px;
                    color: #333;
                }
                
                /* Header styling to match HTML render */
                .header-container {
                    display: flex !important;
                    background-color: #0078D7 !important;
                    padding: 15px !important;
                    margin-bottom: 20px !important;
                }
                
                .logo-container {
                    width: 350px !important;
                    max-width: 40% !important;
                    background-color: white !important;
                    padding: 5px !important;
                    border-radius: 5px !important;
                    margin: 10px !important;
                }
                
                .header-info {
                    color: white !important;
                    padding-left: 20px !important;
                    flex: 1 !important;
                }
                
                h1, h2, h3 {
                    color: #0078D7;
                    margin-top: 30px;
                }
                
                table {
                    border-collapse: collapse;
                    width: 100%;
                    margin: 20px 0;
                }
                
                th, td {
                    border: 1px solid #ddd;
                    padding: 10px;
                    text-align: left;
                }
                
                th {
                    background-color: #f2f2f2;
                }
                
                /* Default styling for chart images */
                img {
                    width: 500px !important;
                    max-width: 100% !important;
                    height: auto !important;
                    display: block !important;
                    margin: 20px auto !important;
                    border-radius: 4px;
                    box-shadow: 0 2px 5px rgba(0,0,0,0.1);
                }
                
                /* Special styling for the logo */
                .logo-container img, img[src*="my_logo"], img[src*="mylogo.png"], img[alt*="logo"], img[src*="logo"] {
                    width: 350px !important;
                    max-width: 60% !important;
                    height: auto !important;
                    margin: 10px 0 !important;
                    box-shadow: none !important;
                    border-radius: 0 !important;
                    background-color: white !important;
                    padding: 15px !important;
                    border: 2px solid white !important;
                }
                
                #monthlyTrendChart, #monthlyChart, #trendChart,
                #servicesChart, #serviceChart,
                #savingsChart, #optimizationChart {
                    margin: 20px auto !important;
                    width: 500px !important;
                    max-width: 100% !important;
                    text-align: center !important;
                }
            </style>
            '''
            
            # Insert simple styles for the email version
            if '<head>' in html_content:
                html_content = html_content.replace('<head>', f'<head>{simple_styles}')
            else:
                html_content = f'<html><head>{simple_styles}</head>{html_content}</html>'
                
            # Make a copy of HTML for email - we NEED to convert to CID references
            # Base64 images directly in emails are often blocked by email clients
            email_html_content = html_content
            
            # Add a class to the logo image and improve header styling
            # This looks for images likely to be the logo based on common patterns
            email_html_content = re.sub(
                r'(<img[^>]*?)(src="[^"]*?(?:logo|VEA|vea)[^"]*?"[^>]*?>)',
                r'\1 class="logo-image" \2',
                email_html_content
            )
            
            # Look for a blue header section that likely contains the logo
            # and subscription details, and add appropriate wrapper classes
            header_pattern = r'(<div[^>]*?style="[^"]*?(?:background|bg)[^"]*?(?:blue|#0078D7)[^"]*?"[^>]*?>)(.*?)(<\/div>)'
            
            def format_header(match):
                header_content = match.group(2)
                
                # Check if there's an image and text in separate sections
                if '<img' in header_content:
                    # Split content into logo and subscription info parts
                    parts = re.split(r'(<img.*?>)', header_content, 1)
                    if len(parts) >= 3:
                        before, img, after = parts[0], parts[1], ''.join(parts[2:])
                        # Wrap logo in its container
                        return f'{match.group(1)}<div class="header-container"><div class="logo-container">{img}</div><div class="header-info">{after}</div></div>{match.group(3)}'
                
                # If we couldn't split it properly, just add the header class
                return f'{match.group(1)}<div class="header-container">{header_content}</div>{match.group(3)}'
            
            # Apply the header formatting if a blue header is found
            email_html_content = re.sub(header_pattern, format_header, email_html_content, flags=re.DOTALL)
                
            # Create a new version with CID references for emails - this is critical for emails
            logger.info("Converting base64 images to CID references for email compatibility")
            
            for idx, base64_data in enumerate(base64_images):
                img_id = f"image_{idx}"
                img_cid = f"<{img_id}>"
                
                try:
                    # Decode base64 image
                    img_data = base64.b64decode(base64_data)
                    
                    # Create image attachment with Content-ID
                    img = MIMEImage(img_data)
                    img.add_header('Content-ID', img_cid)
                    img.add_header('Content-Disposition', 'inline', filename=f"chart_{idx}.png")
                    msg.attach(img)
                    
                    # Replace base64 data with CID reference ONLY in the email HTML
                    # The replace operation needs to handle the full image tag with src attribute
                    pattern = f'src="data:image/png;base64,{base64_data}"'
                    replacement = f'src="cid:{img_id}"'
                    
                    count_before = email_html_content.count(pattern)
                    email_html_content = email_html_content.replace(pattern, replacement)
                    count_after = email_html_content.count(replacement)
                    
                    logger.info(f"Successfully converted image {idx} to CID reference (replaced {count_before} occurrences)")
                    print(f"Image {idx}: Replaced {count_before} base64 references with CID")
                except Exception as e:
                    logger.error(f"Failed to convert base64 image {idx}: {str(e)}")
                    print(f"ERROR converting base64 image {idx}: {str(e)}")
                    traceback.print_exc()  # Print the full exception for debugging
                    # If conversion fails, the base64 will remain, which is better than no image            # Use the modified version with CID references for emails
            html_content_for_email = email_html_content
            
            # Add the HTML part with CID references
            logger.info("Using HTML content with CID references for email body")
            html_part = MIMEText(html_content_for_email, 'html')
            alt_part.attach(html_part)
        
        # If we don't have embedded images, try to extract and embed them
        image_paths = []
        remote_images = []
        img_cids = {}
        
        # Add HTML version with CID references for embedded images
        # Use html_content_for_email if we processed base64 images, otherwise use original html_content
        html_to_send = html_content_for_email if 'html_content_for_email' in locals() else html_content
        logger.info("Using HTML content with CID references for email body")
                
        # Add the HTML part with CID references for images
        html_part = MIMEText(html_to_send, 'html')
        alt_part.attach(html_part)
        
        if not has_embedded_images:
            # Extract and embed images
            image_paths, remote_images = extract_images_from_html(html_path)
            logger.info(f"Found {len(image_paths)} local images and {len(remote_images)} remote images to embed")
        
        # First handle local image files
        for i, img_path in enumerate(image_paths):
            img_id = f"image_{i}"
            img_cid = f"<{img_id}>"
            
            try:
                with open(img_path, 'rb') as img_file:
                    img_data = img_file.read()
                    img = MIMEImage(img_data)
                    img.add_header('Content-ID', img_cid)
                    img.add_header('Content-Disposition', 'inline', filename=os.path.basename(img_path))
                    msg.attach(img)
                    
                    # Store the CID with the image path
                    img_cids[os.path.basename(img_path)] = img_id
            except Exception as e:
                logger.error(f"Failed to embed image {img_path}: {str(e)}")
        
        # Now handle remote images, especially the VEA logo
        import requests
        for i, img_url in enumerate(remote_images):
            img_id = f"remote_image_{i}"
            img_cid = f"<{img_id}>"
            
            try:
                # Special case for VEA logo
                if "logo.stocklight.com/ASX/VEA.png" in img_url or "v2.ai" in img_url:
                    # Get output folder path from the html_path
                    output_folder = os.path.dirname(os.path.abspath(html_path))
                    
                    # Get local logo path using logo_handler
                    logo_path = get_absolute_logo_path(output_folder)
                    
                    if os.path.exists(logo_path):
                        logger.info(f"Using locally cached logo at {logo_path}")
                        with open(logo_path, 'rb') as img_file:
                            img_data = img_file.read()
                    else:
                        # Fallback to direct download
                        logger.info(f"Fetching VEA logo from {img_url}")
                        response = requests.get(img_url, timeout=10)
                        if response.status_code == 200:
                            img_data = response.content
                        else:
                            logger.error(f"Failed to fetch VEA logo: HTTP status {response.status_code}")
                            continue
                    
                    # Create and attach the image
                    img = MIMEImage(img_data)
                    img.add_header('Content-ID', img_cid)
                    img.add_header('Content-Disposition', 'inline', filename="vea_logo.png")
                    msg.attach(img)
                    
                    # Replace the URL with the CID in the HTML content
                    html_content = html_content.replace(f'src="{img_url}"', f'src="cid:{img_id}"')
                    # Also replace any relative logo paths
                    html_content = html_content.replace('src="logos/vea_logo.png"', f'src="cid:{img_id}"')
                else:
                    logger.info(f"Fetching image from {img_url}")
                    response = requests.get(img_url, timeout=10)
                    if response.status_code == 200:
                        img_data = response.content
                        img = MIMEImage(img_data)
                        img.add_header('Content-ID', img_cid)
                        img.add_header('Content-Disposition', 'inline', filename=os.path.basename(img_url))
                        msg.attach(img)
                        
                        # Replace the URL with the CID in the HTML content
                        html_content = html_content.replace(f'src="{img_url}"', f'src="cid:{img_id}"')
                    else:
                        logger.error(f"Failed to fetch image: HTTP status {response.status_code}")
            except Exception as e:
                logger.error(f"Failed to embed remote image {img_url}: {str(e)}")
        
        # For non-base64 images, rewrite HTML to use CIDs for embedded images
        if not has_embedded_images:
            html_for_email = html_content
            for img_name, img_id in img_cids.items():
                html_for_email = html_for_email.replace(f'src="{img_name}"', f'src="cid:{img_id}"')
                html_for_email = html_for_email.replace(f'src="charts/{img_name}"', f'src="cid:{img_id}"')
                # Handle other relative paths
                html_for_email = re.sub(r'src="[^"]*?/([^"/]+)"', lambda m: f'src="cid:{img_cids.get(m.group(1), m.group(0)[5:-1])}"', html_for_email)
            
            # Update the HTML part with rewritten content for email
            # Remove any previous HTML parts
            for part in alt_part.get_payload():
                if part.get_content_type() == 'text/html':
                    alt_part.get_payload().remove(part)
                    
            # Add the final HTML part with properly processed images
            html_part = MIMEText(html_for_email, 'html')
            alt_part.attach(html_part)
            
        # No need to write anything back to the original file
        # The original HTML with base64 is preserved for PDF generation
        
        # PDF attachment is disabled as requested
        # Keep PDF generation in the original process for viewing locally, but don't attach to emails
        logger.info("PDF attachment is disabled - emails will only contain HTML with embedded images")
        
        # Connect to the SMTP server and send the email
        with smtplib.SMTP_SSL(config['smtp_server'], config['smtp_port']) as server:
            server.login(config['smtp_username'], config['smtp_password'])
            server.send_message(msg)
        
        logger.info(f"Email sent successfully to {recipient}")
        return True
        
    except Exception as e:
        logger.error(f"Failed to send email: {str(e)}")
        return False

def process_reports(reports_dir, config_path, mapping_path):
    """Process all reports in a directory and send emails"""
    # Load configurations
    email_config = load_email_config(config_path)
    if not email_config:
        logger.error("Email configuration not loaded. Exiting.")
        return
    
    costcentre_mapping = load_costcentre_mapping(mapping_path) if mapping_path else {}
    
    # Process all HTML files in the directory
    sent_count = 0
    html_files = [f for f in os.listdir(reports_dir) if f.endswith('.html')]
    print(f"Found {len(html_files)} HTML reports to process")
    
    # Define a fallback recipient from the config
    fallback_recipient = email_config.get('default_recipient', 
                                        email_config.get('to_email', 
                                                       email_config.get('recipients', ['shane.baldacchino@v2.ai'])[0]))
    print(f"Using fallback recipient: {fallback_recipient}")
    
    for filename in html_files:
        html_path = os.path.join(reports_dir, filename)
        pdf_path = None  # EXPLICITLY DISABLE PDF ATTACHMENTS
        
        # Extract subscription and cost centre from filename
        parts = filename.split('_')
        subscription_name = parts[0] if len(parts) > 0 else None
        print(f"Processing report for {subscription_name}")
        
        # Get cost centre from the HTML content
        costcentre = None
        try:
            with open(html_path, 'r') as f:
                html_content = f.read()
                costcentre_match = re.search(r'Cost Centre: <strong>([^<]+)</strong>', html_content)
                if costcentre_match:
                    costcentre = costcentre_match.group(1).strip()
                    print(f"Found cost centre: {costcentre}")
        except Exception as e:
            print(f"Error reading HTML content: {str(e)}")
            pass
        
        # Determine recipient
        recipient = fallback_recipient  # Always have a default
        
        if costcentre and costcentre in costcentre_mapping:
            recipient = costcentre_mapping[costcentre]
            print(f"Using mapped recipient for cost centre {costcentre}: {recipient}")
        else:
            print(f"Using default recipient: {recipient}")
        
        # Send the email
        print(f"Sending email to {recipient} for {filename}")
        try:
            success = send_email_with_report(html_path, pdf_path, recipient, email_config)
            if success:
                print(f"Successfully sent email for {filename} to {recipient}")
                sent_count += 1
            else:
                print(f"Failed to send email for {filename} to {recipient}")
        except Exception as e:
            print(f"Error sending email: {str(e)}")
            import traceback
            traceback.print_exc()
    
    print(f"Sent {sent_count} email reports out of {len(html_files)} total HTML files")

if __name__ == "__main__":
    import argparse
    parser = argparse.ArgumentParser(description='Send Azure cost reports via email')
    parser.add_argument('-d', '--dir', dest='directory', required=True, help='Directory containing HTML reports')
    parser.add_argument('-c', '--config', required=True, help='Path to email configuration JSON file')
    parser.add_argument('-m', '--mapping', help='Path to CostCentre to email mapping CSV file')
    
    args = parser.parse_args()
    
    # Print the arguments for debugging
    print(f"Processing reports in: {args.directory}")
    print(f"Using email config: {args.config}")
    print(f"Using mapping file: {args.mapping}")
    
    process_reports(args.directory, args.config, args.mapping)


Executing
This code can be executed either via Python or you can chain the collection of data (see Part 1) in a bash script. The execution will take approximately 4 hours end-2-end and a big part of this time is in collecting the data, the report generation, emailing takes about 30 second per subscription.

In terms of executing the easiest path given how long this process takes is to create a scheduled WebJob on an App Service Plan running Python in Azure or you can even use something as humble as a Raspberry Pi or a server assuming you have Azure CLI and the respective python packages installed.

A bash script to run this is all you need

#!/bin/bash
#
# Azure Cost Report Generator
# --------------------------
# Run this script to generate Azure cost reports and send them via email
#

# Get the directory where the script is located
SCRIPT_DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" && pwd )"
CSV_FILE="$SCRIPT_DIR/azure_cost_report.csv"
OUTPUT_DIR="$SCRIPT_DIR/subscription_reports_$(date +%Y%m%d)"
CONFIG_FILE="$SCRIPT_DIR/email_config.json"

# Check if the CSV file exists
if [ ! -f "$CSV_FILE" ]; then
    echo "Error: CSV file not found at $CSV_FILE"
    echo "Please make sure your Azure cost data CSV file is in the correct location."
    exit 1
fi

# Check if the email config file exists
if [ ! -f "$CONFIG_FILE" ]; then
    echo "Warning: Email config file not found at $CONFIG_FILE"
    echo "Reports will be generated but not emailed."
    EMAIL_ARG=""
else
    EMAIL_ARG="--email --config $CONFIG_FILE"
fi

# Create output directory if it doesn't exist
mkdir -p "$OUTPUT_DIR"

echo "Generating Azure cost reports..."
echo "  CSV File: $CSV_FILE"
echo "  Output Directory: $OUTPUT_DIR"
if [ -n "$EMAIL_ARG" ]; then
    echo "  Email: Enabled"
else
    echo "  Email: Disabled"
fi
echo ""

# Process all subscriptions in batches to manage memory
BATCH_SIZE=200  
BATCH_ARG="--batch-size $BATCH_SIZE"  # Use batch processing instead of limit
echo "Processing all subscriptions in batches of $BATCH_SIZE to manage memory..."

# Run the Python script with batch processing
python3 "$SCRIPT_DIR/azure_cost_html_reporter.py" -i "$CSV_FILE" -o "$OUTPUT_DIR" $BATCH_ARG

if [ $? -eq 0 ]; then
    # Use the improved image fix script that preserves subscription-specific data
    echo "Fixing charts to ensure they're email compatible while preserving subscription-specific data..."
    python3 "$SCRIPT_DIR/simple_image_fix_improved.py" -d "$OUTPUT_DIR"
    
    # Send emails using the email_sender.py script
    if [ -n "$EMAIL_ARG" ]; then
        echo "Sending emails using dedicated email_sender.py script..."
        python3 "$SCRIPT_DIR/email_sender.py" -d "$OUTPUT_DIR" -c "$CONFIG_FILE"
    fi
    
    echo ""
    echo "Reports generated successfully in $OUTPUT_DIR"
    echo "Each report now shows ACCURATE, subscription-specific chart data that works in email clients"
    if [ -n "$EMAIL_ARG" ]; then
        echo "Reports have been emailed according to your configuration."
    fi
else
    echo ""
    echo "Error: Report generation failed."
    exit 1
fi

Summary


Thanks
Shane Baldacchino

Leave a Comment