Module 08 — Files, APIs, and Data Ingestion#

Graduate MSBA Module Overview

Analytics begins with data — and data lives somewhere: in files, in databases, in web services and APIs. Before you can analyze anything, you need to get data into Python. This module covers the primary data ingestion pathways you’ll use throughout your analytics career.

  • File I/O — Read and write CSV, JSON, and text formats
  • APIs — Connect to live data sources: financial markets, business intelligence systems, public datasets
  • Data validation — Verify incoming data before it reaches your analysis pipeline

Understanding how to ingest data programmatically is what separates analysts who work with the data they’re given from analysts who can acquire the data they need.


Course Connections#

Data ingestion connects to all four foundational programming principles. You use iteration to process files record by record. You use inference to validate and filter incoming data. You use abstraction to build reusable ingestion functions. And you use polymorphism to write code that handles multiple data formats with consistent logic.


Quick Code Example#

import json

customer_data = [
    {'name': 'Alice Johnson', 'customer_id': 1001, 'total_spent': 851.50, 'tier': 'Gold'},
    {'name': 'Bob Martinez', 'customer_id': 1002, 'total_spent': 430.50, 'tier': 'Standard'},
    {'name': 'Carol Chen', 'customer_id': 1003, 'total_spent': 890.75, 'tier': 'Gold'}
]

with open('customers.json', 'w') as f:
    json.dump(customer_data, f, indent=2)

with open('customers.json', 'r') as f:
    loaded_data = json.load(f)

print('Customers loaded from file:')
for customer in loaded_data:
    print(f"  {customer['name']} | Tier: {customer['tier']} | Total: ${customer['total_spent']}")

print(f'\nTotal customers: {len(loaded_data)}')
print(f"Total revenue: ${sum(c['total_spent'] for c in loaded_data):.2f}")

Expected Output:

Customers loaded from file:
  Alice Johnson | Tier: Gold | Total: $851.5
  Bob Martinez | Tier: Standard | Total: $430.5
  Carol Chen | Tier: Gold | Total: $890.75

Total customers: 3
Total revenue: $2172.75

Learning Progression#

Platform What Students Will Do
NotebookLM Explore data ingestion through business storytelling that shows how data flows from sources into analytics pipelines
Google Colab Read and write files, parse JSON and CSV data, and make basic API calls to retrieve live data
Zybooks Complete structured exercises that reinforce file handling patterns and data ingestion workflows

Module Pages#

  • Concept → — Deep narrative on files, APIs, and data integrity
  • Advanced → — Extended code with CSV, JSON, and simulated API patterns
  • Notebook → — Jupyter notebook lab description