Chapter 14 - Excel Spreadsheets (Python)

Here's a concise walkthrough of the main ideas in Chapter 14, each with a small example.

Intro and setup

Goal: use openpyxl to read/write .xlsx files from Excel or LibreOffice Calc, automating repetitive spreadsheet tasks.
openpyxl works on files, not the live Excel app; you must close the workbook in Excel (or download/re-upload in Office 365) before/after running scripts.
Install with pip install openpyxl; examples use example3.xlsx, censuspopdata.xlsx, produceSales3.xlsx, etc., from the book resources.

Reading Excel files

Workbooks, sheets, cells

A spreadsheet file is a workbook (.xlsx).
Each workbook has sheets (worksheets); one is the active sheet.
Sheets have lettered columns (A, B, ...) and numbered rows (1, 2, ...).
A cell is identified by its address, like B3.

The example workbook example3.xlsx has sheets Sheet1, Sheet2, Sheet3, with Sheet1 containing a small table of dates, fruit, and quantities.

Opening a workbook

import openpyxl

wb = openpyxl.load_workbook('example3.xlsx')
print(type(wb))  # <class 'openpyxl.workbook.workbook.Workbook'>

Getting sheets from the workbook

wb.sheetnames              # ['Sheet1', 'Sheet2', 'Sheet3']
sheet = wb['Sheet3']       # Worksheet object
print(sheet.title)         # 'Sheet3'

active_sheet = wb.active   # last viewed sheet in Excel
print(active_sheet.title)

Sheets are Worksheet objects, accessible like dict keys by name, or via wb.active.

Getting cells from a sheet

import openpyxl

wb = openpyxl.load_workbook('example3.xlsx')
sheet = wb['Sheet1']

print(sheet['A1'])           # Cell object
print(sheet['A1'].value)     # datetime(2035, 4, 5, 13, 34, 2)
c = sheet['B1']
print(c.value)               # 'Apples'
print(f'Row {c.row}, Column {c.column} is {c.value}')
print(f'Cell {c.coordinate} is {c.value}')
print(sheet['C1'].value)     # 73

Dates are returned as datetime objects, not strings.

You can also use sheet.cell(row=..., column=...) with 1-based indices:

print(sheet.cell(row=1, column=2).value)  # 'Apples'

for i in range(1, 8, 2):  # every other row
    print(i, sheet.cell(row=i, column=2).value)

Size of the used area:

print(sheet.max_row)     # 7
print(sheet.max_column)  # 3

Converting between column letters and numbers

from openpyxl.utils import get_column_letter, column_index_from_string

print(get_column_letter(1))   # 'A'
print(get_column_letter(27))  # 'AA'
print(get_column_letter(900)) # 'AHP'

print(column_index_from_string('A'))   # 1
print(column_index_from_string('AA'))  # 27

You can also use get_column_letter(sheet.max_column) to get the last column's letter.

Getting ranges, rows, and columns

A rectangular slice:

sheet = wb['Sheet1']

for row_cells in sheet['A1':'C3']:
    for cell in row_cells:
        print(cell.coordinate, cell.value)
    print('--- END OF ROW ---')

You can also use sheet.rows and sheet.columns (convert to list first):

second_col = list(sheet.columns)[1]
for cell in second_col:
    print(cell.value)    # Apples, Cherries, ...

list(sheet.rows) is a list of row tuples; list(sheet.columns) is a list of column tuples.

Review checklist

To read a cell, the steps are:

import openpyxl
wb = openpyxl.load_workbook(...)
Get sheet via wb.active or wb[sheetname].
Get a Worksheet object.
Use indexing (sheet['B3']) or sheet.cell(row=, column=).
Get a Cell object.
Read cell.value.

Project: Gather Census Statistics

Task: from censuspopdata.xlsx, count for each county:

Number of census tracts.
Total population.

Spreadsheet Population by Census Tract has columns: A (tract number), B (state abbreviation), C (county name), D (population).

Step 1: Read spreadsheet data

# readCensusExcel.py
import openpyxl, pprint

print('Opening workbook...')
wb = openpyxl.load_workbook('censuspopdata.xlsx')
sheet = wb['Population by Census Tract']
county_data = {}

print('Reading rows...')
for row in range(2, sheet.max_row + 1):  # skip header
    state = sheet['B' + str(row)].value
    county = sheet['C' + str(row)].value
    pop = sheet['D' + str(row)].value
    # will fill county_data

Step 2: Populate nested dict structure

county_data structure:

{
  'AK': {
    'Anchorage': {'tracts': 55, 'pop': 291826},
    'Bethel':    {'tracts': 3,  'pop': 17013},
  },
}

Code:

for row in range(2, sheet.max_row + 1):
    state = sheet['B' + str(row)].value
    county = sheet['C' + str(row)].value
    pop = sheet['D' + str(row)].value

    county_data.setdefault(state, {})
    county_data[state].setdefault(county, {'tracts': 0, 'pop': 0})
    county_data[state][county]['tracts'] += 1
    county_data[state][county]['pop'] += int(pop)

setdefault ensures nested dictionaries exist before incrementing.

Step 3: Write results to a Python file

Use pprint.pformat to dump the dict as valid Python code.

print('Writing results...')
with open('census2010.py', 'w', encoding='utf-8') as result_file:
    result_file.write('allData = ' + pprint.pformat(county_data))
print('Done.')

Then you can import and use it:

import census2010

anchorage = census2010.allData['AK']['Anchorage']
print(anchorage['pop'], anchorage['tracts'])

Ideas for similar programs

Use this pattern (load sheet → loop rows → build structure) to:

Compare rows within one sheet.
Compare across multiple spreadsheets.
Validate data (blank rows, invalid cells).
Use spreadsheet data as input for other Python programs.

Writing Excel documents

Creating and saving workbooks

import openpyxl

wb = openpyxl.Workbook()       # new blank workbook
print(wb.sheetnames)           # ['Sheet']
sheet = wb.active
print(sheet.title)             # 'Sheet'

sheet.title = 'Spam Bacon Eggs Sheet'
wb.save('example_new.xlsx')

When editing existing files, always save to a different filename to avoid corrupting the original; Excel must not have the file open when you call save().

Creating and deleting sheets

wb = openpyxl.Workbook()
print(wb.sheetnames)               # ['Sheet']

wb.create_sheet()                  # appended, name 'Sheet1'
wb.create_sheet(index=0, title='First Sheet')
wb.create_sheet(index=2, title='Middle Sheet')
print(wb.sheetnames)               # ['First Sheet', 'Sheet', 'Middle Sheet', 'Sheet1']

del wb['Middle Sheet']
del wb['Sheet1']
print(wb.sheetnames)               # ['First Sheet', 'Sheet']

wb.save('multi_sheets.xlsx')

Writing values to cells

wb = openpyxl.Workbook()
sheet = wb['Sheet']

sheet['A1'] = 'Hello, world!'
print(sheet['A1'].value)   # 'Hello, world!'

wb.save('hello.xlsx')

You assign directly to sheet['A1'] or to sheet.cell(row=..., column=...).value.

Project: Update a Spreadsheet

produceSales3.xlsx has rows: Produce (A), Cost per Pound (B), Pounds Sold (C), TOTAL (D) as a formula like =ROUND(B2*C2, 2).

Task: fix incorrect prices for Garlic, Celery, Lemon in thousands of rows.

Step 1: Data structure for updated prices

# updateProduce.py
import openpyxl

wb = openpyxl.load_workbook('produceSales3.xlsx')
sheet = wb['Sheet']

PRICE_UPDATES = {
    'Garlic': 3.07,
    'Celery': 1.19,
    'Lemon':  1.27,
}

Step 2: Loop rows and update

for row_num in range(2, sheet.max_row + 1):   # skip header
    produce_name = sheet.cell(row=row_num, column=1).value
    if produce_name in PRICE_UPDATES:
        sheet.cell(row=row_num, column=2).value = PRICE_UPDATES[produce_name]

wb.save('updatedProduceSales3.xlsx')

Because D uses formulas, Excel recalculates totals automatically when B changes.

Ideas for similar programs

Copy data between spreadsheets.
Pull data from websites/files/clipboard into Excel.
"Clean" spreadsheets: standardize phone numbers, strip whitespace, etc., using regex and text processing.

Styling cells: fonts

Import Font and assign to cell.font.

from openpyxl.styles import Font
import openpyxl

wb = openpyxl.Workbook()
sheet = wb['Sheet']

italic_24_font = Font(size=24, italic=True)
sheet['A1'].font = italic_24_font
sheet['A1'] = 'Hello, world!'

wb.save('styles3.xlsx')

Useful Font keyword args:

name: 'Calibri', 'Times New Roman', ...
size: integer point size.
bold: True/False.
italic: True/False.

Example with two styles:

bold_font = Font(name='Times New Roman', bold=True)
sheet['A1'].font = bold_font
sheet['A1'] = 'Bold Times New Roman'

italic_font = Font(size=24, italic=True)
sheet['B3'].font = italic_font
sheet['B3'] = '24 pt Italic'
wb.save('styles3.xlsx')

Formulas

Formulas are just strings starting with =; openpyxl writes them, Excel calculates them.

Set a formula:

sheet['B9'] = '=SUM(B1:B8)'

Simple example:

import openpyxl

wb = openpyxl.Workbook()
sheet = wb['Sheet']
sheet['A1'] = 200
sheet['A2'] = 300
sheet['A3'] = '=SUM(A1:A2)'
wb.save('writeFormula3.xlsx')

openpyxl does not evaluate the formula; Excel does when you open/save the file.

If you open in Excel, save, then reload with data_only=True, you can read computed values instead of formula text:

# After opening writeFormula3.xlsx in Excel and saving it
wb = openpyxl.load_workbook('writeFormula3.xlsx', data_only=True)
sheet = wb['Sheet']
print(sheet['A3'].value)   # 500 (not '=SUM(A1:A2)')

Overall idea of the chapter

Chapter 14 shows how to automate Excel tasks with openpyxl: reading workbooks/sheets/cells, building data structures from spreadsheet rows (census project), writing and updating cells (produce project), styling with fonts, and writing formulas. The key pattern is: load workbook → loop rows → read or write cells → save.

Intro and setup​

Reading Excel files​

Workbooks, sheets, cells​

Opening a workbook​

Getting sheets from the workbook​

Getting cells from a sheet​

Converting between column letters and numbers​

Getting ranges, rows, and columns​

Review checklist​

Project: Gather Census Statistics​

Step 1: Read spreadsheet data​

Step 2: Populate nested dict structure​

Step 3: Write results to a Python file​

Ideas for similar programs​

Writing Excel documents​

Creating and saving workbooks​

Creating and deleting sheets​

Writing values to cells​

Project: Update a Spreadsheet​

Step 1: Data structure for updated prices​

Step 2: Loop rows and update​

Ideas for similar programs​

Styling cells: fonts​

Formulas​

Overall idea of the chapter​

Intro and setup

Reading Excel files

Workbooks, sheets, cells

Opening a workbook

Getting sheets from the workbook

Getting cells from a sheet

Converting between column letters and numbers

Getting ranges, rows, and columns

Review checklist

Project: Gather Census Statistics

Step 1: Read spreadsheet data

Step 2: Populate nested dict structure

Step 3: Write results to a Python file

Ideas for similar programs

Writing Excel documents

Creating and saving workbooks

Creating and deleting sheets

Writing values to cells

Project: Update a Spreadsheet

Step 1: Data structure for updated prices

Step 2: Loop rows and update

Ideas for similar programs

Styling cells: fonts

Formulas

Overall idea of the chapter