# Excel Reports

DBTK provides two Excel writers built on [openpyxl](https://openpyxl.readthedocs.io/):

| Class | Use when |
|---|---|
| `ExcelWriter` | Single or multi-sheet reports with column formatting, styles, and auto-sizing |
| `LinkedExcelWriter` | Reports with internal navigation links or external hyperlinks (extends `ExcelWriter`) |

Both writers keep the workbook open across `write_batch()` calls and save only on context exit, making them efficient for multi-sheet reports.

---

## Quick Start

```python
from dbtk.writers import ExcelWriter

# Single sheet — one shot
ExcelWriter(cursor, 'report.xlsx').write()

stmt = cursor.prepare_file('quarterly_sales.sql')
# Multi-sheet — context manager
with ExcelWriter(file='report.xlsx') as writer:
    for qtr in (1, 2, 3, 4):
        stmt.execute({'quarter': qtr})
        writer.write_batch(stmt, sheet_name=f'Q{qtr}')
# Workbook saved on exit
```

The first `write_batch()` to a sheet writes the header row; subsequent calls to the **same** sheet append rows without repeating it. Writing to a different `sheet_name` starts a new sheet.

### Custom Headers

Pass `headers` explicitly when the column names in the query aren't what you want in the report — for example, Oracle's 30-character column name limit forces aliases that make poor headers, or you simply want friendlier labels:

```python
# Oracle alias → readable header
cursor.execute("SELECT acad_plan_owner_org_id AS owner_org, ...")
ExcelWriter(cursor, 'report.xlsx',
            headers=['Owner Org', ...]).write()

# Any column names → display labels
cursor.execute("SELECT crse_numb, subj_code, cred_hrs FROM courses")
ExcelWriter(cursor, 'courses.xlsx',
            headers=['Course Number', 'Subject', 'Credits']).write()
```

---

## Automatic Behaviors

Out of the box, without any `formatting` configuration, `ExcelWriter` does the following automatically:

- **Bold header row** — column names are written in the first row with bold font
- **Auto-sized columns** — the first 15 data rows are sampled to estimate content width; columns are sized to fit, between 6 and 60 characters wide by default
- **Frozen header row** — the top row is frozen so it stays visible while scrolling (`freeze_panes = 'A2'`)
- **Date and datetime formatting** — `date` values get `YYYY-MM-DD` format; `datetime` values with a non-midnight time get `YYYY-MM-DD HH:MM:SS` format, automatically
- **None → empty cell** — `None` values are written as blank cells rather than the string `"None"`

These defaults are all overridable via the `formatting` parameter. Auto-sizing limits are controlled by `min_column_width` (default `6`) and `max_column_width` (default `60`); freezing is controlled by `freeze` (pass `None` to disable).

---

## Worksheet Formatting

Pass a `formatting` dict (or an `ExcelFormat` object) to `ExcelWriter` (or `LinkedExcelWriter`) to control styles, column widths, hidden columns, freeze panes, and header rotation. All keys are optional.

```python
fmt = {
    'styles':            {...},   # named style definitions
    'columns':           {...},   # pattern → column properties
    'rows':              {...},   # row key → row properties
    'freeze':            'D2',    # freeze panes cell reference
    'header_auto_rotate': 1.5,   # auto-rotate long headers
    'min_column_width':  4,       # minimum auto-sized column width
    'max_column_width':  40,      # maximum auto-sized column width
    'auto_filter':       True,    # enable dropdown filter on header row
    'tab_color':         '#4472C4',  # worksheet tab color (hex string)
}

with ExcelWriter(file='report.xlsx', formatting=fmt) as writer:
    writer.write_batch(cursor, sheet_name='Data')
```

### ExcelFormat and ColumnRule Dataclasses

For discoverable, IDE-friendly formatting configuration, `ExcelFormat` and `ColumnRule` are available as typed dataclasses:

```python
from dbtk.writers import ExcelWriter, ExcelFormat, ColumnRule

fmt = ExcelFormat(
    styles={
        'fmt_fees': {'bg_color': '#d5f1cc'},
        'fmt_warn': {'bg_color': '#ffcccc', 'font': {'bold': True}},
    },
    columns={
        '*_fee*': ColumnRule(style='fmt_fees'),
        'notes':  ColumnRule(width=40, comment='Free-text field'),
        'gpa':    ColumnRule(style={'number_format': '0.00'}),
    },
    rows={
        '*':    {'height': 15},
        'data': {'odd': {'style': 'fmt_stripe'}},
    },
    freeze='D2',
    min_column_width=4,
    auto_filter=True,
    tab_color='#e3f3fe'
)

with ExcelWriter(file='report.xlsx', formatting=fmt) as writer:
    writer.write_batch(cursor, sheet_name='Data')
```

`ExcelFormat` and `ColumnRule` are fully interchangeable with dicts — pass either form. `ColumnRule` fields mirror the column rule dict keys exactly.

### Named Styles

Define reusable styles in `formatting['styles']`. Each entry is a style name mapped to a properties dict:

```python
'styles': {
    'fmt_fees':    {'bg_color': '#d5f1cc'},                      # green fill
    'fmt_alerts':  {'bg_color': '#ffcccc', 'font': {'bold': True, 'color': 'FF0000'}},
    'fmt_pct':     {'number_format': '0.0%'},
    'fmt_wrap':    {'alignment': {'wrap_text': True, 'vertical': 'top'}},
}
```

**Supported properties:**

| Property | Type | Effect |
|---|---|---|
| `bg_color` | hex string (`'#d5f1cc'` or `'d5f1cc'`) | Solid background fill |
| `font` | dict of [`Font`](https://openpyxl.readthedocs.io/en/stable/api/openpyxl.styles.fonts.html) kwargs | `bold`, `italic`, `color`, `size`, `underline`, … |
| `number_format` | string | Excel number format code |
| `alignment` | dict of [`Alignment`](https://openpyxl.readthedocs.io/en/stable/api/openpyxl.styles.alignment.html) kwargs | `horizontal`, `vertical`, `wrap_text`, `text_rotation`, … |

#### Built-in Styles

The following styles are registered on every workbook and can be used directly by name without defining them in `styles`:

| Style name | Description |
|---|---|
| `date_style` | Number format `YYYY-MM-DD` |
| `datetime_style` | Number format `YYYY-MM-DD HH:MM:SS` |
| `hyperlink_style` | Blue underlined font (used automatically by `LinkedExcelWriter`) |
| `bold_style` | Bold font |
| `header_vert_style` | Bold + 90° rotation + centred (see [Auto-rotating Headers](#auto-rotating-headers)) |
| `currency_style` | Number format `#,##0.00` |
| `percent_style` | Number format `0.00%` |
| `comma_style` | Number format `#,##0` |

> **Overriding built-ins:** To replace a built-in style, define it in `formatting['styles']` using the same name. User-defined styles are registered before built-ins, so a `comma_style` entry in `styles` replaces the default. Any property not specified in your definition falls back to the openpyxl default (no font, no fill, etc.), so include all properties you want.

> **Note:** Named styles are embedded in the workbook file. If you rerun a script that writes to an existing `.xlsx` file, styles from the previous run are already baked in — changed style definitions won't take effect. Delete the file before rerunning to pick up style changes.

---

### Column Rules

`formatting['columns']` maps patterns to per-column properties. Three pattern forms are supported, all matched **case-insensitively**:

- **Literal** — exact column name: `'notes'`
- **Glob** — [`fnmatch`](https://docs.python.org/3/library/fnmatch.html) wildcards (`*`, `?`, `[seq]`): `'*_fee*'`, `'resv_*'`
- **Range** — `'start:end'` applies to all columns from `start` through `end` inclusive, based on their position in the result set. Open-ended forms: `':end'` (from first column) and `'start:'` (to last column). Raises `ValueError` if either endpoint isn't found. A pattern containing `:` is only treated as a range if it has no wildcard characters and doesn't itself match a column name.

```python
'columns': {
    '*_fee*':           {'style': 'fmt_fees', 'header_style': 'header_vert_style'},
    'sales_*':          {'hidden': 1},
    'sales_*_total':    {'hidden': 0, 'comment': 'Additional columns are hidden'},
    'notes':            {'width': 40},
    'unit_price:revenue':  {'style': 'fmt_numeric'},   # range
    ':subj_code':       {'hidden': 1},               # hide everything up to subj_code
}
```

**Column properties:**

| Key | Type | Effect |
|---|---|---|
| `style` | style name or inline dict | Applied to every **data** cell in this column |
| `conditional_style` | callable or list of callables | `lambda rec: style_name_or_None` — per-row conditional style(s); composed on top of all other styles when non-None |
| `header_style` | style name or inline dict | Applied to the **header** cell only; owns the cell entirely (include `font: {bold: True}` if needed) |
| `width` | float | Column width in Excel character units; overrides auto-sizing |
| `hidden` | 0 or 1 | Hide (`1`) or explicitly un-hide (`0`) the column |
| `comment` | string | Adds an Excel comment/note to the header cell |
| `filter` | 0 or 1 | Show a filter dropdown on this column's header; hides dropdowns on all other columns |
| `group_label` | string | Merged super-header label above this column range (range patterns only) |

**Precedence:** Rules are applied in definition order. For most properties (width, hidden, filter, etc.), later patterns override earlier ones. The `style` property is the exception — when multiple patterns match the same column and both provide `style`, the styles are **composed**: properties from the later rule take precedence per property (fill, font, number format), but non-conflicting properties from the earlier rule are preserved.

```python
# wide rule sets background; narrower rule adds number format without losing the background
'columns': {
    'g:slg':   {'style': 'hits_style'},               # green background for all batting cols
    'avg:slg': {'style': {'number_format': '0.000'}}, # composed on top — keeps green bg
}

# scalar properties (width, hidden) always override
'columns': {
    'sales_*':      {'hidden': 1},
    'sales_*_desc': {'hidden': 0, 'comment': 'Additional columns are hidden'},
}
```

**Per-row conditional styles** use a `conditional_style` callable in the column rule. The lambda receives the full record and returns a style name (or `None`). This overrides both the column's static `style` and any row-level style when non-None:

```python
'columns': {
    'max_capacity': {'conditional_style': lambda rec: 'fmt_warn' if rec.max_capacity < 10 else None},
}
```

**Group headers** use `group_label` on a range pattern to add a merged super-header row above the column names. Only range patterns are supported — using `group_label` on a wildcard logs a warning and is ignored. Columns not covered by any group are left blank in the group header row.

```python
'columns': {
    'q1_sales:q4_sales':       {'style': 'fmt_sales',    'group_label': 'Quarterly Sales'},
    'q1_revenue:q4_revenue':   {'style': 'fmt_revenue', 'group_label': 'Quarterly Revenue'},
}
```

Group headers are written bold and centered in row 1; column headers shift to row 2; data starts at row 3. The freeze pane default shifts from `'A2'` to `'A3'` automatically.

**Inline style dicts** work anywhere a style name is accepted. Equivalent inline dicts are deduplicated automatically:

```python
'columns': {
    'gpa': {'style': {'number_format': '0.00', 'alignment': {'horizontal': 'center'}}},
    'title': ColumnRule(style={'bg_color': '#60CCFF'}, width=40,
                        comment='This comment will appear in the header row', filter=1),
}
```

---

### Row Formatting

`formatting['rows']` is a dict with four named keys. All are optional:

| Key | Applies to |
|---|---|
| `'*'` | Every row (header, group header, and all data rows) |
| `'header'` | The column-name header row only |
| `'group_header'` | The group label row only (only relevant when `group_label` columns are used) |
| `'data'` | Data rows only |

Each key maps to a dict that may contain `height` and/or `style`. The `'data'` key additionally supports `odd`, `even`, and `style`:

```python
'rows': {
    '*':           {'height': 15},                        # all rows: default height
    'header':      {'height': 30},                        # override header height
    'group_header': {'height': 20},                       # override group label row height
    'data': {
        'height':   15,                                   # data row height
        'style':    'fmt_data',                           # static style for all data rows
        'odd':      {'style': 'fmt_stripe'},              # rows 1, 3, 5, … (overrides data style)
        'even':     {'style': 'fmt_alt'},                 # rows 2, 4, 6, …
        'conditional_style': lambda rec: 'fmt_alerts' if rec['status'] == 'OVERDUE' else None,
    },
}
```

**Height cascade:** `'*'` sets the default; `'header'`, `'group_header'`, and `'data'` override it for their respective rows. Setting only `'*'` is equivalent to setting the same height everywhere.

**Alternating row colors** use `odd` and `even` nested under `'data'`:

```python
'rows': {
    'data': {
        'odd':  {'style': 'fmt_stripe'},   # rows 1, 3, 5, …
        'even': {'style': 'fmt_alt'},      # rows 2, 4, 6, …
    },
}
```

You can define only one side if you only want every-other-row coloring:

```python
'rows': {'data': {'odd': {'style': 'fmt_stripe'}}}   # only odd rows get a background
```

**Conditional row styles** use `conditional_style` under `'data'`. It accepts a callable or a list of callables; each receives the full record and returns a style name or `None`. Multiple callables are composed in order, later ones taking precedence:

```python
'rows': {
    'data': {
        'conditional_style': lambda rec: 'fmt_alerts' if rec['status'] == 'OVERDUE' else None,
    },
}

# Multiple callables — both applied, last non-None wins per property
'rows': {
    'data': {
        'conditional_style': [
            lambda rec: 'fmt_stripe' if rec['row_num'] % 2 else None,
            lambda rec: 'fmt_alerts' if rec['overdue'] else None,
        ],
    },
}
```

**Style cascade (lowest → highest priority):**

1. Date/datetime base format (applied automatically by type)
2. Column `style`
3. `rows['*']['style']`
4. `rows['data']['style']`
5. `rows['data']['odd'|'even']['style']`
6. `rows['data']['conditional_style']` callable(s)
7. Column `conditional_style` callable(s)
8. `hyperlink_style` (applied by `LinkedExcelWriter` to linked cells)

Styles at higher priority levels are composed on top — they override individual properties (fill, font, number format) rather than replacing the whole style.

---

### Freeze Panes

Set the top-left unfrozen cell. Defaults to `'A2'` (freeze the header row). Pass `None` to disable:

```python
'freeze': 'D2',    # freeze columns A-C and the header row
'freeze': None,    # no freeze panes
```

---

### Auto-rotating Headers

Long column names over narrow data columns waste horizontal space. `header_auto_rotate` detects these columns and applies `header_vert_style` (bold, 90° rotation, centred) automatically.

Two conditions must **both** hold for a column to be rotated:

1. `header_len >= min_length` — prevents short headers like `'OPEN'` from rotating
2. `header_len > data_width × ratio` — the header must be meaningfully longer than the data

```python
# Scalar shorthand — uses defaults (min_length=8, ratio=1.5)
'header_auto_rotate': 1.5

# Dict form — full control
'header_auto_rotate': {
    'ratio':         2.5,    # header must be 2.5× the data sample width
    'min_length':    8,      # ignore headers shorter than 8 characters
    'height_factor': 6.5,    # pts per character for auto header row height
}
```

| Example header | Length | Sampled data | Rotated? |
|---|---|---|---|
| `OPEN` | 4 | `Y` / `N` | No — fails `min_length` |
| `STATUS` | 6 | `Y` / `N` | No — fails `min_length` |
| `GRADABLE_IND` | 12 | `Y` / `N` | Yes |
| `REG_AUTH_ACTIVE_CDE` | 19 | `Y` / `N` | Yes |
| `student_name` | 12 | `Smith, Jane` (11) | No — header not wider than data |

**Header row height** is computed automatically from the longest rotated header name using `height_factor` (default 6.5 pt/character) unless `rows['header']['height']` is set explicitly. `REG_AUTH_ACTIVE_CDE` (19 chars) produces a height of ~123 pt.

**Columns with an explicit `header_style`** are excluded from auto-rotation — your explicit choice takes precedence.

**Auto-rotated columns** use the sampled data width for column sizing (not the header length). Non-rotated columns use `max(header_length, data_width)` as before.

---

### Minimum Column Width

`min_column_width` sets the floor for auto-sized columns (default `6`). Lower it for indicator columns where the data is always short:

```python
'min_column_width': 3    # Y/N flag columns don't need width 6
```

Explicit `width` values in column rules are not affected.

---

### Maximum Column Width

`max_column_width` sets the ceiling for auto-sized columns (default `60`). Useful when a few long text columns would otherwise dominate the sheet:

```python
'max_column_width': 40   # no column wider than 40 units
```

Explicit `width` values in column rules are not affected.

---

### Auto-filter

`auto_filter: True` enables Excel's dropdown filter on the header row across all columns:

```python
'auto_filter': True
```

To show filter dropdowns on **specific columns only**, use the `filter` column rule instead. This implies auto-filter but hides the dropdown on all unmarked columns:

```python
'columns': {
    'subj_code': {'filter': 1},
    'term_code': {'filter': 1},
}
```

If both `auto_filter: True` and column-level `filter: 1` are set, the column-level rules win — only the marked columns show dropdowns.

---

### Formatted Excel Example

```python
fmt = {
    'styles': {        
        'fmt_sales':     {'bg_color': '#e3f3fe'},
        'fmt_volume':    {'bg_color': '#ffeed9'},
        'fmt_shrinkage': {'bg_color': '#d5f1cc', 'font': {'bold': 1}},
        'fmt_warning':   {'bg_color': '#fec76f'},
        'fmt_stripe':    {'bg_color': '#dde8ed'}
    },
    'columns': {
        'sales_*':      {'style': 'fmt_sales'},
        'volume*':      {'style': 'fmt_volume'},
        'shrink*':      {'style': 'fmt_shrinkage', 'hidden': 1},
        'shrink_pct':   {'hidden': 0,
                         'conditional_style': lambda x: 'fmt_warning' if x.shrink_pct > 8.0 else None},
    },
    'freeze':             'D3',
    'header_auto_rotate': {'min_length': 8, 'ratio': 2.5},
    'min_column_width':   3,
    'tab_color': '#e3f3fe'
}
stmt = cursor.prepare_file('quarterly_sales')
with ExcelWriter(file='quarterly_sales.xlsx', formatting=fmt) as writer:
    for qtr in fy_quarters:
        stmt.execute({'quarter': qtr})
        writer.write_batch(stmt, sheet_name=f'Q{qtr}')
```

See [examples/formatted_spreadsheet.py](../examples/README.md#formatted_spreadsheetpy) for a complete, runnable example.
---

## Per-sheet Formatting

The writer-level `formatting` applies to all `write_batch()` calls by default. Pass a `formatting` argument directly to `write_batch()` to override it for that sheet:

- `formatting=None` (default) — use the writer-level formatting
- `formatting={...}` — use this dict for this sheet only
- `formatting={}` — write this sheet with no formatting (bold headers, default column widths)

```python
base_fmt   = {'styles': {'fmt_fees': {'bg_color': '#d5f1cc'}}, 'auto_filter': True}
fees_fmt   = {'columns': {'*_fee*': {'style': 'fmt_fees'}}}
roster_fmt = {'columns': {'gpa': {'style': 'fmt_pct'}}}

with ExcelWriter(file='report.xlsx', formatting=base_fmt) as writer:
    writer.write_batch(fees_data,   sheet_name='Fees',   formatting=fees_fmt)
    writer.write_batch(roster_data, sheet_name='Roster', formatting=roster_fmt)
    writer.write_batch(raw_data,    sheet_name='Raw',    formatting={})   # no formatting
```

Named styles are registered to the workbook (not per-worksheet), so styles defined in any call's `formatting['styles']` are available to all subsequent calls. The first registration of a given style name wins — later calls with the same name are silently ignored.

---

## Hyperlinked Reports with LinkedExcelWriter

`LinkedExcelWriter` extends `ExcelWriter` to create navigable reports with internal worksheet links and external URL hyperlinks.

### Workflow

1. Define one or more `LinkSource` objects describing linkable entities
2. Register them with the writer
3. Write **source sheets first** — row locations and display text are cached as they're written
4. Write **detail sheets** — specify which columns become hyperlinks

```python
from dbtk.writers import LinkedExcelWriter, LinkSource

student_link = LinkSource(
    name="student",
    source_sheet="Students",
    key_column="student_id",
    text_template="{last_name}, {first_name}",
    url_template="https://sis.university.edu/student/{student_id}",
)

with LinkedExcelWriter(file='enrollment_report.xlsx') as writer:
    writer.register_link_source(student_link)

    # Source sheet — locations cached as written
    writer.write_batch(students_cursor, sheet_name='Students')

    # Detail sheet — student_id column becomes a hyperlink
    writer.write_batch(
        enrollments_cursor,
        sheet_name='Enrollments',
        links={'student_id': 'student'}           # column → link source name
    )
```

### LinkSource Parameters

| Parameter       | Required | Description                                                                       |
|-----------------|----------|-----------------------------------------------------------------------------------|
| `name`          | Yes      | Identifier used in `links=` dict                                                  |
| `source_sheet`  | Yes*     | Sheet whose rows are the link targets                                             |
| `key_column`    | Yes*     | Column that uniquely identifies each row                                          |
| `text_template` | No       | Python format string for link display text (uses `str.format_map`)                |
| `url_template`  | No       | Python format string for external URL                                             |
| `missing_text`  | No       | Fallback text when a key can't be resolved                                        |
| `external_only` | No       | If `True`, generate URLs from current row without caching; reusable across sheets |

*Not required when `external_only=True`.

### Link Types

```python
links={
    'student_name': 'student',           # external URL (falls back to internal)
    'student_name': 'student:internal',  # always links to source sheet row
    'student_name': 'student:external',  # always uses url_template
}
```

### External-only Links

When `external_only=True`, the `LinkSource` generates URLs directly from the current row's data without caching. The same source can be reused across multiple sheets:

```python
imdb_link = LinkSource(
    name="imdb",
    url_template="https://imdb.com/title/{tconst}",
    text_template="{primary_title} ({start_year})",
    external_only=True,
)

with LinkedExcelWriter(file='movies.xlsx') as writer:
    writer.register_link_source(imdb_link)
    writer.write_batch(all_movies, sheet_name='All',
                       links={'primary_title': 'imdb'})
    writer.write_batch(top_rated, sheet_name='Top Rated',
                       links={'primary_title': 'imdb'})
```

### Combining Formatting and Links

`LinkedExcelWriter` accepts the same `formatting` dict as `ExcelWriter`. Column styles and row styles apply to non-linked cells; linked cells always get `hyperlink_style`:

```python
with LinkedExcelWriter(file='report.xlsx', formatting=fmt) as writer:
    writer.register_link_source(student_link)
    writer.write_batch(students, sheet_name='Students')
    writer.write_batch(enrollments, sheet_name='Enrollments',
                       links={'student_id': 'student:internal'})
```

---

## Formatting Quick Reference

| Key | Type | Default | Description |
|---|---|---|---|
| `styles` | `dict[name, props]` | `{}` | Named style definitions |
| `columns` | `dict[pattern, props]` | `{}` | Wildcard column rules |
| `rows` | `dict` | `{}` | Row height/style; keys: `'*'` (all rows), `'header'`, `'group_header'`, `'data'`; `data` supports nested `odd`, `even`, `style`, `height` |
| `freeze` | `str \| None` | `'A2'` | Freeze panes cell reference |
| `header_auto_rotate` | `float \| dict` | off | Auto-rotate long headers; see [above](#auto-rotating-headers) |
| `min_column_width` | `float` | `6` | Floor for auto-sized column widths |
| `max_column_width` | `float` | `60` | Ceiling for auto-sized column widths |
| `auto_filter` | `bool` | `False` | Enable dropdown filter on header row |
| `tab_color` | `str` | — | Worksheet tab color as hex string (`'#FF0000'` or `'FF0000'`) |

**Column rule keys:**

| Key | Type | Description |
|---|---|---|
| `style` | style name or dict | Style applied to data cells |
| `conditional_style` | callable or list | Per-row conditional style(s); composed on top of all other styles |
| `header_style` | style name or dict | Style applied to the header cell only |
| `width` | float | Override auto-sized width |
| `hidden` | 0 or 1 | Hide or explicitly un-hide the column |
| `comment` | string | Excel comment/note on the header cell |
| `filter` | 0 or 1 | Show filter dropdown on this column; hides dropdowns on all others |
| `group_label` | string | Merged super-header label (range patterns only) |

**`header_auto_rotate` dict keys:**

| Key | Default | Description |
|---|---|---|
| `ratio` | `1.5` | `header_len > data_width × ratio` to trigger rotation |
| `min_length` | `8` | Minimum header character count to be considered for rotation |
| `height_factor` | `6.5` | Points per character for auto header row height calculation |


See [examples/linked_spreadsheet.py](../examples/README.md#linked_spreadsheetpy) for a complete, runnable example.