Configuration & Security
DBTK uses YAML configuration files to manage database connections and keep credentials secure with encryption. Running dbtk checkup will copy a well commented sample config file to ~/.config/dbtk_sample.yml.
DBTK also has several command line tools to assist with configuration and encryption.
Quick Start
Create a dbtk.yml file in your project folder or in ~/.config folder:
connections:
dev_db:
type: postgres
host: localhost
database: myapp_dev
user: developer
password: dev_password
prod_db:
type: postgres
host: db.example.com
database: myapp_prod
user: app_user
encrypted_password: gAAAAABh... # Use dbtk encrypt-password
Connect from code:
import dbtk
db = dbtk.connect('prod_db')
# or with context manager
with dbtk.connect('prod_db') as db:
cur = db.cursor()
Configuration File Locations
DBTK searches for config files in this order:
Explicitly set path:
dbtk.set_config_file('path/to/config.yml')Current directory:
./dbtk.ymlor./dbtk.yamlUser config:
~/.config/dbtk.ymlor~/.config/dbtk.yaml
If no config file is found, a sample is automatically created at ~/.config/dbtk_sample.yml.
import dbtk
# Use specific config file
dbtk.set_config_file('/path/to/production.yml')
db = dbtk.connect('database_name')
# Use default search path
db = dbtk.connect('database_name')
Configuration File Structure
The config file has two main sections: settings, connections, and optionally passwords and drivers.
Settings
settings:
default_batch_size: 1000
default_country: US
default_timezone: UTC
# Logging settings can be overwritten in a script level dbtk.setup_logging(level='DEBUG')
logging:
directory: ./logs
level: INFO
format: '%(asctime)s [%(levelname)s] %(name)s: %(message)s'
timestamp_format: '%Y-%m-%d %H:%M:%S'
filename_format: '%Y%m%d_%H%M%S'
split_errors: true # If True, separate error log will be created (only if critical or errors are encountered
console: true
retention_days: 30
# File I/O settings
compressed_file_buffer_size: 1048576 # 1MB buffer for reading compressed files (.gz, .bz2, .xz)
data_dump_dir: /tmp # Default directory for BulkSurge dump() operations
# CSV writer settings
null_string_csv: '' # How to represent NULL values in CSV output
Database Connections
Each connection is defined under the connections: key with a type field indicating the database type.
Some database drivers use non-standard parameter names (db instead database). DBTK will automatically map standard names to driver specific names if needed.
connections:
my_database:
type: postgres # Database type: postgres, oracle, mysql, sqlserver, sqlite
host: localhost # Server hostname
port: 5432 # Port (optional, uses driver default)
database: mydb # Database name
user: myuser # Username
password: mypassword # Plaintext - encrypt it! See below
encrypted_password: gAAAAABh... #Use `dbtk encrypt-password mypassword`
Driver selection:
You can optionally specify a driver field to choose a specific database adapter. If omitted, DBTK automatically selects the best available driver by priority. If you specify driver, type is optional since DBTK can infer it from the driver name.
connections:
# Use psycopg (v3) instead of default psycopg2
pg_v3:
driver: psycopg # run `dbtk checkup` to show available drivers
host: localhost
database: mydb
user: myuser
encrypted_password: gAAAAABh...
# ODBC connection with DSN
odbc_db:
driver: pyodbc_postgres
dsn: MY_DSN
password: '${MY_PASSWORD}' # Pull from environment variable
Driver-Specific Examples
PostgreSQL:
connections:
postgres_db:
type: postgres
host: localhost
port: 5432
database: mydb
user: myuser
encrypted_password: gAAAAABh...
Oracle:
connections:
oracle_db:
type: oracle
host: oracle.company.com
port: 1521
database: prod.company.com # Service name
user: app_user
encrypted_password: gAAAAABh...
MySQL:
connections:
mysql_db:
type: mysql
host: localhost
port: 3306
database: mydb
user: myuser
password: secret
SQL Server:
connections:
sqlserver_db:
driver: pyodbc_sqlserver
type: sqlserver
host: localhost\SQLEXPRESS # Instance name supported
database: mydb
user: myuser
password: secret
cursor:
fast_executemany: true # Recommended for bulk operations
SQLite:
connections:
sqlite_db:
type: sqlite
database: /path/to/database.db
Cursor Settings
Set default cursor behavior for all cursors created from a connection:
connections:
my_database:
type: postgres
host: localhost
database: mydb
user: myuser
encrypted_password: gAAAAABh...
cursor:
batch_size: 4000 # Rows to process at once in bulk operations
debug: false # Print SQL queries and bind variables
return_cursor: true # execute() returns cursor for method chaining
fast_executemany: true # For pyodbc SQL Server bulk inserts
Password Encryption
DBTK uses Fernet symmetric encryption (from the cryptography library) for password storage. Before you can begin encrypting and decrypting passwords, you must generate and store an encryption key.
If system keyring is available, this is as easy as running dbtk store-key. See Command-Line Tools and Programatic Encryption sections for help.
Encryption Key Management
DBTK looks for encryption keys in this order:
Environment variable:
DBTK_ENCRYPTION_KEYSystem keyring: service
dbtk, keyencryption_key(requireskeyringlibrary)
Command-Line Tools
All encryption operations use the dbtk CLI with subcommands:
# Generate a new encryption key
$ dbtk generate-key # Generate a key to manually store in environmental variable
# Store key in system keyring (requires keyring library)
$ dbtk store-key [key] # Store provided key or generate new one
$ dbtk store-key --force # Overwrite existing key
# Encrypt a single password
$ dbtk encrypt-password mypassword
gAAAAABh...
# Encrypt all passwords in a config file
# Finds plaintext 'password:' entries and converts to 'encrypted_password:'
$ dbtk encrypt-config [dbtk.yml]
# Migrate config to a new encryption key
$ dbtk migrate-config old_config.yml new_config.yml --new-key "new_key_here"
# Check dependencies, drivers, and configuration
$ dbtk checkup
# Interactive configuration setup wizard
$ dbtk config-setup
If the dbtk command is not found (e.g. pip-installed without admin privileges so the
scripts directory is not in PATH), every command above can be run as python -m dbtk
instead:
python -m dbtk config-setup
python -m dbtk checkup
python -m dbtk encrypt-config
Using Encrypted Passwords
In your config file, use encrypted_password instead of password:
connections:
prod_db:
type: postgres
host: db.example.com
database: production
user: app_user
encrypted_password: gAAAAABh...
Programmatic Encryption
from dbtk import config
# Generate encryption key
key = config.generate_encryption_key()
# Encrypt/decrypt passwords
cfg = config.ConfigManager()
encrypted = cfg.encrypt_password('my_secret')
decrypted = cfg.decrypt_password(encrypted)
# Encrypt all passwords in a config file
config.encrypt_config_file('dbtk.yml')
# Store key in system keyring
config.store_key(key, force=True)
Environment Variables in Connection Config
Reference environment variables in any connection parameter using ${VAR_NAME} syntax. You can also provide a default value with ${VAR_NAME:default}:
connections:
# Required env var (fails if not set)
prod_db:
type: postgres
host: '${PROD_HOST}'
password: '${PROD_PASSWORD}'
# With defaults (uses default if env var not set)
dev_db:
type: postgres
host: '${DB_HOST:localhost}'
port: '${DB_PORT:5432}'
database: '${DB_NAME:myapp_dev}'
user: '${DB_USER:developer}'
password: '${DB_PASSWORD:dev_password}'
This is especially useful for Docker/CI environments where you want a config that works both locally (using defaults) and in production (using env vars).
Recommended Setup for Production
# 1. Generate encryption key
$ dbtk generate-key
# Output: kL7xP9... (your Fernet key)
# 2. Store key securely
$ export DBTK_ENCRYPTION_KEY=kL7xP9... # For containers/CI
# OR
$ dbtk store-key kL7xP9... # For workstations with keyring
# 3. Encrypt passwords in config
$ dbtk encrypt-config dbtk.yml
Key rotation:
Use dbtk migrate-config to migrate a config file to a new environment without having shared encryption keys.
The passwords will be decrypted with your current key and encrypted with the new key.
export DBTK_ENCRYPTION_KEY="current_key"
NEW_KEY=$(dbtk generate-key)
dbtk migrate-config dbtk.yml dbtk_new.yml --new-key "$NEW_KEY"
export DBTK_ENCRYPTION_KEY="$NEW_KEY"
mv dbtk_new.yml dbtk.yml
Standalone Passwords
Store non-database credentials (API keys, etc.) in the passwords section:
passwords:
openai_key:
description: "OpenAI API key for data processing"
encrypted_password: gAAAAABh...
Custom Driver Registration
Register custom database drivers in the config file. By registering a driver for any DB-API 2.0 compliant library, DBTK can support
additional databases. Plain reads, writes, and DataSurge inserts/updates/deletes work without any additional configuration.
The following features are database-specific and will not work without a matching DatabaseDialect subclass:
Table.upsert, DataSurge.upsert, Table.merge, DataSurge.merge, BulkSurge, and column_defs_from_db.
drivers:
firebird:
database_type: firebird
module: firebird.driver # Only needed if name doesn't match module
priority: 1
param_map: {'database': 'db', 'password': 'passwd'} # Map non-standard parameter names
required_params: [{'host', 'database', 'user'}, {'dsn'}]
optional_params: {'port', 'protocol'}
connection_method: kwargs # connection_string (postgres), dsn (oracle), odbc_string (odbc), kwargs (all others)
default_port: 3050
Or register programmatically:
from dbtk.database import register_user_drivers
register_user_drivers({
'my_driver': {
'database_type': 'postgres',
'priority': 10,
'param_map': {'database': 'dbname'},
'required_params': [{'host', 'database', 'user'}],
'optional_params': {'port', 'password'},
'connection_method': 'kwargs',
'default_port': 5432
}
})
Security Best Practices
Use
encrypted_passwordin production config filesNever commit encryption keys to version control
Use
DBTK_ENCRYPTION_KEYenvironment variable in containerized environmentsUse system keyring on workstations (
dbtk store-key)Set restrictive permissions on config files (
chmod 600)Use separate configs for dev/staging/production
See Also
Database Connections - Using the Database and Cursor classes
ETL Framework - Building production ETL pipelines
Advanced Features - Custom drivers and performance tuning