KingbaseES数据库空间监控全攻略:从单表到整个数据库的容量查询技巧

张开发
2026/4/5 22:41:10 15 分钟阅读

分享文章

KingbaseES数据库空间监控全攻略:从单表到整个数据库的容量查询技巧
KingbaseES数据库空间监控全攻略从单表到整个数据库的容量查询技巧数据库空间管理是运维工作中最容易被忽视却至关重要的环节。想象一下当你凌晨三点被报警电话惊醒发现生产数据库因空间耗尽而崩溃时那种手足无措的感觉。作为深耕数据库领域多年的技术专家我见过太多因空间监控不到位导致的灾难性事故。本文将分享一套完整的KingbaseES空间监控方法论从微观到宏观带您掌握空间管理的核心技巧。1. 空间监控基础理解KingbaseES存储架构KingbaseES采用典型的PostgreSQL风格存储架构但又有其独特之处。数据文件主要存储在base目录下每个数据库对应一个子目录而表空间则可能分布在不同的物理位置。理解这些基础概念是进行有效监控的前提。关键目录结构$KINGBASE_HOME/data ├── base # 默认表空间数据文件 ├── pg_tblspc # 额外表空间符号链接 ├── pg_stat_tmp # 统计信息临时文件 └── pg_wal # WAL日志文件表空间与数据库的关系可通过以下SQL查询SELECT spcname, spcowner, spclocation FROM sys_tablespace;注意sys_default是KingbaseES默认表空间通常对应base目录。当创建新表空间时会在pg_tblspc下创建符号链接指向自定义位置。2. 单表空间分析精准定位存储大户单表级别的空间监控是精细化管理的基础。以下是我在多个生产环境中验证过的高效查询方案。2.1 基础表大小查询获取表及其索引的详细大小SELECT table_name, sys_size_pretty(sys_table_size(table_name)) AS data_size, sys_size_pretty(sys_indexes_size(table_name)) AS index_size, sys_size_pretty(sys_total_relation_size(table_name)) AS total_size FROM ( SELECT ( || table_schema || . || table_name || ) AS table_name FROM information_schema.tables WHERE table_schema public -- 替换为目标模式名 ) AS tables ORDER BY total_size DESC;2.2 高级空间分析技巧对于大表我们还需要了解其内部存储分布SELECT relname AS table_name, pg_size_pretty(pg_relation_size(oid)) AS total_size, pg_size_pretty(pg_relation_size(oid, main)) AS main_size, pg_size_pretty(pg_relation_size(oid, fsm)) AS fsm_size, pg_size_pretty(pg_relation_size(oid, vm)) AS vm_size FROM pg_class WHERE relkind r AND relnamespace public::regnamespace ORDER BY pg_relation_size(oid) DESC;典型问题排查表现象可能原因解决方案表大小异常增长未清理的旧数据实施数据归档策略索引占比过高冗余或低效索引索引优化重建FSM文件过大频繁更新导致空间映射膨胀VACUUM FULL操作3. 模式级空间管理逻辑分组的容量控制模式(schema)作为表的逻辑容器其空间监控对多租户系统尤为重要。以下是经过实战检验的模式空间分析方案。3.1 单模式空间统计SELECT nspname AS schema_name, pg_size_pretty(sum(pg_total_relation_size(oid))) AS total_size, pg_size_pretty(sum(pg_relation_size(oid))) AS table_size, pg_size_pretty(sum(pg_indexes_size(oid))) AS index_size FROM pg_class JOIN pg_namespace ON relnamespace pg_namespace.oid WHERE nspname kingbase -- 替换为目标模式名 GROUP BY nspname;3.2 全库模式空间对比SELECT nspname AS schema_name, pg_size_pretty(sum(pg_total_relation_size(oid))) AS total_size, round(sum(pg_total_relation_size(oid)) * 100.0 / (SELECT sum(pg_total_relation_size(oid)) FROM pg_class WHERE relkind IN (r,i)), 2) AS percentage FROM pg_class JOIN pg_namespace ON relnamespace pg_namespace.oid WHERE nspname NOT LIKE pg_% AND nspname ! information_schema GROUP BY nspname ORDER BY sum(pg_total_relation_size(oid)) DESC;模式空间增长趋势监控脚本#!/bin/bash DATE$(date %Y%m%d) psql -U kingbase -d kingbase -c COPY ( SELECT nspname, sum(pg_total_relation_size(oid)) AS size_bytes FROM pg_class JOIN pg_namespace ON relnamespace pg_namespace.oid GROUP BY nspname ) TO /var/log/kingbase_space/schema_size_$DATE.csv WITH CSV HEADER; 4. 数据库级空间全景视图全局视角的空间监控是容量规划的基础。以下方法在我负责的多个PB级KingbaseES集群中得到验证。4.1 单数据库大小查询SELECT datname, pg_size_pretty(pg_database_size(datname)) AS size, pg_database_size(datname) AS size_bytes FROM pg_database WHERE datname kingbase; -- 替换为目标数据库名4.2 全实例数据库对比SELECT datname, pg_size_pretty(pg_database_size(datname)) AS size, pg_database_size(datname) AS size_bytes, round(pg_database_size(datname) * 100.0 / (SELECT sum(pg_database_size(datname)) FROM pg_database), 2) AS percentage FROM pg_database ORDER BY pg_database_size(datname) DESC;数据库空间组成分析SELECT datname, pg_size_pretty(pg_database_size(datname)) AS total, pg_size_pretty((SELECT sum(pg_total_relation_size(oid)) FROM pg_class WHERE relkind IN (r,i))) AS user_data, pg_size_pretty(pg_database_size(datname) - (SELECT sum(pg_total_relation_size(oid)) FROM pg_class WHERE relkind IN (r,i))) AS system_usage FROM pg_database WHERE datname current_database();5. 自动化监控方案实战手工查询难以满足生产环境需求下面分享我设计的全自动化监控体系。5.1 监控视图创建CREATE OR REPLACE VIEW space_monitor.schema_stats AS SELECT nspname AS schema_name, count(*) AS table_count, pg_size_pretty(sum(pg_total_relation_size(oid))) AS total_size, sum(pg_total_relation_size(oid)) AS size_bytes, round(sum(pg_total_relation_size(oid)) * 100.0 / (SELECT sum(pg_total_relation_size(oid)) FROM pg_class WHERE relkind IN (r,i)), 2) AS percentage FROM pg_class JOIN pg_namespace ON relnamespace pg_namespace.oid WHERE relkind IN (r,i) AND nspname NOT LIKE pg_% GROUP BY nspname;5.2 定时任务配置使用KingbaseES的pgAgent扩展设置每日空间快照-- 创建历史记录表 CREATE TABLE space_monitor.history ( capture_time TIMESTAMP PRIMARY KEY, db_size BIGINT, largest_schema TEXT, largest_table TEXT ); -- 设置定时任务 SELECT pgagent.add_job( daily_space_monitor, Daily database space collection, 0 2 * * *, -- 每天凌晨2点运行 $$INSERT INTO space_monitor.history SELECT now(), pg_database_size(current_database()), (SELECT schema_name FROM space_monitor.schema_stats ORDER BY size_bytes DESC LIMIT 1), (SELECT table_name FROM space_monitor.table_stats ORDER BY size_bytes DESC LIMIT 1) $$ );5.3 预警阈值设置CREATE OR REPLACE FUNCTION space_monitor.check_space() RETURNS TRIGGER AS $$ DECLARE total_size BIGINT; warning_threshold BIGINT : 100 * 1024 * 1024 * 1024; -- 100GB critical_threshold BIGINT : 150 * 1024 * 1024 * 1024; -- 150GB BEGIN SELECT pg_database_size(current_database()) INTO total_size; IF total_size critical_threshold THEN RAISE EXCEPTION CRITICAL: Database space exceeded %, pg_size_pretty(critical_threshold); ELSIF total_size warning_threshold THEN RAISE WARNING WARNING: Database space approaching %, pg_size_pretty(critical_threshold); END IF; RETURN NEW; END; $$ LANGUAGE plpgsql; CREATE TRIGGER space_check_trigger AFTER INSERT OR UPDATE ON space_monitor.history FOR EACH ROW EXECUTE FUNCTION space_monitor.check_space();6. 高级技巧与疑难问题处理在实际运维中我们经常会遇到一些特殊场景和棘手问题。以下是经过实战验证的解决方案。6.1 大对象(LOB)空间回收KingbaseES中的大对象可能造成空间浪费-- 查找大对象 SELECT oid, pg_size_pretty(pg_largeobject_size(oid)) AS size FROM pg_largeobject_metadata ORDER BY pg_largeobject_size(oid) DESC LIMIT 10; -- 清理过期大对象 SELECT lo_unlink(oid) FROM pg_largeobject_metadata WHERE create_time now() - interval 6 months;6.2 表空间平衡策略当默认表空间即将满时可按以下步骤迁移数据-- 创建新表空间 CREATE TABLESPACE fast_ssd LOCATION /mnt/ssd/kingbase; -- 迁移大表 ALTER TABLE large_table SET TABLESPACE fast_ssd; -- 检查迁移进度 SELECT relname, pg_size_pretty(pg_relation_size(oid)) AS size, CASE WHEN reltablespace 0 THEN sys_default ELSE (SELECT spcname FROM pg_tablespace WHERE oid reltablespace) END AS tablespace FROM pg_class WHERE relkind r ORDER BY pg_relation_size(oid) DESC;6.3 WAL日志空间管理WAL日志可能意外占用大量空间-- 检查WAL空间使用 SELECT name, setting, unit, pg_size_pretty(setting::bigint) AS pretty_size FROM pg_settings WHERE name IN (wal_segment_size, max_wal_size, min_wal_size); -- 手动触发WAL切换 SELECT pg_switch_wal(); -- 清理旧WAL(需在维护窗口执行) SELECT pg_walfile_name_offset(pg_current_wal_lsn()); VACUUM FULL VERBOSE ANALYZE;

更多文章