Is there a way to monitor the progress of the creation of an index in PostgreSQL. I am creating an index on a large table and I would like to see how fast this is occurring.
Is there a way to monitor this?
It is possible since PostgreSQL 12 release (released October 3, 2019).
SELECT
now()::TIME(0),
a.query,
p.phase,
round(p.blocks_done / p.blocks_total::numeric * 100, 2) AS "% done",
p.blocks_total,
p.blocks_done,
p.tuples_total,
p.tuples_done,
ai.schemaname,
ai.relname,
ai.indexrelname
FROM pg_stat_progress_create_index p
JOIN pg_stat_activity a ON p.pid = a.pid
LEFT JOIN pg_stat_all_indexes ai on ai.relid = p.relid AND ai.indexrelid = p.index_relid;
This can be used to check which index is rebuilding on REINDEX DATABASE command.
See the docs for pg_stat_progress_create_index view and depesz's blog post for details.
CREATE INDEX or REINDEX currently in progress as pg_stat_progress_create_index docs states.
case when p.blocks_total > 0
select * from pg_stat_progress_create_index;
According to Postgres Wiki's Index Maintenance page, you can find out the current state of all your indexes with:
SELECT
t.tablename,
indexname,
c.reltuples AS num_rows,
pg_size_pretty(pg_relation_size(quote_ident(t.tablename)::text)) AS table_size,
pg_size_pretty(pg_relation_size(quote_ident(indexrelname)::text)) AS index_size,
CASE WHEN indisunique THEN 'Y'
ELSE 'N'
END AS UNIQUE,
idx_scan AS number_of_scans,
idx_tup_read AS tuples_read,
idx_tup_fetch AS tuples_fetched
FROM pg_tables t
LEFT OUTER JOIN pg_class c ON t.tablename=c.relname
LEFT OUTER JOIN
( SELECT c.relname AS ctablename, ipg.relname AS indexname, x.indnatts AS number_of_columns, idx_scan, idx_tup_read, idx_tup_fetch, indexrelname, indisunique FROM pg_index x
JOIN pg_class c ON c.oid = x.indrelid
JOIN pg_class ipg ON ipg.oid = x.indexrelid
JOIN pg_stat_all_indexes psai ON x.indexrelid = psai.indexrelid )
AS foo
ON t.tablename = foo.ctablename
WHERE t.schemaname='public'
ORDER BY 1,2;
The column num_rows indicates how many rows are covered by your index and index_size will grow as the index is being built.
So, there is no good way to do it pre Postgres 12, but if you really need to know... first calculate the amount of space the index should take, based on data size * rows + overhead. You can then use something like pfiles or pgtruss to find the files that are being written inside $PGDATA; if your indexes are over 1GB, it will be a series of files like nnnnn.n, where the first set of n's is consistent, and the last n increments for each GB file. Once you know how many files are created, you can watch the growth and figure out how close you are to finishing. Rough estimate, but maybe it helps.
No, there isn't, even if you are building it in CONCURRENT mode. Although in the past I have kept on eye on the size of the file in the database directory, this isn't really useful since you can only guess how large it is going to be.
https://dev.to/bolajiwahab/progress-reporting-in-postgresql-1i0d
SELECT
clock_timestamp() - a.xact_start AS duration_so_far,
coalesce(a.wait_event_type ||'.'|| a.wait_event, 'false') AS waiting,
a.state,
p.phase,
CASE p.phase
WHEN 'initializing' THEN '1 of 12'
WHEN 'waiting for writers before build' THEN '2 of 12'
WHEN 'building index: scanning table' THEN '3 of 12'
WHEN 'building index: sorting live tuples' THEN '4 of 12'
WHEN 'building index: loading tuples in tree' THEN '5 of 12'
WHEN 'waiting for writers before validation' THEN '6 of 12'
WHEN 'index validation: scanning index' THEN '7 of 12'
WHEN 'index validation: sorting tuples' THEN '8 of 12'
WHEN 'index validation: scanning table' THEN '9 of 12'
WHEN 'waiting for old snapshots' THEN '10 of 12'
WHEN 'waiting for readers before marking dead' THEN '11 of 12'
WHEN 'waiting for readers before dropping' THEN '12 of 12'
END AS phase_progress,
format(
'%s (%s of %s)',
coalesce(round(100.0 * p.blocks_done / nullif(p.blocks_total, 0), 2)::text || '%', 'not applicable'),
p.blocks_done::text,
p.blocks_total::text
) AS scan_progress,
format(
'%s (%s of %s)',
coalesce(round(100.0 * p.tuples_done / nullif(p.tuples_total, 0), 2)::text || '%', 'not applicable'),
p.tuples_done::text,
p.tuples_total::text
) AS tuples_loading_progress,
format(
'%s (%s of %s)',
coalesce((100 * p.lockers_done / nullif(p.lockers_total, 0))::text || '%', 'not applicable'),
p.lockers_done::text,
p.lockers_total::text
) AS lockers_progress,
format(
'%s (%s of %s)',
coalesce((100 * p.partitions_done / nullif(p.partitions_total, 0))::text || '%', 'not applicable'),
p.partitions_done::text,
p.partitions_total::text
) AS partitions_progress,
p.current_locker_pid,
trim(trailing ';' from l.query) AS current_locker_query
FROM pg_stat_progress_create_index AS p
JOIN pg_stat_activity AS a ON a.pid = p.pid
LEFT JOIN pg_stat_activity AS l ON l.pid = p.current_locker_pid
ORDER BY clock_timestamp() - a.xact_start DESC;
https://gitlab.com/-/snippets/2138417
select
now(),
query_start as started_at,
now() - query_start as query_duration,
format('[%s] %s', a.pid, a.query) as pid_and_query,
index_relid::regclass as index_name,
relid::regclass as table_name,
(pg_size_pretty(pg_relation_size(relid))) as table_size,
nullif(wait_event_type, '') || ': ' || wait_event as wait_type_and_event,
phase,
format(
'%s (%s of %s)',
coalesce((round(100 * blocks_done::numeric / nullif(blocks_total, 0), 2))::text || '%', 'N/A'),
coalesce(blocks_done::text, '?'),
coalesce(blocks_total::text, '?')
) as blocks_progress,
format(
'%s (%s of %s)',
coalesce((round(100 * tuples_done::numeric / nullif(tuples_total, 0), 2))::text || '%', 'N/A'),
coalesce(tuples_done::text, '?'),
coalesce(tuples_total::text, '?')
) as tuples_progress,
current_locker_pid,
(select nullif(left(query, 150), '') || '...' from pg_stat_activity a where a.pid = current_locker_pid) as current_locker_query,
format(
'%s (%s of %s)',
coalesce((round(100 * lockers_done::numeric / nullif(lockers_total, 0), 2))::text || '%', 'N/A'),
coalesce(lockers_done::text, '?'),
coalesce(lockers_total::text, '?')
) as lockers_progress,
format(
'%s (%s of %s)',
coalesce((round(100 * partitions_done::numeric / nullif(partitions_total, 0), 2))::text || '%', 'N/A'),
coalesce(partitions_done::text, '?'),
coalesce(partitions_total::text, '?')
) as partitions_progress,
(
select
format(
'%s (%s of %s)',
coalesce((round(100 * n_dead_tup::numeric / nullif(reltuples::numeric, 0), 2))::text || '%', 'N/A'),
coalesce(n_dead_tup::text, '?'),
coalesce(reltuples::int8::text, '?')
)
from pg_stat_all_tables t, pg_class tc
where t.relid = p.relid and tc.oid = p.relid
) as table_dead_tuples
from pg_stat_progress_create_index p
left join pg_stat_activity a on a.pid = p.pid
order by p.index_relid
; -- in psql, use "\watch 5" instead of semicolon
Demo for
REINDEX TABLE CONCURRENTLY: