Code Monkey home page Code Monkey logo

pg_reorg's Introduction

pg_reorg -- UNOFFICIAL REPOSITORY

This is NOT the official pg_reorg repository. Official development is currently on pgFoundry: http://pgfoundry.org/projects/reorg

This repository (and the url https://github.com/reorg) has been set up to provide greater visibility and easier contribution to the pg_reorg project, an intention apparent from a recent mailing list discussion.

The current pg_reorg maintainers have been be invited to the organization with administrative privileges, keeping total control of the project.


In this repository there are changesets fixing pg_reorg 1.1.7 bugs and shortcomings (in the maint_1.1 branch) and new development ideas (in the master branch and several feature branches). Not having heard recently from pg_reorg authors, we don't think we have the authority to distribute them as a new pg_reorg release. For this reason, the fork pg_repack is being developed.


This repository has been converted from CVS using the command:

git cvsimport -v -d :pserver:[email protected]:/cvsroot/reorg -A pg_reorg.auths -C pg_reorg -k -r cvs pg_reorg

with a suitably populated pg_reorg.auths.

I assume new CSV commits will be added to remotes/cvs/master, but I'm not sure yet, so please consider this repository unstable until the development model has been organized.

pg_reorg's People

Contributors

dvarrazzo avatar itgacky avatar schmiddy avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

pg_reorg's Issues

pg_reorg fails to run on a newly-promoted streaming replication slave

Originally reported by Jeff Frost as pg_reorg Bug #1011203
Forwarded as PostgreSQL bug #7572

Original Jeff's report follows:

I've seen this issue on two systems. The postgresql server in question was a 9.1.3 streaming replica which was brought up as a master.

pg_reorg starts fine, gets the initial locks, creates the trigger, then does the SELECT INTO, creates all the indexes,thencan't complete because it always finds a long running transaction.

If you go look in pg_locks, you find this:

  locktype  | database | relation | page | tuple | virtualxid | transactionid | classid | objid | objsubid | virtualtransaction |  pid  |     mode      | granted
------------+----------+----------+------+-------+------------+---------------+---------+-------+----------+--------------------+-------+---------------+---------
 virtualxid |          |          |      |       | 1/1        |               |         |       |          | -1/0               |  5232 | ExclusiveLock | t

If you look for that in pg_stat_activity, you won't find it:

select * from pg_stat_activity where procpid = 5232;
 datid | datname | procpid | usesysid | usename | application_name | client_addr | client_hostname | client_port | backend_start | xact_start | query_start | waiting | current_query 
-------+---------+---------+----------+---------+------------------+-------------+-----------------+-------------+---------------+------------+-------------+---------+---------------
(0 rows)

because that PID is actually the WAL Writer!

ps -ef|grep 5232
postgres  5232 10545  0 May28 ?        00:00:01 postgres: wal writer process

Drop support for PostgreSQL 8.2

It's been unsupported for a while and clutters up the codebase with #defines

A commit 370e572 is in master, I assume that's enough as cleanup. Maybe there should be a guard in one of the entry point functions and/or the binary.

Make pg_reorg available on pgxn

The necessary metadata is available in the 'extension' branch, see commit 61aa796.

Would be nice to have it for 1.1.8, as it's closely related to extension packaging (ticket #2).

Improve cleanup/error handling

A few gripes with our current error handling:

  • pgut_atexit_push() is called relatively late (after step 1 is complete), which is probably OK with current git master, but is a bit dodgy for the concurrent-ddl-guard changes
  • Instead of calling ROLLBACK unconditionally, we should use pgut_rollback() which checks PQtransactionStatus(conn)
  • Instead of using printf() plus exit() for reporting fatal error messages, an elog or ereport ... ERROR seems more appropriate
  • Errors encountered in reorg_one_table(), e.g. for conflicted triggers, shouldn't be treated as fatal -- pg_reorg might be operating on a whole database, and should be able to continue processing other tables

storage parameter on TOAST table gets lost

hi.

when using pg_reorg/pg_repack (same result) i recognized a missing storage parameter on my TOAST table after reorganization.

the table had set toast.autovacuum_vacuum_scale_factor to 0.1 (via ALTER TABLE mytable SET (toast.autovacuum_vacuum_scale_factor=0.1); )
when running pg_reorg/pg_repack on such table, the storage parameter for the TOAST table was gone.

used command was: pg_reorg -n -e -U postgres -d playground -t mytable

server-version: PostgreSQL 9.1.6
pg_reorg version 1.1.7

reorg_drop() with nonexistent OID causes crash

reorg_drop() will cause a backend crash when passed a table OID which does not exist, due to get_quoted_relname() and get_quoted_nspname() dereferencing a NULL pointer.

josh@[local] [test] # select reorg.reorg_drop(9999);
The connection to the server was lost. Attempting reset: Failed.

This is particularly bad, because even non-superusers who have USAGE on the reorg schema may execute reorg_drop() with a bogus argument, and get a crash before the must_be_superuser() check kicks in. I see the 1.1 milestone is closed, but perhaps the fix, which should be forthcoming shortly, should be shoehorned in.

Concurrent DDL guard?

It is generally unsafe to perform DDL on your table while it is being reorg'ed, except for VACUUM or ANALYZE. Although the primary benefit of pg_reorg is that it does not hold a high-level lock on the target table while it is being reorg'ed, this also leaves us with no hard and fast way to prevent unsafe DDL. Perhaps we could add in some sanity checks of pg_class, pg_tablespace, etc. before and after the reorg finishes, and throw an error if any unexpected changes in the table attributes are detected.

Support multiple --table arguments

It sure would be nice if pg_reorg could support multiple --table arguments. I realize that clusterdb, vacuumdb, reindexdb, and even pg_restore do not support multiple --table arguments (pg_dump being the odd man out), but this seems more like an oversight than an intentional design, per:
http://archives.postgresql.org/pgsql-hackers/2010-08/msg00689.php

I might try to propose a TODO or patch for a few of those in-core utilities first, so that there is some precedent for us.

pg_reorg may choose a partial index as primary key

There is no check in reorg.primary_keys that the index chosen is partial; specifically a partial index may be chosen in place of a total one, possibly resulting in data corruption because concurrent update/delete may affect the wrong records.

Functional indexes should be checked too.

Test case (not sure if 100% reproducible), the index returned should be partidx_data_idx:

contrib_regression=# create table partidx(id integer not null, data text not null);
CREATE TABLE
contrib_regression=# create unique index partidx_id_pidx1 on partidx(id) where (id <= 1);
CREATE INDEX
contrib_regression=# create unique index partidx_data_idx on partidx(data);
CREATE INDEX
contrib_regression=# create unique index partidx_id_pidx2 on partidx(id) where (id <= 2);
CREATE INDEX
contrib_regression=# insert into partidx values (1, 'a'), (2, 'b'), (3, 'c');
INSERT 0 3
contrib_regression=# select indexrelid::regclass from repack.primary_keys where indrelid = 'partidx'::regclass;
    indexrelid    
------------------
 partidx_id_pidx1
(1 row)

Compile warnings about appendStringInfoVA and appendStringInfoVA_s

I'd like to get rid of this compile warning:

pgut/pgut.c: In function ‘appendStringInfoVA’:
pgut/pgut.c:1174: warning: function might be possible candidate for ‘printf’ format attribute
...
pgut/pgut-spi.c: In function ‘appendStringInfoVA_s’:
pgut/pgut-spi.c:27: warning: function might be possible candidate for ‘printf’ format attribute

As described here:
http://lists.pgfoundry.org/pipermail/reorg-general/2012-September/000077.html

Consistency check between program and library

pg_reorg can run against a database where an older version of the extension is loaded, with undefined results.

I've started a branch to check for discrepancies between the executable and the library. I've committed a preliminary patch to clean up the errors handling in reorg_one_database, allowing it to return any error and making the output of the program more consistent between the --all and --dbname modes. Following patches address the version check.

The current check is between the pg_reorg binary and the library, as reorg.version() is implemented in C. This doesn't address the discrepancy between the binary/library and the sql schema, that is probably more likely (just make+make install in a db where pg_reorg was already installed and there it is).

Note that I'm working at this branch in my local repos: in case these patches are rejected I don't like the reorg repos to be cluttered with my experiments.

Add DURATION to debug logging....

'ello.

Would it be possible to add in either a BEGIN and END timestamp to various operations? For instance:

DEBUG: sql_pop        : DELETE FROM reorg.log_19533 WHERE id <= $1
DEBUG: ---- setup ----
DEBUG: ---- copy tuples ----
DEBUG: ---- create indexes ----
DEBUG: [0]
DEBUG: target_oid   : 20924
DEBUG: create_index : CREATE INDEX index_20924 ON reorg.table_19533 USING btree ("colA", colB)

How long did it take for the copy tuples step take? It'd be useful to know when that happened so that we could line up that particular event with other activities in the system. As is, we just have to blindly assume. Not a huge problem, but it would be a very big "nice to have". A duration would be nice, too.

Document exit status codes in wiki

Even just something as simple as 0 = success, non-0 = failure would be great. I spent 5 minutes peeking through the source and was not able to determine the exact values of the error codes and when/why they occur.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.