-
Notifications
You must be signed in to change notification settings - Fork 36
Recovering from a corrupt PostgreSQL Database
If symptoms appear pointing to a corrupted database (e.g. database pod was killed without a graceful shutdown), then the postgres pod will not be able to restart. NOTE this is a rare case where the pod is indeed started successfully, but the DB isn't able to come up.
In that case, find the identifier for the pod that is 'Crash looping', and oc debug
into it.
oc project moe-gwells-<dev/test/prod>
oc get pods
oc debug < postgresql-<identifer>
In that debug shell,
cd /var/lib/pgsql/data
mv userdata userdata_broken
IMPORTANT: In the rare case where the DB pod is up, be sure to scale it down to zero before moving that userdata/
folder. Otherwise, the fresh DB may be corrupted by the DB pod attempting logging, checkpoints, etc . But in this case, you must exist the Debug Pod prior to restarting as the DB Pod, as you may run out of resources (i.e. both the running Debug and running DB pod count as two running pods).
Re-deploy the postgres pod via the console (note that you may need to exit the debug shell if you hit the 'resource limit exceeded' message which stops the re-deploy.
The database will create a new \userdata
folder and an empty set of database files. Once the fresh database is up, the gwells pod should re-deploy and run db-replicate.sh
. If not (i.e. the database crashed hard, and the gwells pod is unaware that it needs to re-deploy and run the post-hook which starts db-replicate.sh
), then you can open the the terminal of gwells pod and run db-replicate.sh
Monitor the log output of the db-replicate.sh
script to ensure the data replication completes.
- Working on GWELLS (full workflow from writing code to deploying to prod)
- Water terminologies
- Testing
- Swagger Documentation
- Restore a database backup manually
- (Archived) Manual Syncing of DEV to TEST to PROD
- (Archived) Setup GWells data migration for local dev test
- Update PostGres Oracle Foreign Data Wrapper image
- Increase PostgreSQL Database storage
- (Archived) Regular Corruption of the PostgreSQL DB
- (Archived) Recovering from a corrupt PostgreSQL Database