logo

PanLex: Configuring a database cluster on AWS

Introduction

Before we copy the database into a PostgreSQL cluster on an AWS instance, configure the cluster as follows.

Installing PostgreSQL has created a database cluster named “main”.

Configuration files

Introduction

Create the following files in the /home/ubuntu/pgsql/custom directory.

postgresql-template.conf

##### Parameters invariant with instance type #####

# See /home/ubuntu/pgsql/custom/postgresql-std.conf for embedded documentation and other options.

authentication_timeout = 15s
# Per J Berkus, 2009-10-13 lecture.

autovacuum = off
# Per J Berkus, 2010-12-02 message.
# When autovacuum was on, other autovacuum settings were:
# autovacuum_analyze_scale_factor = 0.05
# autovacuum_naptime = 10800
# autovacuum_vacuum_scale_factor = 0.2

checkpoint_segments = 32
# 16 per J Berkus, 2009-10-13 lecture.
# Increased because of occasional log warnings of checkpoint intervals about 15 seconds.

constraint_exclusion = off
# Per J Berkus, 2009-10-13 lecture.

cpu_index_tuple_cost = 0.001
# Per J Berkus, 2009-10-13 lecture.

cpu_operator_cost = 0.0005
# Per J Berkus, 2009-10-13 lecture.

data_directory = '/var/lib/postgresql/9.1/main'
# Per Ubuntu postgresql.conf

datestyle = 'iso, ymd'

escape_string_warning = 0
lc_messages = 'C'
lc_monetary = 'C'
lc_numeric = 'C'
lc_time = 'C'
listen_addresses = '*'
log_destination = 'syslog'
log_error_verbosity = default
log_min_duration_statement = 10000
log_min_error_statement = warning
log_line_prefix = '%h '

max_connections = 60
max_prepared_transactions = 3
max_stack_depth = 2MB

random_page_cost = 1.1
# Per Christophe Pettus, <a href="http://thebuild.com/presentations/pg-aws.pdf">2012 lecture

search_path = 'public, hstore'

# statement_timeout = 60min
# Per J Berkus, 2009-10-13 lecture.
# Apparently caused function execution to abort.

synchronous_commit = off
# Per J Berkus, 2009-10-13 lecture.

syslog_ident = 'pg'

track_functions = pl
# Per J Berkus, 2009-10-13 lecture.

unix_socket_directory = '/var/run/postgresql'
# Per Ubuntu postgresql.conf

vacuum_cost_delay = 0
# VCD = 1000ms slowed vacuuming enormously.
# J Berkus, 2009-10-13 lecture, recommends VCD less than 100ms.

wal_buffers = 8MB
# Per J Berkus, 2009-10-13 lecture.

##### Parameters dependent on instance memory #####

effective_cache_size = %%ecs%%MB
# Normally 0.667 of RAM.
# J Berkus, 2009-10-13 lecture, recommends 0.75.

maintenance_work_mem = %%mwm%%MB
# Normally 0.0625 of RAM.
# Can be temporarily doubled to diagnose out-of-memory failures.

shared_buffers = %%sb%%MB
# Requires appropriate values of SHMMAX and SHMALL be set.
# Normally 0.25 to 0.5 of RAM.

work_mem = %%wm%%MB

pg_hba-ec2.conf

local all postgres peer
local all all md5
host all all 0.0.0.0/0 md5

postgresql-shm-template.conf

# Shared memory settings for PostgreSQL

# Note that if another program uses shared memory as well, you will have to
# coordinate the size settings between the two.

# Maximum size of shared memory segment in bytes
kernel.shmmax = %%max%%

# Maximum total size of shared memory in pages (normally 4096 bytes)
kernel.shmall = %%all%%

pgresize.pl

#!/usr/bin/env perl

# pgresize.pl
# Reconfigures the PostgreSQL configuration and the PostgreSQL and kernel memory
# parameters to conform to the current host’s memory.

# The user must execute this program as root in order for the program to perform correctly.

# This program may not be necessary, or may do harm, with versions of PostgreSQL after 9.2.

use warnings 'FATAL', 'all';
# Make every warning fatal.

use strict;
# Require strict checking of variable references, etc.

use utf8;
# Make Perl interpret the script as UTF-8.

binmode STDOUT, ':encoding(utf8)';
binmode STDERR, ':encoding(utf8)';
# make STDOUT and STDERR print in UTF-8.

##################

my $pgvn = '9.1';
# Identify the version of PostgreSQL running on the host.

##################

my $psz = `getconf PAGE_SIZE`;
# Identify the host’s memory page size in bytes.

my $memp = `getconf _PHYS_PAGES`;
# Identify the host’s memory in pages.

my $mem = (int (($psz * $memp) / 1048576));
# Identify it in megabytes.

my $parecs = (int ($mem * 3 / 4));
# Identify the appropriate effective cache size.

my $parmwm = (int ($mem / 16));
# Identify the appropriate maintenance work memory.

my $parsb = (int ($mem * (0.45 - (3500 / (10000 + $mem)))));
# Identify the appropriate shared buffer size as a fraction of memory that is at least 15%
# and asymptotically approaches 45% as memory increases toward infinity.

my $parwm = (int ($mem / 12));
# Identify an appropriate work-memory size, based on the assumption that there will normally
# be at most 3 simultaneous database connections.

(open my $tplcfg, '<:encoding(utf8)', 'postgresql-template.conf') || (die $!);
# Open the template PostgreSQL configuration file for reading.

my $cfg = (join '', <$tplcfg>);
# Identify its content as a concatenation of lines.

close $tplcfg;
# Close it.

$cfg =~ s/%%ecs%%/$parecs/;
$cfg =~ s/%%mwm%%/$parmwm/;
$cfg =~ s/%%sb%%/$parsb/;
$cfg =~ s/%%wm%%/$parwm/;
# Add the identified parameters to the configuration file’s content.

(open my $runcfg, '>:encoding(utf8)', "/etc/postgresql/$pgvn/main/postgresql.conf") || (die $!);
# Open the running PostgreSQL configuration file for writing.

print $runcfg $cfg;
# Output the configuration file’s content to it, replacing its existing content.

close $runcfg;
# Close it.

my $shmall = (int ($mem * 1048576 / 2 / $psz));
# Identify the appropriate value (in pages) for the kernel SHMALL parameter.

my $shmmax = ($shmall * $psz);
# Identify the appropriate value (in bytes) for the kernel SHMMAX parameter.

(open my $tplshmcfg, '<:encoding(utf8)', 'postgresql-shm-template.conf') || (die $!);
# Open the template PostgreSQL shared-memory configuration file for reading.

my $shmcfg = (join '', <$tplshmcfg>);
# Identify its content as a concatenation of lines.

close $tplshmcfg;
# Close it.

$shmcfg =~ s/%%max%%/$shmmax/;
$shmcfg =~ s/%%all%%/$shmall/;
# Add the identified parameters to the configuration file’s content.

(open my $runshmcfg, '>:encoding(utf8)', "/etc/sysctl.d/30-postgresql-shm.conf") || (die $!);
# Open the running PostgreSQL shared-memory configuration file for writing.

print $runshmcfg $shmcfg;
# Output the configuration file’s content to it, replacing its existing content.

close $runshmcfg;
# Close it.

print `sysctl -w kernel.shmmax=$shmmax`;
# Change the running kernel SHMMAX parameter.

print `sysctl -w kernel.shmall=$shmall`;
# Change the running kernel SHMALL parameter.

print `service postgresql restart`;
# Restart PostgreSQL.

Configuration procedure

  1. Make the /home/ubuntu/pgsql/custom directory the current directory.
  2. Make ubuntu the owner and postgres the group of the files in that directory.
  3. Make 0770 the permission mode of pgresize.pl.
  4. Make 0660 the permission mode of the other files in the directory.
  5. Save the standard PostgreSQL main configuration file with the command sudo mv /etc/postgresql/v.v/main/postgresql.conf postgresql-std.conf (where you replace “v.v” with the PostgreSQL version).
  6. Save the standard PostgreSQL authorization configuration file with the command sudo mv /etc/postgresql/v.v/main/pg_hba.conf pg_hba-std.conf (replacing “v.v” as above).
  7. Reconfigure PostgreSQL and the kernel’s shared-memory parameters to conform to this instance with the command sudo ./pgresize.pl.

This last command should produce a display of the kernel shared-memory parameters and confirm that PostgreSQL was successfully restarted. If not, check the terminal window and /var/log/syslog for error messages.

Valid XHTML 1.1!