PanLex: Testing AWS

Test preparation

To make the EC2-based database easily accessible for testing, modify the PanLem interface to use it instead of the same-server database. To accomplish this, make the following changes in the opt/www/main/cgi-bin/plxu.cgi file and save the new version as opt/www/main/cgi-bin/plxec2u.cgi (the password is redacted below):

66c66
< $act = '/u';
---
> $act = '/cgi-bin/plxec2u.cgi';
95c95
< &SRImp ($in{sr}, 'plx');
---
> &SRImp ($in{sr}, 'plx', 'db.panlex.net');
97,98c97,98
< # The pg_hba.conf and pg_ident.conf configurations in the PostgreSQL data directory permit
< # “apache” to connect as “apache” via a Unix-domain socket.
---
> # The pg_hba.conf configuration in the PostgreSQL data directory permits any login-capable
> # role to connect with a valid password.
4456a4457
> #    2: database host.
4467c4468,4469
<             "dbi:Pg:dbname=$_[1];port=5432", '', '', { (AutoCommit => 0), (pg_enable_utf8 => 1) }
---
>             "dbi:Pg:dbname=$_[1];host=$_[2];port=5432", 'apache', '?????????',
>             { (AutoCommit => 0), (pg_enable_utf8 => 1) }
4469c4471
<         # Specify & connect to the PostgreSQL 9.0.1 database as “apache”, with AutoCommit off
---
>         # Specify & connect to the PostgreSQL 9.1.5 database as “apache”, with AutoCommit off
5632c5634
<         . '<link rel="stylesheet" type="text/css" href="/plxu-styles.css" title="styles" />' . "\n"
---
>         . '<link rel="stylesheet" type="text/css" href="/plxec2u-styles.css" title="styles" />' . "\n"
5644c5646
<     $ret .= ('<p class="id">' . "[PanLem:$in{sr}w]<br>[$in{pgv}]</p>\n\n");
---
>     $ret .= ('<p class="id">' . "[PanLem-α:$in{sr}w]<br>[$in{pgv}]</p>\n\n");

Copy the CSS style file opt/www/main/html/panlex.org/plxu-styles.css to opt/www/main/html/panlex.org/plxec2u-styles.css and modify the latter as follows in order to make it obvious when one is using the test version of PanLex:

2c2
< 	body { margin: 24pt; background-color: #eeeedd; }
---
> 	body { margin: 24pt; background-color: #ffccff; }

Test conditions

Access the test version of PanLex with the URL http://panlex.org/cgi-bin/plxec2u.cgi. Perform each test twice in immediate succession. Report the measurements for the second (hot) performance in each case.

Test 0

In the PanLem interface’s translation-search feature, seek expressions in all language varieties with a partial exact match to “runn” (in state “exviz3”). Time from submission to display of the matching expressions:

Local server: 8 seconds
EC2 Micro instance with postgresql-ec2micro.conf: 30 seconds
EC2 Micro instance with postgresql-ec2micro-sb.conf: 27 seconds
EC2 Micro instance with postgresql-ec2micro-sbx.conf: 17 seconds
EC2 Micro instance with postgresql-ec2micro-sbx-rpc.conf: 17 seconds
EC2 Small instance with postgresql-ec2micro.conf: 27 seconds
EC2 Small instance with postgresql-ec2small.conf: 28 seconds
EC2 Small instance with postgresql-ec2small-sb.conf: 28 seconds
EC2 High-CPU Medium instance with postgresql-ec2small.conf: 18 seconds
EC2 Large instance with postgresql-ec2small.conf: 11 seconds
EC2 Large instance with postgresql-ec2large.conf: 11 seconds
EC2 High-CPU Extra Large instance with postgresql-ec2large.conf: 10 seconds
EC2 High-Memory Extra Large instance with postgresql-ec2large.conf: 9 seconds
EC2 High-Memory Extra Large instance with postgresql-ec2hmxlarge.conf: 9 seconds
EC2 High-Memory Double Extra Large instance with postgresql-ec2hmxlarge-rpc.conf: 8 seconds
EC2 High-Memory Double Extra Large instance with postgresql-ec2hmdxlarge-rpc.conf: 9 seconds

Test 1

In the PanLem interface’s language-variety information feature, get a tabulation of the counts of the characters in cmn-000. Time from submission to first line of the tabulation:

Local server: 7 seconds
EC2 Micro instance with postgresql-ec2micro.conf: 32 seconds
EC2 Micro instance with postgresql-ec2micro-rpc.conf: 38 seconds
EC2 Micro instance with postgresql-ec2micro-sb.conf: 29 seconds
EC2 Small instance with postgresql-ec2micro.conf: 27 seconds
EC2 Small instance with postgresql-ec2small.conf: 32 seconds
EC2 Small instance with postgresql-ec2small-sb.conf: 30 seconds
EC2 High-CPU Medium instance with postgresql-ec2small.conf: 26 seconds
EC2 Large instance with postgresql-ec2small.conf: 24 seconds
EC2 Large instance with postgresql-ec2large.conf: 29 seconds
EC2 High-CPU Extra Large instance with postgresql-ec2large.conf: 72 seconds
EC2 High-Memory Extra Large instance with postgresql-ec2large.conf: 26 seconds
EC2 High-Memory Extra Large instance with postgresql-ec2hmxlarge.conf: 30 seconds

Test 2

In the PanLem interface’s translation-search feature, get a list of the best indirect translations and their scores from “roof” in eng-000 into fra-000. Time from submission to first line of tabulations:

Local server: 3 seconds
EC2 Micro instance with postgresql-ec2micro.conf: 198 seconds
EC2 Micro instance with postgresql-ec2micro-rpc.conf: 189 seconds
EC2 Micro instance with postgresql-ec2micro-sb.conf: 205 seconds
EC2 Micro instance with postgresql-ec2micro-sb-rpc.conf: 600+ seconds
EC2 Small instance with postgresql-ec2micro.conf: 30 seconds
EC2 Small instance with postgresql-ec2small.conf: 33 seconds
EC2 Small instance with postgresql-ec2small-sb.conf: 40 seconds
EC2 High-CPU Medium instance with postgresql-ec2small.conf: 13 seconds
EC2 Large instance with postgresql-ec2small.conf: 11 seconds
EC2 Large instance with postgresql-ec2large.conf: 5 seconds
EC2 High-CPU Extra Large instance with postgresql-ec2large.conf: 5 seconds
EC2 High-Memory Extra Large instance with postgresql-ec2large.conf: 4 seconds
EC2 High-Memory Extra Large instance with postgresql-ec2hmxlarge.conf: 5 seconds
EC2 High-Memory Double Extra Large instance with postgresql-ec2hmxlarge-rpc.conf: 4 seconds
EC2 High-Memory Double Extra Large instance with postgresql-ec2hmdxlarge-rpc.conf: 4 seconds

Test 3

In the PanLem interface’s translation-search feature, get information about the cmn-000 language variety.

Local server: 3 seconds
EC2 High-Memory Double Extra Large instance with postgresql-ec2hmxlarge-rpc.conf: 8 seconds
EC2 High-Memory Double Extra Large instance with postgresql-ec2hmdxlarge-rpc.conf: 8 seconds

Test interpretation

Two of the above tests, Test 0 and Test 2, revealed improving performance with higher-powered instances, whether the power increases were measured in memory or in CPU units, but one test, Test 1, exhibited resistance to any such improvement.

For the tests that responded to repowering, performance reached about 2/3 of the local service’s performance on the Large instance, and the High-CPU Extra Large and High-Memory Extra Large instances didn’t produce substantial further improvement.

The Micro instance’s performance was erratic, and some results were greatly inferior to those documented above. This is apparently to be expected, because of the unpublished variability in CPU power provided on Micro instances. The documentation suggests that a CPU-intensive process may run faster than normal if preceded by a long idle period, but might be throttled to a small fraction of normality if immediately preceded by a substantial CPU-intensive load.

The Micro instance did not reveal, during this testing, any inability to process queries submitted through the PanLem interface (e.g., any out-of-memory errors). And simple navigation-type queries were handled tolerably fast on the Micro instance.

Increasing the Micro instance’s default 20-MB shared-buffer size may have helped on Test 0, but may have hurt on Test 2 (erratic behavior makes any conclusion tentative).

It isn’t clear whether decreasing the usual random_page_cost value helped or hurt.

Inspection of the code that Test 1 executes seems to reveal why Test 1 was impervious to improvement with repowering. The reason is that Test 1 causes a massive export of data from the database server to the web server for intermediate processing—data that the user never sees. This design originated when the two servers coexisted on the same physical host. During these tests on AWS EC2, however, the web server issued fetchrow_array commands to the database server and received output from the database server via the public Internet. If we redesign this feature on the assumption that transfers of data between the servers are expensive, or we redesign the server environment so the two servers are again tightly coupled, it may be possible to make Test 1 sensitive to instance repowering, like the other tests.

Retest 1

After the redesign of the procedure executed in Test 1 (implemented in state file “lvviz2w.pl” of version 2.8 of PanLem), in the language-variety information feature, get a tabulation of the counts of the characters in cmn-000. (The redesigned procedure completes the tabulation within the database and exports the result to the web server. This increases the time to the beginning of the output, but decreases the time to the end of the output.) Times from submission to the first line of the tabulation:

Local server: 25 seconds
EC2 Micro instance with postgresql-ec2micro-rpc.conf: 71 seconds
EC2 Large instance with postgresql-ec2large-rpc.conf: 24 seconds
EC2 High-CPU Extra Large instance with postgresql-ec2large-rpc.conf: 23 seconds
EC2 High-Memory Extra Large instance with postgresql-ec2large-rpc.conf: 20 seconds
EC2 High-Memory Extra Large instance with postgresql-ec2hmxlarge-rpc.conf: 20 seconds
EC2 High-Memory Double Extra Large instance with postgresql-ec2hmxlarge-rpc.conf: 19 seconds
EC2 High-Memory Double Extra Large instance with postgresql-ec2hmdxlarge-rpc.conf: 19 seconds

Conclusions

Moving the PanLex database to an AWS EC2 Large, High-Memory Extra Large, or High-Memory Double Extra Large instance appears to produce performance sometimes equaling, sometimes almost equaling, and sometimes beating that of the current local server.

The instance types named above appear to be worth considering as PanLex hosts. Their prices on a 3-year plan (excluding transaction costs) are:

Large: $856/year
High-Memory Extra Large: $1130/year
High-Memory Double Extra Large: $2261/year

Intuitively, the added performance levels of the more powerful instance types do not appear sufficient to justify their added costs. It seems likely that a Large instance could execute most queries fast enough for use by PanLex’s development team, and that the cheapest way to achieve satisfactory speed in the execution of painfully slow retrieval queries would be to precompute with trigger functions derivative tables that will decrease the complexity of expensive queries, at the cost of enlarging the storage consumed by the database.

There were performance differences that seemed to contradict predictions. In particular, it had been expected that, with enough memory, a daily dump of tables and indices would keep them in memory and optimize (or at least speed) all queries by eliminating disk access. We might understand this prediction to imply that, after such dumps, there wouldn’t be any difference between cold and hot query performance. That implication, however, turned out to be false. An example is a query for the 10 best indirect translations of the eng-000 expression “set” into fra-000. On the High-Memory Double Extra Large instance, after the dumps had been performed, the first (cold) execution of this query took 88 seconds, and the second (hot) one took 25 seconds. Thus, some performance-enhancing action is taking place during the first execution beyond what the dumps do. A still-unsatisfied challenge, then, is to create hot conditions at all times. Since 25 seconds for such a query is still unsatisfactory, another challenge is to improve further on hot performance. This challenge exists not only on EC2, but also on the local server. Immediately after the large tables were dumped and the large indexes were cached, this same query was run 4 times in succession, and its execution times were even longer, but likewise became much shorter with repetition: 116, 64, 35, and 35 seconds, in that order. An “analyze” operation was performed on the database immediately thereafter, and then 4 more executions of the query, whose execution times were 44, 55, 78, and 36 seconds, in that order. Then “cluster” and “analyze” were performed, and thereafter two query executions took 33 and 31 seconds, in that order.

This evaluation didn’t reveal any benefit from the decision to store the database on an EBS volume different from the volume housing the operating system and applications. It could be simpler to store them on a single volume. However, situations might arise militating against that. These could include situations calling for multiple alternative operating systems or versions of PostgreSQL. If we packaged our instance as an AMI to give others access to the database and they wanted to operate their copy on a different operating system, or, if we wanted to minimize the interruption during a PostgreSQL upgrade by creating a new instance with the new version and moving or restoring the database to that instance after it was fully operational, the two-volume strategy might be optimal.