logo

PanLex: Quality control

PanLem-based approver post-editing

In version 2.8 of PanLem, a system for post-editing of approver data was implemented. It was invoked in state dosrcv2 (“file—receive”). There one format option was “text—edit”. Choosing this option led to states dosrcv5 and dosrcv6, which produced a copy of the approver data in text format annotated for editorial inspection and modification.

The system relied in part on a table, “dfp”, containing the texts, with parentheses removed, of all definitions containing parentheses. If the parts of those definitions not enclosed in parentheses were identical to well-attested expressions in the same varieties, those expressions were nominated as replacements for expressions identical to the corresponding deparenthesized definitions. The definition of table “dfp” was:

                           Table "public.dfp"
 Column |  Type   | Modifiers | Storage  |          Description          
--------+---------+-----------+----------+-------------------------------
 lv     | integer | not null  | plain    | variety
 tt     | text    | not null  | extended | text
 td     | text    | not null  | extended | text with parentheses removed
Indexes:
    "dfp_pkey" PRIMARY KEY, btree (lv, tt)

The system also relied on a table, “exq”, tabulating the sums of the qualities of the approvers of denotations of expressions. The definition of table “exq” was:

                                    Table "public.exq"
 Column |  Type   | Modifiers | Storage |                   Description                   
--------+---------+-----------+---------+-------------------------------------------------
 ex     | integer | not null  | plain   | expression
 q      | integer | not null  | plain   | sum of qualities of approvers of the expression
Indexes:
    "exq_pkey" PRIMARY KEY, btree (ex)

No automatic updating of the “dfp” and “exq” tables was implemented. A function “edw” was used for this. Its definition was:

--create or replace function edw ()
--returns void language plpgsql as
declare
begin
-- Create a table of approvers’ expressions.
create temporary table apex on commit drop as
select distinct mn.ap, dn.ex from mn, dn
where dn.mn = mn.mn;
alter table apex add primary key (ap, ex);
-- Repopulate the table of evaluated expressions.
truncate exq;
alter table exq drop constraint exq_pkey;
insert into exq
select apex.ex, sum (uq) as q from apex, ap
where ap.ap = apex.ap
group by apex.ex
order by apex.ex;
alter table exq add primary key (ex);
-- Repopulate the table of definitions with
-- parentheses.
truncate dfp;
alter table dfp drop constraint dfp_pkey;
insert into dfp
select distinct lv, tt,
regexp_replace (df.tt, '[()]', '', 'g') as td
from df
where tt similar to '%[()]%'
order by lv, tt;
alter table dfp add primary key (lv, tt);
end;

Its comment was: “Act: Repopulate derivative tables exq and dfp required for the production of editorial text files in dosrcv6w.”

This system did not attain persistent editorial use. In February 02013 the system was removed from PanLem 2.9. The code could be retrieved from version 2.8 as a basis for a reimplementation.

Valid XHTML 1.1!