Below is a list of potential enhancements to the defunt utility,
including the code that supports it.

----------

Defunt can attempt to include an uncertified book.  That can be OK
interactively (provided its inclusion doesn't depend on portcullis
commands, for example to define packages), but it will cause
certification to fail.  It might be best to exclude uncertified books
somehow.

This would (a) require knowing which books are certified, and also (b)
modifying the implementation of defunt to use that information to skip
functions that are associated with uncertified books.  Probably (b)
would be straightforward.

For (a), one possibility is to LD td-cands.acl2 to get all the
packages, and then read the certificate of every book mentioned in
td-cands.lisp to get and then check its :BOOK-LENGTH and
:BOOK-WRITE-DATE, perhaps skipping those that have an old-style
book-hash.  Every user of defunt would then be required to execute
such a script to write out a list of acceptable books, which would be
read in somehow the first time that defunt is used in a session.  If
defunt is called with no such list, then an error message would
complain and point to a :doc topic on how to set that up.

When defunt fails to complete, it could try to complete without
restricting to the acceptable books and then print a message pointing
to the functions it used -- that could still be useful information for
someone trying to admit a definition.

----------

Extend defunt to be incremental (kind of in the spirit of
terminatrix), where the database read in from td-cands.lisp can be
extended incrementally, perhaps with a table so that the database is
respected by undoing.

----------

Make it easy for a user to set up defunt to be used based on a given
set of certified books, smaller than the "everything" suite that
currently generates td-cands.lisp.  (Indeed, as a reviewer of my 2018
ACL2 Workshop paper noted, it would be nice to be able to specify
extra books that aren't necessarily community books.  Also see the
related item below on "including more books beyond doc/top".)

----------

Consider subsuming individual clauses when building the database, not
just entire clause-sets.  (But be careful that this doesn't increase
the number of included books; that needs some thought, and might lead
to avoiding most such "optimizations".)

----------

Get a better database by tweaking measures generated by make-flag.
(Matt Kaufmann checked with Sol Swords, an author of make-flag, who
said it was fine with him.)

Consider the following example.

(mutual-recursion
(defun f1 (x) (if (consp x) (f2 (cdr x)) x))
(defun f2 (x) (if (consp x) (f1 (cdr x)) x)))

(make-flag f1-flag f1)

If now we do :pe f1-flag, we see that the measure is:

(CASE FLAG
 (F1 (ACL2-COUNT X))
 (OTHERWISE (ACL2-COUNT X)))

I'm guessing that it will be easy to notice that all branches here
have the same result, (ACL2-COUNT X).  As a sanity check, I took the
generated event and replaced the measure with (ACL2-COUNT X), and that
worked fine.

----------

Automate the updating of td-cands.lisp, which could be done in a few
seconds when top/doc.lisp is certified.  This would however add
complexity to the build process since td-cands.lisp would change after
the dependencies etc. were calculated -- and in general, it's frowned
on to continually generate files that are in the repository.

----------

For portability, add an option that disallows td entries for any
function that requires a book other than a system book.  (Hint:
Probably it will be helpful to search for "system-book".  Bigger hint:
Modify td-book-alist.)  This option could be useful if someone wants
to create a td and then, from it, a td-cands.lisp file, that is
portable -- i.e., does not depend on the local filesystem -- so that
it can be distributed to others.

----------

Consider minimizing the expense from the include-book forms generated
for a defunt call.  That is probably not worth the effort unless
efficiency is wanting.  Otherwise, it might be very useful (if a bit
awkward) to associate include-book times with books.  It might also be
useful to know which books include, even transitively, which other
books.  The system book kestrel/utilities/include-book-paths.lisp
defines include-book-dag, which might be helpful -- or perhaps more
likely, code that supports the definition of include-book-dag could be
re-used.

----------

A verbose flag for defunt might be nice, perhaps as an extra sort of
declaration, or perhaps via some other syntax.  That way one could get
less output (by omitting each "Defunt note") or more output (by
skipping the surrounding with-output).

With more output, one could get the event actually generated, to paste
into the book that one is developing.  Two reviewers of the associated
ACL2 Workshop 2018 paper noted that there could be problems when books
supporting a use of defunt are modified, making that use of defunt
fail.  This problem could be eliminated if a methodology is promoted
that encourages putting the generated encapsulate form in one's book,
instead of the call of defunt.

----------

It may be possible to strengthen :termination-theorem soundly, by
dropping literals that are false under a suitable functional
instantiation.  Consider the following (rather silly) example.

(defun f1 (x y)
  (if (consp x)
      (and (f1 (cdr x) y)
           (f1 (cddr x) (car x)))
    (cons x y)))

We have:

ACL2 !>:tthm f1
 (AND (O-P (ACL2-COUNT X))
      (OR (NOT (CONSP X))
          (O< (ACL2-COUNT (CDR X))
              (ACL2-COUNT X)))
      (OR (NOT (CONSP X))
          (NOT (F1 (CDR X) Y))
          (O< (ACL2-COUNT (CDDR X))
              (ACL2-COUNT X))))
ACL2 !>

The following succeeds.

(defun f2 (x y)
  (declare (xargs :hints (("Goal" :by (:termination-theorem f1
                                                            ; implicit: ((f1 f2))
                                                            )))))
  (if (consp x)
      (and (f2 (cdr x) y)
           (f2 (cddr x) y))
    (cons y x)))

But the following does not.

(defun f3 (x y)
  (declare (xargs :hints (("Goal" :by (:termination-theorem f1
                                                            ; implicit: ((f1 f3))
                                                            )))))
  (if (consp x)
      (f3 (cddr x) y)
    (cons y x)))

However, the following does.  (We can't quite use a :by hint, but the
:use hint makes things go trivially.)

(defun always-t-arity-2 (x y)
  (declare (ignore x y))
  t)

(defun f3 (x y)
  (declare (xargs :hints (("Goal" :use (:termination-theorem f1
                                                             ((f1 always-t-arity-2))
                                                             )))))
  (if (consp x)
      (f3 (cddr x) y)
    (cons y x)))

This kind of situation may be common.  It would require an analysis of
the parity of calls of the "old" function.

----------

Perhaps arrange, for increased efficiency, that the (acl2-count var)
entry is first in td-cands.lisp.  After all, that is probably the
common case.  Well, maybe not; if the measure it that simple then
maybe it isn't called for to use defunt.

----------

Consider supporting mutual-recursion.

----------

Perhaps several measures can be combined in some cases, as we search
for help from existing defun events.  Consider the following example:

(defunt f (x y)
  (if (consp x)
      (if (atom y)
          (f (cddr x) y)
        (f x (cddr y)))
    (list x y)))

Each of the following works, with measures (acl2-count x) and
(acl2-count y), resp.

ACL2 !>(defunt f1 (x y)
         (if (consp x)
             (if (atom y)
                 (f1 (cddr x) y)
               17)
           (list x y)))

*Defunt note*: Using termination theorem for EVENS.

 F1
ACL2 !>(defunt f2 (x y)
         (if (consp x)
             (if (atom y)
                 17
               (f2 x (cddr y)))
           (list x y)))

*Defunt note*: Using termination theorem for EVENS.

 F2
ACL2 !>

So for f, it would be reasonable to do the following, and indeed, that
works.  It works because for each call that is covered by one of the
measures, the measure does not go up for the other -- in this case,
because the unmeasured argument is unchanged.

(defun f (x y)
  (declare (xargs :measure (+ (acl2-count x) (acl2-count y))))
  (if (consp x)
      (if (atom y)
          (f (cddr x) y)
        (f x (cddr y)))
    (list x y)))

----------

The name :clause-size-limit is misleading, since the size limit is on
each clause-set, not each clause.  Actually, it makes more sense to
limit each clause, since each clause can be used independently -- some
clauses may be useful even if the entire clause-set is deemed to be
too large.

----------

Experiment with various limits on the number of functions and the
clause-size, by supplying nil or natural number values for keywords
:fns-limit and :clause-size-limit in the td-init call, as in file
write-td-cands.lsp.  Here are some results on 6/28/2018.

(write-td "td-cands")
Interrupted after about 12 minutes; produced large td-cands.lsp file,
more than 500 MB.

ACL2 !>(td-init :fns-limit 20 :clause-size-limit 1000)
; (EV-REC *RETURN-LAST-ARG3* ...) took
; 2.49 seconds realtime, 2.50 seconds runtime
; (251,646,720 bytes allocated).
<td>
ACL2 !>(time$ (write-td "td-cands-20-1000"))
; (EV-REC *RETURN-LAST-ARG3* ...) took
; 3.59 seconds realtime, 3.58 seconds runtime
; (10,659,600 bytes allocated).
<state>
ACL2 !>(td-init :fns-limit 20 :clause-size-limit 500)
; (EV-REC *RETURN-LAST-ARG3* ...) took
; 2.36 seconds realtime, 2.36 seconds runtime
; (243,400,384 bytes allocated).
<td>
ACL2 !>(time$ (write-td "td-cands-20-500"))
; (EV-REC *RETURN-LAST-ARG3* ...) took
; 2.60 seconds realtime, 2.58 seconds runtime
; (8,843,392 bytes allocated).
<state>
ACL2 !>(td-init :fns-limit 10 :clause-size-limit 1000)
; (EV-REC *RETURN-LAST-ARG3* ...) took
; 1.89 seconds realtime, 1.89 seconds runtime
; (215,566,160 bytes allocated).
<td>
ACL2 !>(time$ (write-td "td-cands-10-1000"))
; (EV-REC *RETURN-LAST-ARG3* ...) took
; 1.82 seconds realtime, 1.81 seconds runtime
; (5,554,848 bytes allocated).
<state>
ACL2 !>

File sizes from ls -l, where td-cands.lisp is from 10 and 500:

    617951 Jun 28 09:09 td-cands-10-1000.lisp
   1295670 Jun 28 09:07 td-cands-20-1000.lisp
    983911 Jun 28 09:08 td-cands-20-500.lisp
 535585448 Jun 28 09:06 td-cands-all.lisp
    593943 Jun 27 22:14 td-cands.lisp

----------

Look even when there are a lot of embeddings.

Consider the following comment from td-candidate-event-0 in
defunt.lisp.

; We could maybe still check a few embeddings, or at least the obvious one if
; the formals have the same length.  But we don't, for now.

----------

Put more functions into guard-verified logic mode.  The file
injections.lisp might be a good place to start.

----------

Consider experimenting with the arbitrary number of 1000 in
too-many-injections (file injections.lisp).

----------

When building td-cands.lisp, consider merging entries that differ only
in the ruler-extenders, by unioning together their ruler-extenders.
Then when we create the final event, hints will still work because the
new termination clauses will be, if anything, stronger than expected
because the extra ruler-extenders supply extra tests for each call.

----------

The subsumption tests in defunt.lisp performed by
clause-set-subsumes-t-top are generated by "injections" that create
the new measure from the old.  It seems likely that each injection
could be an alist that seeds those top-level subsumption calls.
As of this writing, there didn't seem to be a need for the potential
additional efficiency -- but it might be that the real gain is in the
generation of less garbage (by less consing up of alists).

----------

Possibility, perhaps: Improve existing termination schemes by
eliminating tests and setting a prover time limit.  Problem: This
would probably need to be done in the environment of the original
definition.  Still, at least we could focus on only a few hundred or
so defs.  As of this writing (6/27/18), (length *td-candidates*) = 185
and (loop for x in *td-candidates* sum (length (access td-candidate x
:roots-alist))) = 446, so that's 446 functions that could be
considered.

----------

Change the name td-cands.acl2 to portcullis.acl2 and make a book
portcullis.lisp, and make appropriate adjustments.  This will make it
easy to include a book that does little more than define packages,
which might be useful to others.

----------

Consider using factoring to drop useless literals like (not flg) where
flg doesn't occur elsewhere (or, just bind flg to t and substitute,
like remove-trivial-equivalences -- maybe even apply that function
directly).  Maybe just call subsumes on the clause with itself minus
the literal.  Example, from (trace$ termination-theorem-clauses) and
then (tthm 'var-fn-count-1):

(((O-P (ACL2-COUNT X)))
 ((NOT FLG)
  (ATOM X)
  (O< (ACL2-COUNT (CDR (CAR X)))
      (ACL2-COUNT X)))
 ((NOT FLG)
  (ATOM X)
  (O< (ACL2-COUNT (CAR X))
      (ACL2-COUNT X)))
 ((NOT FLG)
  (ATOM X)
  (O< (ACL2-COUNT (CDR X))
      (ACL2-COUNT X)))
 (FLG (ATOM X)
      (EQ 'QUOTE (CAR X))
      (O< (ACL2-COUNT (CDR X))
          (ACL2-COUNT X))))

This is really just a minor optimization, and it shouldn't hurt
anything other than the extra time considering a clause, since extra
old clauses are harmless at worst.

Be careful: if we do this then we may need to provide instances so
that the dropped literal is false.  Note for example:

ACL2 !>(subsumes nil
                 '((not FLG)
                   (ATOM X)
                   (O< (ACL2-COUNT (CDR X))
                       (ACL2-COUNT X)))
                 '((ATOM X)
                   (O< (ACL2-COUNT (CDR X))
                       (ACL2-COUNT X))) nil)
NIL
ACL2 !>

----------

As suggested by Eric McCarthy, try to apply defunt to functions in
existing community books, especially books not already included in
system book doc/top.

----------

Consider including more books beyond doc/top for defunt, maybe even if
there are conflicts.

----------
