This day was organised in the context of HCRC's effort to refocus the work of its historical working groups. This is MLAP's version, following on from the Dialogue Group's two one-day meetings. In particular, the arrival of Johanna Moore and Bonnie Webber, who both bring significant experience in large system building, has encouraged us to re-consider the question of building a large-scale end-to-end dialogue system within HCRC.
Johanna Moore offered an inventory of potential benefits:
What would we need to do this?
A tour through canvassed benefits, raising questions.
Not only classroom teaching, but student projects as well.
Plug-and . . . work needs not just modular, but properly documented and maintained. Trains (Rochester) is a candidate example.
Overview of TRAINS: not a system, rather an umbrella project: fixed domain, data collection and analysis (cf. HCRC Map Task) but also implementation.
Implementation included an end-to-end prototype as well as separate components which were not integrated into the prototype.
From an educational perspective, if there is a group of students who are commited to making it work together, they can do it. Cf. first phase of TRAINS, one prototype per year over four years, one module per student, modest ability to run end-to-end because the students' effort was focussed on the theory underlying a component, not the engineering needed for interoperation and coverage.
The system was focussed on language analysis and planning, not on language generation.
Second phase much less student involvement, two paid RAs working full time to reconstruct a real end-to-end demo system (again one per year over four years), simplified task, order 2 person years each system. There was less student involvement in this phase. The major success was in a case where there was a good personal and problem-area fit with the existing architecture and software engineering focus (Eric Ringger).
Did it work? Student excitement and engagement certainly there, how much was system and how much was from the encounter with real data is unclear. These are related, in that system building encourages careful engagement with the data.
There were six students involved at the peak, all computer scientists who knew how to program: a point we need to come back to.
Do it again:
One intrinsic problem with modular architectures is that it's hard to detect the primary locus of responsibility for a failure to perform.
So student involvement was building the system together, not plugging in one new module. The quantum of effort involved is quite large.
The point of the proposed changes above is to make the modules susceptable to small-quantum changes, which is quite different from a module built as a whole for a research project, e.g. PhD.
One could of course have both monolithic single-owner opaque modules and teaching-oriented transparent modules.
Joann Bryson mentioned a multi-agent architecture being developed at Michigan.
Does the modular design constrain the science? Yes, if you can't change the modularity. No, if you are free to reconstruct things.
So it's emerging that one needs two different kinds of systems, a structurely stable collection of teaching-oriented modules, and a more flexible collection of research modules.
What's helpful about e.g. TRAINS for student projects? The modules/system, or the domain/data (including lexicon)?
Just the domain gives you a key ingredient, a social framework for students.
Is the experience of the Map Task providing a focus for a number of MSc projects and PhD theses relevant?
Cf. Festival, ILEX, CLE (SRI), LOLITA (Durham): you can reference as basis for new proposals, starting point for new work.
The external examples above are/were intended to encompass all the work done in an institution. Has it helped them? It's certainly worked in terms of branding and visibility. But it does leave you vulnerable, both to weaknesses there from the start and from limitations (can't do both research and teaching with one system).
We cerainly don't want to be seen as branded with the phrase "Edinburgh Informatics is building a dialogue system, you know, just like Rochester 10 years ago.
Is end-to-end important, or should we continue to focus on non-end-to-end systems which demonstrate good portability/reusability of results (ILEX, Map Task, GATE).
But note that some of those end up being monolithic and idiosyncratic, and at least at the beginning you don't want to discourage this.
The question of the extent to which you see your system as embodying Our Theory is important.
Writing documentation can compete for resources with writing research papers or PhD theses.
A contentious discussion ensued on the relative value of these two activities.
Is a toolkit substantively different from an end-to-end framework?
A system which could be configured/ported to particular applications.
When you're holding a hammer, everything looks like a nail.
Cf. previous query about toolkit vs. end-to-end system.
If you have a catalogue of reusable components, then an end-to-end system using those components is an important proof of concept.
Is it true that refereed publication requires evaluation in the context of an end-to-end system? Not for all dialogue work, evidently.
To show off the value of what we've done in a practical context: cf. Verbmobil, OVIS.
This is not the same as a research vehicle, the emphases are in different places: stable, robust, covers not removable.
Isn't a flashy demo system a waste of resource?
How to bridge the gap between scientific observations and exploitation of those observations in improving a particular system's performance.