Our basic approach is to cluster units within a unit type (i.e. a particular phone) based on questions concerning prosodic and phonetic context. Specifically, these questions relate to information that can be produced by the linguistic component, e.g. is the unit phrase-final, or is the unit in a stressed syllable. Thus for each phone in the database a decision tree is constructed whose leaves are a list of database units that are best identified by the questions which lead to that leaf.
At synthesis time for each target in the target specification the appropriate decision tree is used to find the best cluster of candidate units. A search is then made to find the best path through the candidate units that takes into account the distance of a candidate unit from its cluster center and the cost of joining two adjacent units.