Research into translation processes is often characterised by small-scale studies with few participants and variables. However, the translation process is characterised by much variation between different translators, texts, tasks and languages, and larger-scale investigations are therefore highly desirable. To that end, we have publicly released the CRITT database of translation process data which currently contains data from 758 translation sessions, with information about texts and participants as well as eye-tracking and keylogging data. We describe the database and illustrate some of the different analysis purposes for which it may be used. We focus on a large-scale analysis of the behaviour of 68 translators translating and post-editing six English texts into German, Spanish, Hindi and Chinese. The analysis seeks to explain differences in the production time of Alignment Units (AUs), i.e. sequences of source-target correspondences, using mixed-effects regression modelling. The large- scale analysis shows a number of interesting results and complex interactions, among which the most interesting and relevant are that from- scratch translation always takes longer than post-editing; that lower- frequency source text words result in slower production, especially for student translators; and that high translation ambiguity has a slow-down effect only in post-editing.
