Text this: Document similarity for error prediction