In information extraction,a token may occur multiple times in a document and usually there is long distance among multiple occurrences of a same token.Traditional linear-chain CRFs' models annotate the multiple occurrences of the same token separately at the cost of losing global information because it cannot represent long-distance dependent relations among labels under the Markov assumption.We present a CRF model with long-distance dependencies.This model can collectively annotate a token for all of its occurrences by combining its features ...