Best way to handle multi-page contracts?

We process contracts that are anywhere from 5 to 200+ pages. For shorter ones, Qomplement works great. But for the really long ones (100+ pages), I’m finding that:

  1. Processing takes a while (expected)
  2. The extracted data sometimes misses clauses that appear deep in the document

Any tips for handling very long documents? Should I split them up?

1 Like

For long contracts we split by section. Most contracts have clear section headers so we wrote a quick script to split the PDF at those boundaries, process each section with the appropriate template, then merge the results. Adds complexity but the accuracy is much better.

1 Like

Similar approach here. For regulatory filings that are 100+ pages, we split into logical sections first. It also makes the results more usable downstream since each section maps to a specific compliance requirement.

3 Likes