Projection of tf-idf vectors in two dimensions after Latent Semantic Analysis. Red points correspond to a group of insurance clauses and blue points correspond to all other clauses.
Commercial lease agreements are not only long, but often tedious and boring to read. The nature of a commercial lease agreement makes it the perfect document for landlords to hide unfavorable terms from potential tenants. Using natural language processing techniques and clustering, I successfully group similar clauses across different lease agreements. From the security deposit clause group, I build distributions of useful negotiation terms, including the ratio of security deposit to monthly rent and the number of days before a landlord must return the security deposit to the tenant. These results provide invaluable context to any commercial lease agreement in California.
Distribution of the number of days before a landlord must return the security deposit to the tenant.
Links
Medium post