Popis: |
PurposeContractors must check the provisions that may cause disputes in the specifications to manage project risks when bidding for a construction project. However, since the specification is mainly written regarding many national standards, determining which standard each section of the specification is derived from and whether the content is appropriate for the local site is a labor-intensive task. To develop an automatic reference section identification model that helps complete the specification review process in short bidding steps, the authors proposed a framework that integrates rules and machine learning algorithms.Design/methodology/approachThe study begins by collecting 7,795 sections from construction specifications and the national standards from different countries. Then, the collected sections were retrieved for similar section pairs with syntactic rules generated by the construction domain knowledge. Finally, to improve the reliability and expandability of the section paring, the authors built a deep structured semantic model that increases the cosine similarity between documents dealing with the same topic by learning human-labeled similarity information.FindingsThe integrated model developed in this study showed 0.812, 0.898, and 0.923 levels of performance in NDCG@1, NDCG@5, and NDCG@10, respectively, confirming that the model can adequately select document candidates that require comparative analysis of clauses for practitioners.Originality/valueThe results contribute to more efficient and objective identification of potential disputes within the specifications by automatically providing practitioners with the reference section most relevant to the analysis target section. |