Supporting authors of SA documentation using NLP and Semantic technologies

Motivation

Software architecture (SA) serves as a primary vehicle for communication among stakeholders (Bachmann et al. 2010). Maintaining clarity and completeness of knowledge expressed by SA stakeholders is beneficial, as it speeds up analysis of SA, improves project management, and simplifies introduction of new team members into the project(de Graaf et al. 2012). One of the main approaches to express SA knowledge is documenting it in a form of natural language text (SA documentation)(Alexeeva et al. 2016; Ding et al. n.d.; Tang et al. 2005). Though, many methods and formalization approaches were proposed, the process of maintaining SA documents remains effort and time consuming. One way to reduce the effort and time spent on reviewing SA documents is to employ existing natural language processing (NLP) techniques and ontologies to automatize quality checks and point authors of SA documents to the gaps and weaknesses in the statements being made.

Research objective

In our research, we focus on the large-scale industrial software engineering projects where SA documentation is usually scattered across and maintained in various tools. We distinguish between two types of SA documents. 1) Passive SA documents, e.g. wiki-pages describing quality attributes, user stories, functional and technical architectures, stakeholder roles, SA decisions being made. We consider this type of documents to be the essential communication artifacts and elements of SA documentation, that were abstracted from the greater implementation details and used as point of reference during architecture reviews, sprint plannings, providing guidance to team members less familiar with specificity of project SA, etc. 2) Active SA documents, where SA knowledge gets expressed during the design sessions without paying a lot of attention to its proper formalization. For instance, meeting minutes, issues in task management systems, drawings on the whiteboard, email, chat, forum conversations, and source code comments.

Both types of documents are essential and could not exist without each other. For instance, once SA knowledge is expressed in the form of active documents, there is a follow-up activity of creating, updating or restructuring passive SA documentation, through formalizing the active document, aligning it to the company's standards and reference architectures, etc.

Any attempts to formalize and structure the process of SA documentation during design sessions struggle to adopt in practice, as most of the time they disrupt the flow of the design session, have to be taught to the newly introduced colleagues, and require additional attention. Taking this into account, we aim to support SA stakeholders only during the phase of transferring actively expressed knowledge in their passive form later.

Understandability is an important quality attribute of SA documentation to improve. Lack of understandability in sentences denoted in their structures, semantics, and context. Context should be a shared entity between writer and reader of the text. However, in some cases, readers could experience uncertainty in sentences, where context was poorly expressed by the writer. In these case software architects' (writers) on-time awareness about affected sentences and follow-up patching of those issues could improve the understandability of produced documents. Therefore, we aim to provide software architects with ability automatically detect uncertainty in SA documentation, and, as a consequence, increase its quality.

To support the writers of the documents in resolving the detected uncertainties we aim to provide context and project specific recommendations (Shumaiev et al. 2017). Later are based on reasoning over the SA knowledge stored in software architecture management system (proposed by us in Bhat et al. 2016) and types of uncertainties detected.

References

Alexeeva, Z., Perez-Palacin, D., and Mirandola, R. 2016. “Design Decision Documentation: A Literature Overview,” in Software Architecture: 10th European Conference, ECSA 2016, Copenhagen, Denmark, November 28--December 2, 2016, Proceedings 10, pp. 84–101.

Bachmann, F., Bass, L., Clements, P., Garlan, D., Ivers, J., Little, M., Merson, P., Nord, R., and Stafford, J. 2010. Documenting Software Architectures: Views and Beyond (Second.), Addison-Wesley Professional.

Ding, W., Liang, P., Tang, A., Vliet, H. Van, and Shahin, M. (n.d.). “How Do Open Source Communities Document Software Architecture : An Exploratory Survey.,”

de Graaf, K. A., Tang, A., Liang, P., and van Vliet, H. 2012. “Ontology-based Software Architecture Documentation,” in 2012 Joint Working IEEE/IFIP Conference on Software Architecture and European Conference on Software Architecture, , August, pp. 121–130 (doi: 10.1109/WICSA-ECSA.212.20).

Tang, A., Babar, M. A., Gorton, I., and Han, J. 2005. “A Survey of the Use and Documentation of Architecture Design Rationale,” in 5th Working IEEE/IFIP Conference on Software Architecture (WICSA’05), pp. 89–98 (doi: 10.1109/WICSA.2005.7).

Bhat, M.; Shumaiev, K.; Biesdorf, A.; Hohenstein, U.; Hassel, M.; Matthes, F.: Meta-model Based Framework for Architectural Knowledge Management, SAGRA workshop at European Conference on Software Architecutre 2016, Nov. 28 - Dec. 02, 2016, Copenhagen, Denmark

Shumaiev, K.; Bhat, M: Automatic uncertainty detection in software architecture documentation, IEEE International Conference on Software Architecutre 2017 Young Research Forum, 4th April, Gothenburg, Sweden

To top

Chair of Software Engineering for Business Information Systems

Prof. Dr. Florian Matthes

Contact

Research Area:
Social Software Engineering

Contact
Klym Shumaiev
Dr. Manoj Mahabaleshwar

Team Member
Klym Shumaiev
Dr. Manoj Mahabaleshwar

Partners
Siemens Corporate Technology

Sponsors
Siemens Corporate Technology

Status
Completed

Project Start
2016

Project End
2018