During the past few years, I have been responsible for the room-list and the door-list in the planning of a large biotechnology facility. It suddenly occurred to me that these two things are actually one and can be jointly represented by a graph.
For obvious confidentiality reasons and also for the sake of clarity, I will not publish that. Although free access to structured architectural data, aka BIM, is very restricted, there are some exceptions. BIMobject®, a digital platform for the construction industry based in Malmö, Denmark, is one of them. They actually published the model of their own headquarters designed by SHL Architects as a demonstration of best practice.
Studio Malmö by SHL
Networked Rooms
The idea, which is frankly not new but also not common practice in the industry, is to represent rooms and doors as nodes and edges of a network. Room properties such as area and perimeter become node_attributes and door properties such as width and height become edge_weights.
Get structured data
Name
Area
CTO Office
9.52
Legal Eagle Office
9.51
PA Office
9.51
CEO Office
21.62
HR Office
9.30
From Room
To Room
Width
CTO Office
Open
1015
Legal Eagle Office
Open
1015
PA Office
Open
1015
CEO Office
Open
1015
CFO Office
Open
1015
Preprocessing rooms
Preprocessing doors
Graph construction
Now it is time to build the network. The graph is small and should be easy to construct but there is a problem. ‘Open’ and ‘Sales Arena’ are connected with 2 doors. A double edge or Multi-edge is allowed but hinders calculations and for all intends and purposes can be replaced by a weight.
Embed Attributes
At the moment our MulitiGraph is not attributed. It needs to be converted into a simple weighted Graph with embedded node attributes.
The Graph rendered as a Matrix
Gephi Export
Why is a graph useful in the first place? Because it is operational, not just a visualization. It can provide us with network-specific metrics that add to the dimensionality of the initial dataset. For a first evaluation, Gephi, an open-source graph-editing software, is a good option.
Get data laboratory results
Label
eigencentrality
CTO Office
0.209728
Open
1.000000
Legal Eagle Office
0.209728
PA Office
0.209728
CEO Office
0.209728
This method effectively expands the dataset from two to fourteen dimensions.
(25, 14)
Clustering
Our data are sparse to begin with and clustering in high dimensions is destined to fail. In order to escape the curse-of-dimensionality, an intermediate mapping is needed.
Step 1: Non-linear dimensionality reduction.
Step 2: Kmeans clustering in 3 dimensions.
The algorithm successfully identifies meaningful room clusters.