[AI]rport aims to generate an airport terminal using an artificial machine, trained on existing videos of airports and our own architecture intuition.

Using ideas from Mario Carpo’s The Second Digital Turn, we conclude that the ever increasing abundance of big data has not only changed our ways of making, but has “introduced a new culture and economics poised to change our ways of thinking”. A machine’s ability to search through a number of items in milliseconds without sorting, coupled with its ability to learn labels and patterns, makes an artificial intelligence capable of becoming a design tool that not only finds a solution, but generates it as well.

One of the goals of a design tool like this is to increase the speed at which architecture can respond to things happening in our world. We believe the current COVID pandemic will greatly affect the future of design, especially that of airports. Along with this, as climate change becomes more urgent and ecological commitments increase, so does "flying shame". This changes the role of the airport from everyday vacation traveling to a more ecologically aware, need-base traveling.

While this tool will be able to generate a terminal for any city’s needs, we chose Xiong'an as our site for its focus on new technology and sustainability efforts.

By collecting and captioning videos of existing airport interiors, we are able to create a dataset of frames based on the timestamp of each caption. Our captions were developed on three categories - Space, People, and Abstract - encompassing things like spaciousness, geometry, modularity, and light qualities. By constraining the categories, we are able to teach the machine to understand physical qualities of a frame without having to label every element in the image.

The training captures a variety of airport scales and locations across the globe. The goal is to capture desirable qualities of all types of airports and have the machine translate them into a small terminal based on what it understands these qualities to be. To encode our intuition means to capture more than just what’s visible in the images, and include aspects of beauty and possible usage as well.

The output will be generated through a sentence input. These sentences will describe qualities we wish the final space to have as one “walks through it”.

After researching our site and aspects of airport design, we developed goals and keywords of qualities to include in our generated terminal. These focus on the same three categories as the training captions. It’s important to understand that the only control we have of the output is through the training and input sentence, but we cannot accurately predict what the machine will generate based on its understanding of the training words.

For the generated output to be a 3-dimensional model, rather than an image, we need to ensure it encodes position and time consistency when generating sequences of images. The sequence can then be made into a video, which can be made 3-dimensional by predicting depth information based on the encoded positions.

While only our terminal will be developed by the a.i., our airport is more than a typical traveler's aid - it is a new flight technology hub. Along with a passenger terminal, it will contain training grounds for new airplanes and aerial rescue training, an Ehang (autonomous aerial vehicles) center and command station, a general aviation terminal for private planes, and connection to the underground infrastructure being built in Xiong'an.

Our design proposes this as a system to connect to Xiong’An’s city center. Other than through E-hangs, the connection to the underground infrastructure will create access for freight carriers and autonomous passenger vehicles. For this tunnel, we propose a Metro line going under the Baiyang Dian Wetland to connect Xiong’an and the airport.

This tool will not only be a design tool that helps architects find novel solutions based on big data, but it will also be a novel tool to GAN developers and users. Before this, generation was done by scraping the whole internet. There was no specific aim or purpose. In this case, we are using tools from deep learning to encode the intuition of an architect.

This is also the beginning of "open world" generation. Open world is the term used to describe the complexity of the scene. Rather than an enclosed single object with a bounding box, it is more closely related to the complexities and different variables that we see in the "real" world. We are tackling the variability of complex scenes end-to-end. Training with full scenes, we are able to type an input and get the output.

We tackle these problems of extra data and annotations by constraining the parameters of our captions, based on the categories previously mentioned. Some more novelty will also come when generating the final output, once the 2D training is complete, going from image sequence to video, and eventually to 3D.

Making use of a.i.’s ability to navigate large quantities of data and recognize patterns through captions, we can train it with our own “intuition”, tapping into new ways of making and thinking. This will help the field of architecture engage with big data as not only an informant, but a tool of design, making its response faster and therefore more relevant to happenings of the world around us.

Faculty Advisor:
Matias del Campo