Automatic Cinema - Background

Russian Theory

Producing a movie can be an inert process. What can be equally a quality of filmmaking is definitly also a limitation, since the production includes numerous steps. Movies are a linguistically highly complex medium and at the end, everything can be told differently by cutting the story. Its raw material can be interpreted in numerous ways, but the cutting process is static. Automatic Cinema tries to dynamize the cut by appliying a style based concept.

Theorists refer at this point to Kuleshov's experiments, where the facial expressions of the actor Ivan Mosjukin facial expression gets a different meaning depending on the «montage». Talking about russians, there were the russian formalists like Propp and Shklovsky who defined the terms «fabula» and «syuzhet», meaning the raw material on the one and its organization on the other hand. The concept of Automatic Cinema is heavily influenced by this division. Storing content and applying a style.

Visual Approach

In every aspect, Automatic Cinema follows a visual strategy. Unlike other databases where contents are tagged with keywords, Automatic Cinema stores relations between clips and keywords as graphs. In terms of this project, we call these graphs «Ontologies». Ontologies are multidimensional matrices with keywords, relations between keywords, and objects. These are videoclips, images, texts or audio files.

Ontologies have a big advantage over a normal keyword system. The association between a clip and keyword is analogue. This means, a clip can be positioned closer or further away from a keyword. Which leads to a more or less distinctive match. Same for keywords: Relations between them are weighted by their distance. Also, it is possible in ontologies to determine a relation as negative, positive, relational or causal. This is done by drawing lines in various colors and with or without arrows. Think of semantically turbocharged hyperlinks saying not only «connected», but also «yes», «no», «is related with» or «is a consequence of».

In the end, the narration engine follows the drawn paths, applying controllable random factors. The threshold, when to change a keyword, is adjustable as well. Since causal relations cannot be told the wrong way, the narration engine follows arrows only in the direction they are pointing. Of course, other relations are bidirectional. The difference between positive and negative relations - green and red lines - are perceptible as a more or less continuous story. In short, the calculation of a narrations based upon ontologies is called «Geometrical Storytelling».

Physics + Content + History = Future

The calculation of a narration is the answer to the question, which clip fits best to the currently running one. The better a clip fits, the higher its score is. In the worst case, no clip has a sufficient score, so the Algorithm possibly proposes a short pause. Thresholds are also applied for scores and can be set in the style settings.

Scores are calculated from three factors. First, there is the historical aspect. How many times has a clip been used from the beginning of a show? Depending of the style settings, repetition yields a smaller or bigger demerit. Second, the interpolated physical parameters are compared with each clip. Physical parameters are values, which can be computed. Like brightness, hue, saturation or duration. Last but not least, the «content» score is calculated. This is done by analyzing the ontology and measuring distances between an element and the current position.

Geometrical Storytelling

As written before, on part of the score is calculated by the positioning within the ontology. This involves a multi part process. Ontologies may contain multiple dimensions so that keywords can be grouped into topics. Also, clips can be associated with more than one keyword in one dimension. An image with a camel and a lyon is possibly assigned with both keywords «camel» and «lyon».

Based on the style settings, the algorithm first decides if switching a dimension is appropriate or not. Second, the current active keyword is evaluated. Switching a keyword always resets a value called «tension». The less a keyword is switched, the bigger the tension gets. After a switch, the tension is set to zero. Like a thunderstorm which unloads its electric tension.

If the style setting for «path accuracy» has a high value, the selection of elements follows more closely the lines. If it has a low value, the distance between the last and the current element is higher rated. Which renders a narration either more associative or more logic.