Contingent Determinacies marks this practice's initial engagement with reinforcement learning. It is directly compared to the process of generative art, notably GANs, like Sculpture-GAN by Robbie Barrat, whose code adaptation and development was later utilized by Obvious to produce Portrait of Edmond Bellamy. Apart from the difference in the process, this juxtaposition will also highlight the reasons why artists engaging with machine learning algorithms tend toward GANs rather than reinforcement learning. Both projects have a goal of generating 3D-printable sculptures. In the case of Sculpture-GAN, a three-dimensional GAN is fed with a corpus of 10,000 3D printable objects and sculptures in the hope that the model will extract features necessary for the sculpture to be 3D-printable. As a result, the 3D-GAN almost always generates a 3D-printable sculpture since it creates 3D objects that should be indistinguishable from the training corpus.
In the case of Contingent Determinacies the implementation was done in a reinforcement learning algorithm. That implied the implementation of the environment with the reward function and the agent with the policy function. The definition of 3D-printable sculpture was that no parts should be disjunct, so the sculpture should be printable and contained within the volume of a given 3D printer. There were no further constraints regarding printability since the planned usage of support for hanging parts (hanging parts are part of a 3D model where the current layer has no material underneath, which leads to the melted plastic sacking through, possibly reaching the print plate. The underlying aesthetic design of the sculpture was a 3D random walk. The agent could move along three standard axes without diagonal movement. It was placed at any random point inside the defined 3D space, it knew its current relative position in 3D space from the starting point, it could decide whether to take the random step or step in a chosen direction, and it could choose the step size. The reward function was defined as following: taking a random step resulted in medium positive reward received, taking a chosen step in medium negative reward; absence of gap received small positive reward, while presence of gap resulted in sizeable negative reward; approaching the boundary of 3D space without crossing it with small negative reward, crossing 3D space boundary with large negative reward (this acted as proximity sensor for the boundary).
For aesthetic reasons, another rule was added: the increasing boundary volume of sculpture yielded the reward increasing with x^(3/2), motivating the agent to create more expanded sculptures in a given number of steps, thus creating tension with the reward regarding the overlap of single cubes. This rule motivated the agent to create cubes filling the whole 3D space until the proximity of the boundary. So, the subsequent rule was that the sculpture with a volume of 1/3 of its boundary volume received the highest reward, with the reward falling with the deviation in both directions. Even more rules emerged while sculpting the process.
There is already a substantial technical difference visible between the two processes. The description of the Sculpture-GAN process took one sentence, while the RL process description took one paragraph. First, there is a larger probability of an already programmed or easily adjustable GAN framework for a given task. In contrast, RL frameworks must be programmed, especially in the art where projects tend to be idiosyncratic. Second, in GAN, the result is determined by an external data set, unlike RL, where the system's reward mechanism and policy function are inherent. This ready-made availability of frameworks means that the artist using GAN needs to change the data set and tweak the model's parameters to achieve the desired aesthetic result. In RL, the artist needs to add or remove the reward rule, asses its influence on all already present rules, and reflect on the implication of the new rule to overall aesthetics. Without the last rule and with the agent's only goal to maximise its cumulative reward, the system started producing completely undesirable results. Fourth, while GANs produced, in most cases, the expected result, there was no guarantee that the 3D model would be printable due to the training. In RL, given a balanced reward mechanism, the agent will always produce printable results after enough training episodes due to the agent's inherent cognition.
Apart from technical differences, even a more significant, ana-material difference lurks in the concept. GANs are based on the mimesis model, in which they generate a resemblance, not to a single instance of dataset or, in general, an original (landscape, portrait). However, there is a statistical resemblance to the data set itself. It operates as passive matter, on which the model of an external data set is pressed like a mould on wet clay. With incredible accuracy, the generator generates a copy of the statistical original. It presents it, first to the discriminator, a digital version of Plato, that replies good copy, bad copy, bad copy, bad copy, good copy, and when sufficiently trained, to the artist who replies: printable, unprintable, unprintable, unprintable, printable. As a machinic surrogate, it learns with time through the feedback to produce the results that the artist desires, hence its emergent identity, bound to the artist.
RLs operate as cognitive agents, learning through reward mechanisms and their actions in the environment, in which their mode of operation is conceptual/processual, where the result does not mime anything but is entirely the result of the process. Furthermore, since there is no original, the Plato mentioned above, uninvited into the algorithm, would be stuck in repeating: false copy, false copy, false copy, a simulacrum. The artist was devising, or rather, sculpting the method of rewards that was guiding a non-conscious cogniser through its algorithmic space-time to devise, or rather, cognise a process that will maximise its award and non-consciously produce artist's desired result.