Foley, in audiovisual postproduction, is the name under which the practice of recreating sounds, generated through various means, is called to correct or replace the absence of an original sound record.
Ingenuity has always been a key factor when recreating these effects, but an algorithm seeks to make all of this easier, generating the missing sound effects using an AI.
Foley effects generated by artificial intelligence
Sanchita Ghose and John J. Prevost, members of the IEEE, an organization dedicated to research in technological innovation, recently published a Article in which they propose the creation of an AI algorithm, which through deep learning techniques is capable of determining the relationship between certain types of scene and their respective sounds, in order to generate audios adapted to the silent samples that are given to it. present.
This type of tool can be built through an antagonistic generative network (GAN) and it is precisely on this dynamic that FoleyGAN is based, the proposal of this pair of researchers for the generation of these room effects.
Based on each sequence of frames presented, FoleyGAN generates sounds related to visual information, in good audio quality and synchronized with the image. This novel research presented focuses on the process of creating an antagonistic generative network for these purposes, taking care of the aspects just mentioned.
In its first lines of presentation, FoleyGAN is presented in its introductory study as a system “capable of conditioning sequences of action of visual events that lead to the generation of visually aligned realistic soundtracks”.
The researchers claim to have worked with a large sample of Foley data to train FoleyGAN. Its first synthesized sounds were subjected to human evaluation, obtaining an average of 81% approval, an index that reflects the high plausibility that these synchronized sounds can achieve within a video clip.
These results are celebrated as positive by the executors of the project, since they ensure that their proposal generates superior results, based on the registered statistics of projects proposed under other techniques and trained with different data samples.
Generating original Foley effects requires dedication and time. Obtaining these sounds from the Internet can be a quick solution, but it can detract from a project’s originality. To these two common alternatives, a third could be added in the future, from the hand of the AI presented.