Adobe unveils some features of Photoshop VoCo for audio

If Photoshop is the photo editing program that revolutionized the world of graphics, Adobe is developing Project VoCo to revolutionize the world of audio

Adobe, known for its video and digital graphics products, surprised everyone at the Adobe MAX event, held recently in San Diego, California, by presenting what can be considered - to all intents and purposes - the Photoshop of audio.

Project VoCo - developed in collaboration with Princeton University - is still in the experimental stage, but Zeyu Jin, one of the team's developers, showed some of the main functions such as, for example, the ability to add, change or delete words from audio as if it were a written text.

The purpose of Project VoCo, it seems, will not be limited to editing speeches or simply replacing some words as in a word processing program. L’obiettivo è molto più ambizioso: realizzare discorsi tramite software, completamente da zero, senza la presenza del soggetto. Basta solo possedere la tonalità di voce dell’interessato per fargli dire quello che si vuole. Uno strumento di editing audio sicuramente utile per correggere eventuali errori nel parlato in registrazioni e podcast, ma qualche dubbio su altri impieghi meno leciti sorge spontanea…

Come funziona Project VoCo

Fonte foto: Web

Project VoCo

La presentazione del nuovo progetto “a sorpresa” è stata simpatica e divertente. Adobe ha mostrato come è semplice modificare i discorsi, rimuovendo alcune parole e sostituendole con altre. I risultati, sebbene Photoshop VoCo sia ancora in fase sperimentale, promettono molto bene. All it takes is for a subject to record at least 20 minutes of audio samples, and a series of algorithms - called CUTE - analyze the speech, divide the words into individual phonemes to create the speech pattern. The application captures the subject's voice timbre, and the speech text can be edited at will as in a word processing program.

Ethics on the edge

Project VoCo, at the moment, is only able to replace one word with another or make any corrections to pronunciation. But it is the future that is slightly puzzling. Creating a speech from scratch using only software? Without the need to record the person in question live? It's true that the application could be of great help in correcting recordings and podcasts, but if the program would allow you to edit a speech as easily as retouching an image in Photoshop, then you'd have to pay attention not only to what you see, but even more to what you hear.