Tag: AI

  • Making audio contents with AI and In-The-Box

    Making audio contents with AI and In-The-Box

    Suno or Udio are gigantic platforms for making music with AI, with all the pros and cons of this type of tool. They also allow free access, which is fine to understand what they are, but if you want something more you have to pay, obviously.

    What if I wanted to use Artificial Intelligence on my computer, and perhaps base my results on truly free material without stealing anything from anyone?

    Personally I believe that an excellent tool for exploring AI on personal PC is ComfyUI.

    In addition to being able to test all types of checkpoints for graphics and videos through its powerful node interface, it also allows you to manage a free audio library both for use and because it is based on open source audio materials: Stable Audio Open 1.0.

    Unlike other models, private and not accessible for artists and researchers to build upon, Audio Open is based on a architecture and training process driven by a new open-weights text-to-audio model trained with Creative Commons data.

    Stable Audio Open generates stereo audio at 44.1kHz in Flac format, and is an open-source model optimized for generating short audio samples, sound effects, and production elements using text prompts. Ideal for creating drum beats, instrument riffs, ambient sounds, foley recordings, and other audio samples, the model was trained on data from Freesound and the Free Music Archive, respecting creator rights.

    How to do it? We see it right away.

    As I said previously, we need a tool to manage all the necessary elements. Personally I had a good experience with ComfyUI, as mentioned previously, if you are inexperienced you can install the deskop version, if Python and its cousins ​​scared you.

    The latest version right now is 0.3.10, which you can download here.

    It’s really very simple, but for installation you can find all the necessary information on the ComfyUI website.

    One of the features of ComfyUI is that it allows you to use AI without having to resort to excessively powerful PCs, which I personally find very sensible.

    However, once the installation is finished, before launching the program, take a moment to look at the directory tree: inside the main folder there are some important folders, where we will have to download the files necessary to make the whole thing work.

    Inside the ComfyUI folder, notice the one called checkpoints. All the files necessary to make our workflow work will be placed inside these folders.

    At this moment our installation is virgin, and since our goal is to create sounds with AI, let’s get what we need.

    1. Open the ComfyUI audio examples page, and literally follow the instructions. Rename the two files needed as stated, and put them in the right directories.
    2. Download the workflow, and put it in the ComfyUI > user folder. Simply download the flac file, which you will then drag into the ComfyUI interface to extract the relevant workflow.
    3. Now we can open ComfyUI by double clicking on the relevant icon created by the installation.
    4. Drag the previously downloaded .flac file onto the ComfyUI window, and you should see an interface similar to the following image. The nodes can be repositioned as is most convenient for you.
    The audio workflow in ComfyUI.

    That’s it, you don’t need anything else and you’re ready to type your prompt into the node CLIP Text Encode and click Queue.

    Example of audio generated from prompt in the image with the base workflow.

    I hope it wasn’t too difficult. The technical part is finished, and if you have obtained an audio file in the Save audio node the installation works.

    Creating meaningful prompts requires some experimentation, of course. However, your results will be saved in ComfyUI’s Output folder.

    I strongly suggest studying the prompts page in the Stable Audio User Guide, it really explains how to proceed.

    This is the starting point, from here you can start building your own path with AI. For example:

    BEWARE, it is a dangerous drug and your hard drive will quickly fill up.

    You can find countless examples by doing a little search for “ComfyUI audio workflow”.

    Obviously this is only one of the ways to obtain our result, there are many others. It’s just probably the easiest to get started with.

  • Best wishes from Pelican Village

    Best wishes from Pelican Village

    A song for us avatars: Livio Korobase & Renee Rebane in Digital Holiday.

    [Verse] In the land of codes and streams, Where we live our digital dreams, The snow falls bright in shades of blue, A pixel-perfect holiday view. Avatars in festive clothes, Building trees where no one knows, Silent nights in a neon glow, It’s Christmas in the pixel snow. [Chorus] Oh, it’s Christmas in the pixel snow, Where the virtual winds of winter blow. Lights that sparkle, hearts that gleam, A holiday in a coded dream. Oh, it’s Christmas in the pixel snow, Together, no matter where we go. Across the wires, through the screen,

    We celebrate as one big team.

    Udio, Studio One, OBS. Video recorded in Second Life. Livio Korobase & Renee Rebane in Digital Holiday

  • Singing Poet Society @ Hexagons: what happened?

    Singing Poet Society @ Hexagons: what happened?

    One of the most controversial things about generative AI in the artistic field is undoubtedly the fact that the gigantic databases on which the generation is based are built with data available on the web, without asking for any authorization from the authors. Some sites have specialized in audio generation, but they do not care about the origin of the generated content and instead focus on the creation of web interfaces designed to facilitate the creation of “songs” that sound “believable”.

    This also applies to graphics or wherever AI works generatively, so much so that the prompts can be tacked on with the wording “in the style of [famous name here]”, sometimes resulting in somewhat “artistic” results. But who is the artist in this case? Who wrote the prompt or who actually created the snippet that the AI ​​based the piece on?

    In my opinion there is no real creative act in this, it is more a question of luck than anything else.

    The Singing Poet Society project case instead adds an element that changes the cards on the table. Tony has trained the AI ​​(a process called machine learning) using his own material, aspects that in my opinion constitute the heart of the matter. The AI ​​is used here simply as a tool for the construction of a song, it is not ultimately that different from using sequencers or other generative tools in a DAW.

    However, knowing that the one singing is the AI ​​with Tony’s voice is a bit shocking, but that’s what actually happens.

    I haven’t come to a personal opinion yet and I don’t know what I think, but removing the use of materials made by others from the scene certainly cleans up the perspective.

    Anyway, here is the recording of the evening, so everyone can develop their own conviction.


    Tony Gerber aka Cypress Rosewood’s Singing Poet Society @ Hexagon 241207 (AI music project).
    Video by D-oo-b.

  • Singing Poet Society at The Hexagons, AI meet poetry and music

    Singing Poet Society at The Hexagons, AI meet poetry and music

    Saturday Dec 7 1PM SLT in Second Life, Roof of The Hexagons. Presentation of project, performance and Q&A session with Tony Gerber

    There is always a lot of discussion about artificial intelligence and its intelligent use, it seems that the same adjective is used in a inconsistent way.

    I really like this Singing Poet Society project, because it is undoubtedly an example of how AI can be used creatively and in an original way.

    In an innovative blend of art and technology, Tony Gerber, a visionary artist and musician, has embraced artificial intelligence (AI) with enthusiasm and creative inspirations. His creations contain various musical blends of his own original music and AI music collaboration.

    The Singing Poet Society YouTube channel serves as both a platform for artistic collaboration with AI and an educational tool aimed at demystifying AI’s role in creative endeavors.

    The channel proudly hosts an impressive collection of 110 videos, each transforming public domain poems from celebrated poets such as Robert Frost, Emily Dickinson, Edgar Allan Poe, and other literary luminaries into captivating song videos.

    Gerber has harnessed AI driven graphic tools like Midjourney and integrated emerging AI music applications, including the beta
    version of Udio, alongside traditional video editing techniques to craft these engaging and thought provoking pieces.

    “Singing Poet Society” is not merely an entertainment outlet but a source of inspiration and education.
    Beyond its YouTube presence, Gerber envisions the channel as a conduit for introducing AI into educational settings, particularly within schools and English classes. His goal is to illuminate AI’s potential as a tool for enhancing learning, by enabling students to explore and interpret the rich insights, life reflections, and human experiences encapsulated in classic poetry.

    Through this initiative, Gerber encourages students to engage creatively with poetry, fostering their own compositions and song videos, and offering fresh perspectives on public domain, time honored works.

    This fusion of AI technology and poetic artistry promises to open new avenues for learning and creation, making the “Singing Poet Society” a pioneering venture in the realm of digital education and artistic expression.

    An example of Singing Poet Society channel content. Don’t miss the transcriptions of poetry.

    Tony Gerber has been a part of the Nashville art, music and technology communities for 43 years. He has worked with technology as an artistic tool since the 70s and continues with project to inspire younger generations and reinspire older generations with Singing Poet Society.

    CONTACT:

    Tony Gerber,
    singingpoetsociety@gmail.com ,
    615-414-1241
    http://singingpoetsociety.com
    Gerber’s music site: tonygerber.bandcamp.com

  • AI and music: Pinokio and Magenta Studio

    AI and music: Pinokio and Magenta Studio

    I think we’ve all tried a bit to use artificial intelligence to make music. Initially amazed, then slowly you find the limits, and above all the costs.

    My personal view is that machine learning can be used to enable and enhance the creative potential of all people, and I’d like it to be like that for everyone.

    That said, there are many platforms on Web, even complex ones, that offer the possibility of creating a song from a text prompt. The “trial” generation is free, but if you need more, you have to switch to a payment plan based on the amount of rendering you need.

    However, there is also the possibility of generating music with AI on your computer, downloading several different models, and thus avoiding the costs of online platforms.

    I would like to talk here about two solutions that work locally, on your PC: Pinokio and Magenta Studio, two completely different approaches to AI-generated music.

    Pinokio

    Pinokio It is really a possible solution: its scripts take care of downloading everything you need and configuring the working environment without disturbing your file system in any way. At the installation you will be asked to indicate a Pinokio Home, and everything you download will go inside this directory, no mess around the PC.

    Installation procedure and setup are very simple and easy, and it’s available for Windows, Mac, and Linux.

    Pinokio is like a steam, or playstore for AI

    The scripts available obviously do not concern only music, but a myriad of applications in all the areas concerned: text, images, videos, and so on and so forth.
    Warning: each application requires disk space, and the download is quite heavy. Make sure you have space on the disk where you created your Pinokio Home.

    I have installed several libraries on my PC, currently the ones you see in the image below. Well, that’s 140 GB of disk space, and unfortunately appetite comes with eating.

    Discover page is gigantic and full of distributions.

    Anyway, interesting. Worth a try.

    Magenta studio

    Magenta Studio follow a complete different path and is based on recurrent neural networks (RNN). A recurrent neural network has looped, or recurrent, connections which allow the network to hold information across inputs and these connections can be thought of as similar to memory. RNNs are particularly useful for learning sequential data like music. Magenta currently consists of several tools: Continue, Drumify, Generate, Groove and Interpolate.

    These tools are available as standalone programs, but version 2 has become a integrated plugin for Ableton Live, with the same functionality as version 1. They use cutting-edge machine learning techniques for music generation, really interesting.

    At the Magenta site you can also become familiar with the so-called DDSP-VST.

    Okay, talking about Neural Synthesis may seem like science fiction, but it’s actually simpler than it seems. At the end of the day, it’s just a matter of installing a VST3, which is complex.

    If you like to experiment, I find very interesting the part dedicated to the creation of your own instruments, where artificial intelligence can be trained with your samples.

    If you use Linux or Mac, take a look at the Magenta Midi Interface and say wow.

    In short, as they say a lot of stuff to play and many acronyms to learn.

    Quick presentation about Magenta