JS pitch shift with timbre control












0















i need a good pitch shift solution for my project to change the voice. a lot of pitch shift js libraries around - tried them all but they don't provide the desired result. main thing is no control on the result voice timbre and i get Mickey mouse or hell zombie sounding stuff but not real voices with it. while here the result is just outstanding if to test with vega's voice: http://www.sonicapi.com/docs/live-task-demo?task=process-elastiqueTune#demo_form
unfortunately i'm total zero with audio processing and wanna know at least how it's done, what a type of shifting algorythm is used here and how we can achieve timbre/formant control over the process. any hints highly appreciated. thanks ;)










share|improve this question



























    0















    i need a good pitch shift solution for my project to change the voice. a lot of pitch shift js libraries around - tried them all but they don't provide the desired result. main thing is no control on the result voice timbre and i get Mickey mouse or hell zombie sounding stuff but not real voices with it. while here the result is just outstanding if to test with vega's voice: http://www.sonicapi.com/docs/live-task-demo?task=process-elastiqueTune#demo_form
    unfortunately i'm total zero with audio processing and wanna know at least how it's done, what a type of shifting algorythm is used here and how we can achieve timbre/formant control over the process. any hints highly appreciated. thanks ;)










    share|improve this question

























      0












      0








      0








      i need a good pitch shift solution for my project to change the voice. a lot of pitch shift js libraries around - tried them all but they don't provide the desired result. main thing is no control on the result voice timbre and i get Mickey mouse or hell zombie sounding stuff but not real voices with it. while here the result is just outstanding if to test with vega's voice: http://www.sonicapi.com/docs/live-task-demo?task=process-elastiqueTune#demo_form
      unfortunately i'm total zero with audio processing and wanna know at least how it's done, what a type of shifting algorythm is used here and how we can achieve timbre/formant control over the process. any hints highly appreciated. thanks ;)










      share|improve this question














      i need a good pitch shift solution for my project to change the voice. a lot of pitch shift js libraries around - tried them all but they don't provide the desired result. main thing is no control on the result voice timbre and i get Mickey mouse or hell zombie sounding stuff but not real voices with it. while here the result is just outstanding if to test with vega's voice: http://www.sonicapi.com/docs/live-task-demo?task=process-elastiqueTune#demo_form
      unfortunately i'm total zero with audio processing and wanna know at least how it's done, what a type of shifting algorythm is used here and how we can achieve timbre/formant control over the process. any hints highly appreciated. thanks ;)







      javascript web-audio pitch-shifting






      share|improve this question













      share|improve this question











      share|improve this question




      share|improve this question










      asked Nov 25 '18 at 13:22









      Grigory GrogulenkoGrigory Grogulenko

      14




      14
























          1 Answer
          1






          active

          oldest

          votes


















          1














          This question touches a very broad subject. Here's a few pointers.



          Pitch can be, in general, shifted by offsetting the frequencies that form the voice material. An easy version of this is resampling in temporal domain, where in essential the recording is played back in a different speed. This naturally leads to a tempo change as well which is often not desirable.



          In order to preserve the tempo, you need to "explode" the material into its components, in other words, make a domain change from temporal domain to frequency domain. This is what Fourier Transform is for. Once done, you have an estimate of set of frequencies (and respective phases if properly done in complex space) per sample.



          The perceived timbre of the voice depends on the relative amplitudes of the frequency set called overtones. Overtones are formed in the speaker's vocal tract and to the listener, heard together with the fundamental frequency. You can control the timbre using different filters in either time domain, spectral
          (frequency) domain or cepstral domain. This kind of signal processing is a subject for a library section full of books.



          You can move from back from the spectral (frequency) domain to the temporal (time) domain using inverse Fourier transform.



          To sum up, the naive approach to shift the pitch you need to transform the samples from temporal to spectral domain, resample along the time axis, and then do the inverse Fourier transform to get back to the time domain.



          Besides Fourier transform, you could use wavelets. I hope this gets you started.






          share|improve this answer























            Your Answer






            StackExchange.ifUsing("editor", function () {
            StackExchange.using("externalEditor", function () {
            StackExchange.using("snippets", function () {
            StackExchange.snippets.init();
            });
            });
            }, "code-snippets");

            StackExchange.ready(function() {
            var channelOptions = {
            tags: "".split(" "),
            id: "1"
            };
            initTagRenderer("".split(" "), "".split(" "), channelOptions);

            StackExchange.using("externalEditor", function() {
            // Have to fire editor after snippets, if snippets enabled
            if (StackExchange.settings.snippets.snippetsEnabled) {
            StackExchange.using("snippets", function() {
            createEditor();
            });
            }
            else {
            createEditor();
            }
            });

            function createEditor() {
            StackExchange.prepareEditor({
            heartbeatType: 'answer',
            autoActivateHeartbeat: false,
            convertImagesToLinks: true,
            noModals: true,
            showLowRepImageUploadWarning: true,
            reputationToPostImages: 10,
            bindNavPrevention: true,
            postfix: "",
            imageUploader: {
            brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
            contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
            allowUrls: true
            },
            onDemand: true,
            discardSelector: ".discard-answer"
            ,immediatelyShowMarkdownHelp:true
            });


            }
            });














            draft saved

            draft discarded


















            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53467908%2fjs-pitch-shift-with-timbre-control%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown

























            1 Answer
            1






            active

            oldest

            votes








            1 Answer
            1






            active

            oldest

            votes









            active

            oldest

            votes






            active

            oldest

            votes









            1














            This question touches a very broad subject. Here's a few pointers.



            Pitch can be, in general, shifted by offsetting the frequencies that form the voice material. An easy version of this is resampling in temporal domain, where in essential the recording is played back in a different speed. This naturally leads to a tempo change as well which is often not desirable.



            In order to preserve the tempo, you need to "explode" the material into its components, in other words, make a domain change from temporal domain to frequency domain. This is what Fourier Transform is for. Once done, you have an estimate of set of frequencies (and respective phases if properly done in complex space) per sample.



            The perceived timbre of the voice depends on the relative amplitudes of the frequency set called overtones. Overtones are formed in the speaker's vocal tract and to the listener, heard together with the fundamental frequency. You can control the timbre using different filters in either time domain, spectral
            (frequency) domain or cepstral domain. This kind of signal processing is a subject for a library section full of books.



            You can move from back from the spectral (frequency) domain to the temporal (time) domain using inverse Fourier transform.



            To sum up, the naive approach to shift the pitch you need to transform the samples from temporal to spectral domain, resample along the time axis, and then do the inverse Fourier transform to get back to the time domain.



            Besides Fourier transform, you could use wavelets. I hope this gets you started.






            share|improve this answer




























              1














              This question touches a very broad subject. Here's a few pointers.



              Pitch can be, in general, shifted by offsetting the frequencies that form the voice material. An easy version of this is resampling in temporal domain, where in essential the recording is played back in a different speed. This naturally leads to a tempo change as well which is often not desirable.



              In order to preserve the tempo, you need to "explode" the material into its components, in other words, make a domain change from temporal domain to frequency domain. This is what Fourier Transform is for. Once done, you have an estimate of set of frequencies (and respective phases if properly done in complex space) per sample.



              The perceived timbre of the voice depends on the relative amplitudes of the frequency set called overtones. Overtones are formed in the speaker's vocal tract and to the listener, heard together with the fundamental frequency. You can control the timbre using different filters in either time domain, spectral
              (frequency) domain or cepstral domain. This kind of signal processing is a subject for a library section full of books.



              You can move from back from the spectral (frequency) domain to the temporal (time) domain using inverse Fourier transform.



              To sum up, the naive approach to shift the pitch you need to transform the samples from temporal to spectral domain, resample along the time axis, and then do the inverse Fourier transform to get back to the time domain.



              Besides Fourier transform, you could use wavelets. I hope this gets you started.






              share|improve this answer


























                1












                1








                1







                This question touches a very broad subject. Here's a few pointers.



                Pitch can be, in general, shifted by offsetting the frequencies that form the voice material. An easy version of this is resampling in temporal domain, where in essential the recording is played back in a different speed. This naturally leads to a tempo change as well which is often not desirable.



                In order to preserve the tempo, you need to "explode" the material into its components, in other words, make a domain change from temporal domain to frequency domain. This is what Fourier Transform is for. Once done, you have an estimate of set of frequencies (and respective phases if properly done in complex space) per sample.



                The perceived timbre of the voice depends on the relative amplitudes of the frequency set called overtones. Overtones are formed in the speaker's vocal tract and to the listener, heard together with the fundamental frequency. You can control the timbre using different filters in either time domain, spectral
                (frequency) domain or cepstral domain. This kind of signal processing is a subject for a library section full of books.



                You can move from back from the spectral (frequency) domain to the temporal (time) domain using inverse Fourier transform.



                To sum up, the naive approach to shift the pitch you need to transform the samples from temporal to spectral domain, resample along the time axis, and then do the inverse Fourier transform to get back to the time domain.



                Besides Fourier transform, you could use wavelets. I hope this gets you started.






                share|improve this answer













                This question touches a very broad subject. Here's a few pointers.



                Pitch can be, in general, shifted by offsetting the frequencies that form the voice material. An easy version of this is resampling in temporal domain, where in essential the recording is played back in a different speed. This naturally leads to a tempo change as well which is often not desirable.



                In order to preserve the tempo, you need to "explode" the material into its components, in other words, make a domain change from temporal domain to frequency domain. This is what Fourier Transform is for. Once done, you have an estimate of set of frequencies (and respective phases if properly done in complex space) per sample.



                The perceived timbre of the voice depends on the relative amplitudes of the frequency set called overtones. Overtones are formed in the speaker's vocal tract and to the listener, heard together with the fundamental frequency. You can control the timbre using different filters in either time domain, spectral
                (frequency) domain or cepstral domain. This kind of signal processing is a subject for a library section full of books.



                You can move from back from the spectral (frequency) domain to the temporal (time) domain using inverse Fourier transform.



                To sum up, the naive approach to shift the pitch you need to transform the samples from temporal to spectral domain, resample along the time axis, and then do the inverse Fourier transform to get back to the time domain.



                Besides Fourier transform, you could use wavelets. I hope this gets you started.







                share|improve this answer












                share|improve this answer



                share|improve this answer










                answered Nov 25 '18 at 14:06









                Sami HultSami Hult

                2,3871613




                2,3871613
































                    draft saved

                    draft discarded




















































                    Thanks for contributing an answer to Stack Overflow!


                    • Please be sure to answer the question. Provide details and share your research!

                    But avoid



                    • Asking for help, clarification, or responding to other answers.

                    • Making statements based on opinion; back them up with references or personal experience.


                    To learn more, see our tips on writing great answers.




                    draft saved


                    draft discarded














                    StackExchange.ready(
                    function () {
                    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53467908%2fjs-pitch-shift-with-timbre-control%23new-answer', 'question_page');
                    }
                    );

                    Post as a guest















                    Required, but never shown





















































                    Required, but never shown














                    Required, but never shown












                    Required, but never shown







                    Required, but never shown

































                    Required, but never shown














                    Required, but never shown












                    Required, but never shown







                    Required, but never shown







                    Popular posts from this blog

                    404 Error Contact Form 7 ajax form submitting

                    How to know if a Active Directory user can login interactively

                    TypeError: fit_transform() missing 1 required positional argument: 'X'