Concatenate specific number of files











up vote
1
down vote

favorite












I have a bunch of files named uv_set_XXXXXXXX where the 6 Xs stand for the usual format year, month and day. Imagine I have 325 files of this type. I would like to concatenate by groups of 50 files, so in the end I have 7 files (6 files of 50 and 1 of 25).



I have been thinking in using cat but I can't see an option to select a number of files from a list. I could do this with Python, but just wondering if some Unix command line utility does it more directly.



Thanks.










share|improve this question






















  • do you want the file names to be ordered on date (YYMMDD?) before splitting by 50
    – stack0114106
    Nov 19 at 13:21










  • @stack0114106: no, it's not necessary at this moment. Thank you.
    – David
    Nov 19 at 13:30















up vote
1
down vote

favorite












I have a bunch of files named uv_set_XXXXXXXX where the 6 Xs stand for the usual format year, month and day. Imagine I have 325 files of this type. I would like to concatenate by groups of 50 files, so in the end I have 7 files (6 files of 50 and 1 of 25).



I have been thinking in using cat but I can't see an option to select a number of files from a list. I could do this with Python, but just wondering if some Unix command line utility does it more directly.



Thanks.










share|improve this question






















  • do you want the file names to be ordered on date (YYMMDD?) before splitting by 50
    – stack0114106
    Nov 19 at 13:21










  • @stack0114106: no, it's not necessary at this moment. Thank you.
    – David
    Nov 19 at 13:30













up vote
1
down vote

favorite









up vote
1
down vote

favorite











I have a bunch of files named uv_set_XXXXXXXX where the 6 Xs stand for the usual format year, month and day. Imagine I have 325 files of this type. I would like to concatenate by groups of 50 files, so in the end I have 7 files (6 files of 50 and 1 of 25).



I have been thinking in using cat but I can't see an option to select a number of files from a list. I could do this with Python, but just wondering if some Unix command line utility does it more directly.



Thanks.










share|improve this question













I have a bunch of files named uv_set_XXXXXXXX where the 6 Xs stand for the usual format year, month and day. Imagine I have 325 files of this type. I would like to concatenate by groups of 50 files, so in the end I have 7 files (6 files of 50 and 1 of 25).



I have been thinking in using cat but I can't see an option to select a number of files from a list. I could do this with Python, but just wondering if some Unix command line utility does it more directly.



Thanks.







bash command cat






share|improve this question













share|improve this question











share|improve this question




share|improve this question










asked Nov 19 at 12:25









David

335213




335213












  • do you want the file names to be ordered on date (YYMMDD?) before splitting by 50
    – stack0114106
    Nov 19 at 13:21










  • @stack0114106: no, it's not necessary at this moment. Thank you.
    – David
    Nov 19 at 13:30


















  • do you want the file names to be ordered on date (YYMMDD?) before splitting by 50
    – stack0114106
    Nov 19 at 13:21










  • @stack0114106: no, it's not necessary at this moment. Thank you.
    – David
    Nov 19 at 13:30
















do you want the file names to be ordered on date (YYMMDD?) before splitting by 50
– stack0114106
Nov 19 at 13:21




do you want the file names to be ordered on date (YYMMDD?) before splitting by 50
– stack0114106
Nov 19 at 13:21












@stack0114106: no, it's not necessary at this moment. Thank you.
– David
Nov 19 at 13:30




@stack0114106: no, it's not necessary at this moment. Thank you.
– David
Nov 19 at 13:30












2 Answers
2






active

oldest

votes

















up vote
3
down vote



accepted










With GNU parallel you can use the following command



parallel -n50 "cat {} > out{#}" ::: uv_set_*


This will merge the first 50 files into out1, the next 50 files into out2, and so on.






share|improve this answer




























    up vote
    1
    down vote













    I would just break down and do this in Awk.



    awk 'FNR==1 && (++i%50 == 0) {
    if(NR>1) close p;
    p = "dest_" ++j }
    { print >p }' uv_set_????????


    This creates files dest_1 through dest_7, the first 6 with 50 files in each and the last with the remainder.



    Closing the previous file is necessary because the system only allows Awk to have a limited number of open file handles (though the limit is typically higher than 7 so it's probably not important in your example).





    Thinking out loud dept, just to prevent anyone else from wasting time on repeating this dead end.



    You could use xargs -L 50 cat to concatenate 50 files at a time, but there is no simple way to pass in a new redirection for standard output for each invocation. You could try to hack your way around that with something like



    # XXX Do not use: incomplete
    printf '%sn' uv_set_???????? |
    xargs -L 50 sh -c 'cat "$@" > ... something' _


    but I can't come up with elegant way to have a different something each time.






    share|improve this answer























    • Maybe you could redirect to > starting_at_"$1".
      – Socowi
      Nov 19 at 12:50










    • Yeah, I was toying with exactly that idea, but it simply sucked too much.
      – tripleee
      Nov 19 at 12:50











    Your Answer






    StackExchange.ifUsing("editor", function () {
    StackExchange.using("externalEditor", function () {
    StackExchange.using("snippets", function () {
    StackExchange.snippets.init();
    });
    });
    }, "code-snippets");

    StackExchange.ready(function() {
    var channelOptions = {
    tags: "".split(" "),
    id: "1"
    };
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function() {
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled) {
    StackExchange.using("snippets", function() {
    createEditor();
    });
    }
    else {
    createEditor();
    }
    });

    function createEditor() {
    StackExchange.prepareEditor({
    heartbeatType: 'answer',
    convertImagesToLinks: true,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: 10,
    bindNavPrevention: true,
    postfix: "",
    imageUploader: {
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    },
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    });


    }
    });














     

    draft saved


    draft discarded


















    StackExchange.ready(
    function () {
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53374610%2fconcatenate-specific-number-of-files%23new-answer', 'question_page');
    }
    );

    Post as a guest















    Required, but never shown

























    2 Answers
    2






    active

    oldest

    votes








    2 Answers
    2






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes








    up vote
    3
    down vote



    accepted










    With GNU parallel you can use the following command



    parallel -n50 "cat {} > out{#}" ::: uv_set_*


    This will merge the first 50 files into out1, the next 50 files into out2, and so on.






    share|improve this answer

























      up vote
      3
      down vote



      accepted










      With GNU parallel you can use the following command



      parallel -n50 "cat {} > out{#}" ::: uv_set_*


      This will merge the first 50 files into out1, the next 50 files into out2, and so on.






      share|improve this answer























        up vote
        3
        down vote



        accepted







        up vote
        3
        down vote



        accepted






        With GNU parallel you can use the following command



        parallel -n50 "cat {} > out{#}" ::: uv_set_*


        This will merge the first 50 files into out1, the next 50 files into out2, and so on.






        share|improve this answer












        With GNU parallel you can use the following command



        parallel -n50 "cat {} > out{#}" ::: uv_set_*


        This will merge the first 50 files into out1, the next 50 files into out2, and so on.







        share|improve this answer












        share|improve this answer



        share|improve this answer










        answered Nov 19 at 12:40









        Socowi

        5,2832724




        5,2832724
























            up vote
            1
            down vote













            I would just break down and do this in Awk.



            awk 'FNR==1 && (++i%50 == 0) {
            if(NR>1) close p;
            p = "dest_" ++j }
            { print >p }' uv_set_????????


            This creates files dest_1 through dest_7, the first 6 with 50 files in each and the last with the remainder.



            Closing the previous file is necessary because the system only allows Awk to have a limited number of open file handles (though the limit is typically higher than 7 so it's probably not important in your example).





            Thinking out loud dept, just to prevent anyone else from wasting time on repeating this dead end.



            You could use xargs -L 50 cat to concatenate 50 files at a time, but there is no simple way to pass in a new redirection for standard output for each invocation. You could try to hack your way around that with something like



            # XXX Do not use: incomplete
            printf '%sn' uv_set_???????? |
            xargs -L 50 sh -c 'cat "$@" > ... something' _


            but I can't come up with elegant way to have a different something each time.






            share|improve this answer























            • Maybe you could redirect to > starting_at_"$1".
              – Socowi
              Nov 19 at 12:50










            • Yeah, I was toying with exactly that idea, but it simply sucked too much.
              – tripleee
              Nov 19 at 12:50















            up vote
            1
            down vote













            I would just break down and do this in Awk.



            awk 'FNR==1 && (++i%50 == 0) {
            if(NR>1) close p;
            p = "dest_" ++j }
            { print >p }' uv_set_????????


            This creates files dest_1 through dest_7, the first 6 with 50 files in each and the last with the remainder.



            Closing the previous file is necessary because the system only allows Awk to have a limited number of open file handles (though the limit is typically higher than 7 so it's probably not important in your example).





            Thinking out loud dept, just to prevent anyone else from wasting time on repeating this dead end.



            You could use xargs -L 50 cat to concatenate 50 files at a time, but there is no simple way to pass in a new redirection for standard output for each invocation. You could try to hack your way around that with something like



            # XXX Do not use: incomplete
            printf '%sn' uv_set_???????? |
            xargs -L 50 sh -c 'cat "$@" > ... something' _


            but I can't come up with elegant way to have a different something each time.






            share|improve this answer























            • Maybe you could redirect to > starting_at_"$1".
              – Socowi
              Nov 19 at 12:50










            • Yeah, I was toying with exactly that idea, but it simply sucked too much.
              – tripleee
              Nov 19 at 12:50













            up vote
            1
            down vote










            up vote
            1
            down vote









            I would just break down and do this in Awk.



            awk 'FNR==1 && (++i%50 == 0) {
            if(NR>1) close p;
            p = "dest_" ++j }
            { print >p }' uv_set_????????


            This creates files dest_1 through dest_7, the first 6 with 50 files in each and the last with the remainder.



            Closing the previous file is necessary because the system only allows Awk to have a limited number of open file handles (though the limit is typically higher than 7 so it's probably not important in your example).





            Thinking out loud dept, just to prevent anyone else from wasting time on repeating this dead end.



            You could use xargs -L 50 cat to concatenate 50 files at a time, but there is no simple way to pass in a new redirection for standard output for each invocation. You could try to hack your way around that with something like



            # XXX Do not use: incomplete
            printf '%sn' uv_set_???????? |
            xargs -L 50 sh -c 'cat "$@" > ... something' _


            but I can't come up with elegant way to have a different something each time.






            share|improve this answer














            I would just break down and do this in Awk.



            awk 'FNR==1 && (++i%50 == 0) {
            if(NR>1) close p;
            p = "dest_" ++j }
            { print >p }' uv_set_????????


            This creates files dest_1 through dest_7, the first 6 with 50 files in each and the last with the remainder.



            Closing the previous file is necessary because the system only allows Awk to have a limited number of open file handles (though the limit is typically higher than 7 so it's probably not important in your example).





            Thinking out loud dept, just to prevent anyone else from wasting time on repeating this dead end.



            You could use xargs -L 50 cat to concatenate 50 files at a time, but there is no simple way to pass in a new redirection for standard output for each invocation. You could try to hack your way around that with something like



            # XXX Do not use: incomplete
            printf '%sn' uv_set_???????? |
            xargs -L 50 sh -c 'cat "$@" > ... something' _


            but I can't come up with elegant way to have a different something each time.







            share|improve this answer














            share|improve this answer



            share|improve this answer








            edited Nov 19 at 12:50

























            answered Nov 19 at 12:44









            tripleee

            86.9k12121176




            86.9k12121176












            • Maybe you could redirect to > starting_at_"$1".
              – Socowi
              Nov 19 at 12:50










            • Yeah, I was toying with exactly that idea, but it simply sucked too much.
              – tripleee
              Nov 19 at 12:50


















            • Maybe you could redirect to > starting_at_"$1".
              – Socowi
              Nov 19 at 12:50










            • Yeah, I was toying with exactly that idea, but it simply sucked too much.
              – tripleee
              Nov 19 at 12:50
















            Maybe you could redirect to > starting_at_"$1".
            – Socowi
            Nov 19 at 12:50




            Maybe you could redirect to > starting_at_"$1".
            – Socowi
            Nov 19 at 12:50












            Yeah, I was toying with exactly that idea, but it simply sucked too much.
            – tripleee
            Nov 19 at 12:50




            Yeah, I was toying with exactly that idea, but it simply sucked too much.
            – tripleee
            Nov 19 at 12:50


















             

            draft saved


            draft discarded



















































             


            draft saved


            draft discarded














            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53374610%2fconcatenate-specific-number-of-files%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown





















































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown

































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown







            Popular posts from this blog

            404 Error Contact Form 7 ajax form submitting

            How to know if a Active Directory user can login interactively

            TypeError: fit_transform() missing 1 required positional argument: 'X'