Cleaning up after killing a running docker container












0














My goal is to write a docker image that runs a python script that produces a lot of csv files full of random numbers, which once finished, are to be written to an external storage drive, after which the container quits. Assume that it writes so many of these csv files that they cannot be stored into memory.



What I'm worried about are the cases where the container encounters an error and quits (or is made to quit by the user), and then it creates a bunch of garbage files that have to be manually cleaned.



First solution is to mount a fast drive (like an SSD) directly into the container and write to it. After it is done, it transfers the data from this SSD to the external storage drive. The bad thing about this one is that if the container quits unexpectedly, it will leave garbage on the SSD.



Second solution was to create a volume using the SSD, start a container with this volume, and then pretty much do the same as the first solution. In this case, if the container dies unexpectedly, what happens to the volume? Will it auto quit as well? Can it be configured to auto quit thereby deleting any garbage that was created?



In case you're curious, the final goal is to use these containers with some sort of orchestration system.










share|improve this question






















  • Stack Overflow is a site for programming and development questions. This question appears to be off-topic because it is not about programming or development. See What topics can I ask about here in the Help Center. Perhaps Super User or Unix & Linux Stack Exchange would be a better place to ask.
    – jww
    Sep 25 at 12:58
















0














My goal is to write a docker image that runs a python script that produces a lot of csv files full of random numbers, which once finished, are to be written to an external storage drive, after which the container quits. Assume that it writes so many of these csv files that they cannot be stored into memory.



What I'm worried about are the cases where the container encounters an error and quits (or is made to quit by the user), and then it creates a bunch of garbage files that have to be manually cleaned.



First solution is to mount a fast drive (like an SSD) directly into the container and write to it. After it is done, it transfers the data from this SSD to the external storage drive. The bad thing about this one is that if the container quits unexpectedly, it will leave garbage on the SSD.



Second solution was to create a volume using the SSD, start a container with this volume, and then pretty much do the same as the first solution. In this case, if the container dies unexpectedly, what happens to the volume? Will it auto quit as well? Can it be configured to auto quit thereby deleting any garbage that was created?



In case you're curious, the final goal is to use these containers with some sort of orchestration system.










share|improve this question






















  • Stack Overflow is a site for programming and development questions. This question appears to be off-topic because it is not about programming or development. See What topics can I ask about here in the Help Center. Perhaps Super User or Unix & Linux Stack Exchange would be a better place to ask.
    – jww
    Sep 25 at 12:58














0












0








0







My goal is to write a docker image that runs a python script that produces a lot of csv files full of random numbers, which once finished, are to be written to an external storage drive, after which the container quits. Assume that it writes so many of these csv files that they cannot be stored into memory.



What I'm worried about are the cases where the container encounters an error and quits (or is made to quit by the user), and then it creates a bunch of garbage files that have to be manually cleaned.



First solution is to mount a fast drive (like an SSD) directly into the container and write to it. After it is done, it transfers the data from this SSD to the external storage drive. The bad thing about this one is that if the container quits unexpectedly, it will leave garbage on the SSD.



Second solution was to create a volume using the SSD, start a container with this volume, and then pretty much do the same as the first solution. In this case, if the container dies unexpectedly, what happens to the volume? Will it auto quit as well? Can it be configured to auto quit thereby deleting any garbage that was created?



In case you're curious, the final goal is to use these containers with some sort of orchestration system.










share|improve this question













My goal is to write a docker image that runs a python script that produces a lot of csv files full of random numbers, which once finished, are to be written to an external storage drive, after which the container quits. Assume that it writes so many of these csv files that they cannot be stored into memory.



What I'm worried about are the cases where the container encounters an error and quits (or is made to quit by the user), and then it creates a bunch of garbage files that have to be manually cleaned.



First solution is to mount a fast drive (like an SSD) directly into the container and write to it. After it is done, it transfers the data from this SSD to the external storage drive. The bad thing about this one is that if the container quits unexpectedly, it will leave garbage on the SSD.



Second solution was to create a volume using the SSD, start a container with this volume, and then pretty much do the same as the first solution. In this case, if the container dies unexpectedly, what happens to the volume? Will it auto quit as well? Can it be configured to auto quit thereby deleting any garbage that was created?



In case you're curious, the final goal is to use these containers with some sort of orchestration system.







linux docker






share|improve this question













share|improve this question











share|improve this question




share|improve this question










asked Sep 24 at 20:39









Mr. Fegur

3731313




3731313












  • Stack Overflow is a site for programming and development questions. This question appears to be off-topic because it is not about programming or development. See What topics can I ask about here in the Help Center. Perhaps Super User or Unix & Linux Stack Exchange would be a better place to ask.
    – jww
    Sep 25 at 12:58


















  • Stack Overflow is a site for programming and development questions. This question appears to be off-topic because it is not about programming or development. See What topics can I ask about here in the Help Center. Perhaps Super User or Unix & Linux Stack Exchange would be a better place to ask.
    – jww
    Sep 25 at 12:58
















Stack Overflow is a site for programming and development questions. This question appears to be off-topic because it is not about programming or development. See What topics can I ask about here in the Help Center. Perhaps Super User or Unix & Linux Stack Exchange would be a better place to ask.
– jww
Sep 25 at 12:58




Stack Overflow is a site for programming and development questions. This question appears to be off-topic because it is not about programming or development. See What topics can I ask about here in the Help Center. Perhaps Super User or Unix & Linux Stack Exchange would be a better place to ask.
– jww
Sep 25 at 12:58












2 Answers
2






active

oldest

votes


















1















What I'm worried about are the cases where the container encounters an error and quits (or is made to quit by the user), and then it creates a bunch of garbage files that have to be manually cleaned.




Note that you may configure your ENTRYPOINT Python script to automatically perform the necessary cleanup.



To give you some guidelines/examples of that approach:




  • I gave one such example (implemented in Bash, namely with a trap) in this SO answer.

  • Another possible example (implemented in Python) is given in this blog article.


Note that beyond the graceful termination of your containers, you may want to setup a restart policy, such as always or unless-stopped. See for example this codeship blog article.




First solution is to mount a fast drive (like an SSD) directly into the container and write to it. After it is done, it transfers the data from this SSD to the external storage drive. The bad thing about this one is that if the container quits unexpectedly, it will leave garbage on the SSD.



Second solution was to create a volume using the SSD, start a container with this volume, and then pretty much do the same as the first solution. In this case, if the container dies unexpectedly, what happens to the volume? Will it auto quit as well?




Albeit the two solutions you present wouldn't be necessary to address the main question of this thread, I have to mention that in general, it is a best practice to use volumes in production, rather than using a mere bind-mount. But of course using any of these two approches (-v volume-name:/path or a bind-mount -v /path:/path) is way better than not using the -v option at all, because I recall that writing data directly in the writable layer of the container implies this data will be lost if the container is recreated from the image.






share|improve this answer































    1















    What I'm worried about are the cases where the container encounters an error and quits (or is made to quit by the user), and then it creates a bunch of garbage files that have to be manually cleaned.




    If you write your intermediate files into the container filesystem, rather than to a persistent volume, then docker can do all the hard work for you. Simply run your container with the remove option (--rm). E.g. if you did:



    docker run --rm -v /path/to/external/storage:/final/result your_image


    Then your application can write to anywhere other than /final/result, and upon exit of the container (successful or any other error condition), the container will be automatically deleted by the docker daemon. On successful completion of your task, write your content to /final/result to be persisted after the container exits. This path is completely made up and you'll likely want to adjust this for your usage.



    Note that if you are running on a desktop environment (mac/windows) and not native linux, then there is an issue with the VM disk expanding with usage and not shrinking as files are deleted. This is the nature of VM filesystems that allocate upon usage and outside of docker's control. In that scenario, you'd likely want the entire setup running with an external volume and configuring your entrypoint to cleanup any temporary files left over from the last run of your container.






    share|improve this answer





















    • The reason I didn't go with this option is because I did not know where docker did its file IO when left to its devices. For example if I have two hard disks, an HDD and an SSD, and I want my python script to do its file IO on the SSD because of speed, then I don't know how to get docker to do that using this method. With volumes, I can specify this fast SSD and have my script use it.
      – Mr. Fegur
      Sep 24 at 22:49






    • 1




      @Mr.Fegur the container files will be stored under /var/lib/docker while the container is running.
      – BMitch
      Sep 25 at 9:22











    Your Answer






    StackExchange.ifUsing("editor", function () {
    StackExchange.using("externalEditor", function () {
    StackExchange.using("snippets", function () {
    StackExchange.snippets.init();
    });
    });
    }, "code-snippets");

    StackExchange.ready(function() {
    var channelOptions = {
    tags: "".split(" "),
    id: "1"
    };
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function() {
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled) {
    StackExchange.using("snippets", function() {
    createEditor();
    });
    }
    else {
    createEditor();
    }
    });

    function createEditor() {
    StackExchange.prepareEditor({
    heartbeatType: 'answer',
    autoActivateHeartbeat: false,
    convertImagesToLinks: true,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: 10,
    bindNavPrevention: true,
    postfix: "",
    imageUploader: {
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    },
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    });


    }
    });














    draft saved

    draft discarded


















    StackExchange.ready(
    function () {
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f52487019%2fcleaning-up-after-killing-a-running-docker-container%23new-answer', 'question_page');
    }
    );

    Post as a guest















    Required, but never shown

























    2 Answers
    2






    active

    oldest

    votes








    2 Answers
    2






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes









    1















    What I'm worried about are the cases where the container encounters an error and quits (or is made to quit by the user), and then it creates a bunch of garbage files that have to be manually cleaned.




    Note that you may configure your ENTRYPOINT Python script to automatically perform the necessary cleanup.



    To give you some guidelines/examples of that approach:




    • I gave one such example (implemented in Bash, namely with a trap) in this SO answer.

    • Another possible example (implemented in Python) is given in this blog article.


    Note that beyond the graceful termination of your containers, you may want to setup a restart policy, such as always or unless-stopped. See for example this codeship blog article.




    First solution is to mount a fast drive (like an SSD) directly into the container and write to it. After it is done, it transfers the data from this SSD to the external storage drive. The bad thing about this one is that if the container quits unexpectedly, it will leave garbage on the SSD.



    Second solution was to create a volume using the SSD, start a container with this volume, and then pretty much do the same as the first solution. In this case, if the container dies unexpectedly, what happens to the volume? Will it auto quit as well?




    Albeit the two solutions you present wouldn't be necessary to address the main question of this thread, I have to mention that in general, it is a best practice to use volumes in production, rather than using a mere bind-mount. But of course using any of these two approches (-v volume-name:/path or a bind-mount -v /path:/path) is way better than not using the -v option at all, because I recall that writing data directly in the writable layer of the container implies this data will be lost if the container is recreated from the image.






    share|improve this answer




























      1















      What I'm worried about are the cases where the container encounters an error and quits (or is made to quit by the user), and then it creates a bunch of garbage files that have to be manually cleaned.




      Note that you may configure your ENTRYPOINT Python script to automatically perform the necessary cleanup.



      To give you some guidelines/examples of that approach:




      • I gave one such example (implemented in Bash, namely with a trap) in this SO answer.

      • Another possible example (implemented in Python) is given in this blog article.


      Note that beyond the graceful termination of your containers, you may want to setup a restart policy, such as always or unless-stopped. See for example this codeship blog article.




      First solution is to mount a fast drive (like an SSD) directly into the container and write to it. After it is done, it transfers the data from this SSD to the external storage drive. The bad thing about this one is that if the container quits unexpectedly, it will leave garbage on the SSD.



      Second solution was to create a volume using the SSD, start a container with this volume, and then pretty much do the same as the first solution. In this case, if the container dies unexpectedly, what happens to the volume? Will it auto quit as well?




      Albeit the two solutions you present wouldn't be necessary to address the main question of this thread, I have to mention that in general, it is a best practice to use volumes in production, rather than using a mere bind-mount. But of course using any of these two approches (-v volume-name:/path or a bind-mount -v /path:/path) is way better than not using the -v option at all, because I recall that writing data directly in the writable layer of the container implies this data will be lost if the container is recreated from the image.






      share|improve this answer


























        1












        1








        1







        What I'm worried about are the cases where the container encounters an error and quits (or is made to quit by the user), and then it creates a bunch of garbage files that have to be manually cleaned.




        Note that you may configure your ENTRYPOINT Python script to automatically perform the necessary cleanup.



        To give you some guidelines/examples of that approach:




        • I gave one such example (implemented in Bash, namely with a trap) in this SO answer.

        • Another possible example (implemented in Python) is given in this blog article.


        Note that beyond the graceful termination of your containers, you may want to setup a restart policy, such as always or unless-stopped. See for example this codeship blog article.




        First solution is to mount a fast drive (like an SSD) directly into the container and write to it. After it is done, it transfers the data from this SSD to the external storage drive. The bad thing about this one is that if the container quits unexpectedly, it will leave garbage on the SSD.



        Second solution was to create a volume using the SSD, start a container with this volume, and then pretty much do the same as the first solution. In this case, if the container dies unexpectedly, what happens to the volume? Will it auto quit as well?




        Albeit the two solutions you present wouldn't be necessary to address the main question of this thread, I have to mention that in general, it is a best practice to use volumes in production, rather than using a mere bind-mount. But of course using any of these two approches (-v volume-name:/path or a bind-mount -v /path:/path) is way better than not using the -v option at all, because I recall that writing data directly in the writable layer of the container implies this data will be lost if the container is recreated from the image.






        share|improve this answer















        What I'm worried about are the cases where the container encounters an error and quits (or is made to quit by the user), and then it creates a bunch of garbage files that have to be manually cleaned.




        Note that you may configure your ENTRYPOINT Python script to automatically perform the necessary cleanup.



        To give you some guidelines/examples of that approach:




        • I gave one such example (implemented in Bash, namely with a trap) in this SO answer.

        • Another possible example (implemented in Python) is given in this blog article.


        Note that beyond the graceful termination of your containers, you may want to setup a restart policy, such as always or unless-stopped. See for example this codeship blog article.




        First solution is to mount a fast drive (like an SSD) directly into the container and write to it. After it is done, it transfers the data from this SSD to the external storage drive. The bad thing about this one is that if the container quits unexpectedly, it will leave garbage on the SSD.



        Second solution was to create a volume using the SSD, start a container with this volume, and then pretty much do the same as the first solution. In this case, if the container dies unexpectedly, what happens to the volume? Will it auto quit as well?




        Albeit the two solutions you present wouldn't be necessary to address the main question of this thread, I have to mention that in general, it is a best practice to use volumes in production, rather than using a mere bind-mount. But of course using any of these two approches (-v volume-name:/path or a bind-mount -v /path:/path) is way better than not using the -v option at all, because I recall that writing data directly in the writable layer of the container implies this data will be lost if the container is recreated from the image.







        share|improve this answer














        share|improve this answer



        share|improve this answer








        edited Sep 24 at 22:04

























        answered Sep 24 at 21:58









        ErikMD

        2,0851319




        2,0851319

























            1















            What I'm worried about are the cases where the container encounters an error and quits (or is made to quit by the user), and then it creates a bunch of garbage files that have to be manually cleaned.




            If you write your intermediate files into the container filesystem, rather than to a persistent volume, then docker can do all the hard work for you. Simply run your container with the remove option (--rm). E.g. if you did:



            docker run --rm -v /path/to/external/storage:/final/result your_image


            Then your application can write to anywhere other than /final/result, and upon exit of the container (successful or any other error condition), the container will be automatically deleted by the docker daemon. On successful completion of your task, write your content to /final/result to be persisted after the container exits. This path is completely made up and you'll likely want to adjust this for your usage.



            Note that if you are running on a desktop environment (mac/windows) and not native linux, then there is an issue with the VM disk expanding with usage and not shrinking as files are deleted. This is the nature of VM filesystems that allocate upon usage and outside of docker's control. In that scenario, you'd likely want the entire setup running with an external volume and configuring your entrypoint to cleanup any temporary files left over from the last run of your container.






            share|improve this answer





















            • The reason I didn't go with this option is because I did not know where docker did its file IO when left to its devices. For example if I have two hard disks, an HDD and an SSD, and I want my python script to do its file IO on the SSD because of speed, then I don't know how to get docker to do that using this method. With volumes, I can specify this fast SSD and have my script use it.
              – Mr. Fegur
              Sep 24 at 22:49






            • 1




              @Mr.Fegur the container files will be stored under /var/lib/docker while the container is running.
              – BMitch
              Sep 25 at 9:22
















            1















            What I'm worried about are the cases where the container encounters an error and quits (or is made to quit by the user), and then it creates a bunch of garbage files that have to be manually cleaned.




            If you write your intermediate files into the container filesystem, rather than to a persistent volume, then docker can do all the hard work for you. Simply run your container with the remove option (--rm). E.g. if you did:



            docker run --rm -v /path/to/external/storage:/final/result your_image


            Then your application can write to anywhere other than /final/result, and upon exit of the container (successful or any other error condition), the container will be automatically deleted by the docker daemon. On successful completion of your task, write your content to /final/result to be persisted after the container exits. This path is completely made up and you'll likely want to adjust this for your usage.



            Note that if you are running on a desktop environment (mac/windows) and not native linux, then there is an issue with the VM disk expanding with usage and not shrinking as files are deleted. This is the nature of VM filesystems that allocate upon usage and outside of docker's control. In that scenario, you'd likely want the entire setup running with an external volume and configuring your entrypoint to cleanup any temporary files left over from the last run of your container.






            share|improve this answer





















            • The reason I didn't go with this option is because I did not know where docker did its file IO when left to its devices. For example if I have two hard disks, an HDD and an SSD, and I want my python script to do its file IO on the SSD because of speed, then I don't know how to get docker to do that using this method. With volumes, I can specify this fast SSD and have my script use it.
              – Mr. Fegur
              Sep 24 at 22:49






            • 1




              @Mr.Fegur the container files will be stored under /var/lib/docker while the container is running.
              – BMitch
              Sep 25 at 9:22














            1












            1








            1







            What I'm worried about are the cases where the container encounters an error and quits (or is made to quit by the user), and then it creates a bunch of garbage files that have to be manually cleaned.




            If you write your intermediate files into the container filesystem, rather than to a persistent volume, then docker can do all the hard work for you. Simply run your container with the remove option (--rm). E.g. if you did:



            docker run --rm -v /path/to/external/storage:/final/result your_image


            Then your application can write to anywhere other than /final/result, and upon exit of the container (successful or any other error condition), the container will be automatically deleted by the docker daemon. On successful completion of your task, write your content to /final/result to be persisted after the container exits. This path is completely made up and you'll likely want to adjust this for your usage.



            Note that if you are running on a desktop environment (mac/windows) and not native linux, then there is an issue with the VM disk expanding with usage and not shrinking as files are deleted. This is the nature of VM filesystems that allocate upon usage and outside of docker's control. In that scenario, you'd likely want the entire setup running with an external volume and configuring your entrypoint to cleanup any temporary files left over from the last run of your container.






            share|improve this answer













            What I'm worried about are the cases where the container encounters an error and quits (or is made to quit by the user), and then it creates a bunch of garbage files that have to be manually cleaned.




            If you write your intermediate files into the container filesystem, rather than to a persistent volume, then docker can do all the hard work for you. Simply run your container with the remove option (--rm). E.g. if you did:



            docker run --rm -v /path/to/external/storage:/final/result your_image


            Then your application can write to anywhere other than /final/result, and upon exit of the container (successful or any other error condition), the container will be automatically deleted by the docker daemon. On successful completion of your task, write your content to /final/result to be persisted after the container exits. This path is completely made up and you'll likely want to adjust this for your usage.



            Note that if you are running on a desktop environment (mac/windows) and not native linux, then there is an issue with the VM disk expanding with usage and not shrinking as files are deleted. This is the nature of VM filesystems that allocate upon usage and outside of docker's control. In that scenario, you'd likely want the entire setup running with an external volume and configuring your entrypoint to cleanup any temporary files left over from the last run of your container.







            share|improve this answer












            share|improve this answer



            share|improve this answer










            answered Sep 24 at 22:45









            BMitch

            57.4k9119135




            57.4k9119135












            • The reason I didn't go with this option is because I did not know where docker did its file IO when left to its devices. For example if I have two hard disks, an HDD and an SSD, and I want my python script to do its file IO on the SSD because of speed, then I don't know how to get docker to do that using this method. With volumes, I can specify this fast SSD and have my script use it.
              – Mr. Fegur
              Sep 24 at 22:49






            • 1




              @Mr.Fegur the container files will be stored under /var/lib/docker while the container is running.
              – BMitch
              Sep 25 at 9:22


















            • The reason I didn't go with this option is because I did not know where docker did its file IO when left to its devices. For example if I have two hard disks, an HDD and an SSD, and I want my python script to do its file IO on the SSD because of speed, then I don't know how to get docker to do that using this method. With volumes, I can specify this fast SSD and have my script use it.
              – Mr. Fegur
              Sep 24 at 22:49






            • 1




              @Mr.Fegur the container files will be stored under /var/lib/docker while the container is running.
              – BMitch
              Sep 25 at 9:22
















            The reason I didn't go with this option is because I did not know where docker did its file IO when left to its devices. For example if I have two hard disks, an HDD and an SSD, and I want my python script to do its file IO on the SSD because of speed, then I don't know how to get docker to do that using this method. With volumes, I can specify this fast SSD and have my script use it.
            – Mr. Fegur
            Sep 24 at 22:49




            The reason I didn't go with this option is because I did not know where docker did its file IO when left to its devices. For example if I have two hard disks, an HDD and an SSD, and I want my python script to do its file IO on the SSD because of speed, then I don't know how to get docker to do that using this method. With volumes, I can specify this fast SSD and have my script use it.
            – Mr. Fegur
            Sep 24 at 22:49




            1




            1




            @Mr.Fegur the container files will be stored under /var/lib/docker while the container is running.
            – BMitch
            Sep 25 at 9:22




            @Mr.Fegur the container files will be stored under /var/lib/docker while the container is running.
            – BMitch
            Sep 25 at 9:22


















            draft saved

            draft discarded




















































            Thanks for contributing an answer to Stack Overflow!


            • Please be sure to answer the question. Provide details and share your research!

            But avoid



            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.


            To learn more, see our tips on writing great answers.





            Some of your past answers have not been well-received, and you're in danger of being blocked from answering.


            Please pay close attention to the following guidance:


            • Please be sure to answer the question. Provide details and share your research!

            But avoid



            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.


            To learn more, see our tips on writing great answers.




            draft saved


            draft discarded














            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f52487019%2fcleaning-up-after-killing-a-running-docker-container%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown





















































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown

































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown







            Popular posts from this blog

            404 Error Contact Form 7 ajax form submitting

            How to know if a Active Directory user can login interactively

            TypeError: fit_transform() missing 1 required positional argument: 'X'