Puppet: Recovering from failed run
up vote
0
down vote
favorite
Consider the following code:
file { '/etc/systemd/system/docker.service.d/http-proxy.conf':
ensure => 'present',
owner => 'root',
group => 'root',
mode => '644',
content => '[Service]
Environment="HTTP_PROXY=http://10.0.2.2:3128"
Environment="HTTPS_PROXY=http://10.0.2.2:3128"
',
notify => Exec['daemon-reload'],
require => Package['docker-ce'],
}
exec { 'daemon-reload':
command => 'systemctl daemon-reload',
path => '/sbin',
refreshonly => true,
}
service { 'docker':
ensure => 'running',
subscribe => File['/etc/systemd/system/docker.service.d/http-proxy.conf'],
require => Exec['daemon-reload'],
}
I would like to edit some systemd service. In this instance, it is the environment for docker, but it could be any other need.
Since a systemd unit file has been changed, systemctl daemon-reload
must be run for the new configuration to be picked up.
Running puppet apply
fails:
Notice: Compiled catalog for puppet-docker-test.<redacted> in environment production in 0.18 seconds
Notice: /Stage[main]/Main/File[/etc/systemd/system/docker.service.d/http-proxy.conf]/ensure: defined content as '{md5}dace796a9904d2c5e2c438e6faba2332'
Error: /Stage[main]/Main/Exec[daemon-reload]: Failed to call refresh: Could not find command 'systemctl'
Error: /Stage[main]/Main/Exec[daemon-reload]: Could not find command 'systemctl'
Notice: /Stage[main]/Main/Service[docker]: Dependency Exec[daemon-reload] has failures: false
Warning: /Stage[main]/Main/Service[docker]: Skipping because of failed dependencies
Notice: Applied catalog in 0.15 seconds
The cause is immediately obvious: systemctl
lives in /bin
, not /sbin
, as configured. However, fixing this, then running puppet apply
again will neither cause the service to be restarted nor systemctl daemon-reload
to be run:
Notice: Compiled catalog for puppet-docker-test.<redacted> in environment production in 0.19 seconds
Notice: Applied catalog in 0.16 seconds
Apparently, this happens because there were no changes to the file resource (since it was applied on the failed run), which would have refreshed the daemon-reload and then triggered the service to restart.
In order to force puppet to reload the service and restart it, I could change the contents of the file on disk, I could change the contents on the puppet code, but it feels like I'm missing some better way of doing this.
How to better recover from such scenario? Or, how to write puppet code that doesn't have this issue?
puppet systemd
add a comment |
up vote
0
down vote
favorite
Consider the following code:
file { '/etc/systemd/system/docker.service.d/http-proxy.conf':
ensure => 'present',
owner => 'root',
group => 'root',
mode => '644',
content => '[Service]
Environment="HTTP_PROXY=http://10.0.2.2:3128"
Environment="HTTPS_PROXY=http://10.0.2.2:3128"
',
notify => Exec['daemon-reload'],
require => Package['docker-ce'],
}
exec { 'daemon-reload':
command => 'systemctl daemon-reload',
path => '/sbin',
refreshonly => true,
}
service { 'docker':
ensure => 'running',
subscribe => File['/etc/systemd/system/docker.service.d/http-proxy.conf'],
require => Exec['daemon-reload'],
}
I would like to edit some systemd service. In this instance, it is the environment for docker, but it could be any other need.
Since a systemd unit file has been changed, systemctl daemon-reload
must be run for the new configuration to be picked up.
Running puppet apply
fails:
Notice: Compiled catalog for puppet-docker-test.<redacted> in environment production in 0.18 seconds
Notice: /Stage[main]/Main/File[/etc/systemd/system/docker.service.d/http-proxy.conf]/ensure: defined content as '{md5}dace796a9904d2c5e2c438e6faba2332'
Error: /Stage[main]/Main/Exec[daemon-reload]: Failed to call refresh: Could not find command 'systemctl'
Error: /Stage[main]/Main/Exec[daemon-reload]: Could not find command 'systemctl'
Notice: /Stage[main]/Main/Service[docker]: Dependency Exec[daemon-reload] has failures: false
Warning: /Stage[main]/Main/Service[docker]: Skipping because of failed dependencies
Notice: Applied catalog in 0.15 seconds
The cause is immediately obvious: systemctl
lives in /bin
, not /sbin
, as configured. However, fixing this, then running puppet apply
again will neither cause the service to be restarted nor systemctl daemon-reload
to be run:
Notice: Compiled catalog for puppet-docker-test.<redacted> in environment production in 0.19 seconds
Notice: Applied catalog in 0.16 seconds
Apparently, this happens because there were no changes to the file resource (since it was applied on the failed run), which would have refreshed the daemon-reload and then triggered the service to restart.
In order to force puppet to reload the service and restart it, I could change the contents of the file on disk, I could change the contents on the puppet code, but it feels like I'm missing some better way of doing this.
How to better recover from such scenario? Or, how to write puppet code that doesn't have this issue?
puppet systemd
add a comment |
up vote
0
down vote
favorite
up vote
0
down vote
favorite
Consider the following code:
file { '/etc/systemd/system/docker.service.d/http-proxy.conf':
ensure => 'present',
owner => 'root',
group => 'root',
mode => '644',
content => '[Service]
Environment="HTTP_PROXY=http://10.0.2.2:3128"
Environment="HTTPS_PROXY=http://10.0.2.2:3128"
',
notify => Exec['daemon-reload'],
require => Package['docker-ce'],
}
exec { 'daemon-reload':
command => 'systemctl daemon-reload',
path => '/sbin',
refreshonly => true,
}
service { 'docker':
ensure => 'running',
subscribe => File['/etc/systemd/system/docker.service.d/http-proxy.conf'],
require => Exec['daemon-reload'],
}
I would like to edit some systemd service. In this instance, it is the environment for docker, but it could be any other need.
Since a systemd unit file has been changed, systemctl daemon-reload
must be run for the new configuration to be picked up.
Running puppet apply
fails:
Notice: Compiled catalog for puppet-docker-test.<redacted> in environment production in 0.18 seconds
Notice: /Stage[main]/Main/File[/etc/systemd/system/docker.service.d/http-proxy.conf]/ensure: defined content as '{md5}dace796a9904d2c5e2c438e6faba2332'
Error: /Stage[main]/Main/Exec[daemon-reload]: Failed to call refresh: Could not find command 'systemctl'
Error: /Stage[main]/Main/Exec[daemon-reload]: Could not find command 'systemctl'
Notice: /Stage[main]/Main/Service[docker]: Dependency Exec[daemon-reload] has failures: false
Warning: /Stage[main]/Main/Service[docker]: Skipping because of failed dependencies
Notice: Applied catalog in 0.15 seconds
The cause is immediately obvious: systemctl
lives in /bin
, not /sbin
, as configured. However, fixing this, then running puppet apply
again will neither cause the service to be restarted nor systemctl daemon-reload
to be run:
Notice: Compiled catalog for puppet-docker-test.<redacted> in environment production in 0.19 seconds
Notice: Applied catalog in 0.16 seconds
Apparently, this happens because there were no changes to the file resource (since it was applied on the failed run), which would have refreshed the daemon-reload and then triggered the service to restart.
In order to force puppet to reload the service and restart it, I could change the contents of the file on disk, I could change the contents on the puppet code, but it feels like I'm missing some better way of doing this.
How to better recover from such scenario? Or, how to write puppet code that doesn't have this issue?
puppet systemd
Consider the following code:
file { '/etc/systemd/system/docker.service.d/http-proxy.conf':
ensure => 'present',
owner => 'root',
group => 'root',
mode => '644',
content => '[Service]
Environment="HTTP_PROXY=http://10.0.2.2:3128"
Environment="HTTPS_PROXY=http://10.0.2.2:3128"
',
notify => Exec['daemon-reload'],
require => Package['docker-ce'],
}
exec { 'daemon-reload':
command => 'systemctl daemon-reload',
path => '/sbin',
refreshonly => true,
}
service { 'docker':
ensure => 'running',
subscribe => File['/etc/systemd/system/docker.service.d/http-proxy.conf'],
require => Exec['daemon-reload'],
}
I would like to edit some systemd service. In this instance, it is the environment for docker, but it could be any other need.
Since a systemd unit file has been changed, systemctl daemon-reload
must be run for the new configuration to be picked up.
Running puppet apply
fails:
Notice: Compiled catalog for puppet-docker-test.<redacted> in environment production in 0.18 seconds
Notice: /Stage[main]/Main/File[/etc/systemd/system/docker.service.d/http-proxy.conf]/ensure: defined content as '{md5}dace796a9904d2c5e2c438e6faba2332'
Error: /Stage[main]/Main/Exec[daemon-reload]: Failed to call refresh: Could not find command 'systemctl'
Error: /Stage[main]/Main/Exec[daemon-reload]: Could not find command 'systemctl'
Notice: /Stage[main]/Main/Service[docker]: Dependency Exec[daemon-reload] has failures: false
Warning: /Stage[main]/Main/Service[docker]: Skipping because of failed dependencies
Notice: Applied catalog in 0.15 seconds
The cause is immediately obvious: systemctl
lives in /bin
, not /sbin
, as configured. However, fixing this, then running puppet apply
again will neither cause the service to be restarted nor systemctl daemon-reload
to be run:
Notice: Compiled catalog for puppet-docker-test.<redacted> in environment production in 0.19 seconds
Notice: Applied catalog in 0.16 seconds
Apparently, this happens because there were no changes to the file resource (since it was applied on the failed run), which would have refreshed the daemon-reload and then triggered the service to restart.
In order to force puppet to reload the service and restart it, I could change the contents of the file on disk, I could change the contents on the puppet code, but it feels like I'm missing some better way of doing this.
How to better recover from such scenario? Or, how to write puppet code that doesn't have this issue?
puppet systemd
puppet systemd
asked Nov 19 at 21:15
Thiago Vinicius
11
11
add a comment |
add a comment |
1 Answer
1
active
oldest
votes
up vote
0
down vote
Puppet does not provide a mechanism for resuming a failed run. Doing so would not make much sense to me, since one would expect that for a resumption to have a different result would depend on the machine state being changed since the failure, and a machine-state change outside Puppet potentially invalidates the catalog that was being applied.
The agent does, by default, send run reports to the master, so in the event of a failed run, you should be able to determine from that what went wrong. Supposing that you don't want to scour reports to figure out how to recover from a failed run, however, you could consider compiling a recovery script.
For example, you know that any failure may have caused a daemon-reload to be missed, and it's harmless to perform one when it isn't required, so just put that in your script. You might also put in a restart of each service under management. Basically, you're looking for anything that has non-trivial refresh behavior (Exec
s and Service
s are the main ones that come to my mind).
It occurs to me that if you're extra clever then it might be possible to put that in the form of one or more Puppet classes, and to determine on the master whether the last run for the target node failed, so as to apply the recovery.
As for avoiding the problem in the first place, I can only suggest testing, testing, and more testing. To that end, if you don't have dedicated machines on which to test Puppet updates, then at least select a small number of normal machines that get updates first, so that any problems that arise are closely contained.
add a comment |
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
0
down vote
Puppet does not provide a mechanism for resuming a failed run. Doing so would not make much sense to me, since one would expect that for a resumption to have a different result would depend on the machine state being changed since the failure, and a machine-state change outside Puppet potentially invalidates the catalog that was being applied.
The agent does, by default, send run reports to the master, so in the event of a failed run, you should be able to determine from that what went wrong. Supposing that you don't want to scour reports to figure out how to recover from a failed run, however, you could consider compiling a recovery script.
For example, you know that any failure may have caused a daemon-reload to be missed, and it's harmless to perform one when it isn't required, so just put that in your script. You might also put in a restart of each service under management. Basically, you're looking for anything that has non-trivial refresh behavior (Exec
s and Service
s are the main ones that come to my mind).
It occurs to me that if you're extra clever then it might be possible to put that in the form of one or more Puppet classes, and to determine on the master whether the last run for the target node failed, so as to apply the recovery.
As for avoiding the problem in the first place, I can only suggest testing, testing, and more testing. To that end, if you don't have dedicated machines on which to test Puppet updates, then at least select a small number of normal machines that get updates first, so that any problems that arise are closely contained.
add a comment |
up vote
0
down vote
Puppet does not provide a mechanism for resuming a failed run. Doing so would not make much sense to me, since one would expect that for a resumption to have a different result would depend on the machine state being changed since the failure, and a machine-state change outside Puppet potentially invalidates the catalog that was being applied.
The agent does, by default, send run reports to the master, so in the event of a failed run, you should be able to determine from that what went wrong. Supposing that you don't want to scour reports to figure out how to recover from a failed run, however, you could consider compiling a recovery script.
For example, you know that any failure may have caused a daemon-reload to be missed, and it's harmless to perform one when it isn't required, so just put that in your script. You might also put in a restart of each service under management. Basically, you're looking for anything that has non-trivial refresh behavior (Exec
s and Service
s are the main ones that come to my mind).
It occurs to me that if you're extra clever then it might be possible to put that in the form of one or more Puppet classes, and to determine on the master whether the last run for the target node failed, so as to apply the recovery.
As for avoiding the problem in the first place, I can only suggest testing, testing, and more testing. To that end, if you don't have dedicated machines on which to test Puppet updates, then at least select a small number of normal machines that get updates first, so that any problems that arise are closely contained.
add a comment |
up vote
0
down vote
up vote
0
down vote
Puppet does not provide a mechanism for resuming a failed run. Doing so would not make much sense to me, since one would expect that for a resumption to have a different result would depend on the machine state being changed since the failure, and a machine-state change outside Puppet potentially invalidates the catalog that was being applied.
The agent does, by default, send run reports to the master, so in the event of a failed run, you should be able to determine from that what went wrong. Supposing that you don't want to scour reports to figure out how to recover from a failed run, however, you could consider compiling a recovery script.
For example, you know that any failure may have caused a daemon-reload to be missed, and it's harmless to perform one when it isn't required, so just put that in your script. You might also put in a restart of each service under management. Basically, you're looking for anything that has non-trivial refresh behavior (Exec
s and Service
s are the main ones that come to my mind).
It occurs to me that if you're extra clever then it might be possible to put that in the form of one or more Puppet classes, and to determine on the master whether the last run for the target node failed, so as to apply the recovery.
As for avoiding the problem in the first place, I can only suggest testing, testing, and more testing. To that end, if you don't have dedicated machines on which to test Puppet updates, then at least select a small number of normal machines that get updates first, so that any problems that arise are closely contained.
Puppet does not provide a mechanism for resuming a failed run. Doing so would not make much sense to me, since one would expect that for a resumption to have a different result would depend on the machine state being changed since the failure, and a machine-state change outside Puppet potentially invalidates the catalog that was being applied.
The agent does, by default, send run reports to the master, so in the event of a failed run, you should be able to determine from that what went wrong. Supposing that you don't want to scour reports to figure out how to recover from a failed run, however, you could consider compiling a recovery script.
For example, you know that any failure may have caused a daemon-reload to be missed, and it's harmless to perform one when it isn't required, so just put that in your script. You might also put in a restart of each service under management. Basically, you're looking for anything that has non-trivial refresh behavior (Exec
s and Service
s are the main ones that come to my mind).
It occurs to me that if you're extra clever then it might be possible to put that in the form of one or more Puppet classes, and to determine on the master whether the last run for the target node failed, so as to apply the recovery.
As for avoiding the problem in the first place, I can only suggest testing, testing, and more testing. To that end, if you don't have dedicated machines on which to test Puppet updates, then at least select a small number of normal machines that get updates first, so that any problems that arise are closely contained.
answered Nov 19 at 23:07
John Bollinger
76.8k63771
76.8k63771
add a comment |
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Some of your past answers have not been well-received, and you're in danger of being blocked from answering.
Please pay close attention to the following guidance:
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53382754%2fpuppet-recovering-from-failed-run%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown