Presto worker process mysteriously killed and restarted sometime
In our presto cluster (0.212) with ~200 nodes (EC2 instances), sometime (like once per day) a few presto worker processes mysteriously restart (usually around the same time when it happens). The EC2 instances are fine and memory % metrics indicate 70% memory was used.
Does presto worker has any kind of suicide and restart logic (like restart if >= M consecutive errors in a row)? Or can presto coordinator restart worker under some situations? What else might kill a few worker process around the same time?
Here is one example of the server log that shows the restart.
2018-11-14T23:16:28.78011 2018-11-14T23:16:28.776Z INFO Thread-63 io.airlift.bootstrap.LifeCycleManager Life cycle stopping...
2018-11-14T23:16:29.17181 ThreadDump 4524
2018-11-14T23:16:29.17182 ForceSafepoint 414
2018-11-14T23:16:29.17182 Deoptimize 66
2018-11-14T23:16:29.17182 CollectForMetadataAllocation 11
2018-11-14T23:16:29.17182 CGC_Operation 272
2018-11-14T23:16:29.17182 G1IncCollectionPause 2900
2018-11-14T23:16:29.17183 EnableBiasedLocking 1
2018-11-14T23:16:29.17183 RevokeBias 6248
2018-11-14T23:16:29.17183 BulkRevokeBias 272
2018-11-14T23:16:29.17183 Exit 1
2018-11-14T23:16:29.17183 931 VM operations coalesced during safepoint
2018-11-14T23:16:29.17184 Maximum sync time 197 ms
2018-11-14T23:16:29.17184 Maximum vm operation time (except for Exit VM operation) 2599 ms
2018-11-14T23:16:29.52968 ./finish: line 37: kill: (3700) - No such process
2018-11-14T23:16:29.52969 ./finish: line 37: kill: (3702) - No such process
2018-11-14T23:16:31.53563 ./finish: line 40: kill: (3704) - No such process
2018-11-14T23:16:31.53564 ./finish: line 40: kill: (3706) - No such process
2018-11-14T23:16:32.25948 2018-11-14T23:16:32.257Z INFO main io.airlift.log.Logging Logging to stderr
2018-11-14T23:16:32.26034 2018-11-14T23:16:32.260Z INFO main Bootstrap Loading configuration
2018-11-14T23:16:32.33800 2018-11-14T23:16:32.337Z INFO main Bootstrap Initializing logging
......
2018-11-14T23:16:35.75427 2018-11-14T23:16:35.754Z INFO main io.airlift.bootstrap.LifeCycleManager Life cycle starting...
2018-11-14T23:16:35.75556 2018-11-14T23:16:35.755Z INFO main io.airlift.bootstrap.LifeCycleManager Life cycle startup complete. System ready.
If relevant, these "./finish: ..." lines in the log are related to the the /etc/service/presto/finish file below.
1 #!/bin/bash
2 set -e
3 exec 2>&1
4 exec 3>>/var/log/runit/runit.log
5
6 STATSD_PREFIX="runit.presto"
7 source /etc/statsd/functions
8
9 function error_handler() {
10 echo "$(date +"%Y-%m-%dT%H:%M:%S.%3NZ") Error occurred in run file at line: $1."
11 echo "$(date +"%Y-%m-%dT%H:%M:%S.%3NZ") Line exited with status: $2"
12 incr "finish.error"
13 }
14 trap 'error_handler $LINENO $?' ERR
15 echo "$(date +"%Y-%m-%dT%H:%M:%S.%3NZ") process=presto status=stopped exitcode=$1 waitcode=$2" >&3
16 # treat non-zero exit codes as a crash
17 # waitcode contains the signal if there's one (ex. 11 - SIGSEGV)
18 if [ $1 -ne 0 ]; then
19 incr "finish.crash"
20 fi
21
22
23 # ensure that we kill the entire process group.
24 # When sv force-restart runs, it will try to TERM the runit processes. If
25 # this doesn't work, it will kill (-9) the process. In case of haproxy,
26 # apache, gunicorn, etc., the master process will be killed (-9). Child processes
27 # (ie apache workers, gunicorn workers) will *not* be killed and will be
28 # around for minutes (if not hours). These child workers will keep
29 # listening on the socket, preventing the new master apache/gunicorn
30 # processes from binding to the socket. The new master process will keep
31 # crashing and be restarted by runit until the old child processes are
32 # gone.
33
34 # determine the process group id. it's the group id of the current (finish) proces.
35 PGID=$(ps -o pgid= $$ | grep -o [0-9]*)
36 # kill all processes, except ourself and the PGID (which is the main process)
37 kill $(pgrep -g $PGID | egrep -v "$PGID|$$" ) || true
38 sleep 2
39 # kill -9 to be sure
40 kill -9 $(pgrep -g $PGID | egrep -v "$PGID|$$" ) || true
41
42 echo "$(date +"%Y-%m-%dT%H:%M:%S.%3NZ") process=presto status=finished" >&3
43 incr "finish.count"
44 timing "finish.duration"
prestodb
add a comment |
In our presto cluster (0.212) with ~200 nodes (EC2 instances), sometime (like once per day) a few presto worker processes mysteriously restart (usually around the same time when it happens). The EC2 instances are fine and memory % metrics indicate 70% memory was used.
Does presto worker has any kind of suicide and restart logic (like restart if >= M consecutive errors in a row)? Or can presto coordinator restart worker under some situations? What else might kill a few worker process around the same time?
Here is one example of the server log that shows the restart.
2018-11-14T23:16:28.78011 2018-11-14T23:16:28.776Z INFO Thread-63 io.airlift.bootstrap.LifeCycleManager Life cycle stopping...
2018-11-14T23:16:29.17181 ThreadDump 4524
2018-11-14T23:16:29.17182 ForceSafepoint 414
2018-11-14T23:16:29.17182 Deoptimize 66
2018-11-14T23:16:29.17182 CollectForMetadataAllocation 11
2018-11-14T23:16:29.17182 CGC_Operation 272
2018-11-14T23:16:29.17182 G1IncCollectionPause 2900
2018-11-14T23:16:29.17183 EnableBiasedLocking 1
2018-11-14T23:16:29.17183 RevokeBias 6248
2018-11-14T23:16:29.17183 BulkRevokeBias 272
2018-11-14T23:16:29.17183 Exit 1
2018-11-14T23:16:29.17183 931 VM operations coalesced during safepoint
2018-11-14T23:16:29.17184 Maximum sync time 197 ms
2018-11-14T23:16:29.17184 Maximum vm operation time (except for Exit VM operation) 2599 ms
2018-11-14T23:16:29.52968 ./finish: line 37: kill: (3700) - No such process
2018-11-14T23:16:29.52969 ./finish: line 37: kill: (3702) - No such process
2018-11-14T23:16:31.53563 ./finish: line 40: kill: (3704) - No such process
2018-11-14T23:16:31.53564 ./finish: line 40: kill: (3706) - No such process
2018-11-14T23:16:32.25948 2018-11-14T23:16:32.257Z INFO main io.airlift.log.Logging Logging to stderr
2018-11-14T23:16:32.26034 2018-11-14T23:16:32.260Z INFO main Bootstrap Loading configuration
2018-11-14T23:16:32.33800 2018-11-14T23:16:32.337Z INFO main Bootstrap Initializing logging
......
2018-11-14T23:16:35.75427 2018-11-14T23:16:35.754Z INFO main io.airlift.bootstrap.LifeCycleManager Life cycle starting...
2018-11-14T23:16:35.75556 2018-11-14T23:16:35.755Z INFO main io.airlift.bootstrap.LifeCycleManager Life cycle startup complete. System ready.
If relevant, these "./finish: ..." lines in the log are related to the the /etc/service/presto/finish file below.
1 #!/bin/bash
2 set -e
3 exec 2>&1
4 exec 3>>/var/log/runit/runit.log
5
6 STATSD_PREFIX="runit.presto"
7 source /etc/statsd/functions
8
9 function error_handler() {
10 echo "$(date +"%Y-%m-%dT%H:%M:%S.%3NZ") Error occurred in run file at line: $1."
11 echo "$(date +"%Y-%m-%dT%H:%M:%S.%3NZ") Line exited with status: $2"
12 incr "finish.error"
13 }
14 trap 'error_handler $LINENO $?' ERR
15 echo "$(date +"%Y-%m-%dT%H:%M:%S.%3NZ") process=presto status=stopped exitcode=$1 waitcode=$2" >&3
16 # treat non-zero exit codes as a crash
17 # waitcode contains the signal if there's one (ex. 11 - SIGSEGV)
18 if [ $1 -ne 0 ]; then
19 incr "finish.crash"
20 fi
21
22
23 # ensure that we kill the entire process group.
24 # When sv force-restart runs, it will try to TERM the runit processes. If
25 # this doesn't work, it will kill (-9) the process. In case of haproxy,
26 # apache, gunicorn, etc., the master process will be killed (-9). Child processes
27 # (ie apache workers, gunicorn workers) will *not* be killed and will be
28 # around for minutes (if not hours). These child workers will keep
29 # listening on the socket, preventing the new master apache/gunicorn
30 # processes from binding to the socket. The new master process will keep
31 # crashing and be restarted by runit until the old child processes are
32 # gone.
33
34 # determine the process group id. it's the group id of the current (finish) proces.
35 PGID=$(ps -o pgid= $$ | grep -o [0-9]*)
36 # kill all processes, except ourself and the PGID (which is the main process)
37 kill $(pgrep -g $PGID | egrep -v "$PGID|$$" ) || true
38 sleep 2
39 # kill -9 to be sure
40 kill -9 $(pgrep -g $PGID | egrep -v "$PGID|$$" ) || true
41
42 echo "$(date +"%Y-%m-%dT%H:%M:%S.%3NZ") process=presto status=finished" >&3
43 incr "finish.count"
44 timing "finish.duration"
prestodb
What's the./finish
shell script?
– Piotr Findeisen
Nov 21 at 9:57
Edited the post to show the ./finish file content.
– danzhi
Nov 21 at 19:26
There is a system that force kills the JVM when anOutOfMemoryException
is thrown. This is a hard kill, so you would not get theLife cycle stopping...
. That message is only printed from a JVM shutdown hook, so someone would have to send a kill signal to the process.
– Dain Sundstrom
Nov 22 at 23:45
We found that our continuous pull deployment will restart the presto server under some conditions. It is however not 100% confirmed due to missing deployment log.
– danzhi
Dec 10 at 17:16
add a comment |
In our presto cluster (0.212) with ~200 nodes (EC2 instances), sometime (like once per day) a few presto worker processes mysteriously restart (usually around the same time when it happens). The EC2 instances are fine and memory % metrics indicate 70% memory was used.
Does presto worker has any kind of suicide and restart logic (like restart if >= M consecutive errors in a row)? Or can presto coordinator restart worker under some situations? What else might kill a few worker process around the same time?
Here is one example of the server log that shows the restart.
2018-11-14T23:16:28.78011 2018-11-14T23:16:28.776Z INFO Thread-63 io.airlift.bootstrap.LifeCycleManager Life cycle stopping...
2018-11-14T23:16:29.17181 ThreadDump 4524
2018-11-14T23:16:29.17182 ForceSafepoint 414
2018-11-14T23:16:29.17182 Deoptimize 66
2018-11-14T23:16:29.17182 CollectForMetadataAllocation 11
2018-11-14T23:16:29.17182 CGC_Operation 272
2018-11-14T23:16:29.17182 G1IncCollectionPause 2900
2018-11-14T23:16:29.17183 EnableBiasedLocking 1
2018-11-14T23:16:29.17183 RevokeBias 6248
2018-11-14T23:16:29.17183 BulkRevokeBias 272
2018-11-14T23:16:29.17183 Exit 1
2018-11-14T23:16:29.17183 931 VM operations coalesced during safepoint
2018-11-14T23:16:29.17184 Maximum sync time 197 ms
2018-11-14T23:16:29.17184 Maximum vm operation time (except for Exit VM operation) 2599 ms
2018-11-14T23:16:29.52968 ./finish: line 37: kill: (3700) - No such process
2018-11-14T23:16:29.52969 ./finish: line 37: kill: (3702) - No such process
2018-11-14T23:16:31.53563 ./finish: line 40: kill: (3704) - No such process
2018-11-14T23:16:31.53564 ./finish: line 40: kill: (3706) - No such process
2018-11-14T23:16:32.25948 2018-11-14T23:16:32.257Z INFO main io.airlift.log.Logging Logging to stderr
2018-11-14T23:16:32.26034 2018-11-14T23:16:32.260Z INFO main Bootstrap Loading configuration
2018-11-14T23:16:32.33800 2018-11-14T23:16:32.337Z INFO main Bootstrap Initializing logging
......
2018-11-14T23:16:35.75427 2018-11-14T23:16:35.754Z INFO main io.airlift.bootstrap.LifeCycleManager Life cycle starting...
2018-11-14T23:16:35.75556 2018-11-14T23:16:35.755Z INFO main io.airlift.bootstrap.LifeCycleManager Life cycle startup complete. System ready.
If relevant, these "./finish: ..." lines in the log are related to the the /etc/service/presto/finish file below.
1 #!/bin/bash
2 set -e
3 exec 2>&1
4 exec 3>>/var/log/runit/runit.log
5
6 STATSD_PREFIX="runit.presto"
7 source /etc/statsd/functions
8
9 function error_handler() {
10 echo "$(date +"%Y-%m-%dT%H:%M:%S.%3NZ") Error occurred in run file at line: $1."
11 echo "$(date +"%Y-%m-%dT%H:%M:%S.%3NZ") Line exited with status: $2"
12 incr "finish.error"
13 }
14 trap 'error_handler $LINENO $?' ERR
15 echo "$(date +"%Y-%m-%dT%H:%M:%S.%3NZ") process=presto status=stopped exitcode=$1 waitcode=$2" >&3
16 # treat non-zero exit codes as a crash
17 # waitcode contains the signal if there's one (ex. 11 - SIGSEGV)
18 if [ $1 -ne 0 ]; then
19 incr "finish.crash"
20 fi
21
22
23 # ensure that we kill the entire process group.
24 # When sv force-restart runs, it will try to TERM the runit processes. If
25 # this doesn't work, it will kill (-9) the process. In case of haproxy,
26 # apache, gunicorn, etc., the master process will be killed (-9). Child processes
27 # (ie apache workers, gunicorn workers) will *not* be killed and will be
28 # around for minutes (if not hours). These child workers will keep
29 # listening on the socket, preventing the new master apache/gunicorn
30 # processes from binding to the socket. The new master process will keep
31 # crashing and be restarted by runit until the old child processes are
32 # gone.
33
34 # determine the process group id. it's the group id of the current (finish) proces.
35 PGID=$(ps -o pgid= $$ | grep -o [0-9]*)
36 # kill all processes, except ourself and the PGID (which is the main process)
37 kill $(pgrep -g $PGID | egrep -v "$PGID|$$" ) || true
38 sleep 2
39 # kill -9 to be sure
40 kill -9 $(pgrep -g $PGID | egrep -v "$PGID|$$" ) || true
41
42 echo "$(date +"%Y-%m-%dT%H:%M:%S.%3NZ") process=presto status=finished" >&3
43 incr "finish.count"
44 timing "finish.duration"
prestodb
In our presto cluster (0.212) with ~200 nodes (EC2 instances), sometime (like once per day) a few presto worker processes mysteriously restart (usually around the same time when it happens). The EC2 instances are fine and memory % metrics indicate 70% memory was used.
Does presto worker has any kind of suicide and restart logic (like restart if >= M consecutive errors in a row)? Or can presto coordinator restart worker under some situations? What else might kill a few worker process around the same time?
Here is one example of the server log that shows the restart.
2018-11-14T23:16:28.78011 2018-11-14T23:16:28.776Z INFO Thread-63 io.airlift.bootstrap.LifeCycleManager Life cycle stopping...
2018-11-14T23:16:29.17181 ThreadDump 4524
2018-11-14T23:16:29.17182 ForceSafepoint 414
2018-11-14T23:16:29.17182 Deoptimize 66
2018-11-14T23:16:29.17182 CollectForMetadataAllocation 11
2018-11-14T23:16:29.17182 CGC_Operation 272
2018-11-14T23:16:29.17182 G1IncCollectionPause 2900
2018-11-14T23:16:29.17183 EnableBiasedLocking 1
2018-11-14T23:16:29.17183 RevokeBias 6248
2018-11-14T23:16:29.17183 BulkRevokeBias 272
2018-11-14T23:16:29.17183 Exit 1
2018-11-14T23:16:29.17183 931 VM operations coalesced during safepoint
2018-11-14T23:16:29.17184 Maximum sync time 197 ms
2018-11-14T23:16:29.17184 Maximum vm operation time (except for Exit VM operation) 2599 ms
2018-11-14T23:16:29.52968 ./finish: line 37: kill: (3700) - No such process
2018-11-14T23:16:29.52969 ./finish: line 37: kill: (3702) - No such process
2018-11-14T23:16:31.53563 ./finish: line 40: kill: (3704) - No such process
2018-11-14T23:16:31.53564 ./finish: line 40: kill: (3706) - No such process
2018-11-14T23:16:32.25948 2018-11-14T23:16:32.257Z INFO main io.airlift.log.Logging Logging to stderr
2018-11-14T23:16:32.26034 2018-11-14T23:16:32.260Z INFO main Bootstrap Loading configuration
2018-11-14T23:16:32.33800 2018-11-14T23:16:32.337Z INFO main Bootstrap Initializing logging
......
2018-11-14T23:16:35.75427 2018-11-14T23:16:35.754Z INFO main io.airlift.bootstrap.LifeCycleManager Life cycle starting...
2018-11-14T23:16:35.75556 2018-11-14T23:16:35.755Z INFO main io.airlift.bootstrap.LifeCycleManager Life cycle startup complete. System ready.
If relevant, these "./finish: ..." lines in the log are related to the the /etc/service/presto/finish file below.
1 #!/bin/bash
2 set -e
3 exec 2>&1
4 exec 3>>/var/log/runit/runit.log
5
6 STATSD_PREFIX="runit.presto"
7 source /etc/statsd/functions
8
9 function error_handler() {
10 echo "$(date +"%Y-%m-%dT%H:%M:%S.%3NZ") Error occurred in run file at line: $1."
11 echo "$(date +"%Y-%m-%dT%H:%M:%S.%3NZ") Line exited with status: $2"
12 incr "finish.error"
13 }
14 trap 'error_handler $LINENO $?' ERR
15 echo "$(date +"%Y-%m-%dT%H:%M:%S.%3NZ") process=presto status=stopped exitcode=$1 waitcode=$2" >&3
16 # treat non-zero exit codes as a crash
17 # waitcode contains the signal if there's one (ex. 11 - SIGSEGV)
18 if [ $1 -ne 0 ]; then
19 incr "finish.crash"
20 fi
21
22
23 # ensure that we kill the entire process group.
24 # When sv force-restart runs, it will try to TERM the runit processes. If
25 # this doesn't work, it will kill (-9) the process. In case of haproxy,
26 # apache, gunicorn, etc., the master process will be killed (-9). Child processes
27 # (ie apache workers, gunicorn workers) will *not* be killed and will be
28 # around for minutes (if not hours). These child workers will keep
29 # listening on the socket, preventing the new master apache/gunicorn
30 # processes from binding to the socket. The new master process will keep
31 # crashing and be restarted by runit until the old child processes are
32 # gone.
33
34 # determine the process group id. it's the group id of the current (finish) proces.
35 PGID=$(ps -o pgid= $$ | grep -o [0-9]*)
36 # kill all processes, except ourself and the PGID (which is the main process)
37 kill $(pgrep -g $PGID | egrep -v "$PGID|$$" ) || true
38 sleep 2
39 # kill -9 to be sure
40 kill -9 $(pgrep -g $PGID | egrep -v "$PGID|$$" ) || true
41
42 echo "$(date +"%Y-%m-%dT%H:%M:%S.%3NZ") process=presto status=finished" >&3
43 incr "finish.count"
44 timing "finish.duration"
prestodb
prestodb
edited Nov 21 at 19:25
asked Nov 21 at 2:23
danzhi
11
11
What's the./finish
shell script?
– Piotr Findeisen
Nov 21 at 9:57
Edited the post to show the ./finish file content.
– danzhi
Nov 21 at 19:26
There is a system that force kills the JVM when anOutOfMemoryException
is thrown. This is a hard kill, so you would not get theLife cycle stopping...
. That message is only printed from a JVM shutdown hook, so someone would have to send a kill signal to the process.
– Dain Sundstrom
Nov 22 at 23:45
We found that our continuous pull deployment will restart the presto server under some conditions. It is however not 100% confirmed due to missing deployment log.
– danzhi
Dec 10 at 17:16
add a comment |
What's the./finish
shell script?
– Piotr Findeisen
Nov 21 at 9:57
Edited the post to show the ./finish file content.
– danzhi
Nov 21 at 19:26
There is a system that force kills the JVM when anOutOfMemoryException
is thrown. This is a hard kill, so you would not get theLife cycle stopping...
. That message is only printed from a JVM shutdown hook, so someone would have to send a kill signal to the process.
– Dain Sundstrom
Nov 22 at 23:45
We found that our continuous pull deployment will restart the presto server under some conditions. It is however not 100% confirmed due to missing deployment log.
– danzhi
Dec 10 at 17:16
What's the
./finish
shell script?– Piotr Findeisen
Nov 21 at 9:57
What's the
./finish
shell script?– Piotr Findeisen
Nov 21 at 9:57
Edited the post to show the ./finish file content.
– danzhi
Nov 21 at 19:26
Edited the post to show the ./finish file content.
– danzhi
Nov 21 at 19:26
There is a system that force kills the JVM when an
OutOfMemoryException
is thrown. This is a hard kill, so you would not get the Life cycle stopping...
. That message is only printed from a JVM shutdown hook, so someone would have to send a kill signal to the process.– Dain Sundstrom
Nov 22 at 23:45
There is a system that force kills the JVM when an
OutOfMemoryException
is thrown. This is a hard kill, so you would not get the Life cycle stopping...
. That message is only printed from a JVM shutdown hook, so someone would have to send a kill signal to the process.– Dain Sundstrom
Nov 22 at 23:45
We found that our continuous pull deployment will restart the presto server under some conditions. It is however not 100% confirmed due to missing deployment log.
– danzhi
Dec 10 at 17:16
We found that our continuous pull deployment will restart the presto server under some conditions. It is however not 100% confirmed due to missing deployment log.
– danzhi
Dec 10 at 17:16
add a comment |
1 Answer
1
active
oldest
votes
Our continuous pull deploy (salt based) restarts the presto server process under some conditions (dependency or config change). It was undesirable and unintentional and related listen_in sections have been removed.
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53404467%2fpresto-worker-process-mysteriously-killed-and-restarted-sometime%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
Our continuous pull deploy (salt based) restarts the presto server process under some conditions (dependency or config change). It was undesirable and unintentional and related listen_in sections have been removed.
add a comment |
Our continuous pull deploy (salt based) restarts the presto server process under some conditions (dependency or config change). It was undesirable and unintentional and related listen_in sections have been removed.
add a comment |
Our continuous pull deploy (salt based) restarts the presto server process under some conditions (dependency or config change). It was undesirable and unintentional and related listen_in sections have been removed.
Our continuous pull deploy (salt based) restarts the presto server process under some conditions (dependency or config change). It was undesirable and unintentional and related listen_in sections have been removed.
answered Dec 14 at 19:48
danzhi
11
11
add a comment |
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Some of your past answers have not been well-received, and you're in danger of being blocked from answering.
Please pay close attention to the following guidance:
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53404467%2fpresto-worker-process-mysteriously-killed-and-restarted-sometime%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
What's the
./finish
shell script?– Piotr Findeisen
Nov 21 at 9:57
Edited the post to show the ./finish file content.
– danzhi
Nov 21 at 19:26
There is a system that force kills the JVM when an
OutOfMemoryException
is thrown. This is a hard kill, so you would not get theLife cycle stopping...
. That message is only printed from a JVM shutdown hook, so someone would have to send a kill signal to the process.– Dain Sundstrom
Nov 22 at 23:45
We found that our continuous pull deployment will restart the presto server under some conditions. It is however not 100% confirmed due to missing deployment log.
– danzhi
Dec 10 at 17:16