Script to report disk usage
up vote
1
down vote
favorite
I am trying to see how I can speed up the below script that reports disk_usage. The timed find command towards the end is the problematic line that I am trying to speed up. This script is run on directories that have over 6-7TB of data and it takes 16-18hrs. However I want it to run in under 8hrs. Can someone please suggest alternate ways to modify this script?
# -disk_check.csh takes dir name as a mandatory argument and an options <num> or -verbose as a second argument.
# Ex1: disk_check <dir_name> - Reports out the disk usage per user and the total disk consumption
# Ex2: disk_check <dir_name> -verbose -Along with the above, it also lists all files by size in the given directory
# Ex3: disk_check <dir_name> -<num> -Similar to Ex2, But here it reports out the top <num> files by size in the given directory
if ($#argv == 0) then
echo " Error : Dir path missing"
echo " Syntax : disk_check <dir-name> <verbose>"
echo " verbose gives a list of all files per individual sorted by size"
exit 0
endif
set cwd = $argv[1]
if ($cwd =~ "-help") then
echo " Error : Dir path missing"
echo " Syntax : disk_check <dir-name> <-verbose>"
echo " -verbose gives a list of all files per individual sorted by size"
exit 0
endif
if ($#argv > 1) then
set opt = $argv[2]
#echo "opt : $opt"
endif
if ( -d $cwd ) then
set ava = `df -h $cwd | tail -1 | awk '{print $1'}`
set tot = `df -h $cwd | tail -1 | awk '{print $2'}`
set ad = `df -h $cwd | tail -1 | awk '{print $3'}`
set pcu = `df -h $cwd | tail -1 | awk '{print $4'}`
echo ""
echo "Summary for dir ${cwd}: $tot Used (${pcu})"
echo "-----------------------------------------------------------------------------"
echo " Total Volume $ava"
echo " Available on disk $ad "
echo " Percentage used $pcu"
echo ""
echo "Summary by User:"
printf "%sUser%15sSize%10sCountn" ""
echo "---------------------------------------------"
# This is the command that takes a long time:
time find $cwd -type f -printf "%u %sn" | awk '{user[$1]+=$2;count[$1]++}; END{ for( i in user) printf "%s%-13s%5s%-0.2f%s%5s%7sn","", i, "", user[i]/1024**3,"GB", "", count[i]}'| sort -nk2 -r
if ($#argv > 1) then
if ($opt =~ "-verbose") then
echo "nDetail, Sorted by size"
printf " User%15sFile%15sSizen" ""
echo "---------------------------------------------------"
find $cwd -type f -not -path '*/.*' -printf "%-13u | %-50p | %-10s n" | sort -nk5 -r
endif
time-limit-exceeded file-system linux unix tcsh
New contributor
add a comment |
up vote
1
down vote
favorite
I am trying to see how I can speed up the below script that reports disk_usage. The timed find command towards the end is the problematic line that I am trying to speed up. This script is run on directories that have over 6-7TB of data and it takes 16-18hrs. However I want it to run in under 8hrs. Can someone please suggest alternate ways to modify this script?
# -disk_check.csh takes dir name as a mandatory argument and an options <num> or -verbose as a second argument.
# Ex1: disk_check <dir_name> - Reports out the disk usage per user and the total disk consumption
# Ex2: disk_check <dir_name> -verbose -Along with the above, it also lists all files by size in the given directory
# Ex3: disk_check <dir_name> -<num> -Similar to Ex2, But here it reports out the top <num> files by size in the given directory
if ($#argv == 0) then
echo " Error : Dir path missing"
echo " Syntax : disk_check <dir-name> <verbose>"
echo " verbose gives a list of all files per individual sorted by size"
exit 0
endif
set cwd = $argv[1]
if ($cwd =~ "-help") then
echo " Error : Dir path missing"
echo " Syntax : disk_check <dir-name> <-verbose>"
echo " -verbose gives a list of all files per individual sorted by size"
exit 0
endif
if ($#argv > 1) then
set opt = $argv[2]
#echo "opt : $opt"
endif
if ( -d $cwd ) then
set ava = `df -h $cwd | tail -1 | awk '{print $1'}`
set tot = `df -h $cwd | tail -1 | awk '{print $2'}`
set ad = `df -h $cwd | tail -1 | awk '{print $3'}`
set pcu = `df -h $cwd | tail -1 | awk '{print $4'}`
echo ""
echo "Summary for dir ${cwd}: $tot Used (${pcu})"
echo "-----------------------------------------------------------------------------"
echo " Total Volume $ava"
echo " Available on disk $ad "
echo " Percentage used $pcu"
echo ""
echo "Summary by User:"
printf "%sUser%15sSize%10sCountn" ""
echo "---------------------------------------------"
# This is the command that takes a long time:
time find $cwd -type f -printf "%u %sn" | awk '{user[$1]+=$2;count[$1]++}; END{ for( i in user) printf "%s%-13s%5s%-0.2f%s%5s%7sn","", i, "", user[i]/1024**3,"GB", "", count[i]}'| sort -nk2 -r
if ($#argv > 1) then
if ($opt =~ "-verbose") then
echo "nDetail, Sorted by size"
printf " User%15sFile%15sSizen" ""
echo "---------------------------------------------------"
find $cwd -type f -not -path '*/.*' -printf "%-13u | %-50p | %-10s n" | sort -nk5 -r
endif
time-limit-exceeded file-system linux unix tcsh
New contributor
Have you considered turning on disk quotas? Then the filesystem keeps track of the usage per user, and you can runquota
to get a report.
– 200_success
17 mins ago
add a comment |
up vote
1
down vote
favorite
up vote
1
down vote
favorite
I am trying to see how I can speed up the below script that reports disk_usage. The timed find command towards the end is the problematic line that I am trying to speed up. This script is run on directories that have over 6-7TB of data and it takes 16-18hrs. However I want it to run in under 8hrs. Can someone please suggest alternate ways to modify this script?
# -disk_check.csh takes dir name as a mandatory argument and an options <num> or -verbose as a second argument.
# Ex1: disk_check <dir_name> - Reports out the disk usage per user and the total disk consumption
# Ex2: disk_check <dir_name> -verbose -Along with the above, it also lists all files by size in the given directory
# Ex3: disk_check <dir_name> -<num> -Similar to Ex2, But here it reports out the top <num> files by size in the given directory
if ($#argv == 0) then
echo " Error : Dir path missing"
echo " Syntax : disk_check <dir-name> <verbose>"
echo " verbose gives a list of all files per individual sorted by size"
exit 0
endif
set cwd = $argv[1]
if ($cwd =~ "-help") then
echo " Error : Dir path missing"
echo " Syntax : disk_check <dir-name> <-verbose>"
echo " -verbose gives a list of all files per individual sorted by size"
exit 0
endif
if ($#argv > 1) then
set opt = $argv[2]
#echo "opt : $opt"
endif
if ( -d $cwd ) then
set ava = `df -h $cwd | tail -1 | awk '{print $1'}`
set tot = `df -h $cwd | tail -1 | awk '{print $2'}`
set ad = `df -h $cwd | tail -1 | awk '{print $3'}`
set pcu = `df -h $cwd | tail -1 | awk '{print $4'}`
echo ""
echo "Summary for dir ${cwd}: $tot Used (${pcu})"
echo "-----------------------------------------------------------------------------"
echo " Total Volume $ava"
echo " Available on disk $ad "
echo " Percentage used $pcu"
echo ""
echo "Summary by User:"
printf "%sUser%15sSize%10sCountn" ""
echo "---------------------------------------------"
# This is the command that takes a long time:
time find $cwd -type f -printf "%u %sn" | awk '{user[$1]+=$2;count[$1]++}; END{ for( i in user) printf "%s%-13s%5s%-0.2f%s%5s%7sn","", i, "", user[i]/1024**3,"GB", "", count[i]}'| sort -nk2 -r
if ($#argv > 1) then
if ($opt =~ "-verbose") then
echo "nDetail, Sorted by size"
printf " User%15sFile%15sSizen" ""
echo "---------------------------------------------------"
find $cwd -type f -not -path '*/.*' -printf "%-13u | %-50p | %-10s n" | sort -nk5 -r
endif
time-limit-exceeded file-system linux unix tcsh
New contributor
I am trying to see how I can speed up the below script that reports disk_usage. The timed find command towards the end is the problematic line that I am trying to speed up. This script is run on directories that have over 6-7TB of data and it takes 16-18hrs. However I want it to run in under 8hrs. Can someone please suggest alternate ways to modify this script?
# -disk_check.csh takes dir name as a mandatory argument and an options <num> or -verbose as a second argument.
# Ex1: disk_check <dir_name> - Reports out the disk usage per user and the total disk consumption
# Ex2: disk_check <dir_name> -verbose -Along with the above, it also lists all files by size in the given directory
# Ex3: disk_check <dir_name> -<num> -Similar to Ex2, But here it reports out the top <num> files by size in the given directory
if ($#argv == 0) then
echo " Error : Dir path missing"
echo " Syntax : disk_check <dir-name> <verbose>"
echo " verbose gives a list of all files per individual sorted by size"
exit 0
endif
set cwd = $argv[1]
if ($cwd =~ "-help") then
echo " Error : Dir path missing"
echo " Syntax : disk_check <dir-name> <-verbose>"
echo " -verbose gives a list of all files per individual sorted by size"
exit 0
endif
if ($#argv > 1) then
set opt = $argv[2]
#echo "opt : $opt"
endif
if ( -d $cwd ) then
set ava = `df -h $cwd | tail -1 | awk '{print $1'}`
set tot = `df -h $cwd | tail -1 | awk '{print $2'}`
set ad = `df -h $cwd | tail -1 | awk '{print $3'}`
set pcu = `df -h $cwd | tail -1 | awk '{print $4'}`
echo ""
echo "Summary for dir ${cwd}: $tot Used (${pcu})"
echo "-----------------------------------------------------------------------------"
echo " Total Volume $ava"
echo " Available on disk $ad "
echo " Percentage used $pcu"
echo ""
echo "Summary by User:"
printf "%sUser%15sSize%10sCountn" ""
echo "---------------------------------------------"
# This is the command that takes a long time:
time find $cwd -type f -printf "%u %sn" | awk '{user[$1]+=$2;count[$1]++}; END{ for( i in user) printf "%s%-13s%5s%-0.2f%s%5s%7sn","", i, "", user[i]/1024**3,"GB", "", count[i]}'| sort -nk2 -r
if ($#argv > 1) then
if ($opt =~ "-verbose") then
echo "nDetail, Sorted by size"
printf " User%15sFile%15sSizen" ""
echo "---------------------------------------------------"
find $cwd -type f -not -path '*/.*' -printf "%-13u | %-50p | %-10s n" | sort -nk5 -r
endif
time-limit-exceeded file-system linux unix tcsh
time-limit-exceeded file-system linux unix tcsh
New contributor
New contributor
edited 23 mins ago
200_success
127k15149412
127k15149412
New contributor
asked 43 mins ago
user186743
61
61
New contributor
New contributor
Have you considered turning on disk quotas? Then the filesystem keeps track of the usage per user, and you can runquota
to get a report.
– 200_success
17 mins ago
add a comment |
Have you considered turning on disk quotas? Then the filesystem keeps track of the usage per user, and you can runquota
to get a report.
– 200_success
17 mins ago
Have you considered turning on disk quotas? Then the filesystem keeps track of the usage per user, and you can run
quota
to get a report.– 200_success
17 mins ago
Have you considered turning on disk quotas? Then the filesystem keeps track of the usage per user, and you can run
quota
to get a report.– 200_success
17 mins ago
add a comment |
active
oldest
votes
active
oldest
votes
active
oldest
votes
active
oldest
votes
active
oldest
votes
user186743 is a new contributor. Be nice, and check out our Code of Conduct.
user186743 is a new contributor. Be nice, and check out our Code of Conduct.
user186743 is a new contributor. Be nice, and check out our Code of Conduct.
user186743 is a new contributor. Be nice, and check out our Code of Conduct.
Thanks for contributing an answer to Code Review Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
Use MathJax to format equations. MathJax reference.
To learn more, see our tips on writing great answers.
Some of your past answers have not been well-received, and you're in danger of being blocked from answering.
Please pay close attention to the following guidance:
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f209342%2fscript-to-report-disk-usage%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Have you considered turning on disk quotas? Then the filesystem keeps track of the usage per user, and you can run
quota
to get a report.– 200_success
17 mins ago