Script to report disk usage











up vote
1
down vote

favorite












I am trying to see how I can speed up the below script that reports disk_usage. The timed find command towards the end is the problematic line that I am trying to speed up. This script is run on directories that have over 6-7TB of data and it takes 16-18hrs. However I want it to run in under 8hrs. Can someone please suggest alternate ways to modify this script?



# -disk_check.csh takes dir name as a mandatory argument and an options <num> or -verbose as a second argument.
# Ex1: disk_check <dir_name> - Reports out the disk usage per user and the total disk consumption
# Ex2: disk_check <dir_name> -verbose -Along with the above, it also lists all files by size in the given directory
# Ex3: disk_check <dir_name> -<num> -Similar to Ex2, But here it reports out the top <num> files by size in the given directory


if ($#argv == 0) then
echo " Error : Dir path missing"
echo " Syntax : disk_check <dir-name> <verbose>"
echo " verbose gives a list of all files per individual sorted by size"
exit 0
endif

set cwd = $argv[1]
if ($cwd =~ "-help") then
echo " Error : Dir path missing"
echo " Syntax : disk_check <dir-name> <-verbose>"
echo " -verbose gives a list of all files per individual sorted by size"
exit 0
endif

if ($#argv > 1) then
set opt = $argv[2]
#echo "opt : $opt"
endif

if ( -d $cwd ) then

set ava = `df -h $cwd | tail -1 | awk '{print $1'}`
set tot = `df -h $cwd | tail -1 | awk '{print $2'}`
set ad = `df -h $cwd | tail -1 | awk '{print $3'}`
set pcu = `df -h $cwd | tail -1 | awk '{print $4'}`

echo ""
echo "Summary for dir ${cwd}: $tot Used (${pcu})"
echo "-----------------------------------------------------------------------------"
echo " Total Volume $ava"
echo " Available on disk $ad "
echo " Percentage used $pcu"
echo ""
echo "Summary by User:"
printf "%sUser%15sSize%10sCountn" ""
echo "---------------------------------------------"

# This is the command that takes a long time:
time find $cwd -type f -printf "%u %sn" | awk '{user[$1]+=$2;count[$1]++}; END{ for( i in user) printf "%s%-13s%5s%-0.2f%s%5s%7sn","", i, "", user[i]/1024**3,"GB", "", count[i]}'| sort -nk2 -r


if ($#argv > 1) then
if ($opt =~ "-verbose") then
echo "nDetail, Sorted by size"
printf " User%15sFile%15sSizen" ""
echo "---------------------------------------------------"
find $cwd -type f -not -path '*/.*' -printf "%-13u | %-50p | %-10s n" | sort -nk5 -r
endif









share|improve this question









New contributor




user186743 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.




















  • Have you considered turning on disk quotas? Then the filesystem keeps track of the usage per user, and you can run quota to get a report.
    – 200_success
    17 mins ago















up vote
1
down vote

favorite












I am trying to see how I can speed up the below script that reports disk_usage. The timed find command towards the end is the problematic line that I am trying to speed up. This script is run on directories that have over 6-7TB of data and it takes 16-18hrs. However I want it to run in under 8hrs. Can someone please suggest alternate ways to modify this script?



# -disk_check.csh takes dir name as a mandatory argument and an options <num> or -verbose as a second argument.
# Ex1: disk_check <dir_name> - Reports out the disk usage per user and the total disk consumption
# Ex2: disk_check <dir_name> -verbose -Along with the above, it also lists all files by size in the given directory
# Ex3: disk_check <dir_name> -<num> -Similar to Ex2, But here it reports out the top <num> files by size in the given directory


if ($#argv == 0) then
echo " Error : Dir path missing"
echo " Syntax : disk_check <dir-name> <verbose>"
echo " verbose gives a list of all files per individual sorted by size"
exit 0
endif

set cwd = $argv[1]
if ($cwd =~ "-help") then
echo " Error : Dir path missing"
echo " Syntax : disk_check <dir-name> <-verbose>"
echo " -verbose gives a list of all files per individual sorted by size"
exit 0
endif

if ($#argv > 1) then
set opt = $argv[2]
#echo "opt : $opt"
endif

if ( -d $cwd ) then

set ava = `df -h $cwd | tail -1 | awk '{print $1'}`
set tot = `df -h $cwd | tail -1 | awk '{print $2'}`
set ad = `df -h $cwd | tail -1 | awk '{print $3'}`
set pcu = `df -h $cwd | tail -1 | awk '{print $4'}`

echo ""
echo "Summary for dir ${cwd}: $tot Used (${pcu})"
echo "-----------------------------------------------------------------------------"
echo " Total Volume $ava"
echo " Available on disk $ad "
echo " Percentage used $pcu"
echo ""
echo "Summary by User:"
printf "%sUser%15sSize%10sCountn" ""
echo "---------------------------------------------"

# This is the command that takes a long time:
time find $cwd -type f -printf "%u %sn" | awk '{user[$1]+=$2;count[$1]++}; END{ for( i in user) printf "%s%-13s%5s%-0.2f%s%5s%7sn","", i, "", user[i]/1024**3,"GB", "", count[i]}'| sort -nk2 -r


if ($#argv > 1) then
if ($opt =~ "-verbose") then
echo "nDetail, Sorted by size"
printf " User%15sFile%15sSizen" ""
echo "---------------------------------------------------"
find $cwd -type f -not -path '*/.*' -printf "%-13u | %-50p | %-10s n" | sort -nk5 -r
endif









share|improve this question









New contributor




user186743 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.




















  • Have you considered turning on disk quotas? Then the filesystem keeps track of the usage per user, and you can run quota to get a report.
    – 200_success
    17 mins ago













up vote
1
down vote

favorite









up vote
1
down vote

favorite











I am trying to see how I can speed up the below script that reports disk_usage. The timed find command towards the end is the problematic line that I am trying to speed up. This script is run on directories that have over 6-7TB of data and it takes 16-18hrs. However I want it to run in under 8hrs. Can someone please suggest alternate ways to modify this script?



# -disk_check.csh takes dir name as a mandatory argument and an options <num> or -verbose as a second argument.
# Ex1: disk_check <dir_name> - Reports out the disk usage per user and the total disk consumption
# Ex2: disk_check <dir_name> -verbose -Along with the above, it also lists all files by size in the given directory
# Ex3: disk_check <dir_name> -<num> -Similar to Ex2, But here it reports out the top <num> files by size in the given directory


if ($#argv == 0) then
echo " Error : Dir path missing"
echo " Syntax : disk_check <dir-name> <verbose>"
echo " verbose gives a list of all files per individual sorted by size"
exit 0
endif

set cwd = $argv[1]
if ($cwd =~ "-help") then
echo " Error : Dir path missing"
echo " Syntax : disk_check <dir-name> <-verbose>"
echo " -verbose gives a list of all files per individual sorted by size"
exit 0
endif

if ($#argv > 1) then
set opt = $argv[2]
#echo "opt : $opt"
endif

if ( -d $cwd ) then

set ava = `df -h $cwd | tail -1 | awk '{print $1'}`
set tot = `df -h $cwd | tail -1 | awk '{print $2'}`
set ad = `df -h $cwd | tail -1 | awk '{print $3'}`
set pcu = `df -h $cwd | tail -1 | awk '{print $4'}`

echo ""
echo "Summary for dir ${cwd}: $tot Used (${pcu})"
echo "-----------------------------------------------------------------------------"
echo " Total Volume $ava"
echo " Available on disk $ad "
echo " Percentage used $pcu"
echo ""
echo "Summary by User:"
printf "%sUser%15sSize%10sCountn" ""
echo "---------------------------------------------"

# This is the command that takes a long time:
time find $cwd -type f -printf "%u %sn" | awk '{user[$1]+=$2;count[$1]++}; END{ for( i in user) printf "%s%-13s%5s%-0.2f%s%5s%7sn","", i, "", user[i]/1024**3,"GB", "", count[i]}'| sort -nk2 -r


if ($#argv > 1) then
if ($opt =~ "-verbose") then
echo "nDetail, Sorted by size"
printf " User%15sFile%15sSizen" ""
echo "---------------------------------------------------"
find $cwd -type f -not -path '*/.*' -printf "%-13u | %-50p | %-10s n" | sort -nk5 -r
endif









share|improve this question









New contributor




user186743 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.











I am trying to see how I can speed up the below script that reports disk_usage. The timed find command towards the end is the problematic line that I am trying to speed up. This script is run on directories that have over 6-7TB of data and it takes 16-18hrs. However I want it to run in under 8hrs. Can someone please suggest alternate ways to modify this script?



# -disk_check.csh takes dir name as a mandatory argument and an options <num> or -verbose as a second argument.
# Ex1: disk_check <dir_name> - Reports out the disk usage per user and the total disk consumption
# Ex2: disk_check <dir_name> -verbose -Along with the above, it also lists all files by size in the given directory
# Ex3: disk_check <dir_name> -<num> -Similar to Ex2, But here it reports out the top <num> files by size in the given directory


if ($#argv == 0) then
echo " Error : Dir path missing"
echo " Syntax : disk_check <dir-name> <verbose>"
echo " verbose gives a list of all files per individual sorted by size"
exit 0
endif

set cwd = $argv[1]
if ($cwd =~ "-help") then
echo " Error : Dir path missing"
echo " Syntax : disk_check <dir-name> <-verbose>"
echo " -verbose gives a list of all files per individual sorted by size"
exit 0
endif

if ($#argv > 1) then
set opt = $argv[2]
#echo "opt : $opt"
endif

if ( -d $cwd ) then

set ava = `df -h $cwd | tail -1 | awk '{print $1'}`
set tot = `df -h $cwd | tail -1 | awk '{print $2'}`
set ad = `df -h $cwd | tail -1 | awk '{print $3'}`
set pcu = `df -h $cwd | tail -1 | awk '{print $4'}`

echo ""
echo "Summary for dir ${cwd}: $tot Used (${pcu})"
echo "-----------------------------------------------------------------------------"
echo " Total Volume $ava"
echo " Available on disk $ad "
echo " Percentage used $pcu"
echo ""
echo "Summary by User:"
printf "%sUser%15sSize%10sCountn" ""
echo "---------------------------------------------"

# This is the command that takes a long time:
time find $cwd -type f -printf "%u %sn" | awk '{user[$1]+=$2;count[$1]++}; END{ for( i in user) printf "%s%-13s%5s%-0.2f%s%5s%7sn","", i, "", user[i]/1024**3,"GB", "", count[i]}'| sort -nk2 -r


if ($#argv > 1) then
if ($opt =~ "-verbose") then
echo "nDetail, Sorted by size"
printf " User%15sFile%15sSizen" ""
echo "---------------------------------------------------"
find $cwd -type f -not -path '*/.*' -printf "%-13u | %-50p | %-10s n" | sort -nk5 -r
endif






time-limit-exceeded file-system linux unix tcsh






share|improve this question









New contributor




user186743 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.











share|improve this question









New contributor




user186743 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.









share|improve this question




share|improve this question








edited 23 mins ago









200_success

127k15149412




127k15149412






New contributor




user186743 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.









asked 43 mins ago









user186743

61




61




New contributor




user186743 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.





New contributor





user186743 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.






user186743 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.












  • Have you considered turning on disk quotas? Then the filesystem keeps track of the usage per user, and you can run quota to get a report.
    – 200_success
    17 mins ago


















  • Have you considered turning on disk quotas? Then the filesystem keeps track of the usage per user, and you can run quota to get a report.
    – 200_success
    17 mins ago
















Have you considered turning on disk quotas? Then the filesystem keeps track of the usage per user, and you can run quota to get a report.
– 200_success
17 mins ago




Have you considered turning on disk quotas? Then the filesystem keeps track of the usage per user, and you can run quota to get a report.
– 200_success
17 mins ago















active

oldest

votes











Your Answer





StackExchange.ifUsing("editor", function () {
return StackExchange.using("mathjaxEditing", function () {
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["\$", "\$"]]);
});
});
}, "mathjax-editing");

StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "196"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});


}
});






user186743 is a new contributor. Be nice, and check out our Code of Conduct.










draft saved

draft discarded


















StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f209342%2fscript-to-report-disk-usage%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown






























active

oldest

votes













active

oldest

votes









active

oldest

votes






active

oldest

votes








user186743 is a new contributor. Be nice, and check out our Code of Conduct.










draft saved

draft discarded


















user186743 is a new contributor. Be nice, and check out our Code of Conduct.













user186743 is a new contributor. Be nice, and check out our Code of Conduct.












user186743 is a new contributor. Be nice, and check out our Code of Conduct.
















Thanks for contributing an answer to Code Review Stack Exchange!


  • Please be sure to answer the question. Provide details and share your research!

But avoid



  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.


Use MathJax to format equations. MathJax reference.


To learn more, see our tips on writing great answers.





Some of your past answers have not been well-received, and you're in danger of being blocked from answering.


Please pay close attention to the following guidance:


  • Please be sure to answer the question. Provide details and share your research!

But avoid



  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.


To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f209342%2fscript-to-report-disk-usage%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

404 Error Contact Form 7 ajax form submitting

How to know if a Active Directory user can login interactively

Refactoring coordinates for Minecraft Pi buildings written in Python