--GREP--
list file names only that do not contain 'newstimes'
grep -l -v 'newstimes' *
list files with '6082159.php' searching recursively
grep -rl '6082159.php' /data/hnp/articles/
list files that contain x but not y
grep -l 'x' | grep -l -v 'y'
--To sort on the fourth column of a tab delimited file
sort -t\t -k 4,4 <filename>
--You might also want -V which sorts numbers more naturally. For example, yielding 1 2 10 rather than 1 10 2 (lexicographic order).
sort -t\t -k 4,4 -V <filename>
--'''less''' is more [http://helpdeskgeek.com/linux-tips/more-less-command-linux-unix/]
[Arrows]/[Page Up]/[Page Down]/[Home]/[End]: Navigation.
[Space bar]: Next page.
b: Previous page.
ng: Jump to line number n. Default is the start of the file.
nG: Jump to line number n. Default is the end of the file.
/pattern: Search for pattern. Regular expressions can be used.
n: Go to next match (after a successful search).
N: Go to previous match.
mletter: Mark the current position with letter.
‘letter: Return to position letter. [' = single quote]
‘^ or g: Go to start of file.
‘$ or G: Go to end of file.
s: Save current content (got from another program like grep) in a file.
=: File information.
F: continually read information from file and follow its end. Useful for logs watching. Use Ctrl+C to exit this mode.
h: Help.
q: Quit.
--remove duplicate rows of a file without sorting
awk '!x[$0]++' abill.tsv > abill_sm.tsv
In the de-duplicating script, it’s evaluating the expression “!x[$0]++’.
Breaking this down:
$0 is the entire current line.
x[$0] is a hash array element which assigns the current line to the hash as a key
x[$0]++ post-increments the current element of the hash array, thus increasing every time a duplicate line gets assigned (to the same element).
!x[$0]++ returns true if x[$0] is 0, and false if it’s anything else (since it negates the value). The post-increment happens after this is evaluated.
The expression only evaluates as true, and therefore prints the current line, if the line hasn’t been seen already.
--wget entire directory
wget -r --no-parent --reject "index.html*" http://mysite.com/configs/.vim/
-- output first 7 rows of a file
head -7 file > small_file
head -n 7 file > small_file
head -7l file > small_file
cat file.gz | head -7 > small_file
-- output random selection of 7 rows of a file
shuf -n 7 file > small_file
--delete files below a certain size
find -name "*.csv" -size -160k -delete
--run without the delete to see what it selects
--Find maximum value -f2 is here for 2nd column for eg
cat filename | cut -f2 -d " " | sort -nr | head -1
--For minimum
cat filename | cut -f2 -d " " | sort -nr | tail -1
--Show nohup processes: lsof | grep nohup.out
--Kill nohup processes: kill -9 pid
--Check how much space you are taking up: du --max-depth=1 /home/rickmc | sort -n -r
--Check how much space is left: df -k
--File the big files: find {location} -type f -size +{file size threshhold in kb}k -exec ls -lh {} \; | awk '{ print $9 ": " $5 }'
example: find /data -type f -size +10000k -exec ls -lh {} \; | awk '{ print $9 ": " $5 }'
--What processes do you have running: ps -ef |grep rickmc
---Find a file
$ find / -name 'program.c' 2>/dev/null
$ find / -name 'program.c' 2>errors.txt
where / Start searching from the root directory (i.e / directory)
-name Given search text is the filename rather than any other attribute of a file
'program.c' Search text that we have entered. Always enclose the filename in single quotes.
Why to do this is complex.. so simply do so.
Note : 2>/dev/null is not related to find tool as such.
2 indicates the error stream in Linux, and /dev/null is the device where anything you send simply disappears.
So 2>/dev/null in this case means that while finding for the files, in case any error messages pop up simply
send them to /dev/null i.e. simply discard all error messages.