Unix Command Line Fu - Programming ≈ Fun

It’s 1AM… You have:

broken production (hey you should have tested better, that never happens to me!)
humongous log file
no idea whatsoever about what’s wrong

You may or may not know it, but right now your best friend is Unix command line and a guy named Doug Mcllroy. If you know who he is… Pass along nothing new here for you…

If you don’t THE important thing is what he wrote on October 11,1964 in a memo that was inspiration for piping in Unix:

We should have some ways of connecting programs like a garden hose – screw in another segment when it becomes necessary to massage data in another way…

I believe that massaging data is exactly the remedy you need. Looking at the log data from multiple different angles is fastest way to figure out what is going on. Famous Unix one liners are the way to go. They say that back in 1972 when Ken Thompson implemented pipelines (that Mcllroy was talking about 8 years before) the whole new world opened up…

…we had this orgy of one liners. Everybody had a one liner. Look at this, look at that…

The whole Unix philosophy of doing one thing and doing it well was articulated more clearly right then and there. Only problem is that command line is complex and you don’t know where to start. I use to feel that way, but here is the kicker:

You only need to know a few tools and even fewer options. Even the basics will get you a long way.

So I urge you to embrace a couple of simple (to use) tools, and make some rather smart people from the past work for you. Climb on the shoulders of a giant…and enjoy the ride!

grep

Your best friend in the world of error log flushing. Let’s say you need lines in log with errors on it:

grep ERROR RequestLog.log

Options?

-l (files with matches)
-i (ignore case)
-r (recursive)

cut

Your result is too big to fit in one line, or you repeat data that you don’t need to look at. Don’t worry cut is here to help you out:

cut -c 150-200 RequestLog.log | grep ERROR

This will cut each line and only take content from 150-200 characters.

Options?

-c (number of charactes)
-d (delimiter)
-f (fields)

For example you can take first two fields delimited by colon like this

cut -d: -f 1-2 DatabaseErrorLog.log

sort

Sorting line by line.

Options?

-r (reverse)
-n (sort by numeric order)

uniq

Filter unique lines.

Options?

-c (count)

Caveat: If lines in file are not sorted - unique will consider them different if there is different line between two exactly the same lines.

This is actually beneficial sometimes, and fits nicely with “do one thing” philosophy. Nothing is preventing you from piping it with sort like the example below:

cut -c 150-200 RequestLog.log | grep ERROR | sort | uniq -c

If you don’t want count you can use sort -u instead of sort | uniq

cut -c 150-200 RequestLog.log | grep ERROR | sort -u

Next example is counting how may times each unique error appears and than sorting from biggest count to the smallest:

cut -c 150-200 RequestLog.log | grep ERROR | sort | uniq -c | sort -rn

cat & tac

If you need to combine multiple files (one after another) just use cat. If you want to reverse the order of lines while combining use tac (I really appreciate the name here)

paste & join

When you need to combine two files so that you get some data from one, and some from another (and combine them in the same line). If you just want to combine line by line use paste. Join is a little bit smarter, because it expects the same “key” column in both data sources and know how to better combine them.

If we have:

(a) download

1 Reddit
2 Space

and

(b) download

1 Redditor
2 Astronaut

Paste will give you:

paste a b

(paste) download

1 Reddit   1 Redditor
2 Space   2 Astronaut

Join is a bit smarter

join a b

(join) download

1 Reddit Redditor
2 Space Astronaut

tail & head

If you only need tail or head of result.

Options?

-n (Number of lines)

Also tail -F can be used as a super cool real time monitor for log file (spitting out new content as it comes along).

tail -f is doing the same, except tail -F is following the file itself (by name) in the case of file rotation…

Conclusion

That mostly wraps what I think is enough to start harnessing the command line power.

For those familiar with console: I am sure everybody has their own tricks and this is barely scratching the surface. Be gentle on the comments, for sure I can spend entire life on it and still not knowing even 10%. I am very interested in your opinion on the subject.
For the newcomers: Every trip has to start somewhere, so fire up your console and try a couple of things… You may quite like the results.

grep

cut

sort

uniq

cat & tac

paste & join

tail & head

Conclusion

Comments