A Crash Course in UNIX

[ about | introduction | shell | file-system | links ]


About this document

This document attempts to provide a very basic crash-course in UNIX. Primarily for people who are already quite familiar with using computers, but have little or no experience of UNIX. Note that this document will probably be under permanent construction!

Where things typed/printed are involved, the following style/convention is used:

the-prompt$ something typed in
the output of whatever

Multi-key sequences are given in a fairly standard way, for example:

`ctrl-D'hold down the "control" key and press/release the `D' key.
`ctrl-shift-D'hold down the "control" and "shift" keys, then press/release the `D' key.
`esc F'press/release the "escape" key, then press/release the 'F' key.

Introduction

`UNIX' is used to describe an operating system that looks/behaves in a certain way. It's quite hard to pin down the exact meaning of `UNIX', as things change over the years. Typically it refers to systems that provide POSIX, SYSV and/or BSD interfaces to application programs (but that relates to programming on UNIX, which is not necessarily of interest here). Various other things characterise UNIX systems, of decreasing interest here, however. "UNIX" is a registered trademark of The Open Group.

If you've had experience of something that only has a `filesystem' (no explicit drive-letters, etc.), mentioned things called `terminals', `shells' or `command-line', then there's a good chance it's UNIX. Commonly found UNIX-like systems are: Linux (various distributions, e.g. Debian GNU/Linux), Solaris (from Sun Microsystems) and FreeBSD.

The average person's first experience of UNIX usually leaves them frustrated and liable to return to their fluffy window-based operating system. UNIX is slowly getting better at the "fluffy interface", to the point now where there are several user-friendly UNIX distributions available. The `core' of UNIX remains the same, however -- but that isn't up for discussion here either. The reason for choosing UNIX over other operating-systems are generally for its power and flexibility. But like many things, that added power and flexibility can create complexity (flying a helicopter is much harder than driving a car, but you can do more with a helicopter).

Before setting off, there are a couple of UNIX concepts that should be digested:


The Shell

The "shell" is where a UNIX user will typically spend a lot of their time. The job of the shell is to provide an interface for using a UNIX system, typically by means of a "command-line", where the user can type commands, run programs, etc. One widely used shell is bash. As shells go, bash is quite powerful, and when used correctly can greatly enhance the productivity of a UNIX user. "bash" actually stands for the "Bourne Again SHell". See its wikipedia entry for some history.

As standard, the bash shell produces this sort of prompt:

bash$ 

Which isn't actually very nice. One of the first things I do on new UNIX accounts is fix this, usually by copying the shell initialisation files from another UNIX machine. At this point, it's probably worth mentioning some bash-related files (all of which are in your home-directory):

Note that all of these file-names begin with a period (`.'). Files of this nature are generally known as "dot-files". Normally, running "ls" (list directory contents) will not show these files unless the "-a" flag is given. This includes the current directory (just ".") and the parent directory ("..").

When bash first starts up, it leaves you in your home-directory. This is your space on the UNIX machine -- other users have their home-directories elsewhere. Generally, this directory will be "/home/login/", where "login" is your UNIX user-name. For example:

bash$ ls
Mail  public_html
bash$ ls -a
.  ..  .bash_history  .bash_logout  .bash_profile  .bashrc  Mail  public_html
bash$ 

Some of these names are files, others are directories. To get more verbose information out of "ls", use the "-l" flag. For example:

bash$ ls -l
total 8
drwx------  2 auser auser 4096 Dec  9 02:06 Mail
drwxr-xr-x  2 auser auser 4096 Dec  9 02:07 public_html
bash$ ls -al
total 32
drwxr-xr-x  4 auser auser 4096 Dec  9 02:07 .
drwxr-xr-x  6 root  root  4096 Dec  9 02:05 ..
-rw-------  1 auser auser  258 Dec  9 02:07 .bash_history
-rw-r--r--  1 auser auser  174 Dec  9 02:05 .bash_logout
-rw-r--r--  1 auser auser  704 Dec  9 02:05 .bash_profile
-rw-r--r--  1 auser auser 1290 Dec  9 02:05 .bashrc
drwx------  2 auser auser 4096 Dec  9 02:06 Mail
drwxr-xr-x  2 auser auser 4096 Dec  9 02:07 public_html
bash$ 

The first column in the output shows the "mode" and "permissions" of the file. A mode of "d" indicates a directory, "-" (blank) indicates an ordinary file (and there are other types too). The second column shows the number of links to the file (which is usually not interesting, so we'll ignore it). The third and fourth columns show who owns the file/directory (user-name and group-name). Here, the user who owns this home-directory is "auser", and they have their own group "auser" too. The parent-directory is the only thing not owned by "auser" -- this is "/home/" on this system, and that is owned by "root" (the super-user, usually the system administrator(s)). The fifth column output shows the size, that is generally only interesting for files (the size reported for directories is usually the disk-space required to hold the various entries in that directory). The date and time shown in the sixth, seventh and eighth colums are the "modification" time of the file/directory -- i.e. when whatever it is was last modified. Finally the name of the file/directory is shown.

Anyway, back to the original point, which was to make the "bash$ " prompt a little more user-friendly. This is done by setting the "PS1" environment-variable (covered shortly). The sensible place to set this is in your "~/.bash_profile" file. E.g., edit (or create if necessary) this file and add the lines:

# make the prompt a little more friendly
PS1='[\h:\!]:\w:$?$ '

The first line here is a comment. Bash interprets everything following a `#' character to be a comment, up to the end of the line. The second line actually sets the "PS1" variable.

Once your "~/.bash_profile" has been suitably edited, you need to make bash aware of the changes. One way is to logout then log back in again (at which point bash will read the modified file). Another way is to source the file:

bash$ . ~/.bash_profile
[wotex:25]:~:0$ 

And suddenly you have a much nicer prompt :-). The various components that make up the prompt are just a selection of what's available. These are described in detail in the bash manual page. Note the use of the strange "." command in the above, this is a short-hand abbreviation for "source"; which one you use is down to personal preference.

Reading manual pages can be a bit frustrating at first, but quite a lot of these are available as nicely formatted HTML pages on-line. For example, the GNU site has an HTML formatted version of the bash reference manual, that is a lot easier to read (initially) than the corresponding manpage (both are generated from the same "source" however). The "PS1" variable is described in Section 5.1: bourne shell variables, along with other interesting environment variables.

The value of the "PS1" environment-variable (generally any environment variable) can be shown using:

[wotex:25]:~:0$ echo $PS1
[\h:\!]:\w:$?$ 
[wotex:26]:~:0$ 

The "echo" command is built into the bash shell (although a stand-alone version usually exists, "/bin/echo"). All "echo" does is print out the parameters given to it on the command line. The use of "$PS1" (as opposed to "PS1") causes the shell to "expand" that environment-variable before running the command ("echo" in this case). Thus, what "echo" actually sees as a command-line argument is the expanded (substituted) value of the "PS1" environment variable.

One other particularly useful environment-variable is "PATH". This is set to a list of directories (separated using colons -- ":"), that determine where bash looks for executable programs. When you login the system sets up a default path, generally to the usual UNIX utilities, etc. For instance:

[wotex:26]:~:0$ echo $PATH
/usr/local/bin:/usr/bin:/bin:/usr/bin/X11:/usr/games
[wotex:27]:~:0$ 

If you install your own programs (e.g. inside your home-directory somewhere) then you'll probably want to update your "PATH" environment to include these. As a general rule I try and stick my own executables in "~/bin/" (i.e. the subdirectory "bin" inside my home-directory). To include these in the "PATH" I've got something like this in my "~/.bash_profile":

# add ~/bin/ to the path
PATH=$HOME/bin:$PATH

Note that instead of writing "~" for my home-directory, I've used "$HOME". This environment-variable is set to your home-directory when you login and is generally safer than using "~" in some cases (like this). This is because "~" is bash shorthand (like "." meaning "source"), and probably shouldn't be used except on the command-line -- and then only when it's clear what's going to happen. When "~" is expanded by bash, it is expanded into the value in the "$HOME" environment-variable.

At this point, one might wonder how you pass something like "$HOME" (literally) as an argument to a program. The answer lies in how you can quote arguments. This also solves the problem of handling things such as directories and files with spaces in their name. For example, if you happen to have a directory called "My Stuff" (that might typically arise if you're accessing a Windows file-system from UNIX), then you can't list its contents with, for instance:

[wotex:27]:~:0$ ls
Mail  My Stuff  public_html
[wotex:28]:~:0$ ls My Stuff
ls: My: No such file or directory
ls: Stuff: No such file or directory
[wotex:29]:~:1$ 

This happens because the shell (bash) interprets "ls My Stuff" as the program "ls" with the arguments "My" and "Stuff". There are two ways to deal with this one. The first is arguably the simplest and that is to "escape" the space in the filename using a backslash. For example:

[wotex:29]:~:1$ ls My\ Stuff
[wotex:30]:~:0$ 

In this example no output is produced by "ls" because the directory is empty. Escaping individual characters in this way is necessary sometimes -- generally when you want to use a character that bash would otherwise eat up (e.g. single and double quotes, asterisk, question-mark, tilde, and space). Generally speaking the use of such characters (including space) in filenames is not recommended, since it makes using the command-line more akward where these things are involved (you have to remember to escape them properly). For programs such as "ls" the effect of mis-quoted arguments is generally harmless (it'll just produce an error and exit, as shown). But for some other programs (typically those that involve writing to files) the effect of mis-quoted arguments can be unpleasant (e.g. overwriting existing files) -- hence better to avoid using filenames that lead to this sort of problem.

Returning to the above example, the other way of getting "My Stuff" as a single argument is to quote it. There are two ways to do this in bash:

[wotex:30]:~:0$ ls "My Stuff"
[wotex:31]:~:0$ ls 'My Stuff'
[wotex:32]:~:0$ 

The effect of each of these is slightly different (although not for that example). Take a slightly more elaborate example of that:

[wotex:32]:~:0$ ls ~/My\ Stuff
[wotex:33]:~:0$ 

Here the space is escaped, so doesn't break this into two arguments; and the "~" is expanded by bash to my home-directory. But quoting this doesn't work:

[wotex:33]:~:0$ ls "~/My Stuff"
ls: ~/My Stuff: No such file or directory
[wotex:34]:~:1$ 

Because the argument is quoted (and this applies to single-quotes too), bash does not expand the "~", so "ls" gets "~/My Stuff" as a single argument. There is no directory with the name "~" here, so it fails. Shell variables are expanded inside double-quoted arguments, however, but not inside single-quoted arguments; so:

[wotex:34]:~:1$ ls "$HOME/My Stuff"
[wotex:35]:~:0$ ls '$HOME/My Stuff'
ls: $HOME/My Stuff: No such file or directory
[wotex:36]:~:1$ 

The second "ls" fails for a similar reason as before -- there is no directory called "$HOME" here.

You can achieve a surprising amount with the command-line alone, much more than is described here. The bash reference manual (on-line version) is an indispensible guide and from which a lot can be learned.

One of the other heavily used features of bash is tab-completion. Typing in long path-names to programs and files is both boring and time-consuming. To this end, bash allows the "tab" key to be used to complete file and directory names. The idea is that you type in part of a command or file-name, then press the tab key. If there is only one possibility, bash completes it (adjusting the command-line). If there are no possibilities or more than one, it beeps at you. Pressing "tab" a second time determines which of these it is: if there is more than one possibility, they are all displayed. Otherwise it just beeps again. Newer versions of bash politely ask you whether you want to list all the possibilities when there are a large number.

Given that bash will complete on executables (found in directories specified by "PATH"), pressing "tab" twice at an otherwise empty command-prompt will produce a list of all the programs/etc. that you could run. My particular version of bash asks me in this case:

[wotex:40]:~:0$ tab tab
Display all 1815 possibilities? (y or n)
[wotex:40]:~:0$ 

Here I pressed "n"; "y" generates a list of 1815 names...


The UNIX File-System

The UNIX file-system (that begins at the root, "/") is generally a well-structured thing. If you run "ls" in the root-directory (either after changing-directory to there or with "ls /"), you'll see a set of directories that are common to all UNIX systems. These tend to be:

/binsystem binaries (programs)
/devdevices (full of special files)
/etcsystem-wide configuration files
/homeusers' home-directories
/libsystem libraries (bits of programs)
/rootroot's home-directory
/sbinadministrator-only programs (e.g. check disks, mount/un-mount file-systems, enable/disable network interfaces, etc.)
/usrthe "user" directory, that contains most of the installed system software (applications)
/varwhere certain programs store their state (e.g. mail-spool, news-spool, information about installed software, etc.)

Within the "/usr" directory are usually a number of standard sub-directories:

/usr/X11R6X Window System files
/usr/binuser programs
/usr/docprogram documentation
/usr/gameswhere games tend to be installed
/usr/homeusers' home-directories
/usr/includeC headers for program development
/usr/libprogram libraries
/usr/locallocally installed software
/usr/manmanual pages
/usr/sbinsuper-user only programs (e.g. web-server)
/usr/sharemiscellaneous program files (e.g. images, data, etc.)
/usr/srcsource code (e.g. Linux kernel)

The "/usr/local" directory typically follows the structure of "/usr" (minus itself, of course). The purpose of keeping locally installed software separate in "/usr/local" is to avoid conflict/confusion with distribution installed software in "/usr". Most UNIX distributions include some form of package management for managing installed software (including upgrading, removing, etc.) and generally speaking it's best not to interfere with such installed software.

The default "$PATH" environment-variable generally searches for executables in "/usr/local/bin", then "/usr/bin" and lastly "/bin". On the average well-used UNIX system exist a vast number of programs/utilities. One of the best ways to learn is to go and poke around for yourself. If you want to know about a particular program, etc. the best source of information is its manpage (accessed using the "man" command). Many programs also provide some form of built-in help, typically produced using the "-h" or "--help" command-line argument. Most GNU utilities respond to "--help" and "--version" (the latter to find out what version of a particular program is installed).

From the shell, the "cd" (change directory) command is used to move around within the directory structure. For example:

[wotex:40]:~:0$ cd /
[wotex:41]:/:0$ ls
bin    cdrw  floppy  lib         nfs   root-n  tftpboot  var
boot   dev   home    lost+found  proc  sbin    tmp       vmlinuz
cdrom  etc   initrd  mnt         root  sys     usr       vmlinuz.old
[wotex:42]:/:0$ cd bin
[wotex:43]:/bin:0$ cd /usr/local/bin
[wotex:44]:/usr/local/bin:0$ ls
bintohex   openupsd  s2c_list   s2c_scan     sermon2
expirenow  ppthing   s2c_raw    s2c_test     stat
fbounce    s2c_4cd   s2c_reset  s2c_tsensed  temp-update-to-4char.sh
[wotex:45]:/usr/local/bin:0$ cd bin
-bash: cd: bin: No such file or directory
[wotex:46]:/usr/local/bin:1$ cd ../lib
[wotex:47]:/usr/local/lib:0$ cd
[wotex:48]:~:0$ 

The last command here ("cd" without any arguments) returns you to your home-directory. Another useful command is "pwd" (print working directory). This simply prints out the name of the directory you're currently in. For example:

[wotex:48]:~:0$ pwd
/home/auser
[wotex:49]:~:0$ 

Occasionally it is useful to be able to change to another directory, do some stuff, then change back to the original directory. For this, bash provides two built-in commands "pushd" and "popd". Bash maintains a "stack" of directories which these two commands manipulate. The directory stack is shown using the "dirs" built-in command. For example:

[wotex:49]:~:0$ dirs
~
[wotex:50]:~:0$ pushd /usr/local/bin
/usr/local/bin ~
[wotex:51]:/usr/local/bin:0$ cd ../lib
[wotex:52]:/usr/local/lib:0$ dirs
/usr/local/lib ~
[wotex:53]:/usr/local/lib:0$ popd
~
[wotex:54]:~:0$ 

Note that "pushd" and "popd" also report the (resulting) directory stack. To rotate the stack forwards and backwards, a "+n" or "-n" argument to "pushd" can be used. For example:

[wotex:54]:~:0$ pushd /usr/local/src
/usr/local/src ~
[wotex:55]:/usr/local/src:0$ pushd +1
~ /usr/local/src
[wotex:56]:~:0$ popd
/usr/local/src
[wotex:57]:/usr/local/src:0$ cd
[wotex:58]:~:0$ 

Again, the best way to learn about UNIX is usually just to poke around. If you're worried about trashing your system, don't be (too much!). Creating a new user account for messing around in is usually a good idea -- if you trash the account, just delete and re-create it. But, one piece of advice: only use "root" when you need to -- "root" (superuser) is one of the few users who can trash a system..


Links

There's a whole lot of information about UNIX out there, including most of the documentation online.


Last modified: 2015-07-09 22:33:34.866310644 +0100 by Fred Barnes.