Fred's Home Page

Main
About me
Crumble
csh is bad
Debian
FredCam
Guppy
Hardware
Help out
Java-glossary
Job-control
KRoC/Linux (old)
Lambda
Life-stuff
Links
Linux
Mesh
Misc
Music
occam
occam upgrades
occam tutorial
OpenUPSd
Pictures
Programming
Projects (PhD)
Publications
Quick gdb
RAPP
RMoX
Software
UNIX crashcourse
UWG
UWGBuilder
WG / WGBuilder
XDM-Choose
XHPD

A Crash Course in UNIX

[ about | introduction | shell | file-system | links ]


About this document

This document attempts to provide a very basic crash-course in UNIX. Primarily for people who are already quite familiar with using computers, but have little or no experience of UNIX. Note that this document will probably be under permanent construction!

Where things typed/printed are involved, the following style/convention is used:

the-prompt$ something typed in
the output of whatever

Multi-key sequences are given in a fairly standard way, for example:

`ctrl-D'hold down the "control" key and press/release the `D' key.
`ctrl-shift-D'hold down the "control" and "shift" keys, then press/release the `D' key.
`esc F'press/release the "escape" key, then press/release the 'F' key.

Introduction

`UNIX' is used to describe an operating system that looks/behaves in a certain way. It's quite hard to pin down the exact meaning of `UNIX', as things change over the years. Typically it refers to systems that provide POSIX, SYSV and/or BSD interfaces to application programs (but that relates to programming on UNIX, which is not necessarily of interest here). Various other things characterise UNIX systems, of decreasing interest here, however. "UNIX" is a registered trademark of The Open Group.

If you've had experience of something that only has a `filesystem' (no explicit drive-letters, etc.), mentioned things called `terminals', `shells' or `command-line', then there's a good chance it's UNIX. Commonly found UNIX-like systems are: Linux (various distributions, e.g. Debian GNU/Linux), Solaris (from Sun Microsystems) and FreeBSD.

The average person's first experience of UNIX usually leaves them frustrated and liable to return to their fluffy window-based operating system. UNIX is slowly getting better at the "fluffy interface", to the point now where there are several user-friendly UNIX distributions available. The `core' of UNIX remains the same, however -- but that isn't up for discussion here either. The reason for choosing UNIX over other operating-systems are generally for its power and flexibility. But like many things, that added power and flexibility can create complexity (flying a helicopter is much harder than driving a car, but you can do more with a helicopter).

Before setting off, there are a couple of UNIX concepts that should be digested:

  • file-system: UNIX has a (singular) file-system. It starts at the root, "/", from which everything else branches. The root file-system is usually some partition of your hard-disk, some temporary ram-disk, or other file-system (e.g. network, ROM, ...). Most important is that this file-system exists -- the system won't start without it. Other file-systems are mounted inside the root file-system (or inside each other if so desired). For example, I have one of my hard-disk partitions mounted on "/usr" (where most software lives); another mounted file-system is "/proc", which isn't a real file-system, but manufactured by the Linux kernel (providing access to kernel/system information).

    There are a few directories that warrant special attention:

    ./this is the "current" directory, i.e. the directory you are "in" (this is explained a bit more below).
    ../this is the "parent" directory, i.e. the directory above where you are now.
    ~/this is your "home-directory" (the little squiggle is called a "tilde", `shift + #' on UK keyboards). Your home-directory is where you put your own files/programs/web-pages/etc. Your home-directory also contains configuration files for different programs (although you may not be explicitly aware of this).
  • process: at the lowest level, UNIX does not differentiate between the different types of programs/applications running -- they're all just processes. Processes in UNIX are organised into a tree-structure, of parent/child relationships. At the very top (with a process-ID of 1) is something called the init-task -- all other processes are decendants of this. The way UNIX creates new processes is by duplicating an existing process (termed a "fork"), then replacing that new process with the desired target process (loaded as a program on the file-system somewhere, and termed "exec" (for execute)). This mechanism isn't terribly important, but does have an effect on certain things (disussed as they arise).


The Shell

The "shell" is where a UNIX user will typically spend a lot of their time. The job of the shell is to provide an interface for using a UNIX system, typically by means of a "command-line", where the user can type commands, run programs, etc. One widely used shell is bash. As shells go, bash is quite powerful, and when used correctly can greatly enhance the productivity of a UNIX user. "bash" actually stands for the "Bourne Again SHell". See its wikipedia entry for some history.

As standard, the bash shell produces this sort of prompt:

bash$ 

Which isn't actually very nice. One of the first things I do on new UNIX accounts is fix this, usually by copying the shell initialisation files from another UNIX machine. At this point, it's probably worth mentioning some bash-related files (all of which are in your home-directory):

  • ~/.bash_profile: this is your shell "initialistion" file. Typically this is where you will put your own customisation commands for bash. Bash reads and executes the commands in this file when it is started as a "login shell". This typically means when bash is started as the result of you logging into the system (either locally or remotely).

  • ~/.bashrc: this is another bash initialisation file, but is read and executed when bash is started as a "non-login shell". This typically means when bash is started as the result of running a shell-script, or in other instances where "bash" is run (but not a login shell). My own ~/.bashrc is empty, and yours may be too (or might not even exist -- which is ok).

  • ~/.bash_logout: this is the opposite of ~/.bash_profile -- it runs when you log-out of bash (by typing "exit", "logout", or pressing `ctrl-D'). Most people don't have or need this file.

  • ~/.bash_history: this file stores a list of commands you have previously typed, so that the command-line "history" is remembered between separate login sessions

Note that all of these file-names begin with a period (`.'). Files of this nature are generally known as "dot-files". Normally, running "ls" (list directory contents) will not show these files unless the "-a" flag is given. This includes the current directory (just ".") and the parent directory ("..").

When bash first starts up, it leaves you in your home-directory. This is your space on the UNIX machine -- other users have their home-directories elsewhere. Generally, this directory will be "/home/login/", where "login" is your UNIX user-name. For example:

bash$ ls
Mail  public_html
bash$ ls -a
.  ..  .bash_history  .bash_logout  .bash_profile  .bashrc  Mail  public_html
bash$ 

Some of these names are files, others are directories. To get more verbose information out of "ls", use the "-l" flag. For example:

bash$ ls -l
total 8
drwx------  2 auser auser 4096 Dec  9 02:06 Mail
drwxr-xr-x  2 auser auser 4096 Dec  9 02:07 public_html
bash$ ls -al
total 32
drwxr-xr-x  4 auser auser 4096 Dec  9 02:07 .
drwxr-xr-x  6 root  root  4096 Dec  9 02:05 ..
-rw-------  1 auser auser  258 Dec  9 02:07 .bash_history
-rw-r--r--  1 auser auser  174 Dec  9 02:05 .bash_logout
-rw-r--r--  1 auser auser  704 Dec  9 02:05 .bash_profile
-rw-r--r--  1 auser auser 1290 Dec  9 02:05 .bashrc
drwx------  2 auser auser 4096 Dec  9 02:06 Mail
drwxr-xr-x  2 auser auser 4096 Dec  9 02:07 public_html
bash$ 

The first column in the output shows the "mode" and "permissions" of the file. A mode of "d" indicates a directory, "-" (blank) indicates an ordinary file (and there are other types too). The second column shows the number of links to the file (which is usually not interesting, so we'll ignore it). The third and fourth columns show who owns the file/directory (user-name and group-name). Here, the user who owns this home-directory is "auser", and they have their own group "auser" too. The parent-directory is the only thing not owned by "auser" -- this is "/home/" on this system, and that is owned by "root" (the super-user, usually the system administrator(s)). The fifth column output shows the size, that is generally only interesting for files (the size reported for directories is usually the disk-space required to hold the various entries in that directory). The date and time shown in the sixth, seventh and eighth colums are the "modification" time of the file/directory -- i.e. when whatever it is was last modified. Finally the name of the file/directory is shown.

Anyway, back to the original point, which was to make the "bash$ " prompt a little more user-friendly. This is done by setting the "PS1" environment-variable (covered shortly). The sensible place to set this is in your "~/.bash_profile" file. E.g., edit (or create if necessary) this file and add the lines:

# make the prompt a little more friendly
PS1='[\h:\!]:\w:$?$ '

The first line here is a comment. Bash interprets everything following a `#' character to be a comment, up to the end of the line. The second line actually sets the "PS1" variable.

  • environment-variables: every process in UNIX carries with it something called the "environment". This is list of `name=value' pairs, that define various settings and other interesting things. When one UNIX process spawns another process (via a "fork" and "exec"), the new process gets a copy of the current process's environment (a copy of its parent process's environment, from the newly-created child process's perspective). Bash actually maintains two environments: one is the environment that belongs to the "bash" process (and copied when programs are run from that bash); the second is entirely internal to "bash". The contents of these environments can be examined using the "export" and "set" bash built-in commands respectively.

  • editing files: in order to make that change to your "~/.bash_profile" file, you need to edit it. There are a variety of editors available on UNIX systems, some more friendly than others. Vi and Emacs are amongst the most powerful of editors -- there's not a lot (in the sense of editing text) that you can't do in these. The "Vim" version of Vi is my personal choice, but any Vi can be a bit freaky if you've never used it before. GVim (graphical version of Vim for X11) is much more user-friendly than the terminal version. Two significantly simpler editors are "nano" and "pico", so you may want to start with one of these (and later progress to Vim/Emacs).

Once your "~/.bash_profile" has been suitably edited, you need to make bash aware of the changes. One way is to logout then log back in again (at which point bash will read the modified file). Another way is to source the file:

bash$ . ~/.bash_profile
[wotex:25]:~:0$ 

And suddenly you have a much nicer prompt :-). The various components that make up the prompt are just a selection of what's available. These are described in detail in the bash manual page. Note the use of the strange "." command in the above, this is a short-hand abbreviation for "source"; which one you use is down to personal preference.

Reading manual pages can be a bit frustrating at first, but quite a lot of these are available as nicely formatted HTML pages on-line. For example, the GNU site has an HTML formatted version of the bash reference manual, that is a lot easier to read (initially) than the corresponding manpage (both are generated from the same "source" however). The "PS1" variable is described in Section 5.1: bourne shell variables, along with other interesting environment variables.

  • documentation, manual pages: the majority of standard UNIX commands and programs are well documented. This documentation exists mostly in the form of "manual pages" (or "manpage" for short). Manual pages are read using the "man command" program, that displays the manpage in the terminal window. You can search the list of manual pages using "man -k word", followed by some keyword or command-name. The "man" program actually uses various other programs to display the final page in your terminal window. Of most relevance to the user is the program used to display the resulting text in the terminal window. The "PAGER" environment-variable is used for this (or some default if that is not set). Two common programs used to handle the output are "more" and "less". Both of these present the information in a manageable way (e.g. a page at a time); "less" is significantly more powerful (from the design/publishing phrase "less is more" ...). Another relevant environment-variable is "MANPATH", that contains a list of paths to manual-pages (e.g. so you can add your own). Again if this is not set, the "man" program uses a system-wide default. Information on the manual page system can be read using "man man".

The value of the "PS1" environment-variable (generally any environment variable) can be shown using:

[wotex:25]:~:0$ echo $PS1
[\h:\!]:\w:$?$ 
[wotex:26]:~:0$ 

The "echo" command is built into the bash shell (although a stand-alone version usually exists, "/bin/echo"). All "echo" does is print out the parameters given to it on the command line. The use of "$PS1" (as opposed to "PS1") causes the shell to "expand" that environment-variable before running the command ("echo" in this case). Thus, what "echo" actually sees as a command-line argument is the expanded (substituted) value of the "PS1" environment variable.

One other particularly useful environment-variable is "PATH". This is set to a list of directories (separated using colons -- ":"), that determine where bash looks for executable programs. When you login the system sets up a default path, generally to the usual UNIX utilities, etc. For instance:

[wotex:26]:~:0$ echo $PATH
/usr/local/bin:/usr/bin:/bin:/usr/bin/X11:/usr/games
[wotex:27]:~:0$ 

If you install your own programs (e.g. inside your home-directory somewhere) then you'll probably want to update your "PATH" environment to include these. As a general rule I try and stick my own executables in "~/bin/" (i.e. the subdirectory "bin" inside my home-directory). To include these in the "PATH" I've got something like this in my "~/.bash_profile":

# add ~/bin/ to the path
PATH=$HOME/bin:$PATH

Note that instead of writing "~" for my home-directory, I've used "$HOME". This environment-variable is set to your home-directory when you login and is generally safer than using "~" in some cases (like this). This is because "~" is bash shorthand (like "." meaning "source"), and probably shouldn't be used except on the command-line -- and then only when it's clear what's going to happen. When "~" is expanded by bash, it is expanded into the value in the "$HOME" environment-variable.

At this point, one might wonder how you pass something like "$HOME" (literally) as an argument to a program. The answer lies in how you can quote arguments. This also solves the problem of handling things such as directories and files with spaces in their name. For example, if you happen to have a directory called "My Stuff" (that might typically arise if you're accessing a Windows file-system from UNIX), then you can't list its contents with, for instance:

[wotex:27]:~:0$ ls
Mail  My Stuff  public_html
[wotex:28]:~:0$ ls My Stuff
ls: My: No such file or directory
ls: Stuff: No such file or directory
[wotex:29]:~:1$ 

This happens because the shell (bash) interprets "ls My Stuff" as the program "ls" with the arguments "My" and "Stuff". There are two ways to deal with this one. The first is arguably the simplest and that is to "escape" the space in the filename using a backslash. For example:

[wotex:29]:~:1$ ls My\ Stuff
[wotex:30]:~:0$ 

In this example no output is produced by "ls" because the directory is empty. Escaping individual characters in this way is necessary sometimes -- generally when you want to use a character that bash would otherwise eat up (e.g. single and double quotes, asterisk, question-mark, tilde, and space). Generally speaking the use of such characters (including space) in filenames is not recommended, since it makes using the command-line more akward where these things are involved (you have to remember to escape them properly). For programs such as "ls" the effect of mis-quoted arguments is generally harmless (it'll just produce an error and exit, as shown). But for some other programs (typically those that involve writing to files) the effect of mis-quoted arguments can be unpleasant (e.g. overwriting existing files) -- hence better to avoid using filenames that lead to this sort of problem.

  • program exit-codes: it's probably worth mentioning something about this now. In the above example, the end of the prompt changes from ":0$ " to ":1$ " and back again. The value itself is generated by the "$?" bit of "PS1" (the prompt). This is a special shell variable that holds the exit-code of the last command. When any process in UNIX terminates (either because it finished normally or was killed), it returns an exit-code to its parent process. Bash captures this and places it in the "$?" variable, that you can use anywhere it makes sense too. Having it as part of the prompt allows you to see whether something succeeded (returns 0) or failed (returns non-zero) at a glance, without necessarily having to scan the output it generated.

Returning to the above example, the other way of getting "My Stuff" as a single argument is to quote it. There are two ways to do this in bash:

[wotex:30]:~:0$ ls "My Stuff"
[wotex:31]:~:0$ ls 'My Stuff'
[wotex:32]:~:0$ 

The effect of each of these is slightly different (although not for that example). Take a slightly more elaborate example of that:

[wotex:32]:~:0$ ls ~/My\ Stuff
[wotex:33]:~:0$ 

Here the space is escaped, so doesn't break this into two arguments; and the "~" is expanded by bash to my home-directory. But quoting this doesn't work:

[wotex:33]:~:0$ ls "~/My Stuff"
ls: ~/My Stuff: No such file or directory
[wotex:34]:~:1$ 

Because the argument is quoted (and this applies to single-quotes too), bash does not expand the "~", so "ls" gets "~/My Stuff" as a single argument. There is no directory with the name "~" here, so it fails. Shell variables are expanded inside double-quoted arguments, however, but not inside single-quoted arguments; so:

[wotex:34]:~:1$ ls "$HOME/My Stuff"
[wotex:35]:~:0$ ls '$HOME/My Stuff'
ls: $HOME/My Stuff: No such file or directory
[wotex:36]:~:1$ 

The second "ls" fails for a similar reason as before -- there is no directory called "$HOME" here.

  • command-line history: bash does a good job at making the command-line a nicer place for the user. A lot of this is down to two GNU libraries, "readline" and "history". Specifically these provide interactive command-line editing (the ability to cursor left+right and change bits); and a command-line history. The history is accessed using the up and down cursor keys, allowing you to repeat earlier commands easily (either as they were or after some editing). A complete command-line history can be shown with the "history" built-in command. You'll notice that this also shows numbers alongside the commands -- these correspond to the number shown in the prompt that increments each time (produced using the special "\!" escape-sequence in the "PS1" prompt). To repeat some specific earlier command, you can either press the up arrow until you get to it, or enter "!nnn", that tells bash to repeat history entry "nnn".

    As shorthand, you can use "!!" to repeat the last command, and "!-2" etc. to repeat the last-but-one command, etc. Bash also supports non-interactive command-line editing (which sounds a little strange), but can be very handy in certain situations.

You can achieve a surprising amount with the command-line alone, much more than is described here. The bash reference manual (on-line version) is an indispensible guide and from which a lot can be learned.

One of the other heavily used features of bash is tab-completion. Typing in long path-names to programs and files is both boring and time-consuming. To this end, bash allows the "tab" key to be used to complete file and directory names. The idea is that you type in part of a command or file-name, then press the tab key. If there is only one possibility, bash completes it (adjusting the command-line). If there are no possibilities or more than one, it beeps at you. Pressing "tab" a second time determines which of these it is: if there is more than one possibility, they are all displayed. Otherwise it just beeps again. Newer versions of bash politely ask you whether you want to list all the possibilities when there are a large number.

Given that bash will complete on executables (found in directories specified by "PATH"), pressing "tab" twice at an otherwise empty command-prompt will produce a list of all the programs/etc. that you could run. My particular version of bash asks me in this case:

[wotex:40]:~:0$ tab tab
Display all 1815 possibilities? (y or n)
[wotex:40]:~:0$ 

Here I pressed "n"; "y" generates a list of 1815 names...


The UNIX File-System

The UNIX file-system (that begins at the root, "/") is generally a well-structured thing. If you run "ls" in the root-directory (either after changing-directory to there or with "ls /"), you'll see a set of directories that are common to all UNIX systems. These tend to be:

/binsystem binaries (programs)
/devdevices (full of special files)
/etcsystem-wide configuration files
/homeusers' home-directories
/libsystem libraries (bits of programs)
/rootroot's home-directory
/sbinadministrator-only programs (e.g. check disks, mount/un-mount file-systems, enable/disable network interfaces, etc.)
/usrthe "user" directory, that contains most of the installed system software (applications)
/varwhere certain programs store their state (e.g. mail-spool, news-spool, information about installed software, etc.)

Within the "/usr" directory are usually a number of standard sub-directories:

/usr/X11R6X Window System files
/usr/binuser programs
/usr/docprogram documentation
/usr/gameswhere games tend to be installed
/usr/homeusers' home-directories
/usr/includeC headers for program development
/usr/libprogram libraries
/usr/locallocally installed software
/usr/manmanual pages
/usr/sbinsuper-user only programs (e.g. web-server)
/usr/sharemiscellaneous program files (e.g. images, data, etc.)
/usr/srcsource code (e.g. Linux kernel)

The "/usr/local" directory typically follows the structure of "/usr" (minus itself, of course). The purpose of keeping locally installed software separate in "/usr/local" is to avoid conflict/confusion with distribution installed software in "/usr". Most UNIX distributions include some form of package management for managing installed software (including upgrading, removing, etc.) and generally speaking it's best not to interfere with such installed software.

The default "$PATH" environment-variable generally searches for executables in "/usr/local/bin", then "/usr/bin" and lastly "/bin". On the average well-used UNIX system exist a vast number of programs/utilities. One of the best ways to learn is to go and poke around for yourself. If you want to know about a particular program, etc. the best source of information is its manpage (accessed using the "man" command). Many programs also provide some form of built-in help, typically produced using the "-h" or "--help" command-line argument. Most GNU utilities respond to "--help" and "--version" (the latter to find out what version of a particular program is installed).

  • working directories: every UNIX process has associated with it something called the "working directory". For most programs, this is the directory it was launched from (via the shell). The working directory of a process is most opaque in the shell (e.g. "bash") -- initially the user's home-directory. When any process opens a file/directory/etc. using a relative path (one that does not begin with a "/"), the process's working-directory is prepended to it, giving an absolute path. In addition to this, the working-directory of a process is inherited when it forks (spawns a new, often short-lived, copy of itself). For instance, when "ls foo" is run from the command-line, the "ls" program will inherit the current working directory from the shell and attempt to open the directory "foo" (that is relative to the working directory, since it does not start with a "/").

From the shell, the "cd" (change directory) command is used to move around within the directory structure. For example:

[wotex:40]:~:0$ cd /
[wotex:41]:/:0$ ls
bin    cdrw  floppy  lib         nfs   root-n  tftpboot  var
boot   dev   home    lost+found  proc  sbin    tmp       vmlinuz
cdrom  etc   initrd  mnt         root  sys     usr       vmlinuz.old
[wotex:42]:/:0$ cd bin
[wotex:43]:/bin:0$ cd /usr/local/bin
[wotex:44]:/usr/local/bin:0$ ls
bintohex   openupsd  s2c_list   s2c_scan     sermon2
expirenow  ppthing   s2c_raw    s2c_test     stat
fbounce    s2c_4cd   s2c_reset  s2c_tsensed  temp-update-to-4char.sh
[wotex:45]:/usr/local/bin:0$ cd bin
-bash: cd: bin: No such file or directory
[wotex:46]:/usr/local/bin:1$ cd ../lib
[wotex:47]:/usr/local/lib:0$ cd
[wotex:48]:~:0$ 

The last command here ("cd" without any arguments) returns you to your home-directory. Another useful command is "pwd" (print working directory). This simply prints out the name of the directory you're currently in. For example:

[wotex:48]:~:0$ pwd
/home/auser
[wotex:49]:~:0$ 

Occasionally it is useful to be able to change to another directory, do some stuff, then change back to the original directory. For this, bash provides two built-in commands "pushd" and "popd". Bash maintains a "stack" of directories which these two commands manipulate. The directory stack is shown using the "dirs" built-in command. For example:

[wotex:49]:~:0$ dirs
~
[wotex:50]:~:0$ pushd /usr/local/bin
/usr/local/bin ~
[wotex:51]:/usr/local/bin:0$ cd ../lib
[wotex:52]:/usr/local/lib:0$ dirs
/usr/local/lib ~
[wotex:53]:/usr/local/lib:0$ popd
~
[wotex:54]:~:0$ 

Note that "pushd" and "popd" also report the (resulting) directory stack. To rotate the stack forwards and backwards, a "+n" or "-n" argument to "pushd" can be used. For example:

[wotex:54]:~:0$ pushd /usr/local/src
/usr/local/src ~
[wotex:55]:/usr/local/src:0$ pushd +1
~ /usr/local/src
[wotex:56]:~:0$ popd
/usr/local/src
[wotex:57]:/usr/local/src:0$ cd
[wotex:58]:~:0$ 

Again, the best way to learn about UNIX is usually just to poke around. If you're worried about trashing your system, don't be (too much!). Creating a new user account for messing around in is usually a good idea -- if you trash the account, just delete and re-create it. But, one piece of advice: only use "root" when you need to -- "root" (superuser) is one of the few users who can trash a system..


Links

There's a whole lot of information about UNIX out there, including most of the documentation online.

Last modified: 2009-07-24 23:46:46.000000000 +0100 by Fred Barnes [ds] [plain]
Page generated: Sun Apr 28 11:39:34 2013
Valid XHTML 1.0! Valid CSS!