BreadCrumbs: Bash

Bash

From Luke Jackson

Revision as of 17:39, 29 September 2007; Ljackson (Talk | contribs)
(diff) ←Older revision | Current revision | Newer revision→ (diff)
Jump to: navigation, search

Contents

Shells

What is a shell? The short answer is that it is a command interpreter. It is what allows you to interact with the system from the console. That is, it is what locates and starts programs for you. It also has builtin commands and is, infact, a complete programming language. Most of the init scripts on a Linux system are written in bash, peruse the files in /etc/rc.d/init.d in order to see examples of bash scripts.

Bash is not the only shell available, there are many others: sh, ash, tcsh, csh, pdksh, ksh, sash, zsh, etc... Bash happens to be the default shell in most Linux distributions. BTW, the name, 'Bash', stands for 'Bourne Again SHell' -- a pun on the original UNIX shell, sh, written by Steve Bourne. Bash is an improved reimplementation of sh, largely POSIX compliant.

Paths and File Names

A path is composed of two types of parts, separators and elements. The separator on Unix(-like) systems is the forward slash, /, which suggests that the elements to the right are under those to the left. The elements are text strings that are the names of directories or files. File and directory names can be up to 256 characters long and contain just about any character, including control characters, spaces, and other special characters. While you can have special characters in your filename, I would dissuade you from doing so: if you need to separate name parts for readability, use an underscore instead... spaces and other special characters ( ?, *, or !, for instance ) will confuse bash commands and need to be escaped or quoted -- infact, stick to just letters, numbers, and the following three characters: '.', '_', & '-'. Also, there is no limit to the number of periods that you can put in a name -- you could name a file ..., for instance -- such can make file exchange with legacy operating systems difficult ( then again, they keep giving you files with spaces in the name, so it probably comes out even in the end... ).

If you want to check what file name and path length limits are on a system, look at the man page for getconf. This command can report what the limits are that are compiled into the OS.

File endings are not mandated by the OS. Unix-like operating systems figure out what a file is by looking at its content. Most files contain identifiers in their first few bytes. If that does not help, the shell looks further for clues like newlines and control characters. This is not to say that you don't see file endings on Unix, they can be a convenience to the user and some applications use them. Dots

You should know the meanings of these two things: '.' and '..'. The single dot by itself means 'this directory, here', e.g., and find . -type f -print would print all the regular files starting in the current directory. The double dot, '..', means 'this directory's parent directory', e.g., if you are in /home/rjh/, ls -l .. would list /home's contents.

Tilde expansion

The tilde ( the final 'e' is NOT long! ), '~', with no username immediately after it expands to the current user's home directory ( what echo $USER returns ), e.g., if rjh types ls ~, rjh's home directory will be listed, no matter where rjh is in the filesystem. Also, constructions such as ls ~/subdirectory/ are possible. On a related note, remember that typing cd with no argument will return you to your home directory ( like typing cd ~ ).

If you give a path with a tilde followed by a username, the shell will try to expand that by substituting the path to that user's home directory. So, ls ~bob expands to ls /home/bob, assuming that bob is a valid user on the machine.

Start Up

Short and to the point: when you start up bash as a login session ( from login or as bash -l from within another shell ), bash reads a series of startup files in order to configure its environment. The config files tell it how to colorize its output, where to look for executables, who its owner is, what its own name is, and other behaviors/data. These data are kept in the shell's environment variables.

The files read, in order, are:

  1. /etc/profile,
  2. ~/.bash_profile,
  3. /etc/bashrc, and
  4. ~/.bashrc.

The profile files are intended to store environment variables and the like. The bashrc files are intended to store aliases and bash functions.

Upon exit, ~/.bash_logout is executed in order to clean up after the shell.

Bash Scripting

Programs are written in programming languages that may be one of two fundamental types: compiled or interpreted. Compiled languages must be run through a compiler such as gcc or Borland. The compiler translates the source code, which is generated in a text editor, into machine code -- inscrutable strings of numbers -- which is the only thing that the computer understands. Interpreted languages are also written in a text editor, but nothing else needs to be done before they are runnable. A program called the interpreter reads the file when you run the script and translates the source code into machine code on the fly.

Interpreted languages run slower because of this. They have several advantages, however. The primary advantage is that there is no compile step. With large projects, that can save substantial amounts of time: just write your code and run it -- no waiting around for a possibly long compilation to complete. Further, most interpreted languages lend themselves to testing small fragments of code: write a few lines, run them, if they work, put them into the project... As such, they make ideal rapid prototyping languages, where the developer uses a slow-running, quick to write language to work out an idea, then translates it into a compilable language for performance.

Examples of compiled languages are C and C++. C is the language in which most of Unix and Linux are written. There are many scripting languages available with your distribution: perl, python, haskell, ruby, etc. Bash, pdksh, tcsh, and such are command interpreters. They are also programming languages. You can think of shell scripting as similar to DOS batch files, except written in an actual programming language...

Bash is the default shell on Linux. It is compatable with older Bourne shell scripts ( /bin/sh ). It is used widely for system administration. Usually, all the startup scripts are written in bash. With a text editor such as vim examine these files: /etc/rc.d/rc.sysinit and /etc/rc.d/rc. Also, take some time to root through the files in /etc/rc.d/init.d, reading anything you find interesting.

It is fruitless to try to teach a group of non-programmers how to write bash scripts in the course of one evening, so we will skip that. None the less, you need to be familiar with some basics of how scripting works because it is so essential to UNIX system administration.

How to run a script

Bash scripts are plain text files. There are several ways to run a script:

1. chmod the script, adding execute to the permissions. You need to make sure that the first line of the file is a 'shebang' line. That is a line which tells the operating system which interpreter to give the script to in order to run it. Its format is either:

#!/usr/bin/bash

or

#!/usr/bin/env bash

2. Pass the script as a argument to the interpreter, e.g., at the command line type: bash scriptname

  • A few things about the language itself:
    • Comments are preceded by a hash, # . Everything to the right of the hash is disregarded by the interpreter. Comments are there in order to let the programmer know what he was thinking whenever he wrote the script.
    • Variables are usually spelled with all capital letters, e.g., VAR . If you are assigning a value to one, you use just the variable name, e.g., VAR = somevalue ; if you are using the VAR's value, you precede it with a $ , e.g., echo $VAR .
    • Any command on a line by itself does not need a terminator -- a character to tell the interpreter where the end of the line is. You can string several commands together on one line if you separate them with a semicolon ( which is the terminator ). It is usually poor form, however, to put more than one command on a line.
  • How to find out more about bash:

there is a lot of information in the info page about bash. Type: info bash, in order to read the info page. The trouble about the info page, though, is that it was written by programmers for programmers... One book that I have is called 'Learning the Bash Shell', by Newham and Rosenblatt, published by O'Reilly ( http://www.oreilly.com ). This book is good for beginners, building up from a presumed minimal knowledge set to some fairly advanced programming.

Globbing

Globbing is the name for the shell's filename expansion capabilities. When the shell parses a command line, it looks in each token ( word, if you like ) for the special characters:

  •  ? - One of any character
  • * - Zero or more of any character
  • [ - Introduces a character class

If it finds any of these, it replaces token with a list of all filenames that match the pattern that the token represents. The ? character is a wild card that stands for 'one of any character' -- except the dot ( . ). It must match one character: 'abc?' does not match 'abc', but does match 'abcd'. The * character matches zero or more of any character. The pattern 'abc*' matches 'abc' as well as 'abcd' and 'abcdefgh...'. The [ character introduces a character class, a list of single characters that can be matched. The pattern 'abc[de]' matches 'abcd' and 'abce' but not 'abc' nor 'abcg'. If no match is found, the meta characters can be taken as literally what the are. If you do not want a meta character expanded, you need to escape it with a backslash, e.g., a file named 'abc*' would have to be typed 'abc\*' to avoid expansion. The best way to avoid problems, though, is to avoid special characters -- don't use anything in a filename that you have to escape.

Brace Expansion

On a similar subject, the shell also does brace expansion ( which is not globbing -- globbing is done to names from the filesystem ). If it sees a { character in a token, it checks to see if there is a pair of braces containing a properly formatted list of text strings ( at least one comma and no spaces ) then generates a list of tokens from the expansions. Note well that it does not do filename matching in order to determine whether to generate the token. Given the command

chown 701 public_html/chap{0{8,9},1{0,1,2}}

the last token will be expanded to the list: chap08, chap09, chap10, chap11, and chap12 ( the command will attempt to operate on the all the paths, possibly failing on any path that does not exist ). Note that you can nest braces.

Personal tools