Wednesday, September 29, 2010

Useful Linux/Unix One Liners

#### Useful Linux/Unix One Liners ####

1. Display Username and UID sorted by UID Using cut, sort and tr
Cut command is used to extract specific part of a file. The following example cuts the username and UID from /etc/passwd file, and sort the output using sort command using username as a key and ":" as a delimiter
As a part of formatting the output, you can use any other character to display username and UID. Using tr command you can convert to ":" to any other character
$ cut -d ':' -f 1,3 /etc/passwd | sort -t ':' -k2n - | tr ':' '\t'

2. Find List of Unique Words in a file Using tr, sed, uniq
The following example lists the words which has only alphabets. 'tr' command converts all the character other than alphabets to newline. So all the words will be listed out with number of newlines. Sed command removes the empty lines and finally uniquely sort the output to avoid the duplicates.
$ tr -c a-zA-Z '\n' < Readme1.txt | sed '/^$/d' | sort | uniq -i -c
Note: uniq with -i ignores the cases
Linux 'sed' command plays a vital role in test manipulation operations

3. Join Two Files (Where one file is not sorted ) Using sort and Join
Join Command joins two files based on a common field between two files.For join to work properly, both the files should be sorted.
In the example below, the file m1.txt Employee name and Employee Id and its not sorted. Second file m2.txt has employee name and Department name. To join these two files, sort the first file and give the sorted output as one of the input stream of join.
$ sort m1.txt | join - m2.txt

4. Find out which process is using up your memory using ps,awk,sort
The following command lists all the process sorted based on the used memory size.
$ ps aux | awk ' { if ($5 != 0) print $2,$5,$6,$11 } ' | sort -k2n
The above command lists the PID, Used virtaul memory size, Used resident set-size and process command.
Awk is an extremely useful language to manipulate structured data very quickly.

5. Find Out Top 10 Largest File or Directory Using du,sort and head
'du' command shows summarized disk usage for each file and directory of a given location(/var/log/*). The output of a sort command is reversely sorted based on the size
# du -sk /var/log/* | sort -r -n | head -10

6. Find out Top 10 most Used commands.
Bash maintains all the commands you execute in a hidden file called .bash_history under your home directory
Use the following one liner to identify which command you execute a lot from your command line.
$cat ~/.bash_history | tr "\|\;" "\n" | sed -e "s/^ //g" | cut -d " " -f 1 | sort | uniq -c | sort -n | tail -n 15

7. Display timestamp using HISTTIMEFORMAT
Typically when you type history form command line, it displays the command# and the command. For auditing purpose, it may be beneficial to display the timestamp along the command
# export HISTTIMEFORMAT='%F %T'

8. Search the histroy using Control+R

9. Repeat perious command quickly using 4 different methods
Sometime you may end up repeatig the previous commands for various reasons. following are the 4 different ways to repeat the last executed command.
- Use the up arrow
- Type !!
- Type !-1
- Press Control+P

10. Eliminate the continuous repeated entry form history using HISTCONTROL
export HISTCONTROL=ignoredups

11. Erase duplictes accross the whole history using HISTCONTROL
export HISTCONTROL=erasedups

12. Disable the usage of history using HISTSIZE
export HISTSIZE=0

13. Ignore specific commands from the history using HISTIGNORE
Sometimes you may not want to clutter your history with basic commands such as pwd and ls
$export HISTIGNORE="pwd:ls:ls -ltr:"

 ##### Find, Exec , Xargs ###


## Find

Display the pathnames of all files in the current directory and all subdirectories.  The commands
#find . -print
#find -print
#find .

This will search any filename that begins with foo and ends with bar
#find . -name foo\*bar

Example using two search criteria:
#find / -type f -mtime -7 | xargs tar -rf weekly_incremental.tar

Note
Will find any regular files (i.e., not directories or other special files) with the criteria "-type f", and only those modified seven or fewer days ago ("-mtime -7").  Note the use of xargs, a handy utility that coverts a stream of input (in this case the output of find) into command line arguments for the supplied command (in this case tar, used to create a backup archive).

Another use of xargs is illustrated below.  This command will efficiently remove all files named core from your system (provided you run the command as root of course):

#find / -name core | xargs /bin/rm -f
#find / -name core -exec /bin/rm -f '{}' \; # same thing
#find / -name core -delete                  # same if using Gnu find

(The last two forms run the rm command once per file, and are not as efficient as the first form.  However the first form is safer if rewritten to use "-print0".)

The find criteria is used to locate files modified less than 10 minutes ago
#find / -mmin -10

Recently downloaded file want to locate
#find / -cmin -10
-cmin n = File's status was last changed n minutes ago.
-mmin n = File's data was last modified n minutes ago.
-mtime n = File's data was last modified n*24 hours ago.

Find files with various permissions set.  "-perm /permissions"
will locate files that are writeable by "others"
#find . -perm -o=w

#find . -mtime 0   # find files modified between now and 1 day ago
                  # (i.e., within the past 24 hours)
#find . -mtime -1  # find files modified less than 1 day ago
                  # (i.e., within the past 24 hours, as before)
#find . -mtime 1   # find files modified between 24 and 48 hours ago
#find . -mtime +1  # find files modified more than 48 hours ago

#find . -mmin +5 -mmin -10 # find files modified between
                          # 6 and 9 minutes ago

This says to seach the whole system, skipping the directories /proc, /sys, /dev, and /windows-C-Drive (presumably a Windows partition on a dual-booted computer).  The Gnu -noleaf option tells find not to assume all remaining mounted filesystems are Unix file systems (you might have a mounted CD for instance).  The "-o" is the Boolean OR operator, and "!" is the Boolean NOT operator (applies to the following criteria).

So these criteria say to locate files that are world writable ("-perm -2", same as "-o=w") and NOT symlinks ("! -type l") and NOT sockets ("! -type s") and NOT directories with the sticky (or text) bit set ("! \( -type d -perm -1000 \)").  (Symlinks, sockets and directories with the sticky bit set are often world-writable and generally not suspicious.)

#find / -noleaf -wholename '/proc' -prune \
     -o -wholename '/sys' -prune \
     -o -wholename '/dev' -prune \
     -o -wholename '/windows-C-Drive' -prune \
     -o -perm -2 ! -type l  ! -type s \
     ! \( -type d -perm -1000 \) -print

Using -exec Efficiently:
# find whatever... | xargs command
Two limitations
- Firstly not all commands accept the list of files at the end of the command.  A good example is cp:
#find . -name \*.txt | xargs cp /tmp  # This won't work! [ -t - we can handle the issue ]

- Secondly filenames may contain spaces or newlines, which would confuse the command used with xargs.  (Again Gnu tools have options for that, "find ... -print0 |xargs -0 ...".)

An alternate form of -exec ends with a plus-sign, not a semi-colon.  This form collects the filenames into groups or sets, and runs the command once per set. 
(This is exactly what xargs does, to prevent argument lists from becoming too long for the system to handle.) 
In this form the {} argument expands to the set of filenames.  For example:

find / -name core -exec /bin/rm -f '{}' +

#find /opt -name '*.txxt' -type f -exec sh -c 'exec cp -f "$@" /tmp' find-copy {} \;
#find /path/to/files* -mtime +5 -exec rm {} \;

-exec, allows you to pass in a command such as rm. The {} \; at the end is required to end the command.

# dig -x IP @Dns_Server

Monday, September 27, 2010

Python-Reference

### Python (VVV) ###

Dynamic mature language... unix/linux based scripting language... Indentation is important in python programming language... Python encourages neat programming... Python is Cross platform...Python is amazing programming language... It's supports Object Oriented Programming Language...
High data manipulation capability

- which python
- python
Eg: python 1 + 1, 1 * 3,2 ** 16(expo), 4 -2 ,"Hello World","Hello World\n","Hello\tWorld\n\n\n",'Hello\tworld\n\n'
1 tab = 8 spaces

Press Cntr + D to come out of the Interpretor
help()
- python is great at text parsing... it is loosely typed...
- Python is a modular language ... modules are availabe... import the modules... Then the modules code will be availabe in your code
- Indentation of branches is very important

name = "gyani pillala"
print name
### SheBang header #!/usr/bin/python
message = "Hello World"
print "\n\n", message ,"\n\n"

#(string,int,float,lists,tuples,dictionary)
Product = "Linux"
print Product
type(Product)

id(Price),Price2 = Price, id(Price2)... The values are same... It will just act like a Symbolic link.

##STDIN
print "first" , "second"  o/p first second
print "first" + "second" o/p firstsecond
print "hello"*3 o/p hellohellohello
message = raw_input("What is your message?") o/p What is your message? Hello world
print message
###

Eg:
print "------------------- Life Exepectancy ---------------------------"
name = raw_input("What is your name? ") //(After printing gives you a new line)
print name",",   // (After printing doesn't go to new line)
age= input("What is your age? ") // Integer Input

### String 1
Inherent ability to manipulate the string
Eg;
len(message)
print message[4]
print message[2:5] //slice operator to print different perspective of a String

>>> for letter in message:
...    print letter

>>> if message == message2:
...     print "They Match!"

 >>> import string
...     string.upper(message) //String is a module... Upper is a Function/Method
...     string.lower(message)
...     string.capitalize(message)  // New string
...     string.capwords(message)  // New String
...     string.split(message) // ['new', 'string']
### String 3
import string

message = "new string"
message2 = "new strings2"
print message
print message, "contains", len(message), "characters"
print "The first Character in", message, "is: " , message
print "Example of slicing", message, message[0:4]
for letter in message:
    print letter
if message == message2:
    print "They match!"

Difference Bet Interpreter and Scripts --
Interpreter echo's the values ...whereas script do not... by default

## Lists
Part |

numlist = [1,2,3,4,5]
print numlist
numlist.reverse()
numlist2 = [6,7,9]
numlist.append(numlist2)
print numlist // [1,2,3,4,5,[6,7,9]]
print numlist[0]
print numlist[5][0] // Two dimensional list
numlist.pop() // pop up the last element
numlist.pop(3) // pop up the 3rd element
numlist.extend(numlist2)
print numlist  //[1,2,3,4,5,6,7]
numlist.insert(Indexnum,value) // numlist.insert(1,2)

Part ||

range(3) // [0,1,2]
numlist3 = range(10)
range(1,3 ) // [1,2]
range(0,102,3) // Incremented by 3

stringlist = ["LinuxCBT","scripting","Edition"]
stringlist.reverse()
stringlist.append(stringlist2)
stringlist.pop()
stringlist.extend(stringlist2)
stringlist.insert(i,value)

Part |||

logfile = " 200450050 10.10.10.2 192.168.222.233"
type(logfile) // result says --- str
import string
string.split(logfile)
logfile2 = string.split(logfile) // Contains List
type(logfile2) // result says --- list
logfile3 = string.join(logfile2)
print logfile3
type (logfile3) // type str

print range(10) # returns all values excluding boundary (10)
print range(1,11) # retutrns 1 -10
stinglist = ["l", "b", "c"]
stringlist2 = ["p","q"]
print stringlist
stringlist.append(stringlist2)
print stringlist
print stringlist[3][:]
stringlist.extend(stinglist2) // Just one flat list

logfile2 = string.split(logfile)
logfile3 = sting.join(logfile3)  // Convert the list to string
print type(logfile4)

#Tuples // Are Immutable ... Basically it is a list which is immutable...
product = ["d","f"]
product[1] = "change"
product2 = ('linuxcbt','Scripting','redhat')
type(product2) // tuple  read-only  tuple doesn't except any changes to the values

# Dictionary

test = {'script':395,'redhat':595}
test['script'] // 395
test['redhat'] // 595
test['debian'] = 400
test // Will print all the contents
test.keys()
test.values()
del test['redhat'] // key to del
for k,v in test.iteritems(): // Nothing significant about k and v.. key & val
     print k,v // print all the values & keys
suiteprice = [333.444] // list
test['suite'] = suiteprice
test // print all the {'suite':[333,444],'scripting':667...

# Conditional |
6 operators...
<,<=,>,>=,==,!=,<>
min,max = 8,9
>>>if min < max:
....   print min, " is less than", max

Note: since python is loosely typed... We don't have to close the brackets...
if min > max:
 print min, " is less than ", max
else:
print "no luck"
if answer != timeleft:
     print "Sorry", name, "That is incorrect"
else:
    print " hai "

#Conditional ||
import sys
print sys.argv  // Will print the file name itself
print len(sys.argv)
if len(sys.argv) < threshold:
          print " blah.... blah... blah..."
elif len(sys.argv) >=8 :
         print "hello"

# For Loops
for var in list:

Eg;
string1 = "linuxcbt"
for i in string1:
        print i
list = ["x","u"]
for i in list:
      print i

# While loop
eg;
count = 0
while count <= 10:
   print count
   count = count + 1

while 1:
   print ""
   count = count + 1

while answer != timeleft:
    print " blah blah"
    sys.exit()

#File Input/output |
# open accepts 2 arguments: filename and mode
# modes include: r,rb(read-binary),wb(write-binary),a(append),r+(read-write)
#variable object
hadle = open("data1","r")
print hadle // < open file 'data1', mode 'r' at 0xf5fd520{Memory Location} >  // all the file variable goes under memory...
# readline reads into a string the first line up to \n
print hadle.readline()
temp = hadle.readline()
type(temp)  // string
temp = hadle.read() // will read the entire file
temp = hadle.read(50) // will read only 50 lines
#read - reads entire file into a string, unless num of chars specified
#readlines - will read one line per list element

#File Input/output ||
Note: python by default gives a new line character when we print some thing using "print"

E.g:
han1 = open("data1","r")
han2 = open("data2","w") // If the file exist it will overwrite... If the file is not there will create a new file
tempread1 = han1.readlines()
for i in tempread1:
    han2.write(i)
or // han2.writelines(tempread1)
han1.close()
han2.close() // close the files important... flush the buffers

han1 = open(f1,"r") // f1 = "data1" // f1 = raw_input()
han2 = open(f2,"r" // f2 = "data2" // f2 = raw_input()
han2 = open(f2,"a") // Appending is always a good habit.. If the file doesn't exist ... It will create it...
han1 = open(f1,"r+") // Both reading and writing

# File Input/Output |V
han1 = open("data1","w")
productname = "linux"
productcost = 395
count = 1
## Here the intergers cannot be written file,,, It has to be converted to Strings...
## % Operator when applied to strings performs formatting...
## %s -strings, %d - integers digits, %f - floats
han1.write("%s %d %d\n\n" % (productname,productcost,count))
han1.close()
while count <= 100:
    han1.write("%s %.2f %d\n\n" % (productname,productcost,count))
    productcost = productcost + 1
    count = count + 1

## Exceptions |
f1 = raw_input("Please specify a filename for processing:")

try:
    han1 = open(f1,"r")
except:
    print "Problems opening file",f1
print  "we 've moved on"

while 1:
 f1 = raw_input("Please specify a filename for processing:")
 try:
  han1 = open(f1,"r")
  break  // to break the never ending loop
 except:
  print "Problems opening file", f1

# import sys
  import string
count = 0
while 1:
 if count == 3
  answer = raw_input("")
  if answer == "yes":
   sys.exit()
  else
   count = 0
f1 = raw_input(" please specify a filename for processing:")
try:
 string.lower(f1)
 han1 = open(f1,"r")
 break
except:
 print "Problems opening file", f1

count = count + 1

// It's always good to variblized the code...

### Functions
Encapsulates the code
perform repetative tasks

python >>> help
>>> keywords

def name():
    Statements
Note: Function has to be called after the definition
Eg:
def hworld():
    print "hello World"
hworld()

def lifeexpect(e,a): // function definition
    timeleft = e - a
    return timeleft  // function return values

timeleft = lifeexpect(expect,age) // function call

## Modules
python
>>> import sys
>>> dir (sys)
>>> print sys.ps1
>>> print sys.path
>>> import os
>>> dir (os)
>>> print sys.path ( Default location where python searches for modules ) This is a list which can be modified...
cd /usr/lib/python ; ls -l os.p*
os.py - python text file ... Ascii text contents... (ASCII English text )
os.pyc - byte code compiled file for (os.py) file ... When the module is first compiled the byte code will be in os.pyc .
             All the function will exectuing form pyc file since it is faster .. (data file)  It is very important file will be consulted
             by the interpreter
os.pyo - Another form of file ...  (data file )

Note: Python is a high level language... It compiles the program into byte code for subsequent execution... Similar to java
Note: Importing methodolgy...
eg: import sys... Instead of import sys which includes all the function inside the module sys
We can specifically call the functions inside the module ... Which decreases the overhead load...
Eg; from sys import func1 funct2 funct3 ... functn 
why to import all the function.. I am interested in only particular functions...

#### SHUTIL

import shutil
s = shutil
s.copy2("data1","data2")
s.move("data1","temp/data2")
s.copytree(srcdir,dstdir) // copytree will copy dir recursively... copytree will expect that destination dir not exist...
s.copytree(srcdir,dstdir,1) // last argument... creating sybolic link.... 1 - to create a symblic link.. 0 - not to create...
s. rmtree(srcdir)

### Regular Expression ###
Regular Expression I
- For parsing strings,specific character, group of character
import re
dir(re)
reg1 = re.compile('itsamatch')
reg1.match('itsamatch')

reg1 = re.compile('itsamatch',re.IGNORECASE)
match1 = reg1.match('itsamatch')
print match1.group()

searchstring = "This is a test"
reg1 = re.compile('^T')
match1 = reg1.match(searchstring)
print match1
print match1.group()
reg1 = re.compile('.*')   { . - wildcards begining letter, * continues )
metacharacters - ^,*,.,+,?

reg1 = re.compile('[a-z]')
match1 = reg1.match(searchstring)  // searchstring = " this is test "
print match1.group()  // o.p - t

reg1 = re.compile('[a-z]+')
match1 = reg1.match(searchstring)
print match1.group() // o.p : this

reg1 = re.compile('[a-z]+.*')  // . - space , * - continue the string 
match1 = reg1.match(searchstring)
print match1.group() // o.p: this is a test

## Python escape sequence
reg1 = re.compile('[a-z]+\s') - // for space

## script example ##
import re
#reg1 = re.compile('hello',re.IGNORECASE) // search for string hello
#reg1 = re.compile('\d+', re.IGNORECASE) # matches verbatim // searches for digits only
reg1 = re.compile('\d+\s+\w+', re.IGNORECASE) # matches verbatim // searches for digits + strings
searchstring = raw_input("Please give us a search string:")
match1 = reg1.match(searchstring)
if match1:
 print match1.group()
else:
 print "NO match"

## script Example ##
reg1 = re.compile('\d+\s+\w+',re.IGNORECASE) # matches verbatim // matching digits/space/words
c = 1
t = 3
print "Okay, you've got", t, "chances to make a match !"
while c <= t
 searchstring = raw_input("Please give us a search string:")
 match1 = reg1.match(searchstring)
 if match1:
    print match1.group()
    else:
    print "No match"
          c = c + 1

### username @ domain ##
reg1 = re.compile('\w+@.*')  w=[a-zA-Z0-9] but skipes the . - gyani.pillala
reg1 = re.compile('.*@.*')
reg1 = re.compile('\s+\w+@.*') // Any number of strings,words before @

## substituting digits ###
reg1 = re.compile('\d+', re.IGNORECASE) # matches verbatim
filename = "data3"
f1 = open(filename, "r")
f2 = open(filename2, "w")
for searchstring in f1.readlines():
 print reg1.sub("2000", searchstring) // substituing...
 nvalue = reg1.sub("2000",searchstring, count=10) // substituing... 10 times in a string... When it has multiple digits...
 f2.writelines(nvalue)

## Syslog Integration 1 ##
illustrate logging via python
import logging

# definition and instantiation of logger object
logger = logging.getLogger() // object Instantiating

#definition of the handler
# handler .. responsible to deliver the message to the destination
han = logging.FileHandler('log1.log')
#han = logging.FileHandler('log1.log','a')
#definition of the formatted string
format = logging.Formatter('%(asctime)s %(levelname)s %(message)s')
han.setFormatter(format)

#add handler to logger
logger.addHandler(han)

# Model - logger ---> handler ------>
destination(File,Syslog,SMTP,TCP/UDP Sockets, etc )

#invocation of logger
logmessage = "testing logger in Python"
logger.error(logmessage)

### syslog logging ###
import logging
import logging.handlers

# definition and instantiation of logger object
logger = logging.getLogger() // object Instantiating
# logger ----> han1   ( They lead to different destination )
            ----> han2
#definition of the handler
# handler .. responsible to deliver the message to the destination
han = logging.FileHandler('log1.log')\
han2 = logging.handlers.SyslogHandler() # handler for Syslog takes arguments as well

#han = logging.FileHandler('log1.log','a')
#definition of the formatted string
format = logging.Formatter('%(asctime)s %(levelname)s %(message)s')
han.setFormatter(format)
han2.setFormatter(format)

#add handler to logger
logger.addHandler(han)
logger.addHandler(han2)

# Model - logger ---> handler ------>
destination(File,Syslog,SMTP,TCP/UDP Sockets, etc )

#invocation of logger
logmessage = "testing logger in Python"
logger.error(logmessage)

## netstat -anu
    Syslog by default accepts the udp packets on port number 514

### CGI with Python ###
- Common Gateway Interface
- perl is one of the first dynamic web development on the webserver including python,asp, php(Entirely for the web)
- perl lacks certain securites..
cd /var/www/cgi-bin/[ the location where the scripts has to be kept ]
httpd service appended to /cgi-bin/ /var/www/cgi-bin/   # Subject to Change for different directories
 Eg: http://192.168.1.2/cgi-bin/helloworld.py
Mimetype is something which translate the type with the application... then the browser will know about it.
Eg:
print "Content-Type: text/html"



## Action1.py
import cgi

print "Content-type: text/html"

form = cgi.FieldStorage()

name = form["name"].value
title = form["title"].value
email = form["email"].value

print "Name:\t", name
print "Title:\t", title
print "E-Mail:\t", email

#### Globbing ####
Python Specifes Quick aproaches for searching files/directories, globbing is a nice Quick way to scan the directory
Globbing typically means searching paths,files using different wild cards
import glob
dir(glob) // absolute and relavtive path (absolute - starts from the top of the tree, Relative - start from the current directory )
>>> search1 = glob.glob('*.py') // with in our given Directory
>>> print search1
>>> len(search1) // no. of files
>>> search1 = glob.glob('./temp/*.py') // search in current sub-directory
>>> search1 = glob.glob('fileio?.py') // ? - any digit/character
>>> search1 = glob.glob('fileio[1-2].py')

## script
import glob

query1 = raw_input("Please enter a directory or file Query:")
search1 = glob.glob(query1)

if search1:
  for i in search1:
    print i
 print "Your search contains", len(search1), "result!"
else:
 print "Sorry, no matches"

### Python Journal ###

#################################################################
Python Toturial
#################################################################

####### PYTHON ########

The Python scripting language was created by Guido van Rossum, and first released to
the public in 1991.
Search engines (including Google) use Python to perform their tasks

Typing the command python on the command line invokes the inter-
preter in interactive mode:
$ python
Python 2.5.2 (r252:60911, Sep 14 2008, 10:31:08)
[GCC 4.3.2] on linux2
Type “help,” “copyright,” “credits” or “license” for more information.
>>>
The interactive prompt is now waiting for you to type Python commands:

e.g.
Similar to shell and Perl scripting, you can also invoke the Python interpreter from within
your scripts:
$ cat test12.py
#!/usr/bin/python
# Running a Python script
print “This is another script test”

As with other scripts, you must make the script file executable before you can run it on
the system:
$ chmod +x test12.py
$ ./test2.py
This is another script test

The Python language provides many features found in higher-level languages such as high-
level data types, structures, and built-in error checking.

Python Variables and Data Structures
Python has four variable types to work with:
♦♦ scalars
♦♦ lists
♦♦ tuples
♦♦ dictionaries

### Scalars variables
Scalar variables are those that hold a single value

Examples of numeric and string scalar variable assignments are
x = 5
pi = 3.14159
animal = “dog”

Using variables in Python scripts is also a little different:
$ cat test13.py
test = “Jessica”
pi = 3.14159
radius = 5
area = pi * radius ** 2
print test, “says the area should be,” area
$ chmod +x test13.pl
$ ./test13.pl
Jessica says the area should be 78.53975

Note: notice that Python doesn’t use any symbols around the variable names. Because of
that, when you want to display a variable value in the print statement, it must be outside
of the string text defined.

##### List Variables

Lists are numerically indexed lists of objects (similar to arrays in Perl). They can be made
up of any combination of things you choose, such as scalar data, dictionaries, tuples, and
even other lists. An example of a list assignment is

days = [“Monday,” “Tuesday,” “Wednesday,” “Thursday,” “Friday”]

The beauty of lists in Python is that there are several great built-in functions that can
operate directly on the values in the list

Table 26-5: Python List Functions
Function     Description
append(x)     Add an item to the end of an existing list.
count(x)     Return the number of times the specified item appears in the list.
extend(L)     Extends a list by appending all of the items in the specified list.
index(x)     Return the index value of the specified item in the list.
insert(i, x)     Insert an item into the list at a specified location.
pop(i)         Remove an item from the specified index in the list. If no index is specified,
              removes the last item in the list.
remove(x)     Remove an item from an existing list.
reverse()     Reverses the items in the list.
sort()         Sorts the items in the list.

Eg:

Here’s a simple example of working with a list in Python:
$ cat test14.py
#!/usr/bin/python
# testing lists in Python
days = [“Monday,” “Tuesday,” “Wednesday,” “Thursday,” “Friday”]
print “Before:,” days
print “I have practice on,” days[2]
days.sort()
print “After:,” days
$ ./test14.py
Before: [‘Monday,’ ‘Tuesday,’ ‘Wednesday,’ ‘Thursday,’ ‘Friday’]
I have practice on Wednesday
After: [‘Friday,’ ‘Monday,’ ‘Thursday,’ ‘Tuesday,’ ‘Wednesday’]

Note: The sort() function sorts the data inside the list variable. Also, notice that you can quickly
display the list contents just by using the print statement!

####### Tuples Variables

Tuples are almost identical to lists, except they are immutable. That means unlike a list,
which you can change and modify any way you like, you cannot change a tuple. To create
a tuple, just list the values separated by commas:
days = “Monday,” “Tuesday,” “Wednesday,” “Thursday,” “Friday”

Because tuples don’t allow you change any of the values assigned to the tuple, there aren’t
any functions for altering the data contained in the tuple.

######## Dictionary Variables
Dictionaries are associative arrays, similar to hashes in Perl. They are a collection of
objects or values arranged in key/value pairs. An example of a dictionary assignment is
favorites = {“fruit” : “banana,” “vegetable” : “carrot” }
The dictionary elements can be referenced directly using the format:
favorites[‘fruit’]

$$$$ Secret $$$$
You can also use the special dict() function to assign a key/value tuple to a dic-
tionary variable:
dict([(‘fruit,’ ‘banana’), (‘vegetable,’ ‘carrot’)])
This allows you to build separate tuples of key/value pairs and enter them into
the dictionary variable as necessary.

####### Indentation in Structured Commands ########
Just like Perl, Python supports the standard structured commands you’d expect to use in
your scripts, such as the if-then-else statement, while loop, and for loop.

Python uses indents as a formal part of the language, and lines end with a hard return.

The rules on indentation are simple. Be consistent in how you indent. Use the same num-
ber of spaces to indent a block of code. If you need a statement that spans more than one
line, use the backslash (\) to continue the line. Here’s an example of Python’s indentation
rule at work:

e.g.
$ cat test16.py
#!/usr/bin/python
# using structured commands
count = 1
factorial = 1
number = 5
while count <= number:
factorial = factorial * count
count = count + 1
print “The factorial of,” number, “is,” factorial
$ ./test16.py
The factorial of 5 is 120

Notice that the while loop doesn’t use braces to define the code block. Instead, Python
assumes the indented code is within the code block. Python’s indentation rules may take
a little getting used to if you’re coming from another programming language such as Perl,
but they will make your code more readable and thus easier to maintain.


######## Object-Oriented Programming #########

In Python, everything is an object. Each object has an identity, type, and a value, but also
inherent properties and methods. You already saw this when using the sort ( ) method
for the list variable. Instead of using a sort() function and having to assign the output
to another variable, we use the sort() method by adding it to the end of the days list
variable:

days.sort()

This feature is a fundamental cornerstone in object-oriented programming. Almost every
object has some methods that it inherits that you can use.
You can also create your own objects with the class statement. User-defined classes can
have class variables and class methods, which govern all instances of the class. Each
instance of a class can, in turn, have its own instance variables that don’t apply to other
instances.

######### Python Command-Line Arguments ###########

Python in a modularized language, it uses modules for everything, including
how it interacts with the command line.
The sys module is required to interact with the system. It contains the argv array, which
passes command-line arguments to the script. To use the sys module, you must import
it into your code:

import sys

This is yet another feature of object-oriented programming, the ability to import additional
features into your code.

Here’s an example of using the argv array to process command-line arguments:
$ cat test17.py
#!/usr/bin/python
# determine if a specified year is a leap year
import operator, string, sys
if (len(sys.argv) == 2):
year = string.atoi(sys.argv[1], 10)
by4 = year % 4
by100 = year % 100
by400 = year % 400
if (operator.xor(by4 , operator.xor(by100, by400))):
print sys.argv[1] + “ is not a leap year”
else:
print sys.argv[1] + “ is a leap year”
else:
print “Sorry, you did not provide a year.”
$ ./test17.py 2009

2009 is not a leap year
$ ./test17.py 2008
2008 is a leap year
$

As you can see from this example, Python is a little more complicated than Perl when using
command-line arguments. The command-line argument is placed in the sys.argv array as
element 1 (not 0 as in Perl). You might also notice that unlike Perl, Python is very specific
about data types. The command-line arguments are all captured as string values.
Because the program needs to use the command-line argument as an integer value, it
must be converted. The string module provides the atoi function, which converts ASCII
strings to integer values. After assigning the new integer value to a variable, the calcula-
tions can begin.

However, in Python, special mathematical operators (such as the Boolean XOR) are also
functions, and must be used as functions instead of operators. This requires importing
the operators module and rewriting the if statements to use the xor() function.

#### Python Modules ####

All object-oriented programming languages include libraries that contain pre-built classes
that are useful to programmers. The Python programming language is no different.
In Python, class libraries are called modules. A module contains classes and functions
that can be called from within a Python script. The standard Python installation includes
a library of standard modules that are built into the interpreter. To reference classes and
functions from the module, you must define the module name within the script using the
import command, as was demonstrated in the previous section.

Here is an example of using the SMTP module to easily send a mail message from your
Python script:
$ cat test18.py
#!/usr/bin/python
# using the SMTP module to send mail
import smtplib, time
From = “rich”
To = “rich”
Subject = “Test mail from Python”
Date = time.ctime(time.time())
Header = (‘From: %s\nTo: %s\nDate: %s\nSubject: %s\n\n’
% (From, To, Date, Subject))
Text = “This is a test message from my Python script”
server = smtplib.SMTP(‘localhost’)
result = server.sendmail(From, To, Header + Text)
server.quit()
if result:
print “problem sending message”
else:
print “message successfully sent”

This sample program imports two standard modules. The smtplib module provides SMTP
functions, and the time module provides modules for getting the time from the system.
The script uses the SMTP Python module, which interfaces with the mail program on the
local system to send the created message. If you do not have your mail system configured,
this script won’t work.
Besides the standard modules, there are a host of other modules available for just about any
type of programming function. Scanning the Web for the term “Python modules” produces
thousands of code modules freely available to incorporate into your own applications.



#### Conclusion #####

The Python scripting language provides an object-oriented programming approach to
scripted languages. It references items as objects, and provides lots of methods to easily
manipulate objects within the script. Similar to Perl, Python also uses modules to provide
additional functionality to the core features.

## Date 18th Aug 2010

MultiLine Statements
total = item_one + \
    item_two + \
    item_three

Quotation in Python:
Python accepts single ('), double (") and triple (''' or """) quotes to denote string literals, as long as the same type of quote starts and ends the string.
The triple quotes can be used to span the string across multiple lines.

Waiting for the User
raw_input("\n\nPress the enter key to exit.") For NewLine("\n")

Multiple Statements on a Single Line
The semicolon ( ; ) allows multiple statements on the single line given that neither statement starts a new code block
import sys; x = 'foo'; sys.stdout.write(x + '\n')

Variables
Variables are nothing but reserved memory locations to store values. This means that when you create a variable you reserve some space in memory.
Python variables do not have to be explicitly declared to reserve memory space. The declaration happens automatically when you assign a value to a variable.

Standard Datatypes
Numbers,String,List,Tuple,Dictionary

Python Strings
Python allows for either pairs of single or double quotes. Subsets of strings can be taken using the slice operator ( [ ] and [ : ] ) with indexes starting at 0 in the beginning of the string and working their way from -1 at the end
The plus ( + ) sign is the string concatenation operator, and the asterisk ( * ) is the repetition operator.

Eg:
print str[0]       # Prints first character of the string
print str[2:5]     # Prints characters starting from 3rd to 6th
print str[2:]      # Prints string starting from 3rd character
print str * 2      # Prints string two times
print str + "TEST" # Prints concatenated string

Python Lists:
List can be of different datatypes, accessed using the slice operator ( [ ] and [ : ] ) with indexes starting at 0 in the beginning of the list and working their way to end-1. (+) concatenation Operator (*) is the Repetition Operator

list = [ 'abcd', 786 , 2.23, 'john', 70.2 ]
tinylist = [123, 'john']

print list          # Prints complete list
print list[0]       # Prints first element of the list
print list[1:3]     # Prints elements starting from 2nd to 4th
print list[2:]      # Prints elements starting from 3rd element
print tinylist * 2  # Prints list two times
print list + tinylist # Prints concatenated list

Python Tuples:(read-only)
The main differences between lists and tuples are: Lists are enclosed in brackets ( [ ] ), and their elements and size can be changed, while tuples are enclosed in parentheses ( ( ) ) and cannot be updated. Tuples can be thought of as read-only lists.

Python Dictionary:
Dictionaries are enclosed by curly braces ( { } ) and values can be assigned using square braces ( [] ).
Eg:
dict = {}
dict['one'] = "This is one"
dict[2]     = "This is two"
tinydict = {'name': 'john','code':6734, 'dept': 'sales'}
print dict['one']       # Prints value for 'one' key
print dict[2]           # Prints value for 2 key
print tinydict          # Prints complete dictionary
print tinydict.keys()   # Prints all the keys
print tinydict.values() # Prints all the values

DataType Conversion:

Operators:
 Arithmetic Operators
 Comparision Operators
 Logical (or Relational) Operators
 Assignment Operators
 Conditional (or ternary) Operators

#Python - IF...ELIF...ELSE Statement

if expression:
   statement(s)

Eg:if var1:
   print "1 - Got a true expression value"
   print var1

#The else statement

if expression:
   statement(s)
else:
   statement(s)

#The elif statement

if expression1:
   statement(s)
elif expression2:
   statement(s)
elif expression3:
   statement(s)
else:
   statement(s)

#The Nested if...elif...else Construct

if expression1:
   statement(s)
   if expression2:
      statement(s)
   elif expression3:
      statement(s)
   else
      statement(s)
elif expression4:
   statement(s)
else:
   statement(s)

#While Loops

while expression:
   statement(s)

Eg:
while Ture:
   statement(s)

count = 0
while (count < 9):
   print 'The count is:', count
   count = count + 1

#The For Loops
for iterating_var in sequence:
   statements(s)

Eg:
for letter in 'Python':     # First Example
   print 'Current Letter :', letter
o/p: p y t h o n
fruits = ['banana', 'apple',  'mango']
for fruit in fruits:        # Second Example
   print 'Current fruit :', fruit
o/p: banana apple mango

Examples:
Factorial function in C:
int factorial(int x) {     
    if (x == 0) {                     
        return 1;                  
    } else {
        return x * factorial(x-1);
    }
}

Factorial function in Python:
def factorial(x):
    if x == 0:
        return 1
    else:
        return x * factorial(x-1)


###############################################
Linux Journal "Python Programming For Beginners
###############################################

Despite what assembly code and C coders might tell us, high-level languages do have their place in every programmer's toolbox, and some of them are much more than a computer-science curiosity. Out of the many high-level languages we can choose from today, Python seems to be the most interesting for those who want to learn something new and do real work at the same time. Its no-nonsense implementation of object-oriented programming and its clean and easy-to-understand syntax make it a language that is fun to learn and use, which is not something we can say about most other languages.

In this tutorial, you will learn how to write applications that use command-line options, read and write to pipes, access environment variables, handle interrupts, read from and write to files, create temporary files and write to system logs. In other words, you will find recipes for writing real applications instead of the old boring Hello, World! stuff.
Getting Started

To begin, if you have not installed the Python interpreter on your system, now is the time. To make that step easier, install the latest Python distribution using packages compatible with your Linux distribution. rpm, deb and tgz are also available on your Linux CD-ROM or on-line. If you follow standard installation procedures, you should not have any problems.

Next, read the excellent Python Tutorial written by Guido van Rossum, creator of the Python programming language. This tutorial is part of the official Python documentation, and you can find it in either the /usr/doc/python-docs-1.5.2 or /usr/local/doc/python-docs-1.5.2 catalog. It may be delivered in the raw LaTeX format, which must be processed first; if you don't know how to do this, go to http://www.python.org/doc/ to download it in an alternative format.

I also recommend that you have the Python Library Reference handy; you might want it when the explanations given here do not meet your needs. You can find it in the same places as the Python Tutorial.

Creating scripts can be done using your favorite text editor as long as it saves text in plain ASCII format and does not automatically insert line breaks when the line is longer than the width of the editor's window.

Always begin your scripts with either

#! /usr/local/bin/python

or

#! /usr/bin/python

If the access path to the python binary on your system is different, change that line, leaving the first two characters (#!) intact. Be sure this line is truly the first line in your script, not just the first non-blank line—it will save you a lot of frustration.

Use chmod to set the file permissions on your script to make it executable. If the script is for you alone, type chmod 0700 scriptfilename.py; if you want to share it with others in your group but not let them edit it, use 0750 as the chmod value; if you want to give access to everyone else, use the value 0755. For help with the chmod command, type man chmod.
Reading Command-Line Options and Arguments

Command-line options and arguments come in handy when we want to tell our scripts how to behave or pass some arguments (file names, directory names, user names, etc.) to them. All programs can read these options and arguments if they want, and your Python scripts are no different.

Implementing appropriate handlers boils down to reading the argv list and checking for the options and arguments you want your script to recognize. There are a few ways to do this. Listing 1 is a simple option handler that recognizes common -h, -help and --help options, and when they are found, it exits immediately after displaying the help message.

Listing 1. Help Option Handler

#! /usr/local/bin/python
import sys
if '-h' in sys.argv or '--help' in sys.argv or '--help' in sys.argv:
    print '''
help.py--does nothing useful (yet)
options: -h, -help, or --help-display this help
Copyright (c) Jacek Artymiak, 2000 '''
    sys.exit(0)
else:
    print 'I don't recognize this option'
    sys.exit(0)

Copy and save this script as help.py, make it executable with the chmod 0755 help.py command, and run it several times, specifying different options, both recognized by the handler and not; e.g. with one of the options, you will see this message: ./help.py -h or ./help.py -o. If the option handler does recognize one of the options, you will see this message:

help.py—does nothing useful (yet)
options: -h, -help, or --help—display this help
Copyright (c) Jacek Artymiak, 2000

If you invoke help.py with an option it does not recognize, or without any options at all, it will display the “I don't recognize this option” message.

Note that we need to import the sys module before we can check the contents of the argv list and before we can call the exit function. The sys.exit statement is a safety feature which prevents further program execution when one of the help options is found inside the argv list. This ensures that users don't do something dangerous before reading the help messages (for which they wouldn't have a need otherwise).

The simple help option handler described above works quite well and you can duplicate and change it to recognize additional options, but that is not the most efficient way to recognize multiple options with or without arguments. The “proper” way to do it is to use the getopt module, which converts options and arguments into a nice list of tuples. Listing 2 shows how it works.

Listing 2. Option Handler, options.py

#! /usr/local/bin/python
import sys, getopt, string
def help_message():
    print '''options.py -- uses getopt to recognize options
Options: -h      -- displays this help message
       -a      -- expects an argument
       --file= -- expects an argument
       --view  -- doesn't necessarily expect an argument
       --version -- displays Python version'''
    sys.exit(0)
try:
    options, xarguments = getopt.getopt(sys.argv[1:],
    'ha', ['file=', '--view', 'version'])
except getopt.error:
    print 'Error: You tried to use an unknown option or the
    argument for an option that requires it was missing. Try
    `options.py -h\' for more information.'
        sys.exit(0)
for a in options[:]:
    if a[0] == '-h':
        help_message()
for a in options[:]:
    if a[0] == '-a' and a[1] != '':
        print a[0]+' = '+a[1]
        options.remove(a)
        break
    elif a[0] == '-a' and a[1] == '':
        print '-a expects an argument'
        sys.exit(0)
for a in options[:]:
    if a[0] == '--file' and a[1] != '':
        print a[0]+' = '+a[1]
        options.remove(a)
        break
    elif a[0] == '--file' and a[1] == '':
        print '--file expects an argument'
        sys.exit(0)
for a in options[:]:
    if a[0] == '--view' and a[1] != '':
        print a[0]+' = '+a[1]
        options.remove(a)
        break
    elif a[0] == '--view' and a[1] == '':
        print '--view doesn\'t necessarily expects an argument...'
        options.remove(a)
        sys.exit(0)
for a in options[:]:
    if a[0] == '--version':
        print 'options version 0.0.001'
        sys.exit(0)
for a in options[:]:
    if a[0] == '--python-version':
        print 'Python '+sys.version
        sys.exit(0)

Copy this script, save it as options.py and make it executable. As you can see, it uses two modules: sys and getopt which are imported right at the beginning. Then we define a simple function that displays the help message whenever something goes wrong.

The actual processing of command-line arguments begins with the try statement, where we are testing the list of command-line options and arguments (sys.argv) for errors defined as unknown options or missing arguments; if they are detected, the script will display an error message and exit immediately (see the except statement group). When no errors have been detected, our script splits the list of options and their arguments into tuples in the options list and begins parsing them by executing a series of loops, each searching for one option and its expected arguments.

The getopt.getopt function generates two lists in our sample script: options which contains options and their arguments; and xarguments which contains arguments not related to any of the options. We can safely ignore them in most cases.

To recognize short (one-letter such as -h) and long (those prefixed with --) options, getopt.getopt uses two separate arguments. The list of short options contains all of them listed in a single string, e.g., getopt.getopt(sys.argv, 'ahoinmdwq'). It is possible to specify, in that string, options that absolutely require an argument to follow them immediately (e.g., -vfilename) or after a space (e.g., -v filename). This is done by inserting a colon (:) after the option, like this: getopt.getopt(sys.argv, 'ahoiv:emwn'). However, this creates a silly problem that may cause some confusion and unnecessarily waste your time; if the user forgets to specify the argument for the option that requires it, the option that follows it becomes its argument. Consider this example:

script.py -v -h

If you put v: in the short option string argument of the getopt.getopt function, option -h will be treated as the argument of option -v. This is a nuisance and makes parsing of the list of tuples option, argument much more difficult. The solution to this problem is simple: don't use the colon, but check the second item of the tuple that contains the option (first item of the analyzed tuple) which requires an argument. If it's empty, report an error, like the -a option handler.

Long options prefixed with -- must be listed as a separate argument to the getopt.getopt, e.g., getopt.getopt(sys.argv, 'ah', ['view', 'file=']). They can be serviced by the same handler as short options.

What you do after locating options given by the user is up to you. Listing 2 can be used as a template for your scripts.

Handling Interrupts

A properly written application, especially one that opens files for writing or creates temporary files, ought to implement interrupt handlers to ensure that no files are left corrupted or undeleted when the user or the system decides to stop the execution of our script.

Signals, like those sent when you press CTRL-C during the execution of your script, are caught by handlers which may ignore them, allow default handlers to be executed or perform custom actions. Python implements some default handlers, but you can override them with your own code using the signal module.

Listing 3. Signal Traps

#! /usr/local/bin/python
import signal import sys
def signal_handler(signal, frame):
        print 'You pressed Ctrl+C!'
        sys.exit(0)
signal.signal(signal.SIGINT, signal_handler)
print 'Press Ctrl+C'
while 1:
        continue

The function that lets us trap signals is signal.signal. Its two arguments are the number of the signal you want to trap and the name of the signal handler. Listing 3 is a simple script that captures the SIGINT (signal numbers have their own symbolic equivalents) signal sent to it when you press CTRL-C.

The SIGINT signal is not the only one you can capture. If you want to capture additional signals, add more signal.signal calls to handle them, changing the signal number (the signal.SIGxxx constant) and the name of the handler (optional; you can use the same handler with more than one signal). To see what signals are available in Linux, type kill -l on the command line.

Listing 4. Ignoring Signal

#! /usr/local/bin/python
import signal
signal.signal(signal.SIGINT, signal.SIG_IGN)
print 'Your script can\'t be stopped with Ctrl+C'
while 1:
      continue

Signals can be ignored, which is useful if you want to prevent some of them from disturbing the execution of your script. Listing 4 shows the way to do so (be careful; this script can't be stopped with CTRL-C).

Another signal worth remembering is SIGALRM. Setting up a handler for that signal allows you to stop the execution of your script after the given number of seconds. This is done with signal.alarm as shown in Listing 5.

Listing 5. Using Alarm Signal

#! /usr/local/bin/python
import signal
import sys
def alarm_handler(signal, frame):
    print 'BOOM!'
    sys.exit(0)
signal.signal(signal.SIGALRM, alarm_handler)
signal.alarm(5) # generate SIGALRM after 5 secs
n = 0
while 1:
    print n
    n = n+1

Working with Files

Listing 6. Opening a Read File

#! /usr/local/bin/python
import sys
try:
        fi = open('sample_+_file', 'r')
except IOError:
        print 'Can\'t open file for reading.'
        sys.exit(0)

Many scripts need to work with files. Remember that before you can read or write to a file, it must exist and be open. Listing 6 is an example of a script that opens a file for reading. Writing to a file requires only a small change (see Listing 7).

Listing 7. Opening a Write File

#! /usr/local/bin/python
import os
import sys
try:
        fi = open('sample_+_file', 'w')
except IOError:
        print 'Can\'t open file for writing.'
        sys.exit(0)

As you can see, the first of the two scripts shown in Listings 6 and 7 fails to open a file for reading if the file doesn't exist. This is correct behavior. The second script tries to open a file for writing: if the file exists, it is truncated (i.e., its contents are deleted); if it doesn't exist, it is created. This may not always be the desired behavior. When you want to append some data at the end of a file, you ought to open it for writing, while preserving its original contents. To do that, change the second argument of the open function from 'w' to 'a'.

Once the file is open, we can read or write to it using these methods:

    * read(n): reads at most n bytes from a file (if you omit the number of bytes, the entire file will be read), e.g., fi.read(200), which reads up to 200 bytes.
    * readline: reads one line at a time, e.g., fi.readline().
    * readlines: reads all lines from a file, e.g., fi.readlines().
    * write(string): writes string to a file, e.g., fo.write('alabama').
    * writelines(list): writes a list of strings to a file, e.g., fo.writelines(['alaska<\n>', 'california<\n>', 'nevada<\n>'])

When you want to close a file, use its close method, e.g., fo.close().

 Listing 8. Creating Temporary Files

#! /usr/local/bin/python
import os
import tempfile
import sys
try:
    t_path = tempfile.mktemp()
    ttt = os.open(t_path, os.O_CREAT | os.O_RDWR)
    # O_RDWR allows read & write
except IOError:
    print 'Error:  Couldn\'t create temp file.'
    sys.exit(0)
# always close and remove the temp file
# before you exit from script
os.close(ttt)
os.remove(t_path)

If you want your script to create temporary files, use the tempfile module. It simplifies the task of creating temporary files by automatically creating unique file names based on templates defined in variables tempdir and template. It doesn't create or delete temporary files itself, but you can accomplish this using a method similar to the one used in Listing 8.

Note that you need to use both the os.O_CREAT and os.RDWR flags to tell the os.open function to create a temporary file for both reading and writing. Also, remember to close and remove all temporary files created before exiting a script. You will find more information on the functions, constants and variables used in that example in the os, posix and tempfile sections of the Python Library Reference Manual.

Listing 9. Removing Temporary Files

#! /usr/local/bin/python
import os
import tempfile
import sys
import signal
t_path = ''
t_file = None
def cleanup(signal, frame):
    if t_path != '' and t_file != None:
        print 'Cleaning up temporary files ...'
        os.close(t_file)
        os.remove(t_path)
        print 'Done!'
        sys.exit(0)
    else:
        sys.exit(0)
signal.signal(signal.SIGHUP, cleanup)
signal.signal(signal.SIGINT, cleanup)
signal.signal(signal.SIGQUIT, cleanup)
signal.signal(signal.SIGTERM, cleanup)
try:
    t_path = tempfile.mktemp()
    t_file = os.open(t_path, os.O_CREAT | os.O_RDWR)
except IOError:
    print 'Error:  Couldn\'t create temp file.'
    sys.exit(0)
#
# always close and remove the temp file
# before you exit from script
cleanup(t_path, t_file)

It is a good idea to implement a single handler that will remove temporary files before exiting from the script, as in Listing 9.

Working with Pipes

Many command-line tools let us create pipes for processing data, and it is a good idea to consider implementing this functionality in your own scripts. Pipes allow us to read from the standard input and write to the standard output of our script, as well as read from the standard output and write to the standard input of other commands.

Everything we need to implement pipes in our scripts is stored in the os and sys modules. Let's teach our script to read data from its own standard input (represented by sys.stdin) and copy it, unchanged, to its own standard output (sys.stdout):

#! /usr/local/bin/python
import sys
sys.stdout.write(sys.stdin.read())

This works well, but doesn't allow us to modify the data appearing on the script's standard input. This can be achieved in several ways, depending on how much data you want to process at one time. Listing 10 reads one line at a time and inserts # at the beginning of each line.

Listing 10. Modifying Standard Input

#! /usr/local/bin/python
import sys
while 1:
    data = sys.stdin.readline()
    if data != '':
        # do some processing of the contents of
        # the data variable
        data = '#'+data
        # end of data processing module
        sys.stdout.write(data)
    else:
        sys.stdout.flush()
        break

If you use the read(n) method instead of readline, you can set the number of bytes to be read from the standard input. Listing 11 reads 256 bytes at a time.

Listing 11. Reading from Standard Input

#! /usr/local/bin/python
import sys
while 1:
    data = sys.stdin.read(256)
    if data != '':
        # do some processing of the contents of
        # the data variable
        data = '#'+data
        # end of data processing command group
        sys.stdout.write(data)
    else:
        sys.stdout.flush()
        break

A slightly different approach is needed when you want to read the whole file at one go. We use the sub function from the re module to perform a simple substitution. See Listing 12.

Listing 12. Reading the Entire File at Once

#! /usr/local/bin/python
import sys
import re
data = sys.stdin.readlines()
if data != '':
    # do some processing of the contents of the
    # data variable
    data = re.sub('A-Z]', '=', str(data))
    # end of data processing module
    sys.stdout.write(data)
else:
    sys.stdout.flush()

That's about all the basic knowledge needed to work with the standard input and output of our script. However, Python can read the standard output of external pipes or write to their standard input. This time, we'll need to use the os module and its popen function.

Listing 13 writes to the standard input of the pipe sed 's/-/+/g' > output one hundred lines of text, each containing the - string. The data passed to the pipe is then processed by sed and ends up as one hundred lines with +++. You can read from a pipe, too. Listing 14 shows you how.

Listing 13. Writing to Standard Input

#! /usr/local/bin/python
import os
n = 100
try:
    po = os.popen('sed \'s/-/+/g\' > output', 'w')
except IOError:
    exit(0)
while n != 0:
    n = n-1
    po.write('---\n')

Listing 14. Reading from Pipe

#! /usr/local/bin/python
import os
try:
    po = os.popen('cat /usr/doc/FAQ/txt/Linux-FAQ | pr -2', 'r')
except IOError:
    exit(0)
print po.read()

Writing to the Sytem Log

Listing 15. Writing to System Log

#! /usr/local/bin/python
import syslog
syslog.syslog('syslog junkie: the script has just got woken up')
# some code
for a in ['a', 'b', 'c']:
    b = 'syslog junkie: I found letter '+a
    syslog.syslog(b)
syslog.syslog('syslog junkie: the script goes to sleep now, bye,
bye!')

If you develop applications that you want to keep an eye on and leave a trace of their activity in the system log in a way similar to many daemons running on a typical Linux system, you can do so with the syslog function located in the syslog module. To enable writing to system logs, import the syslog module and add calls to the syslog.syslog function at those points needing to be documented in the system log. See Listing 15 for an example.

Listing 16. Output Listing

Jan  20 00:35:28 localhost python: syslog junkie script has just got woken up
Jan 20 00:35:28 localhost python: I found letter a
Jan 20 00:35:28 localhost python: I found letter b
Jan 20 00:35:28 localhost python: I found letter c
Jan 20 00:35:28 localhost python: syslog junkie script goes to sleep now, bye, bye!

To see the output from your script, open another X terminal window or switch to another console and type

tail -f /var/log/messages

to reveal what your script has just been doing. The output looks like Listing 16.

Remember that if you send the same message to the system log several times in a row, it will be buffered until a different one arrives in the system log buffer. It will appear there only once, and the next line in the system log will indicate how many times it was repeated. This bit of code,

#! /usr/local/bin/python
import syslog
# some code
for a in ['a', 'b', 'c']:
        syslog.syslog('Hello from Python!')

will generate the following results:

Jan  20 00:04:33 localhost python: Hello from Python! Jan  20 00:04:49
localhost last message repeated 2 times

Don't treat the system log like a trash can where you can send any kind of garbage; write only the most important information to it.
Reading Environment Variables

Some scripts may need to access information stored in one or more environment variables. Their values at the time your script is executed are stored in the os.environ dictionary, available after you import the os module. Here is the script that prints out all the environment variables set at the time your script executed.

#! /usr/local/bin/python
import os
for a in os.environ.keys():
        print a, ' = ', os.environ.[a]

If you are interested in checking for a particular value and using it in your own script, use this bit of code to get you started.

#! /usr/local/bin/python
import os
if os.environ['USER']:
        print 'Hello, '+os.environ['USER']

Listing 17. Modifying an Environment Variable

#! /usr/local/bin/python
import os
if os.environ['USER']:
    print 'USER was '+os.environ['USER']
    old_user = os.environ['USER']
    os.environ['USER'] = 'Jacek''
    print 'USER is now '+os.environ['USER']
    os.environ['USER'] = old_user
    print 'USER is '+os.environ['USER']+' back again.'

If you want to modify the value of a particular environment variable while your script is running, use Listing 17 as a guide.

What Next?

Now you know enough to write some well-behaved scripts that look and work like many other Linux commands. I encourage you to read the Python Library Reference and see what is possible using only the basic Python distribution. If the standard Python library is not enough for you, a visit to the official Python web site will reveal a wealth of possibilities and bags of useful code which you can use to learn and solve your programming problems.

##

Python Commands(We can execute all Linux Commands)
##

import os,commands
def dialog():
          status,output = commands.getstatusoutput('ls -l')
          print status,output
dialog()

##### Files and directories with Loops ##########

>>> print os.path.isfile("/etc/passwd")
True
>>> print os.path.exists("/etc/passwd")
True

#### Linux Journal ####

 Python is an extensible, high-level, interpreted, object-oriented programming language. Ready for use in the real world, it's also free.

To execute the standard hello program, enter the following at the command line:

$ python
Python 1.2 (Jun  3, 1995) [GCC 2.6.3]
Copyright 1991-1995 Sitchting Mathematisch Centrum, Amsterdam
>> print "hello, bruce"
hello, bruce
>> [CONTROL]-D

The new version will identify who you are based on your user account in /etc/passwd.

1  #!/usr/local/bin/python
2
3  import posix
4  import string
5
6  uid = `posix.getuid()`
7  passwd = open(`/etc/passwd')
8  for line in passwd.readlines():
9      rec = string.splitfields(line, `:')
10      if rec[2] == uid:
11          print `hello', rec[0],
12          print `mind if we call you bruce?'
13          break
14  else:
15      print "I can't find you in /etc/passwd"

A line-by-line explanation of the program is as follows:
    *1 --- Command interpreter to invoke
    *3-4 --- Import two standard Python modules, posix and regsub.
    *6 --- Get the user id using the posix module. The enclosing backticks (`) tell Python to assign this value as a string.
    *7 --- Open the /etc/passwd file in read mode.
    *8 --- Start a for loop, reading in all the lines of /etc/passwd. Compound statements, such as conditionals, have headers starting with a keyword (if, while, for, try) and end with a colon.
    *9 --- Each line in /etc/passwd is read and split into array rec[] based on a colon : boundary, using string.splitfields()
    *10 --- If rec[2] from /etc/passwd matches our call to posix.getuid(), we have identified the user. The first 3 fields of /etc/passwd are: rec[0] = name, rec[1] = password, and rec[2] = uid.
    *11-12 --- Print the user's account name to stdout. The trailing comma avoids the newline after the output.
    *13 --- Break the for loop.
    *14-15 --- Print message if we can't locate the user in /etc/passwd.

The observant reader will note that the control statements lack any form of BEGIN/END keywords or matching braces. This is because the indentation defines the way statements are grouped. Not only does this eliminate the need for braces, but it enforces a readable coding style.

Libraries

You might object that we did a lot of work in the program above just to demonstrate Python language features. A better method would be to use the pwd module from the standard Python library:
print `hello', pwd.getpwuid(posix.getuid())[0]

This points out another nicety about Python that is critical for any new language's success: the robustness of its library.

Take the ftplib module for instance. If you wanted to write a Python script to automatically download the latest FAQ, you can simply use ftplib in the following example:

#!/usr/local/bin/python
from ftplib import FTP
ftp = FTP(`ftp.python.org')     # connect to host
ftp.login()                     # login anonymous
ftp.cwd(`pub/python/doc')       # change directory
ftp.retrlines(`LIST')           # list python/doc
F = open(`python.FAQ', `w')     # file: python.FAQ
ftp.retrbinary(`RETR FAQ', F.write, 1024)
ftp.quit()

Python has numerous features which make programming fun and restore your perspective of the design objectives. The language encourages you to explore its features by writing experimental functions during program development. Several notable Python features:

    *Automatic memory management. No malloc/free or new/delete is necessary—when objects become unreachable they are garbage-collected.
    *Support for manipulating lists, tuples, and arrays
    *Associative arrays, referred to as “Dictionaries” in Python
    *Modules to encourage reusability. Python comes with a large set of standard modules that may be used as the basis for learning to program in Python.
    *Exception handling
    *Classes


### The Python HTMLgen Module ##

We need to download the package and install on the box and import in python

bash$ export PYTHONPATH=/local/HTMLgen:$PYTHONPATH
bash$ python
 >>> import HTMLgen
 >>> doc = HTMLgen.SimpleDocument(title="Hello")
 >>> doc.append(HTMLgen.Heading(1, "Hello World"))
 >>> print doc

Finally, I print the doc object which dumps the following HTML to standard output:





 
 Hello World


Hello World




HTMLgen is a very good tool for generating HTML tables and lists. The data in the table comes from the Linux /proc/interrupts file which details the IRQ interrupts for your Linux PC. On my PC, doing a cat of /proc/interrupts yields:

The Python script reads the contents of the /proc/interrupts file and copies the data into an HTML table.
# Code
import regsub, string, HTMLgen
                      # New HTML document.
doc = HTMLgen.SimpleDocument(title='Interrupts')
                      # New HTML table.
table = HTMLgen.Table(
    tabletitle='Interrupts',
    border=2, width=100, cell_align="right",
    heading=[ "Description", "IRQ", "Count" ])
table.body = []       # Empty list.
doc.append(table)     # Add table to document.
interrupts_file = open('/proc/interrupts')
for line in interrupts_file.readlines():
    data=regsub.split(string.strip(line),'[ :+]+')
    table.body.append(
   [ HTMLgen.Text(data[2]),data[0],data[1] ])
doc.write("interrupts.html")


When creating the table object, I set some optional attributes by supplying them as named arguments. The final headings argument sets the list of column headings that HTMLgen will use
Once I've set up my table, I open the /proc/interrupts file and use the readlines method to read in its entire contents. I use a for loop to step through the lines returned and turn them into table rows. Inside the loop, the string and regular expressions functions are used to strip off leading spaces and split up each line into a list of three data values based on space and colon (:) separators:

data=regsub.split(string.strip(line),'[ :+]+')

Elements of the data list are processed to form a table row by reordering them into a new three-element list consisting of name, number and total calls:

[ HTMLgen.Text(data[2]), data[0], data[1] ]

The first list element, data[2], is the interrupt name. The interrupt name is a non-numeric field, so I've taken the precaution of escaping any characters that might be special to HTML by passing it though the HTMLgen Text filter. The resulting list is made into a row of the table by appending the list to the table's body:

table.body.append(
        [ HTMLgen.Text(data[2]), data[0], data[1] ])

##Bar Charts from HTML Tables

#!/usr/bin/python
import string, os, HTMLgen, barchart
inpipe = os.popen("ps vax", "r");
colnames = string.split(inpipe.readline())
chart = barchart.StackedBarChart()
chart.title = "Text/Data Memory per Process"
chart.datalist = barchart.DataList()
chart.datalist.segment_names = colnames[5:7]
data = chart.datalist
for line in inpipe.readlines():
    cols    = string.split(line)
    barname = string.join(cols[10:], " ")
    tsize   = string.atoi(cols[6])
    dsize   = string.atoi(cols[7])
    data.load_tuple(( barname, tsize, dsize ))
data.sort(key=colnames[5], direction="decreasing")
doc = HTMLgen.SimpleDocument(title='Memory')
doc.append(chart)
doc.write("psv.html")

The original output from ps v looks something like the following:

PID TTY STAT TIME PAGEIN TSIZ DSIZ RSS LIM %MEM COMMA
 555 p1 S 0:01 232 237 1166 664 xx 2.1 -tcsh
1249 p2 S 0:00 424 514 2613 1676 xx 5.4 xv ps
 ...

I use the Python operating system module's popen function to return a file input pipe for the output stream from the command:

inpipe = os.popen("ps vax", "r");

I then read in the first line from the input pipe and split it into a list of column names.

colnames = string.split(inpipe.readline())

Now, I create the chart object, and the chart object's datalist object:

chart = barchart.StackedBarChart()
...
chart.datalist = barchart.DataList()

Datalists can have multiple data segments per bar, which results in a stacked bar chart
I need to tell the datalist object how many data segments are present by setting the list of segment_names. I decided the bars on my chart will have two segments, one for TSIZ (program text memory size) and one for DSIZ (program data memory size). To accomplish this, I need to copy the two column names from colnames into segment_names. Because lists in Python are numbered from zero, the two colnames I'm interested in are columns 5 (TSIZ) and 6 (DSIZ). I can extract them from the colnames list with a single slicing statement:

chart.datalist.segment_names = colnames[5:7]
data = chart.datalist

The [5:7] notation is a slicing notation. In Python, you can slice single items and ranges of items out of strings, lists and other sequence data types. The notation [low:high] means slice out a new list from low to high minus 1.

After initializing the chart, I use a for loop to read the remaining lines from the ps output pipe. I extract the columns I need by using string.split(line) to break the line into a list of columns. I extract the text of each command by taking all the words from column 10 onward and joining them into a new barname string:

barname = string.join(cols[10:], " " )

I use the string module's atoi function to convert the ASCII strings in the numeric fields to integers. The last statement in the loop assembles the data into a tuple:

( barname, tsize, dsize )

A tuple is a Python structure much like a list, except that a tuple is immutable—you cannot insert or delete elements from a tuple. Although the two are similar, their differences lead to quite different implementation efficiencies. Python has both a tuple and a list, because this allows the programmer to choose the one most appropriate to the situation. Many features of Python and its modules are designed to be high-level interfaces to services that are then implemented efficiently in compiled languages such as C. This allows Python to be used for computer graphics programming using OpenGL and for numerical programming using fast numerical libraries.

Back to the example. The last statement in the loop inserts the tuple into the chart's datalist.

data.load_tuple(( barname, tsize, dsize ))

When the last line is processed, the loop terminates and I sort the data in decreasing order of TSIZ:

data.sort(key=colnames[5], direction="decreasing")

After that, I create the final document and save it to a file.

doc = HTMLgen.SimpleDocument(title='Memory')
doc.append(chart)
doc.write("psv.html")

#Python Update
The really significant new item in 1.3 was the addition of keyword arguments to functions, similar to Modula-3's. For example, if we have the function definition:

def curse(subject="seven large chickens",
          verb="redecorate",
          object="rumpus room"):
    print "May", subject, verb, "your", object

then the following calls are all legal:

curse()
curse('a spaniel', 'pour yogurt on', 'hamburger')
curse(object='garage')
curse('the silent majority', object='Honda')

An experimental feature was included in 1.4 and caused quite a bit of controversy: private data belonging to an instance of a class is a little more private. An example will help to explain the effect of the change. Consider the following class:

class A:
    def __init__(self):
        self.__value=0
    def get(self): return self.__value
    def set(self, newval): self.__value=newval
Python doesn't support private data in classes, except by convention. The usual convention is private variables have names that start with at least one underscore. However, users of a class can disregard this and access the private value anyway. For example:

>>> instance=A()
>>> dir(instance)  # List all the attributes of the instance
['__value']
>>> instance.get()
0
>>> instance.__value=5
>>> instance.get()
5






























































Good-to-Know

What is Memcached?

Free & open source, high-performance, distributed memory object caching system, generic in nature, but intended for use in speeding up dynamic web applications by alleviating database load.
Memcached is an in-memory key-value store for small chunks of arbitrary data (strings, objects) from results of database calls, API calls, or page rendering.
Memcached is simple yet powerful. Its simple design promotes quick deployment, ease of development, and solves many problems facing large data caches. Its API is available for most popular languages.
Memcached Users: twitter,youtube,wikipedia



What is Nginx

Nginx is a free, open-source, high-performance HTTP server and reverse proxy, as well as an IMAP/POP3 proxy server. Igor Sysoev started development of Nginx in 2002, with the first public release in 2004. Nginx now hosts nearly 6.55% (13.5M) of all domains worldwide.
Nginx is known for its high performance, stability, rich feature set, simple configuration, and low resource consumption.
Nginx is one of a handful of servers written to address the C10K problem. Unlike traditional servers, Nginx doesn't rely on threads to handle requests. Instead it uses a much more scalable event-driven (asynchronous) architecture. This architecture uses small, but more importantly, predictable amounts of memory under load.
Even if you don't expect to handle thousands of simultaneous requests, you can still benefit from Nginx's high-performance and small memory footprint. Nginx scales in all directions: from the smallest VPS all the way up to clusters of servers.
Nginx powers several high-visibility sites, such as WordPress, Hulu, Github, Ohloh, SourceForge and TorrentReactor.

6 Ways to kill your Servers

Learning how to scale isn’t easy without any prior experience. Nowadays you have plenty of websites like highscalability.com to get some inspiration, but unfortunately there is no solution that fits all websites and needs. You still have to think on your own to find a concept that works for your requirements. So did I.
A few years ago, my bosses came to me and said “We’ve got a new project for you. It’s the relaunch of a website that has already 1 million users a month. You have to build the website and make sure we’ll be able to grow afterwards”. I was already an experienced coder, but not in these dimensions, so I had to start learning how to scale – the hard way.
The software behind the website was a PHP content management system, based on Smarty and MySQL. The first task was finding a proper hosting company who had the experience and would also manage the servers for us. After some research we found one, told them our requirements and ordered the suggested setup:
  • LoadBalancer (+Fallback)
  • 2 Webservers
  • Mysql Server (+Fallback)
  • development machine
They said, that’s gonna be all we need – and we believed it. What we got was:
  • Loadbalancer (single core, 1GB RAM, Pound)
  • 2 Webservers (Dual core, 4GB RAM, Apache)
  • MySQL Server (Quad core, 8GB RAM)
  • Dev (single core, 1GB RAM)
The setup was very basic without any further optimization. To synchronize the files (php+media files) they installed DRBD in active-active configuration.
Eventually the relaunch came – of course we were all excited. Very early in the morning we switched the domains to the new IPs, started our monitoring scripts and stared at the screens. We had almost instantly traffic on the machines and everything seemed to work pretty fine. The pages loaded quickly, MySQL was serving lots of queries and we were all happy.
Then, suddenly our telephones started to ring “We can’t access our website, what’s going on?”. We looked at our monitoring software and indeed – the servers were frozen and the site offline! Of course, the first thing we did was calling our hoster “hey, all our servers are dead. What’s going on?”. They promised to check the machines and call back immediately afterwards. The call came: “well,…erm… your filesystem is completely fubar. What did you do? It’s totally screwed”. They stopped the loadbalancers and told me have a look at one of the webservers. Looking at the index.php file I was shocked. It contained some weird fragments of C code, error messages and something that looked like log files. After some further investigation we found out that DRBD was the cause for this mess.

Lesson #1 learned

Put Smarty compile and template caches on an active-active DRBD cluster with high load and your servers will DIE!
While our hoster was fixing the webservers I rewrote some parts of the CMS to store the Smarty cache files on the servers local filesystems. Issue found & fixed. We went back online! Hurray!!!
Now it was early afternoon. The website usually reaches its peak in the late afternoon until early evening. At night the traffic goes back to almost none. We kept staring at the monitoring software and we were all sweating. The website was loading but the later it got, the higher the system load and the slower the responses. I increased the lifetime of the Smarty template caches and hoped it would do the trick – it didn’t! Very soon the servers started to give timeouts, white pages and error messages. The two machines couldn’t handle the load.
Our customer got a bit nervous at the same time, but he said: Ok, relaunches usually cause some issues. As long as you fix it quickly, it will be fine!
We needed a plan to reduce the load and discussed the issue with our hoster. On of their administrators came up with a good idea: “Guys, your servers are currently running on a pretty common Apache+mod_php setup. How about switching to an alternative webserver like Lighttpd? It’s a fairly small project, but even wikipedia is using it”. We agreed.

Lesson #2 learned

Put an out-of-the-box webserver configuration on your machines, do not optimize it at all and your servers will DIE!
The administrator gave his best and reconfigured both webservers as quickly as he could. He threw away the Apache configuration and switched to Lighttpd+FastCGI+Xcache. Later, when we went back online we almost couldn’t stand the pressure anymore. How long will the servers last this time?
The servers did surprisingly well. The load was MUCH lower than before and the average response time was good. After this huge relief we went home and got some sleep. It was already late and we came to the conclusion there was nothing left we could do.
The next days the website was doing rather well, but at peak times it was still close to crash. We spotted MySQL as the bottleneck and called our hoster again. They suggested a MySQL Master-Slave replication with a MySQL slave on each webserver.

Lesson #3 learned

Even a powerful database server has its limits and when you reach them – your servers will DIE!
In this case the database became so slow at some point, that the incoming and queued network connections killed our webservers – again. Unfortunately this issue wasn’t easy to fix. The content management system was pretty simple in this regard and there was no built-in support for separating reading and writing SQL queries. It took a while to rewrite everything, but the result was astonishing and worth every minute of suspended sleep.
The MySQL replication really did the trick and the website was finally stable! YEAH! Over the next weeks and months the website became a success and the number of users started to increase constantly. It was only a matter of time until the traffic would exceed our resources again.

Lesson #4 learned

Stop planning in advance and your servers are likely to DIE.
Fortunately we kept thinking and planning. We optimized the code, reduced the number of needed SQL queries per pageload and suddenly stumbled upon MemCached. At first I added MemCached support in some of the core functions, as well as in the most heavy (slow) functions. When we deployed the changes we couldn’t believe the results – it felt a bit like finding the Holy Grail. We reduced the number of queries per second by at least 50%. Instead of buying another webserver we decided to make even more use of MemCached.

Lesson #5 learned

Forget about caching and you will either waste a lot of money on hardware or your servers will die!
It turned out, that MemCached helped us to reduce the load on the MySQL servers by 70-80%, which also resulted in a huge performance boost – also on the webservers. The pages were loading much quicker!
Eventually our setup seemed to be perfect. Even on peak times we didn’t have to worry about crashes or slow responding pages anymore. Did we make it? No! Out of the blue one of the webservers started having some kinda hickups. Error messages, white pages and so on. The system load was fine and in most cases the server worked, but only in “most cases”.

Lesson #6 learned

Put a few hundred thousand small files in one folder, run out of Inodes and your server will die!
Yes you read it correct. We were so focused on MySQL, PHP and the webservers itself that we didn’t pay enough attention to the filesystem. The Smarty cache file were stored on the local filesystem – all in one single directory. The solution here was putting Smarty on a dedicated ReiserFS partition. Furthermore we enabled the Smarty “use_subdirs” option.
Over the past years we kept optimizing the pages. We put the Smarty caches into memcached, installed Varnish to reduce the I/O load for serving static files more quickly, switched to Nginx (Lighttpd randomly produced error 500 messages), installed more RAM, bought better hardware, more hardware… this list is endless.

Conclusion

Scaling a website is a never ending process. As soon as you fix one bottleneck you’r very likely to stumble into the next one. Never ever start thinking “that’s it, we’re done” and lean back. It will kill your servers and perhaps even your business. It’s a constant process of planning and learning. If you can’t get a job done on your own, because you have a lack of experience and/or resources – find a competent and realiable partner to work with. Never stop talking with your team and partners about the current issues and the ones that might arise in (near) future. Think ahead and be proactive!