Friday, August 28, 2009
how to install windows after linux installed.
Here are simple steps to install.
(Please take backup of your imp date before doing this)
On linux shell do fdisk -l
shows you all disks.
Free one of the partition.
Restart the system install windows on that partition. (make sure of partition on which window are installing).
Now if you reboot, only windows will come. (expected result).
Now put linux boot able CD or DVD and restart.
Boot from the CD or DVD.
at prompt boot:...
type "linux rescue" (without quotes) and enter.(according to centOS)
Follow steps will lead to shall prompt.
Now mount your boot partition if you have boot as separate partition, if root partition itself contain boot then mount root partition.
ex: mount /dev/sda2 /mnt/temp
cd /mnt/temp
Now open menu.lst (probably it is in /root/boot/grub/menu.lst or /boot/grub.lst)
Add bellow lines
title Windows xp
root (hda0,0)
makeactive
chainloader +1
Here depending on your partion you need to modify the parameters.
Title :- Name you want to see at boot menu list.
root :- It is the partition where menu.lst present. Genrally grub list all disks as "hda" even you have sata hard disk.One more thing is fdisk -l list partitions staring index from 1 to ...
ex:- sad1, sda2....
but grub will take hda0, hda1, hda2 .....
So, if for example boot partion is sda3 you need to give "root (hda2, 0)".
Other two paramets doesn't change.
Now at shall enter grub command.
ex:
sh#grub
grub> root (hda0)
grub> setup (had0)
grub> quit
sh#reboot
After reboot in boot from list, it will show liux and windows.
Now choose from which you want to boot.
Enjoy.............
Thursday, August 20, 2009
commands i use while programing
:set nonu -remove line numbers
:sp file_name ctrl+ww to switch between split sreen
:$ --------------------- go to end of file
:1 ---------------------- go to starting of file 1 number specifies line number
:%s/OLD/NEW/g -- replace OLD string with NEW string globally (g) and in total file %s
:%s/^/NEW/g -- every line at the starting NEW string will be added.
1,30s/rr/aa/gi ----1,30 represent 1 to 30 lines s stands for string rr serch string pattern and aa is replace string g is globally i is case insenditive
ctags * should be enabled at comd prompt
cntl + ] goto function defination
cntl + o come back
cscope -R --- cntl+d exit ,tab to move to bellow cmds
Enabling and running core dump:
add below two line in bashrc file and restart shell.
ulimit -c unlimited
echo core.%e.%p.%s.%t > /proc/sys/kernel/core_pattern
Now if any core dump file is generated run the below command with proper file name.
gcc executable coredump.xxx
CMDS:-
route --gives route details for ur terminal
route add default gw
nm obj_file | grep function_name_to_find
diff file1 file2 --- list lines differed
vi -t functionname goes to that function. (ctags * must be used)
svn:
svn co svn+ssh://phaneendra@
svn add dir1 dir2 file1 ....
svn ci
svn import dir/file svn+ssh://phaneendra@
svn diff
svn ci file_name
if you want to changes code base user name or in some situations your useradmin changed you user name. Now you want to checkin code which your checked out with different name.
now you have two options
1)
Checkout with your new name and see diffs between the code you modified(checked out with old name) and make changed to new check out code base and then check in. (off-course very very bad idea)
2)Best way is use bellow command at trunk (main directory of checkout code)
svn switch svn+ssh://old_user_name@ip/repos/trunk svn+ssh://new_user_name@ip/repos/trunk --relocate
it asks for new user password provide it. Your work is done.
to open .chm files in ubuntu
# apt-get install gnochm
$ gnochm file.chm
Cross compilation:-
source env-setup
This environment variable should be executed before cross compiling or put it in bashrc file and open new terminal and start working.
wed :-
http://cscope.sourceforge.net/large_projects.html
http://www.network-theory.co.uk/ (for gcc and valgring)
http://www.experts-exchange.com/Programming/System/Linux/
http://www.securitytube.net/Programming-Video-List.aspx
Learn perl easy part5
What is a Subroutine?
We have been using a form of subroutines all along. Perl functions are basically built in subroutines. You call them (or "invoke") a function by typing its name, and giving it one or more arguments.
Example: Length
my $seq = 'ATGCAAATGCCA';
my $seq_length = length $seq; ## OR
my $seq_length = length($seq);
# $seq_length now contains 12
Perl gives you the opportunity to define your own functions, called "subroutines". In the simplest sense, subroutines are named blocks of code that can be reused as many times as you wish.
Example: A very basic subroutine
sub Hello {
print "Hello World!!\n";
}
print "Sometimes I just want to shout ";
Hello(); #or &Hello;
Example: Some simple subroutines
sub hypotenuse {
my ($a,$b) = @_;
return sqrt($a**2 + $b**2);
}
sub E {
return 2.71828182845905;
}
#########
$y = 3;
$x = hypotenuse($y,4);
# $x now contains 5
$x = hypotenuse((3*$y),12);
# $x now contains 15
$value_e = E();
# $value_e now contains 2.71828182845905
This way of using subroutines makes them look suspiciously like functions. Note: Unlike a function, you must use parentheses when calling a subroutine in this manner, even if you are giving it no arguments.
The Magic Array - @_
Perhaps the most important concept to understand is that values are passed to the subroutine in the default array @_. This array springs magically into existence, and contains the list of values that you gave to subroutine (within the parentheses).
Example: The magic of @_
sub Add_two_numbers {
my ($number1) = shift; # get first argument from @_ and put it in $number1
my ($number2) = shift; # get second argument from @_ and put it in $number2
my $sum = $number1 + $number2;
return $sum;
}
sub Add_two_numbers_2 {
my ($number1,$number2) = @_;
my $sum = $number1 + $number2;
return $sum;
}
sub Add_two_numbers_arcane {
return ($_[0] + $_[1]);
}
Some Subroutine Notes
* Use a name for your subroutine that makes sense to you. Avoid using names that Perl already uses (like "length" or "print"), unless you really like making yourself miserable.
* If you don't give a return statement, the subroutine will return the last value calculated.
* You may have multiple return statements. The first one that is executed will exit the subroutine
Example: A more complex subroutine with different returns
sub Number_Examiner {
my $number = shift;
unless ($number =~ /^\d+$/){
return "You sure this is a number?";
}
if ($number >= 100){
return "Big Number!";
}
elsif ($number > 50){
return "Bigger than 50!;
}
else {
return "Wee Little Number";
}
}
* You can return either a single value or a list of values. You can, if you wish, return nothing. Remember to use your subroutine in a way that reflects the number of values you expect to get back.
Example: Know what you expect
my ($value1,$value2,$value3) = ReturnThreeValues();
# if you are expecting three values back, make space for them.
my (@values) = ReturnThreeValues(); # another way to do it
my ($value1,$value2) = ReturnThreeValues();
# the last value is lost, gone, vanished, DOA... You may have
wanted to do this.
"my" Variables
Variables that you use in a subroutine should be made private to that subroutine with the my operator. This avoids accidentally overwriting similarly-named variables in the main program. If you already included use strict at the top of your program, perl will check that all variables are introduced with my.
Why Use My?
my $var = "Boo!";
Scary();
print "$var\n";
sub Scary{
print "$var\n";
$var = "Eeek!";
}
# The results:
Boo!
Eeek!
Variables made private with my only exist within a block (curly braces). The subroutine body is a block, so the my variables only exist within the body of the subroutine.
You can make scalars, arrays and hashes private. If you apply my() to a list, it makes each member of the list private.
{ # start a block
my $scalar; # $scalar is private
my @array; # now @array is private
my %hash; # %hash is private
# same thing, but in one swell foop
my ($scalar,@array,%hash);
}
lectuer 7:
1. Using a Module
2. Getting Module Documentation
3. Installing Modules
4. More About Importing
5. Where are Modules Installed?
6. The Anatomy of a Module
7. Exporting Variables & Functions from Modules
8. Using Object-Oriented Modules
Using a Module
A module is a package of useful subroutines and variables that someone has put together. Modules extend the ability of Perl.
Example 1: The File::Basename Module
The File::Basename module is a standard module that is distributed with Perl. When you load the File::Basename module, you get two new functions, basename and dirname.
basename takes a long UNIX path name and returns the file name at the end. dirname takes a long UNIX path name and returns the directory part.
#!/usr/bin/perl
# file: basename.pl
use strict;
use File::Basename;
my $path = '/bush_home/bush1/lstein/C1829.fa';
my $base = basename($path);
my $dir = dirname($path);
print "The base is $base and the directory is $dir.\n";
The output of this program is:
The base is C1829.fa and the directory is /bush_home/bush1/lstein.
The use function loads up the module named File::Basename and imports the two functions. If you didn't use use, then the program would print an error:
Undefined subroutine &main::basename called at basename.pl line 8.
Example 2: The Env Module
The Env module is a standard module that provides access to the environment variables. When you load it, it imports a set of scalar variables corresponding to your environment.
#!/usr/bin/perl
# file env.pl
use strict;
use Env;
print "My home is $HOME\n";
print "My path is $PATH\n";
print "My username is $USER\n";
When this runs, the output is:
My home is /bush_home/bush1/lstein
My path is /net/bin:/usr/bin:/bin:/usr/local/bin:/usr/X11R6/bin:/bush_home/bush1/lstein/bin:.
My username is lstein
Controlling What Gets Imported
Each module will automatically import a different set of variables and subroutines when you use it. You can control what gets imported by providing use with a list of what to import.
By default the Env module will import all the environment variables. You can make it import only some:
#!/usr/bin/perl
# file env2.pl
use strict;
use Env '$HOME','$PATH';
print "My home is $HOME\n";
print "My path is $PATH\n";
print "My username is $USER\n";
Global symbol "$USER" requires explicit package name at env2.pl line 9.
Execution of env2.pl aborted due to compilation errors.
You can import scalars, hashes, arrays and functions by giving a list of strings containing the variable or function names. This line imports a scalar named $PATH, an array named @PATH, and a function named printenv.
#!/usr/bin/perl
use Env '$PATH','@PATH','printenv';
print join "\n",@PATH;
Output:
/net/bin
/usr/bin
/bin
/usr/local/bin
/usr/X11R6/bin
/bush_home/bush1/lstein/bin
.
You will often see the qw() operator used to reduce typing:
use TestModule qw($PATH $HOME @PATH printenv);
Finding out What Modules are Installed
Here are some tricks for finding out what Modules are installed.
Preinstalled Modules
To find out what modules come with perl, look in Appendix A of Perl 5 Pocket Reference. From the command line, use the perldoc command from the UNIX shell. All the Perl documentation is available with this command:
% perldoc perlmodlib
PERLMODLIB(1) User Contributed Perl Documentation PERLMODLIB(1)
NAME
perlmodlib - constructing new Perl modules and finding
existing ones
DESCRIPTION
THE PERL MODULE LIBRARY
Many modules are included the Perl distribution. These
are described below, and all end in .pm. You may discover
...
Standard Modules
Standard, bundled modules are all expected to behave in a
well-defined manner with respect to namespace pollution
because they use the Exporter module. See their own docu-
mentation for details.
AnyDBM_File Provide framework for multiple DBMs
AutoLoader Load subroutines only on demand
AutoSplit Split a package for autoloading
B The Perl Compiler
...
To learn more about a module, run perldoc with the module's name:
% perldoc File::Basename
NAME
fileparse - split a pathname into pieces
basename - extract just the filename from a path
dirname - extract just the directory from a path
SYNOPSIS
use File::Basename;
($name,$path,$suffix) = fileparse($fullname,@suffixlist)
fileparse_set_fstype($os_string);
$basename = basename($fullname,@suffixlist);
$dirname = dirname($fullname);
...
Optional Modules that You May Have Installed
perldoc perllocal will list the names of locally installed modules.
% perldoc perllocal
Thu Apr 27 16:01:31 2000: "Module" the DBI manpage
o "installed into: /usr/lib/perl5/site_perl"
o "LINKTYPE: dynamic"
o "VERSION: 1.13"
o "EXE_FILES: dbish dbiproxy"
Thu Apr 27 16:01:41 2000: "Module" the Data::ShowTable
manpage
o "installed into: /usr/lib/perl5/site_perl"
o "LINKTYPE: dynamic"
o "VERSION: 3.3"
o "EXE_FILES: showtable"
Tue May 16 18:26:27 2000: "Module" the Image::Magick man-
page
...
Installing Modules
You can find thousands of Perl Modules on CPAN, the Comprehensive Perl Archive Network:
http://www.cpan.org
Installing Modules Manually
Search for the module on CPAN using the keyword search. When you find it, download the .tar.gz module. Then install it like this:
% tar zxvf bioperl-0.7.1.tar.gz
bioperl-0.7.1/
bioperl-0.7.1/Bio/
bioperl-0.7.1/Bio/DB/
bioperl-0.7.1/Bio/DB/Ace.pm
bioperl-0.7.1/Bio/DB/GDB.pm
bioperl-0.7.1/Bio/DB/GenBank.pm
bioperl-0.7.1/Bio/DB/GenPept.pm
bioperl-0.7.1/Bio/DB/NCBIHelper.pm
bioperl-0.7.1/Bio/DB/RandomAccessI.pm
bioperl-0.7.1/Bio/DB/SeqI.pm
bioperl-0.7.1/Bio/DB/SwissProt.pm
bioperl-0.7.1/Bio/DB/UpdateableSeqI.pm
bioperl-0.7.1/Bio/DB/WebDBSeqI.pm
bioperl-0.7.1/Bio/AlignIO.pm
% perl Makefile.PL
Generated sub tests. go make show_tests to see available subtests
...
Writing Makefile for Bio
% make
cp Bio/Tools/Genscan.pm blib/lib/Bio/Tools/Genscan.pm
cp Bio/Root/Err.pm blib/lib/Bio/Root/Err.pm
cp Bio/Annotation/Reference.pm blib/lib/Bio/Annotation/Reference.pm
cp bioback.pod blib/lib/bioback.pod
cp Bio/AlignIO/fasta.pm blib/lib/Bio/AlignIO/fasta.pm
cp Bio/Location/NarrowestCoordPolicy.pm blib/lib/Bio/Location/NarrowestCoordPolicy.pm
cp Bio/AlignIO/clustalw.pm blib/lib/Bio/AlignIO/clustalw.pm
cp Bio/Tools/Blast/Run/postclient.pl blib/lib/Bio/Tools/Blast/Run/postclient.pl
cp Bio/LiveSeq/Intron.pm blib/lib/Bio/LiveSeq/Intron.pm
...
Manifying blib/man3/Bio::LiveSeq::Exon.3
Manifying blib/man3/Bio::Location::CoordinatePolicyI.3
Manifying blib/man3/Bio::SeqFeature::Similarity.3
% make test
PERL_DL_NONLAZY=1 /net/bin/perl -Iblib/arch -Iblib/lib -I/net/lib/perl5/5.6.1/i686-linux -I/net/lib/perl5/5.6.1 -e 'use Test::Harness qw(&runtests $verbose); $verbose=0; runtests @ARGV;' t/*.t
t/AAChange..........ok
t/AAReverseMutate...ok
t/AlignIO...........ok
t/Allele............ok
...
t/WWW...............ok
All tests successful, 95 subtests skipped.
Files=60, Tests=1011, 35 wallclock secs (25.47 cusr + 1.60 csys = 27.07 CPU)
% make install
Installing /net/lib/perl5/site_perl/5.6.1/bioback.pod
Installing /net/lib/perl5/site_perl/5.6.1/biostart.pod
Installing /net/lib/perl5/site_perl/5.6.1/biodesign.pod
Installing /net/lib/perl5/site_perl/5.6.1/bptutorial.pl
...
If you have an older version of the tar program, you may need to replace the first step with this:
% gunzip -c bioperl-0.7.1.tar.gz | tar xvf -
Installing Modules Using the CPAN Shell
Perl has a CPAN module installer built into it. You run it like this:
% perl -MCPAN -e shell
cpan shell -- CPAN exploration and modules installation (v1.59_54)
ReadLine support enabled
cpan>
From this shell, there are commands for searching for modules, downloading them, and installing them.
[The first time you run the CPAN shell, it will ask you a lot of configuration questions. Generally, you can just hit return to accept the defaults. The only trick comes when it asks you to select CPAN mirrors to download from. Choose any ones that are in your general area on the Internet and it will work fine.]
Here is an example of searching for the Text::Wrap program and installing it:
cpan> i /Wrap/
Going to read /bush_home/bush1/lstein/.cpan/sources/authors/01mailrc.txt.gz
CPAN: Compress::Zlib loaded ok
Going to read /bush_home/bush1/lstein/.cpan/sources/modules/02packages.details.txt.gz
Database was generated on Tue, 16 Oct 2001 22:32:59 GMT
CPAN: HTTP::Date loaded ok
Going to read /bush_home/bush1/lstein/.cpan/sources/modules/03modlist.data.gz
Distribution B/BI/BINKLEY/CGI-PrintWrapper-0.8.tar.gz
Distribution C/CH/CHARDIN/MailQuoteWrap0.01.tgz
Distribution C/CJ/CJM/Text-Wrapper-1.000.tar.gz
...
Module Text::NWrap (G/GA/GABOR/Text-Format0.52+NWrap0.11.tar.gz)
Module Text::Quickwrap (Contact Author Ivan Panchenko )
Module Text::Wrap (M/MU/MUIR/modules/Text-Tabs+Wrap-2001.0929.tar.gz)
Module Text::Wrap::Hyphenate (Contact Author Mark-Jason Dominus )
Module Text::WrapProp (J/JB/JBRIGGS/Text-WrapProp-0.03.tar.gz)
Module Text::Wrapper (C/CJ/CJM/Text-Wrapper-1.000.tar.gz)
Module XML::XSLT::Wrapper (M/MU/MULL/XML-XSLT-Wrapper-0.32.tar.gz)
41 items found
cpan> install Text::Wrap
Running install for module Text::Wrap
Running make for M/MU/MUIR/modules/Text-Tabs+Wrap-2001.0929.tar.gz
CPAN: LWP::UserAgent loaded ok
Fetching with LWP:
ftp://archive.progeny.com/CPAN/authors/id/M/MU/MUIR/modules/Text-Tabs+Wrap-2001.0929.tar.gz
CPAN: MD5 loaded ok
Fetching with LWP:
ftp://archive.progeny.com/CPAN/authors/id/M/MU/MUIR/modules/CHECKSUMS
Checksum for /bush_home/bush1/lstein/.cpan/sources/authors/id/M/MU/MUIR/modules/Text-Tabs+Wrap-2001.0929.tar.gz ok
Scanning cache /bush_home/bush1/lstein/.cpan/build for sizes
Text-Tabs+Wrap-2001.0929/
Text-Tabs+Wrap-2001.0929/MANIFEST
Text-Tabs+Wrap-2001.0929/CHANGELOG
Text-Tabs+Wrap-2001.0929/Makefile.PL
Text-Tabs+Wrap-2001.0929/t/
Text-Tabs+Wrap-2001.0929/t/fill.t
Text-Tabs+Wrap-2001.0929/t/tabs.t
Text-Tabs+Wrap-2001.0929/t/wrap.t
Text-Tabs+Wrap-2001.0929/README
Text-Tabs+Wrap-2001.0929/lib/
Text-Tabs+Wrap-2001.0929/lib/Text/
Text-Tabs+Wrap-2001.0929/lib/Text/Wrap.pm
Text-Tabs+Wrap-2001.0929/lib/Text/Tabs.pm
CPAN.pm: Going to build M/MU/MUIR/modules/Text-Tabs+Wrap-2001.0929.tar.gz
Checking if your kit is complete...
Looks good
Writing Makefile for Text
cp lib/Text/Wrap.pm blib/lib/Text/Wrap.pm
cp lib/Text/Tabs.pm blib/lib/Text/Tabs.pm
Manifying blib/man3/Text::Wrap.3
Manifying blib/man3/Text::Tabs.3
/usr/bin/make -- OK
Running make test
PERL_DL_NONLAZY=1 /net/bin/perl -Iblib/arch -Iblib/lib -I/net/lib/perl5/5.6.1/i686-linux -I/net/lib/perl5/5.6.1 -e 'use Test::Harness qw(&runtests $verbose); $verbose=0; runtests @ARGV;' t/*.t
t/fill..............ok
t/tabs..............ok
t/wrap..............ok
All tests successful.
Files=3, Tests=37, 0 wallclock secs ( 0.20 cusr + 0.00 csys = 0.20 CPU)
/usr/bin/make test -- OK
Running make install
Installing /net/lib/perl5/5.6.1/Text/Wrap.pm
Installing /net/man/man3/Text::Wrap.3
Installing /net/man/man3/Text::Tabs.3
Writing /net/lib/perl5/5.6.1/i686-linux/auto/Text/.packlist
Appending installation info to /net/lib/perl5/5.6.1/i686-linux/perllocal.pod
/usr/bin/make install UNINST=1 -- OK
cpan> quit
Lockfile removed.
More About Importing
Recall that each module has a default list of functions and variables to import. Some modules import many functions by default, others import none. Most import some.
Modules that have a lot of functions and variables to import frequently put them into groups. Groups can be specified using the ":group" syntax.
For example, the CGI::Pretty module has a group called ":standard", which imports a bunch of standard functions for creating HTML pages.
#!/usr/bin/perl
# file: html.pl
use strict;
use CGI::Pretty qw(:standard);
print h1('This is a level one header');
print p('This is a paragraph.');
print p('Here is some',i('italicized'),'text.');
% html.pl
This is a level one header
This is a paragraph.
Here is some italicized text.
The module's documentation will tell you what function groups are defined. To import the default functions, plus optional ones, use the group ":DEFAULT".
use CGI::Pretty qw(:DEFAULT :standard start_html);
Where are Modules Installed?
Module files end with the extension .pm. If the module name is a simple one, like Env, then Perl will look for a file named Env.pm. If the module name is separated by :: sections, Perl will treat the :: characters like directories. So it will look for the module File::Basename in the file File/Basename.pm
Perl searches for module files in a set of directories specified by the Perl library path. This is set when Perl is first installed. You can find out what directories Perl will search for modules in by issuing perl -V from the command line:
% perl -V
Summary of my perl5 (revision 5.0 version 6 subversion 1) configuration:
Platform:
osname=linux, osvers=2.4.2-2smp, archname=i686-linux
...
Compiled at Oct 11 2001 11:08:37
@INC:
/usr/lib/perl5/5.6.1/i686-linux
/usr/lib/perl5/5.6.1
/usr/lib/perl5/site_perl/5.6.1/i686-linux
/usr/lib/perl5/site_perl/5.6.1
/usr/lib/perl5/site_perl
.
You can modify this path to search in other locations by placing the use lib command somewhere at the top of your script:
#!/usr/bin/perl
use lib '/home/lstein/lib';
use MyModule;
...
This tells Perl to look in /home/lstein/lib for the module MyModule before it looks in the usual places. Now you can install module files in this directory and Perl will find them.
The Anatomy of a Module File
Here is a very simple module file named "MySequence.pm":
package MySequence;
#file: MySequence.pm
use strict;
our $EcoRI = 'ggatcc';
sub reversec {
my $sequence = shift;
$sequence = reverse $sequence;
$sequence =~ tr/gatcGATC/ctagCTAG/;
return $sequence;
}
sub seqlen {
my $sequence = shift;
$sequence =~ s/[^gatcnGATCN]//g;
return length $sequence;
}
1;
A module begins with the keyword package and ends with "1;". package gives the module a name, and the 1; is a true value that tells Perl that the module compiled completely without crashing.
The our keyword declares a variable to be global to the module. It is similar to my, but the variable can be shared with other programs and modules ("my" variables cannot be shared outside the current file, subroutine or block). This will let us use the variable in other programs that depend on this module.
To install this module, just put it in the Perl module path somewhere, or in the current directory.
Using the MySequence.pm Module
Using this module is very simple:
#!/usr/bin/perl
#file: sequence.pl
use strict;
use MySequence;
my $sequence = 'gattccggatttccaaagggttcccaatttggg';
my $complement = MySequence::reversec($sequence);
print "original = $sequence\n";
print "complement = $complement\n";
% sequence.pl
original = gattccggatttccaaagggttcccaatttggg
complement = cccaaattgggaaccctttggaaatccggaatc
Unless you explicitly export variables or functions, the calling function must explicitly qualify each MySequence function by using the notation:
MySequence::function_name
For a non-exported variable, the notation looks like this:
$MySequence::EcoRI
Exporting Variables and Functions from Modules
To make your module export variables and/or functions like a "real" module, use the Exporter module.
package MySequence;
#file: MySequence.pm
use strict;
use base 'Exporter';
our @EXPORT = qw(reversec seqlen);
our @EXPORT_OK = qw($EcoRI);
our $EcoRI = 'ggatcc';
sub reversec {
my $sequence = shift;
$sequence = reverse $sequence;
$sequence =~ tr/gatcGATC/ctagCTAG/;
return $sequence;
}
sub seqlen {
my $sequence = shift;
$sequence =~ s/[^gatcnGATCN]//g;
return length $sequence;
}
1;
The use base 'Exporter' line tells Perl that this module is a type of "Exporter" module. As we will see later, this is a way for modules to inherit properties from other modules. The Exporter module (standard in Perl) knows how to export variables and functions.
The our @EXPORT = qw(reversec seqlen) line tells Perl to export the functions reversec and seqlen automatically. The our @EXPORT_OK = qw($EcoRI) tells Perl that it is OK for the user to import the $EcoRI variable, but not to export it automatically.
The qw() notation is telling Perl to create a list separated by spaces. These lines are equivalent to the slightly uglier:
our @EXPORT = ('reversec','seqlen');
Using the Better MySequence.pm Module
Now the module exports its reversec and seqlen functions automatically:
#!/usr/bin/perl
#file: sequence2.pl
use strict;
use MySequence;
my $sequence = 'gattccggatttccaaagggttcccaatttggg';
my $complement = reversec($sequence);
print "original = $sequence\n";
print "complement = $complement\n";
The calling program can also get at the value of the $EcoRI variable, but he has to ask for it explicitly:
#!/usr/bin/perl
#file: sequence3.pl
use strict;
use MySequence qw(:DEFAULT $EcoRI);
my $sequence = 'gattccggatttccaaagggttcccaatttggg';
my $complement = reversec($sequence);
print "original = $sequence\n";
print "complement = $complement\n";
if ($complement =~ /$EcoRI/) {
print "Contains an EcoRI site.\n";
} else {
print "Doesn't contain an EcoRI site.\n";
}
Note that we must now import the :DEFAULT group in order to get the default reversec and seqlen functions.
Object-Oriented Modules
Some modules are object-oriented. Instead of importing a series of subroutines that are called directly, these modules define a series of object types that you can create and use. We will talk about object-oriented syntax in greater detail in the Perl References and Objects lecture. Here we will just show an example:
The Math::Complex Module
The Math::Complex module is a standard module that implements complex numbers. You work with it by creating one or more Math::Complex objects. You can then manipulate these objects mathematically by adding them, subtracting them, multiplying, and so on. Here is a brief example:
#!/usr/bin/perl
# file: complex.pl
use strict;
use Math::Complex;
my $a = Math::Complex->make(5,6);
my $b = Math::Complex->make(10,20);
my $c = $a * $b;
print "$a * $b = $c\n";
We load the Math::Complex module with use, but now instead of calling imported subroutines, we create two objects named $a and $b. Both are created by calling Math::Complex->make() with two arguments. The first argument is the real part of the complex number, and the second is the imaginary part. The return value from make() is the complex number object. We multiply the two numbers together and store the result in $c. Finally, we print out all three values. The script's output is:
51% perl complex.pl
5+6i * 10+20i = -70+160i
Object Syntax
The call to make() uses Perl's object-oriented syntax. Read it as meaning "invoke the make() subroutine that is located inside the Math::Complex package." The call is similar, but not quite equivalent, to this:
Math::Complex::make(10,20)
The difference is that the object-oriented syntax tells Perl to pass the name of the module as an implicit first argument to make(). Therefore, Math::Complex->make(10,20) is almost exactly equivalent to this:
Math::Complex::make('Math::Complex',10,20)
If you are using object-oriented modules, you will never have to worry about this extra argument. If you are writing object-oriented modules, the necessity for the extra argument will make sense to you.
Learn perl easy part4
You can create your own filehandles using the open function, read and/or write to them, and then clean up using close.
open
open opens a file for reading and/or writing, and associates a filehandle with it. You can choose any name for the filehandle, but the convention is to make it all caps. In the examples, we use FILEHANDLE.
open a file for reading open FILEHANDLE,"cosmids.fasta" alternative form: open FILEHANDLE,"
open a file for writing open FILEHANDLE,">cosmids.fasta"
open a file for appending open FILEHANDLE,">>cosmids.fasta'
open a file for reading and writing open FILEHANDLE,"+
It's common for open to fail. Maybe the file doesn't exist, or you don't have permissions to read or create it. Always check open's return value, which is TRUE if the operation succeeded, FALSE otherwise:
$result = open COSMIDS,"cosmids.fasta";
die "Can't open cosmids file: $!\n" unless $result;
When an error occurs, the $! variable holds a descriptive string containing a description of the error, such as "file not found".
There is a compact idiom for accomplishing this in one step:
open COSMIDS,"cosmids.fasta" or die "Can't open cosmids file: $!\n";
Using a Filehandle
Once you've created a filehandle, you can read from it or write to it, just as if it were STDIN or STDOUT. This code reads from file "text.in" and copies lines to "text.out":
open IN,"text.in" or die "Can't open input file: $!\n";
open OUT,">text.out" or die "Can't open output file: $!\n";
while ($line =
print OUT $line;
}
Closing a Filehandle
When you are done with a filehandle, you should close it. This will also happen automatically when your program ends, or if you reuse the same filehandle name.
close IN or warn "Errors while closing filehandle: $!";
Some errors, like filesystem full, only occur when you close the filehandle, so you should check for errors in the same way you do when you open a filehandle.
The Magic of <>
The bare <> function when used without any explicit filehandle is magical. It reads from each of the files on the command line as if they were one single large file. If no file is given on the command line, then <> reads from standard input.
This sounds weird, but it is extremely useful.
A Practical Example of <>
Count the number of lines and bytes in a series of files. If no file is specified, count from standard input (like wc does).
Code:
#!/usr/local/bin/perl
# file: wc.pl
($bytes,$lines) = (0,0);
while (<>) {
$bytes += length($_);
$lines++;
}
print "LINES: $lines\n";
print "BYTES: $bytes\n";
Output:
(~/grant) 79% wc.pl progress.txt
LINES: 102
BYTES: 5688
(~/grant) 80% wc.pl progress.txt resources.txt specific_aims
LINES: 481
BYTES: 24733
Globals and Functions that Affect I/O
Several built-in globals affect input and output:
$/ The input record separator. The value of this global is used by
$\ The record output string. Whatever this is set to will appear at the end of everything printed by print. Normally empty.
$, The output field separator. Appears between all items printed with the print function. Normally empty. $" The output list separator. Interpolated between all items of an array when an array is interpolated into a double-quoted string. Normally a space. $. The line count. When reading from <>, this will be set to the line number of the "virtual file".
Example use of Input Record Separator
Say you have a text file containing records in the following interesting format:
>gi|5340860|gb|AI793144.1|AI793144 on36f02.y5 NCI_CGAP_Lu5 Homo sapiens cDNA clone
CAAACAGCCCCCGATAACGCTACGTGAGCTGGGCCCTGGGCCTGAGGCAGAAAACGGACGGAAGAAAAGG
TCTGGCCGGAGATGGGTCTCACTCTGTCACCCAGACTGGAGTGCAGTGAGTGGTGCGATCATAGCTTACT
GCAGCCTGAAACTCCTGGGCTCAAGTGATCTTCTCGCCTCAGCCTCCTGAGTAGCTGGAGCTACAGGAAT
GAGCATAGATGAACAATGTTGCATCACGCTTGACATCACCGGNGCTTCTTTCCAGTGTGGATTTGCTCAT
GTAAAATGAGGTGTGAGCTCTGCCTGAAAGCTTTTCCATATGCATCACATTTGCAGGGCTTTTCTCCAGT
GTGGGTTCTTTGGTGTCTCAAAAGATGTGAGCTGTTACTGAAAGCTTTCCCACACACATCACACTCATAG
GGCTTCTCTCTACCGTGGATTCGCTGGTGTCCAACAAGAGCTGAACTGTATCTGAAGGCCTTTCCACGCT
TGTCACATTCATATAGTTTCTTTCCACTGTGGATTNTCTGGTGACAGAAGAGGCCCAAGCACTAGCTAAA
GCTNTTCCCTCACTCACTACACTGCTATGGCTTCTCTTCAGTATGAACTCTGATGTTGTCTCAGATATGA
ACTCAGAGAGGATNTCCCACAATCATTACACTGGTATGGTTCCTTTTCGTGTGAGTTCTCTGGTGTCNAA
ATACATCTGAGCTGTGATGAAAGAACTTNCCACACTCACTACATTGGGAAGG
>gi|4306680|gb|AI451833.1|AI451833 mx13e08.y1 Soares mouse NML Mus musculus cDNA clone
TGAATGTATGCAGTGCGGAAAGACATTCACTTCTGGCCACTGTGCCAGAAGACATTTAGGGACTCACAGT
GGAGCCTGGCCTTACAAATGTGAAGTGTGTGGGAAAGCTTATCCCTACGTCTATTCCCTTCGAAACCACA
AAAAAAGTCACAACGAAGAAAAACTTTATGAATGTAAACAATGTGGGAAAGCCTTTAAATACATTTCTTC
CTTACGCAACCACGAGACTACTCACACTGGAGAGAAGCCCTATGAATGTAAGGAATGTGGGAAAGCCTTT
AGTTGTTCCAGTTACATTCAAAATCACATGAGAACACACAAAAGGCAGTCCTATGAATGTAAGGAGTGTG
GTAAGGTGTTCTCATATTCCAAAAGTCTTCGGAGACACATGACTACACATAGTTAATTAGAGAGGGATAG
TTNTAAGTATAATTTAAATATATAAAAGAGCTCTACACATTCTAGCTCCTCATTAAGAAACAAAAAATTT
CACACTGGAAAACGAGCCTATGAATGCAGTATGTGTGCCAAAGTCTCAGTACATGCCACAGT
>gi|3400733|gb|AI074089.1|AI074089 oq97c08.x1 NCI_CGAP_Co12 Homo sapiens cDNA clone
GAATCTTCTGGGTCCTCTTTATTAAGAGCCCTCTGCCTTCCCAGGGGAGGGAAGCAAATCCTTCAGGGCC
CCCAGAGTTCCTGCACCCCATATCATGGGTGAGTCCTACCAGCCACAGAGCCACCCGTCACCGTGGAGAG
GCTTAAGCTGCACTCAGAGCTCCCCCCGGGCATGCCGAATGTAGTGTTGATGCAGCCCTGCTTCCTGAGC
AAAGTCCTGACCGCACTCTGTGCAGGCGAAGGTGCCAGGAGGGGCACGGACCTCATGCATCTGGCGGTGC
CGCCTCAGAGAAACAGCCTGCCCAAAGGTCTTGCCACAGTCAGGACAAGGGAAGGTGGGCTGGGCAGTAG
TGGTTGCAACCGGCAGGGTGGGCTTGGCGGCTGGACCGTGGCTGCGCTGGTGGGTGATTAGGGCTTTGGA
...
If you use standard <>, you will get a line at a time, and have to figure out where one record ends and a new one starts. However, if you set the input record separator to ">", then each time you read a "line", you will read all the way to the next ">" symbol. Throw away the first record (which is empty), keep the others.
#!/usr/local/bin/perl
# file: get_fasta_records.pl
$/ = '>';
<>; # throw away the first record (will be empty)
while (<>) {
chomp;
# split up lines of the record. The first line
# is the sequence ID. The second and subsequent lines
# are the sequence
my ($id,@sequence) = split "\n";
my $sequence = join '',@sequence; # reassemble the sequence
}
Special Uses of the Input Record Separator
The input record separator has two special cases.
Paragraph Mode
If the input record separator ($/) is set to the empty string ("") it goes into paragraph mode. Each <> will read up to the next blank line. Multiple blank lines will be skipped over. This is good for reading text separated into paragraphs.
Slurp Mode
If the input record separator is set to the undefined value (undef) then it goes into slurp mode. The <> operator will read its entire input into a single scalar.
Here's how to read the entire file cosmids.fasta into a scalar variable:
open IN,"cosmids.fasta" or die "Can't open cosmids.fasta: $!\n";
$/ = undef;
$data =
Regular Expressions
A regular expression is a string template against which you can match a piece of text. They are something like shell wildcard expressions, but much more powerful.
Examples of Regular Expressions
This bit of code loops through each line of a file. Finds all lines containing an EcoRI site, and bumps up a counter:
Code:
#!/usr/bin/perl -w
#file: EcoRI1.pl
use strict;
my $filename = "example.fasta";
open (FASTA , "$filename") or print "$filename does not exist\n";
my $sites;
while (my $line =
chomp $line;
if ($line =~ /GAATTC/){
print "Found an EcoRI site!\n";
$sites++;
}
}
if ($sites){
print "$sites EcoRI sites total\n";
}else{
print "No EcoRI sites were found\n";
}
#note: if $sites is declared inside while loop you would not be able to
#print it outside the loop
Output:
~]$ ./EcoRI1.pl
Found an EcoRI site!
Found an EcoRI site!
.
.
.
Found an EcoRI site!
Found an EcoRI site!
34 EcoRI sites total
This Works Too!
Code:
#file:EcoRI2.pl
while (
chomp;
if ($_ = /GAATTC/){
print "Found an EcoRI site!\n";
$sites++;
}
}
Output:
~]$ ./EcoRI1.pl
Found an EcoRI site!
Found an EcoRI site!
.
.
.
Found an EcoRI site!
Found an EcoRI site!
34 EcoRI sites total
This Also Works
Code:
#file:EcoRI.pl
while (
chomp;
if (/GAATTC/){
print "Found an EcoRI site!\n";
$sites++;
}
}
By default, a regular expression examines $_ and returns a TRUE if it matches, FALSE otherwise.
Output:
~]$ ./EcoRI1.pl
Found an EcoRI site!
Found an EcoRI site!
.
.
.
Found an EcoRI site!
Found an EcoRI site!
34 EcoRI sites total
This does the same thing, but counts one type of methylation site (Pu-C-X-G) instead:
Code:
#file:methy.pl
while (
chomp;
if (/[GA]C.?G/){ #What Happens If Your File Is Not All In CAPS
#print "Found a Methylation Site!\n";
$sites++;
}
}
if ($sites){
print "$sites Methylation Sites total\n";
}else{
print "No Methylation Sites were found\n";
}
Output:
~]$ ./methy.pl
723 Methylation Sites total
Regular Expression Variable
A regular expression is normally delimited by two slashes ("/"). Everything between the slashes is a pattern to match. Patterns can be made up of the following Atoms:
1. Ordinary characters: a-z, A-Z, 0-9 and some punctuation. These match themselves.
2. The "." character, which matches everything except the newline.
3. A bracket list of characters, such as [AaGgCcTtNn], [A-F0-9], or [^A-Z] (the last means anything BUT A-Z).
4. Certain predefined character sets: \d The digits [0-9] \w A word character [A-Za-z_0-9] \s White space [ \t\n\r] \D A non-digit \W A non-word \S Non-whitespace
5. Anchors: ^ Matches the beginning of the string $ Matches the end of the string \b Matches a word boundary (between a \w and a \W)
Examples:
* /g..t/ matches "gaat", "goat", and "gotta get a goat" (twice)
* /g[gatc][gatc]t/ matches "gaat", "gttt", "gatt", and "gotta get an agatt" (once)
* /\d\d\d-\d\d\d\d/ matches 376-8380, and 5128-8181, but not 055-98-2818.
* /^\d\d\d-\d\d\d\d/ matches 376-8380 and 376-83801, but not 5128-8181.
* /^\d\d\d-\d\d\d\d$/ only matches telephone numbers.
* /\bcat/ matches "cat", "catsup" and "more catsup please" but not "scat".
* /\bcat\b/ only text containing the word "cat".
Quantifiers
By default, an atom matches once. This can be modified by following the atom with a quantifier:
? atom matches zero or exactly once* atom matches zero or more times + atom matches one or more times {3} atom matches exactly three times {2,4} atom matches between two and four times, inclusive {4,} atom matches at least four times
Examples:
* /goa?t/ matches "goat" and "got". Also any text that contains these words.
* /g.+t/ matches "goat", "goot", and "grant", among others.
* /g.*t/ matches "gt", "goat", "goot", and "grant", among others.
* /^\d{3}-\d{4}$/ matches US telephone numbers (no extra text allowed.
Alternatives and Grouping
A set of alternative patterns can be specified with the | symbol:
/wolf|sheep/; # matches "wolf" or "sheep"
/big bad (wolf|sheep)/; # matches "big bad wolf" or "big bad sheep"
You can combine parenthesis and quantifiers to quantify entire subpatterns:
/Who's afraid of the big (bad )?wolf\?/;
# matches "Who's afraid of the big bad wolf?" and
# "Who's afraid of the big wolf?"
This also shows how to literally match the special characters -- put a backslash (\) in front of them.
Specifying the String to Match
Regular expressions will attempt to match $_ by default. To specify another string variable, use the =~ (binding) operator:
$h = "Who's afraid of Virginia Woolf?";
print "I'm afraid!\n" if $h =~ /Woo?lf/;
There's also an equivalent "not match" operator !~, which reverses the sense of the match:
$h = "Who's afraid of Virginia Woolf?";
print "I'm not afraid!\n" if $h !~ /Woo?lf/;
Using a Different Delimiter
If you want to match slashes in the pattern, you can backslash them:
$file = '/usr/local/blast/cosmids.fasta';
print "local file" if $file =~ /^\/usr\/local/;
This is ugly, so you can specify any match delimiter with the m (match) operator:
$file = '/usr/local/blast/cosmids.fasta';
print "local file" if $file =~ m!^/usr/local!;
The punctuation character that follows the m becomes the delimiter. In fact // is just an abbreviation for m//. Almost any punctuation character will work:
* m!^/usr/local!
* m#^/usr/local#
* m@^/usr/local@
* m,^/usr/local,
* m{^/usr/local}
* m[^/usr/local]
The last two examples show that you can use left-right bracket pairs as well.
Matching with a Variable Pattern
You can use a scalar variable for all or part of a regular expression. For example:
$pattern = '/usr/local';
print "matches" if $file =~ /^$pattern/;
See the o flag for important information about using variables inside patterns.
Subpatterns
You can extract and manipulate subpatterns in regular expressions.
To designate a subpattern, surround its part of the pattern with parenthesis (same as with the grouping operator). This example has just one subpattern, (.+) :
/Who's afraid of the big bad w(.+)f/
Matching Subpatterns
Once a subpattern matches, you can refer to it later within the same regular expression. The first subpattern becomes \1, the second \2, the third \3, and so on.
while (<>) {
chomp;
print "I'm scared!\n" if /Who's afraid of the big bad w(.)\1f/
}
This loop will print "I'm scared!" for the following matching lines:
* Who's afraid of the big bad woof
* Who's afraid of the big bad weef
* Who's afraid of the big bad waaf
but not
* Who's afraid of the big bad wolf
* Who's afraid of the big bad wife
In a similar vein, /\b(\w+)s love \1 food\b/ will match "dogs love dog food", but not "dogs love monkey food".
Using Subpatterns Outside the Regular Expression Match
Outside the regular expression match statement, the matched subpatterns (if any) can be found the variables $1, $2, $3, and so forth.
Example. Extract 50 base pairs upstream and 25 base pairs downstream of the TATTAT consensus transcription start site:
while (<>) {
chomp;
next unless /(.{50})TATTAT(.{25})/;
my $upstream = $1;
my $downstream = $2;
}
Extracting Subpatterns Using Arrays
If you assign a regular expression match to an array, it will return a list of all the subpatterns that matched. Alternative implementation of previous example:
while (<>) {
chomp;
my ($upstream,$downstream) = /(.{50})TATTAT(.{25})/;
}
If the regular expression doesn't match at all, then it returns an empty list. Since an empty list is FALSE, you can use it in a logical test:
while (<>) {
chomp;
next unless my($upstream,$downstream) = /(.{50})TATTAT(.{25})/;
print "upstream = $upstream\n";
print "downstream = $downstream\n";
}
Grouping without Making Subpatterns
Because parentheses are used both for grouping (a|ab|c) and for matching subpatterns, you may match subpatterns that don't want to. To avoid this, group with (?:pattern):
/big bad (?:wolf|sheep)/;
# matches "big bad wolf" or "big bad sheep",
# but doesn't extract a subpattern.
Subpatterns and Greediness
By default, regular expressions are "greedy". They try to match as much as they can. For example:
$h = 'The fox ate my box of doughnuts';
$h =~ /(f.+x)/;
$subpattern = $1;
Because of the greediness of the match, $subpattern will contain "fox ate my box" rather than just "fox".
To match the minimum number of times, put a ? after the qualifier, like this:
$h = 'The fox ate my box of doughnuts';
$h =~ /(f.+?x)/;
$subpattern = $1;
Now $subpattern will contain "fox". This is called lazy matching.
Lazy matching works with any quantifier, such as +?, *? and {2,50}?.
String Substitution
String substitution allows you to replace a pattern or character range with another one using the s/// and tr/// functions.
The s/// Function
s/// has two parts: the regular expression and the string to replace it with: s/expression/replacement/.
$h = "Who's afraid of the big bad wolf?";
$i = "He had a wife.";
$h =~ s/w.+f/goat/; # yields "Who's afraid of the big bad goat?"
$i =~ s/w.+f/goat/; # yields "He had a goate."
If you extract pattern matches, you can use them in the replacement part of the substitution:
$h = "Who's afraid of the big bad wolf?";
$h =~ s/(\w+) (\w+) wolf/$2 $1 wolf/;
# yields "Who's afraid of the bad big wolf?"
Default Substitution Variable
If you don't bind a variable with =~, then s/// operates on $_ just as the match does.
Using a Variable in the Substitution Part
Yes you can:
$h = "Who's afraid of the big bad wolf?";
$animal = 'hyena';
$h =~ s/(\w+) (\w+) wolf/$2 $1 $animal/;
# yields "Who's afraid of the bad big hyena?"
Using Different Delimiters
The s/// function can use alternative delimiters, including parentheses and bracket pairs. For example:
$h = "Who's afraid of the big bad wolf?";
$h =~ s!(\w+) (\w+) wolf!$2 $1 wolf!; # using ! as delimiter
$h =~ s{(\w+) (\w+) wolf}{$2 $1 wolf}; # using {} as delimiter
Translating Character Ranges
The tr/// function allows you to translate one set of characters into another. Specify the source set in the first part of the function, and the destination set in the second part:
$h = "Who's afraid of the big bad wolf?";
$h =~ tr/ao/AO/; # yields "WhO's AfrAid Of the big bAd wOlf?";
Like s///, the tr/// function operates on $_ if not otherwise specified.
tr/// returns the number of characters transformed, which is sometimes handy for counting the number of a particular character without actually changing the string.
This example counts N's in a series of DNA sequences:
Code:
while (<>) {
chomp; # assume one sequence per line
my $count = tr/Nn/Nn/;
print "Sequence $_ contains $count Ns\n";
}
Output:
(~) 50% count_Ns.pl sequence_list.txt
Sequence 1 contains 0 Ns
Sequence 2 contains 3 Ns
Sequence 3 contains 1 Ns
Sequence 4 contains 0 Ns
...
Regular Expression Options
Regular expression matches and substitutions have a whole set of options which you can toggle on by appending one or more of the i, m, s, g, e or x modifiers to the end of the operation. See Programming Perl Page 153 for more information. Some example:
$string = 'Big Bad WOLF!';
print "There's a wolf in the closet!" if $string =~ /wolf/i;
# i is used for a case insensitive match
i Case insensitive match.
g Global match (see below).
e Evalute right side of s/// as an expression.
o Only compile variable patterns once (see below).
m Treat string as multiple lines. ^ and $ will match at start and end of internal lines, as well as at beginning and end of whole string. Use \A and \Z to match beginning and end of whole string when this is turned on.
s Treat string as a single line. "." will match any character at all, including newline.
x Allow extra whitespace and comments in pattern.
Global Matches
Adding the g modifier to the pattern causes the match to be global. Called in a scalar context (such as an if or while statement), it will match as many times as it can.
This will match all codons in a DNA sequence, printing them out on separate lines:
Code:
$sequence = 'GTTGCCTGAAATGGCGGAACCTTGAA';
while ( $sequence =~ /(.{3})/g ) {
print $1,"\n";
}
Output:
GTT
GCC
TGA
AAT
GGC
GGA
ACC
TTG
If you perform a global match in a list context (e.g. assign its result to an array), then you get a list of all the subpatterns that matched from left to right. This code fragment gets arrays of codons in three reading frames:
@frame1 = $sequence =~ /(.{3})/g;
@frame2 = substr($sequence,1) =~ /(.{3})/g;
@frame3 = substr($sequence,2) =~ /(.{3})/g;
The position of the most recent match can be determined by using the pos function.
Code:
#file:pos.pl
my $seq = "XXGGATCCXX";
if ( $seq =~ /(GGATCC)/gi ){
my $pos = pos($seq);
print "Our Sequence: $seq\n";
print '$pos = ', "1st postion after the match: $pos\n";
print '$pos - length($1) = 1st postion of the match: ',($pos-length($1)),"\n";
print '($pos - length($1))-1 = 1st postion before the the match: ',($pos-length($1)-1),"\n";
}
Output:
~]$ ./pos.pl
Our Sequence: XXGGATCCXX
$pos = 1st postion after the match: 8
$pos - length($&) = 1st postion of the match: 2
($pos - length($&))-1 = 1st postion before the the match: 1
Variable Interpolation and the "o" Modifier
If you use a variable inside a pattern template, as in /$pattern/ be aware that there is a small performance penalty each time Perl encounters a pattern it hasn't seen before. If $pattern doesn't change over the life of the program, then use the o ("once") modifier to tell Perl that the variable won't change. The program will run faster:
$codon = '.{3}';
@frame1 = $sequence =~ /($codon)/og;
Testings Your Regular Expressions
To be sure that you are getting what you think you want you can use the following "Magic" Perl Automatic Match Variables $&, $`, and $'
Code:
#file:matchTest.pl
if ("Hello there, neighbor" =~ /\s(\w+),/){
print "That actually matched '$&'.\n";
print "That was ($`) ($&) ($').\n";
}
Output:
That actually matched ' there,'.
That was (Hello) ( there,) ( neighbor).
Regular Expression Options
Regular expression matches and substitutions have a whole set of options which you can toggle on by appending one or more of the i, m, s, g, e or x modifiers to the end of the operation. See Programming Perl Page 153 for more information. Some example:
$string = 'Big Bad WOLF!';
print "There's a wolf in the closet!" if $string =~ /wolf/i;
# i is used for a case insensitive match
i Case insensitive match.
g Global match (see below).
e Evalute right side of s/// as an expression.
o Only compile variable patterns once (see below).
m Treat string as multiple lines. ^ and $ will match at start and end of internal lines, as well as at beginning and end of whole string. Use \A and \Z to match beginning and end of whole string when this is turned on.
s Treat string as a single line. "." will match any character at all, including newline.
x Allow extra whitespace and comments in pattern.
Global Matches
Adding the g modifier to the pattern causes the match to be global. Called in a scalar context (such as an if or while statement), it will match as many times as it can.
This will match all codons in a DNA sequence, printing them out on separate lines:
Code:
$sequence = 'GTTGCCTGAAATGGCGGAACCTTGAA';
while ( $sequence =~ /(.{3})/g ) {
print $1,"\n";
}
Output:
GTT
GCC
TGA
AAT
GGC
GGA
ACC
TTG
If you perform a global match in a list context (e.g. assign its result to an array), then you get a list of all the subpatterns that matched from left to right. This code fragment gets arrays of codons in three reading frames:
@frame1 = $sequence =~ /(.{3})/g;
@frame2 = substr($sequence,1) =~ /(.{3})/g;
@frame3 = substr($sequence,2) =~ /(.{3})/g;
The position of the most recent match can be determined by using the pos function.
Code:
#file:pos.pl
my $seq = "XXGGATCCXX";
if ( $seq =~ /(GGATCC)/gi ){
my $pos = pos($seq);
print "Our Sequence: $seq\n";
print '$pos = ', "1st postion after the match: $pos\n";
print '$pos - length($1) = 1st postion of the match: ',($pos-length($1)),"\n";
print '($pos - length($1))-1 = 1st postion before the the match: ',($pos-length($1)-1),"\n";
}
Output:
~]$ ./pos.pl
Our Sequence: XXGGATCCXX
$pos = 1st postion after the match: 8
$pos - length($&) = 1st postion of the match: 2
($pos - length($&))-1 = 1st postion before the the match: 1
Variable Interpolation and the "o" Modifier
If you use a variable inside a pattern template, as in /$pattern/ be aware that there is a small performance penalty each time Perl encounters a pattern it hasn't seen before. If $pattern doesn't change over the life of the program, then use the o ("once") modifier to tell Perl that the variable won't change. The program will run faster:
$codon = '.{3}';
@frame1 = $sequence =~ /($codon)/og;
Testings Your Regular Expressions
To be sure that you are getting what you think you want you can use the following "Magic" Perl Automatic Match Variables $&, $`, and $'
Code:
#file:matchTest.pl
if ("Hello there, neighbor" =~ /\s(\w+),/){
print "That actually matched '$&'.\n";
print "That was ($`) ($&) ($').\n";
}
Output:
That actually matched ' there,'.
That was (Hello) ( there,) ( neighbor).
Learn perl easy part3
An Array Is a List of Values
For example a list with the number 3.14 as the first element, the string 'abA' as the second element, and the number 65065 as the third element."Literal Representation"
We write the list as above as(3.14, 'abA', 65065)If $pi = 3.14 and $s = 'abA' we can also write
($pi, $s, 65065)We can also do integer ranges:
(-1..5)shorthand for
(-1, 0, 1, 2, 3, 4, 5)Counting down not allowed!
Array Variables and Assignment
my $x = 65065;
my @x = ($pi, 'abA', $x);
my @y = (-1..5);
my @z = ($x, $pi, @x, @y);
my ($first, @rest) = @z;
Getting at Array Elements
$z[0] # 65065Skip "slices" for now.
$z[0] = 2;
$z[0] # 2
$z[$#z]; # 5
Push, Pop, Shift, Unshift
Add 9 to the end of @z;push @z, 9;Take the 9 off the end of @z, and then take the 5 off the end:
my $end1 = pop @z;Add 9 to the beginning of @z;
my $end2 = pop @z;
unshift @z, 9;Take the 9 off the beginning of @z, and then take the 3.14 off the beginning:
my $b1 = shift @z;
my $b2 = shift @z;
Reverse
my @zr = reverse @z;
Sorting
Alphabetically:my @zs = sort @z;Numerically:
my @q = sort { $a <=> $b } (-1, 3, -20)
Split and Join
my @q = split /\d+/, 'abd1234deff0exx'
# ('abd', 'deff', 'exx');
Swallowing Whole Files in a Single Gulp
my @i = <>;
chomp @i;
Array and Scalar Context
The notion of array and scalar context is unique to perl. Usually you can remain unaware of it, but it comes up in reverse, and can be used to get the size of an array.print reverse 'ab'; # prints ab!!! (reverse in array context)
$ba = reverse 'ab'; # $ba contains 'ba' (reverse in scalar context)
print scalar reverse 'ab'; # prints ba
print scalar @z; # print the size of @z
A Hash Is a Lookup Table
A hash is a lookup table. We use a key to find an associated value.my %translate;
$translate{'atg'} = 'M';
$translate{'taa'} = '*';
$translate{'ctt'} = 'K'; # oops
$translate{'ctt'} = 'L'; # fixed
print $translate{'atg'};
Getting All Keys
keys %translate
Removing Key, Value Pairs
delete $translate{'taa'};
keys %translate;
Initializing From a List
%translate = ( 'atg' => 'M',
'taa' => '*',
'ctt' => 'L',
'cct' => 'P', );
Basic Loops
Loops let you execute the same statements over and over again.
while Loops
A while loop has a condition at the top. The code within the body will execute until the code becomes false.
|
Example: Count the number of times "potato" appears in a list
#!/usr/local/bin/perl |
(~) 51% spud_counter.pl potato potato tomato potato boysenberry
Found a potato!
Found a potato!
tomato is not a potato
Found a potato!
boysenberry is not a potato
Potato count: 3
Another Example: Count Upward from 1 to 5
#!/usr/local/bin/perl |
(~) 51% count_up.pl
count: 1
count: 2
count: 3
count: 4
count: 5
Yet Another Example: Count Down from 5 to 1
#!/usr/local/bin/perl |
(~) 51% count_down.pl
count: 5
count: 4
count: 3
count: 2
count: 1
The continue Block
while loops can have an optional continue block containing code that is executed at the end of each loop, just before jumping back to the test at the top:
#!/usr/local/bin/perl |
continue blocks will make more sense after we consider loop control variables.
The until Loop
Sometimes you want to loop until some condition becomes true, rather than until some condition becomes false. The until loop is easier to read than the equivalent while (!TEST).
my $counter = 5; |
foreach Loops
foreach will process each element of an array or list:
|
The last example is interesting. It shows that if you don't explicitly give foreach a loop variable, the special scalar variable $_ is used.
Changing Values with the foreach Loop
If you modify the loop variable in a foreach loop, the underlying array value will change!
@h = (1..5); # make an array containing numbers between 1 and 5 |
1 potato
2 potato
3 potato
4 potato
5 potato
This works with the automatic $_ variable too:
@h = ('CCCTTT','AAAACCCC','GAGAGAGA'); |
Advanced Loops
The for Loop
Consider the standard while loop:
initialization code |
This can be generalized into the concise for loop:
|
When the loop is first entered, the code at initialization is executed. Each time through the loop, the test at test is executed and the loop stops if it returns false. After the execution of each loop, the code at update is performed.
Compare the process of counting from 1 to 5:
# with a while loop |
Notice how we use my to make $count local to the for loop.
Fancy for() Loops
Any of the three for components are optional. You can even leave them all off to get an infinite loop:
for (;;) { |
Any of the components can be a list. This is usually used to initialize several variables at once:
# read until the "end" line or 10 lines, whichever |
Loop Control
The next, last, and redo statements allow you to change the flow of control in the loop mid-stream, as it were. You can use these three statements in while loops, until loops, and for and foreach loops, but not in the do-until and do-while variants.
next
The next statement causes the rest of the loop to be skipped and control to pass back to the conditional test at the top. If there's a continue block, it is executed before control returns to the top of the loop.
$done = 0; |
last
The last statement causes the loop to terminate prematurely, even if the loop conditional is still true:
while ( $line = |
redo
The redo statement is rarely used. It causes flow of control to jump to the top of the loop, like next. However, the continue block, if any, is not executed. In a for loop, the update expression is not executed.
for (my $i=0; $i<10; $i++) { |
Nested Loops
If you have two or more nested loops, next, last and redo always apply to the innermost loop. You can change this by explicitly labeling the loop block and referring to the label in the loop control statement:
XLOOP: |
Basic I/O
I/O means input/output, and is necessary to get computer programs to talk to the rest of the world.
The STDIN, STDOUT and STDERR Filehandles
Every Perl scripts starts out with three connections to the outside world:
STDIN Standard input, used to read input. Initially connected to the keyboard, but can be changed from shell using redirection (<) or pipe (|).STDOUT Standard output, used to write data out. Initially connected to the terminal, but can be redirected to a file or other program from the shell using redirection or pipes.
STDERR Standard error, used for diagnostic messages. Initially connected to the terminal, etc.
In addition to these three filehandles, you can create your own.
Reading Data from STDIN
To read a line of data into your program use the angle bracket function:
$line = |
print "Type your name: "; |
The read/chomp sequence is often abbreviated as:
chomp($name =);
The Input Loop
At the "end of file" (or when the user presses ^D to end input)This leads typical input loop:
while ( $line = |
The while loop will read one line of text after another. At the end of input, the angle-bracket operator returns undef and the while loop terminates. Remember that even blank lines are TRUE, because they consist of a single newline character.
The Default Input Variable
If you don't assign the result of the angle-bracket operator to a scalar variable, it will default to the special scalar variable $_. This scalar is the default for a number of other functions, including chomp and the regular expression match.
This example prepends the line number to its input.
#!/usr/local/bin/perl |
(~) 50% add_line_numbers.pl
0: Gabor Marth gmarth@watson.wustl.edu
1: Genome Sequencing Center
2: Washington University School of Medicine
3: 4444 Forest Park Blvd.
4: St. Louis, MO 63108
5: 314 286-1839
6: 314 286-1810 (fax)
7: Dates: Oct 17-23
8:
9: Sean Eddy eddy@genetics.wustl.edu
10: Assistant professor
11: Department of Genetics
12: Washington University School of Medicine
13: 660 S. Euclid Ave.
14: St. Louis, Mo. 63110
15: 314 362-7666
16: 314 362-7855 (fax)
17: Dates: Oct 20-22
18:
19: Warren Gish gish@sapiens.wustl.edu
...
Assigning to an Array
Normally you assign the angle-bracket function to a scalar variable, getting a line of input. What if you assign to an array? You get all the lines from the input file or terminal, one per array element!!!
It is convenient to pass this array to chomp, which will remove the newline from each member of the array.
@lines = |
Or you can do both things in one elegant operation:
chomp(@lines =);
Output
The print function writes data to output. In its full form, it takes a filehandle as its first argument, followed by a list of scalars to print:
print FILEHANDLE $data1,$data2,$data3,...
Notice there is no comma between FILEHANDLE and the data arguments. If FILEHANDLE is omitted it defaults to STDOUT (this can be changed). So these are equivalent:
To print to standard error:print STDOUT "Hello world\n";
print "Hello world\n";
print STDERR "Does not compute.\n";
Fundamental Techniques in Handling People
1. Don't criticize, condemn or complain.
2. Give honest and sincere appreciation.
3. Arouse in the other person an eager want.
Six ways to make people like you
1. Become genuinely interested in other people.
2. Smile.
3. Remember that a person's name is to that person the sweetest and most important sound in any language.
4. Be a good listener. Encourage others to talk about themselves.
5. Talk in terms of the other person's interests.
6. Make the other person feel important - and do it sincerely.
Win people to your way of thinking
1. The only way to get the best of an argument is to avoid it.
2. Show respect for the other person's opinions. Never say, "You're wrong."
3. If you are wrong, admit it quickly and emphatically.
4. Begin in a friendly way.
5. Get the other person saying "yes, yes" immediately.
6. Let the other person do a great deal of the talking.
7. Let the other person feel that the idea is his or hers.
8. Try honestly to see things from the other person's point of view.
9. Be sympathetic with the other person's ideas and desires.
10. Appeal to the nobler motives.
11. Dramatize your ideas.
12. Throw down a challenge.
Be a Leader: How to Change People Without Giving Offense or Arousing Resentment
A leader's job often includes changing your people's attitudes and behavior. Some suggestions to accomplish this:
1. Begin with praise and honest appreciation.
2. Call attention to people's mistakes indirectly.
3. Talk about your own mistakes before criticizing the other person.
4. Ask questions instead of giving direct orders.
5. Let the other person save face.
6. Praise the slightest improvement and praise every improvement. Be "hearty in your approbation and lavish in your praise."
7. Give the other person a fine reputation to live up to.
8. Use encouragement. Make the fault seem easy to correct.
9. Make the other person happy about doing the thing you suggest.
Tuesday, August 18, 2009
Creating shared and static libraries
How to create a shared and a static library with gcc.
The code for the library you want to build
This is the code that goes will be in the library. It is one single function that takes two doubles and calculates their avg value and returns it.
calc_avg.c
//#include
double avg(double a, double b)
{
return (a+b) / 2;
}
The header file
Of course, we need a header file.
/* it is very basic example so i don't have any other prototypes or declarations.
calc_avg.h
double avg(double, double);
Build the static library
A static library is basically a set of object files that were copied into a single file. This single file is the static library. The static file is created with the archiver (ar).
First, calc_avg.c is turned into an object file:
gcc -c calc_avg.c -o calc_avg.o
Then, the archiver (ar) is invoked to produce a static library (named libavg.a) out of the object file calc_avg.o.
ar rcs libavg.a calc_avg.o
Note: the library must start with the three letters lib and have the suffix .a.
Creating the shared library
As with static libraries, an object file is created. The -fPIC option tells gcc to create position independant code which is necessary for shared libraries. Note also, that the object file created for the static library will be overwritten. That's not bad, however, because we have a static library that already contains the needed object file.
gcc -c -fPIC calc_avg.c -o calc_avg.o
Due to some reason, gcc will compliance:
cc1: warning: -fPIC ignored for target (all code is position independent)
It looks like -fPIC is not necessary on x86, but according to manuals , it's needed, so I use it too.
Now, the shared library is created using bellow command
gcc -shared -Wl,-soname,libavg.so.1 -o libavg.so.1.0.1 calc_avg.o
Note: the library must start with the three letter lib and ends with .so for shared library.
The programm using the library
This is the program that uses the calc_avg library. Once, we will link it against the static library and once against the shared library.
main.c
#include
#include "calc_avg.h"
int main(int argc, char* argv[]) {
double value1, value2, result;
value1 = 3;
value2 = 12;
result = avg(value1, value2);
printf("The avg of %3.2f and %3.2f is %3.2f\n", value1, value2, result);
return 0;
}
Linking against static library
gcc -static main.c -L. -lavg -o statically_linked
Note: the first three letters (the lib) must not be specified, as well as the suffix (.a)
Linking against shared library
gcc main.c -o dynamically_linked -L. -lavg
Note: the first three letters (the lib) must not be specified, as well as the suffix (.so)
Executing the dynamically linked programm
LD_LIBRARY_PATH=.
./dynamically_linked
Thursday, August 6, 2009
encrypt and decrypt Your file
you can encrypt your file with four digit key and when ever you want you can retrive it with that key.
If your programer you can break this key with in a minute itself.( i said it gives minimal security.
let see code------
so now point is how to use it.
#include "stdio.h"
#include "string.h"
#include "fcntl.h"
#include "malloc.h"
#include "stdlib.h"
#include "termios.h"
/* if you get any error above include change double codes to less than and grater than*/
#define MAX_CHAR 10
void create_write_to_file(char *file_name, char *out_file);
void manuply_pin(char *pin1,char *pin);
char take_input(char *file_name,int *fd);
void read_pin(char *pin1);
typedef struct string_l
{
char ten_chars[MAX_CHAR];
struct string_l *pnext;
}string_link_t;
string_link_t *phead;
int main()
{
char pin1[4]={0},
pin[10]={0},
i=0,
len=10,
enc_or_dec=-1,
flag = 1;
char file_name[31],
out_file[33],
temp_string[11]={0};
int fd;
string_link_t *ptemp, *pmove;
read_pin(pin1);
manuply_pin(pin1,pin);
enc_or_dec=take_input(file_name,&fd);
lseek(fd,0,SEEK_SET);
flag = 1 ;
len = 10;
while( len == 10 && flag == 1)
{
len = read(fd,temp_string,MAX_CHAR);
if(len <= 0) { break; } else if(len < flag =" 0;" ptemp =" (string_link_t">pnext = NULL;
if(!ptemp)
{
printf("CRITICAL ERROR CON'T PROCEED \n");
exit(0);
}
if(!phead)
{
phead = ptemp;
pmove = ptemp;
}
else
{
pmove->pnext = ptemp;
pmove = ptemp;
}
memcpy(ptemp->ten_chars,temp_string,10);
for(i=0; i<10; enc_or_dec ="=">ten_chars[i] = ptemp->ten_chars[i] + pin[i];
}
else if(enc_or_dec == '2')
{
ptemp->ten_chars[i] = ptemp->ten_chars[i] - pin[i];
}
}
else
{
if(enc_or_dec == '1')
{
ptemp->ten_chars[i] = ' ' + pin[i];
}
else if(enc_or_dec == '2')
{
ptemp->ten_chars[i] = ' ' - pin[i];
}
flag = 0;
}
}
}
create_write_to_file(file_name, out_file);
close(fd);
}
char take_input(char *file_name,int *fd)
{
char enc_or_dec=-1;
do
{
printf("enter 1 for encryption 2 for decription\n");
__fpurge(stdin);
scanf("%c",&enc_or_dec);
}
while(!(enc_or_dec == '1'|| enc_or_dec == '2'));
printf("enter file name(Max 30 char): \n");
scanf("%s",file_name);
*fd=open(file_name,O_RDONLY);
if(*fd < len="0," i="0;" len="(int)" i="0;"> '9')
{
printf("pin contain illigal digits\n try again \n");
goto enter_pin;
}
i++;
}
}
read_password(char *pin1)
{
struct termios cuset,newset;
char ch;
char ii=0;
/*Disable echo and canonical mode of processing***/
tcgetattr(0,&cuset);
newset = cuset;
//newset.c_lflag &= ~ICANON;
newset.c_lflag &= ~ECHO;
tcsetattr(0,TCSANOW,&newset);
//setbuf(stdin,NULL);
//setbuf(stdout,NULL);
__fpurge(stdin);
while( ( ch=getchar() ) != '\n' )
{
if(ii<4) fd1="-1;" flag =" 1;" fd1 =" open(out_file,O_WRONLY" flag ="=" flag =" 0;" ptemp =" phead;">ten_chars,MAX_CHAR);
pfree = ptemp;
ptemp = ptemp->pnext;
free(pfree);
}
close(fd1);
}
void manuply_pin(char *pin1,char *pin)
{
pin[0] = pin1[0] + pin1[1] + pin1[2] + pin1[3];
pin[1] = pin1[0] + pin1[1] + pin1[2] - pin1[3];
pin[2] = pin1[0] + pin1[1] - pin1[2] + pin1[3];
pin[3] = pin1[0] + pin1[1] - pin1[2] - pin1[3];
pin[4] = -pin1[0] + pin1[1] + pin1[2] - pin1[3];
pin[5] = -pin1[0] - pin1[1] + pin1[2] + pin1[3];
pin[6] = -pin1[0] - pin1[1] - pin1[2] + pin1[3];
pin[7] = -pin1[0] - pin1[1] - pin1[2] + pin1[3];
pin[8] = pin1[0] - pin1[1] - pin1[2] - pin1[3];
pin[9] = -pin1[0] - pin1[1] + pin1[2] - pin1[3];
}
save this file with some name say enc_dec.c
$gcc -o enc_dec enc_dec.c (in linux)
you wil get enc_dec executable
now you have secret.txt
now run
./enc_dec
enter 4 digit pin no:****
enter 1 for encryption 2 for decription
1
enter file name(Max 30 char):
secret.txt
output file name
secret.enq
output file is secret.enq
so now you can do this on secret again with diff key.
for reverse
./enc_dec
enter 4 digit pin no:**** (enter same pin)
enter 1 for encryption 2 for decription
2
enter file name(Max 30 char):
secret.enq
output file name
secret.txt
output file is secret.txt
if you applied keys multiple order then you should follow stack rule.
first key must use last and last one used first only.