Category: perl

Perl and Python program syntactic differences

If you know Perl Programming then learning Python is not that hard. But many times we would be confusing Perl syntax for Python and might get into trouble while writing Python code. So here I am listing some differences that I know and will keep adding as I knew it.

Variable names

In Perl we would save a variable like $my_variable/@my_varibale etc, but in Python we just can write the names nothing else needed.

Regex Substituion

In Perl we write $in=~s/to/too/g; In Python it is in = re.sub(r'to', 'too', in). Note the difference? In Perl substituion happens in same variable in Python we explicitly need to specify in which variable we are saving the substitution. To replicate the Perl code we have saved in same variable.

Modules

In Perl to import modules we write as use 'modulename' in Python we write as import re. To install module in Perl we have cpan and in Python we use Pip to install modules.

Hash/Dictionary usage

In Perl to assign value to a key we write as $hash{$key} = $value in Python we write as hash[key] = value

Code Indendataion

Perhaps this is the major difference between Python and many other major programming languages. In Perl we have to use curly braces to enclose a block of code into a function/loop/if-else-block in Python we need to indent properly. Also no semicolons are needed to tell the compiler that it is the end of the line we need to just press enter after a line of code. This allows code to look very readable.

A simple perl script to make a file content unique

This simple perl script does the job of sort -u (of course not in windows)
 
Script:
 

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
#!/usr/bin/perl

use strict;

use warnings;


#File reading code

open(my $in,"<:utf8",$ARGV[0]) or die "Cannot open $ARGV[0]:$!n";

my @in =<$in>;

close($in);


#declare hash

my %hash=();


#store content in hash

foreach my $in (@in) {

 chomp($in);

 $hash{$in} = 1;

}


#Finally print the hash

foreach my $keys(sort keys %hash){

 print "$keysn";

}
How to run:
 
Lets assume that our script is saved as my_unique.pl 

$ perl my_unique.pl <filename>
 
* Without angular brackets.
 
Explanation:
Here in Perl hash the duplicate keys are overridden which means only single instance of each line in the file is stored in the hash. Remember that while storing the lines in hash each single line is treated as key so as to remove duplicates. 
 
Note: Always use strict and use warnings when dealing with Perl code so as to write bug free code.

Linux: How to find duplicate lines count in a file from terminal.

Linux has many commands that are useful to process/analyze a file. In this post I would just explain a simple utility that would just print out the number of times each line is repeated in that file.

So here is the command:

terminal$ sort yourfilename.txt | uniq -c

Here yourfilename.txt can be any file name which I used here for an example.
Suppose the contents of yourfilename.txt be

line1
line1
line2
line3
line1
line3

Output:

3 line1
2 line3
1 line2

Explanation:

The sort command is quite self explanatory over here its output is piped/redirected to uniq. Uniq command requires its input to  be sorted(keep in mind always hard to remember). Uniq -c just prints the count of each line.