Wednesday, December 15, 2010

Why should I use mod_proxy_ajp rather than a classic mod_proxy ?

mod_proxy_ajp is an Apache module which can be used to forward a client HTTP request to an internal Tomcat application server using the AJP protocol.

To respond to the question "Why should I use mod_proxy_ajp rather than a classic mod_proxy ?", here is a small recap:

  • You can gain a lot of flexibility (lot of the apache modules/features can be used especially "name-based virtual hosting")
  • Practical for those who need to support Java applications along with PHP / Perl … (only one apache server is needed)
  • Certificates management is easier in apache configuration (this argument is a lot subjective)
  • It's not Tomcat's main objective to serve http static resources (not optimized for that)
  • Load balancing/cluster management is easier with an apache frontend

mod_proxy Vs mod_jk

mod_proxy
Pros:

  • No need for a separate module compilation and maintenance. mod_proxy,mod_proxy_http, mod_proxy_ajp and mod_proxy_balancer comes as part of standard Apache 2.2+ distribution
  • Ability to use http https or AJP protocols, even within the same balancer.
Cons:
  • mod_proxy_ajp does not support large 8K+ packet sizes.
  • Basic load balancer
  • Does not support Domain model clustering


mod_jk

Pros:
  • Advanced load balancer
  • Advanced node failure detection
  • Support for large AJP packet sizes
Cons:
  • Need to build and maintain a separate module

Monday, December 6, 2010

How to install src.rpm

src.rpms are SOURCE rpms.

If you just want to install a pre-compiled binary, then you must not user a src.rpm. just regular i386.rpm, or whatever.rpm.

If you want to use the SOURCE rpm to do a compilation yourself, then

rpm -ivh archive.src.rpm

Then you have to cd /usr/src/redhat

The source code archive is in /usr/src/redhat/SOURCES, in the form of a tar.gz archive. If you wish you can unpack that and manually do the installation to wherever you want.

If you just want to initiate the "automated" process, then you would

cd /usr/src/redhat/SPECS

and do

rpmbuild -bb package.spec

This will then initiate an automagic unpacking of the archive, compilation, and building of an rpm all ready for you to install, with the usual

rpm -ivh package.rpm

The built rpm is put into the appropriate /usr/src/redhats/RPMS/ subdirectory as far as I recall.

man rpmbuild

for all the gory details.

Wednesday, December 1, 2010

Manipulating the Query String in Apache Rewrite

The query string is the part of the URL that follows the question mark (?). It is often used to pass parameters to CGI scripts or other dynamic pages. It is typically available in the QUERY_STRING environment variable.

The typical URL-manipulation directives such as , Redirect, Alias, and RewriteRule cannot directly access the query string. But mod_rewrite can be used to add, remove, or modify the query string. The trick is to use a RewriteCond to match against the %{QUERY_STRING} variable and, if necessary, the [QSA] flag to append to an existing query string.

Some examples follow. These examples all assume that they are placed in the main server configuration file. If they are placed in a section or .htaccess file, the RewriteRule will need to be modified accordingly. Also, these examples can all be transformed from internal alias to external redirects by adding the [R] flag to the RewriteRule.

Be cautious when dealing with complex query strings, since the order of the variables is often arbitrary.

Access control by Query String

Deny access to http://example.com/page?var=val if var=val contains the string foo.

RewriteCond %{QUERY_STRING} foo
RewriteRule ^/page - [F]

Removing the Query String

Delete the query string entirely.

RewriteRule ^/page /page?

Adding to the Query String

Keep the existing query string using the Query String Append flag, but add var=val to the end.

RewriteRule ^/page /page?var=val [QSA]

Rewriting For Certain Query Strings

Rewrite URLs like http://example.com/page1?var=val to http://example.com/page2?var=val but don't rewrite if val isn't present.

RewriteCond %{QUERY_STRING} val
RewriteRule ^/page1 /page2

Note that you don't need to use the Query String Append flag if you won't modify the query string in the RewriteRule; it is left as-is in the URL by default.

Modifying the Query String

Change any single instance of val in the query string to other_val when accessing /path. Note that %1 and %2 are back-references to the matched part of the regular expression in the previous RewriteCond.

RewriteCond %{QUERY_STRING} ^(.*)val(.*)$
RewriteRule /path /path?%1other_val%2

Making the Query String Part of the Path

Take a URL of the form http://example.com/path?var=val and transform it into http://example.com/path/var/val. Note that this particular example will work only for a single var=val pair containing only letters, numbers, and the underscore character.

RewriteCond %{QUERY_STRING} ^(\w+)=(\w+)$
RewriteRule ^/path /path/%1/%2?

Making the Path Part of the Query String

Essentially the reverse of the above recipe. But this example, on the other hand, will work for any valid three level URL. http://example.com/path/var/val will be transformed into http://example.com/path?var=val.

RewriteRule ^/path/([^/]+)/([^/]+) /path?$1=$2
See also RewritePathInfo for more examples of this technique.

Tuesday, November 23, 2010

How to password protect the single user mode in Linux

One of the very compromising situation arising with a Linux box with a slack physical security easy access to anyone to the linux box – is one were a malicious user boots into an un protected single user mode and changes your root password. This can be prevented by making your linux machine to ask for the root password even when the system is made to boot into single user mode. The below given tip lets you achieve this goal.
How to implement this Tip?

1. From your Linux machine access a terminal window and open /etc/inittab file for edit.

2. In this file add the below given line just before the id:X:initdefault: entry
su:S:wait:/sbin/sulogin

3. Save the /etc/ininttab file.

4. Now from next time onwards you will be prompted to provide the root password before accessing the single user mode.

How to free Linux Kernel page cache and/or inode and dentry caches

Kernels 2.6.16 and newer provide a mechanism to have the kernel drop the page cache and/or inode and dentry caches on command, which can help free up a lot of memory.

Writing to this will cause the kernel to drop clean caches, dentries and inodes from memory, causing that memory to become free.

To free pagecache:
echo 1 > /proc/sys/vm/drop_caches

To free dentries and inodes:
echo 2 > /proc/sys/vm/drop_caches

To free pagecache, dentries and inodes:
echo 3 > /proc/sys/vm/drop_caches

As this is a non-destructive operation, and dirty objects are not free-able, the user should run "sync" first in order to make sure all cached objects are freed.

Linux bond or team multiple network interfaces into single

Finally today I had implemented NIC bounding (bind both NIC so that it works as a single device).My idea is to improve performance by pumping out more data from both NIC without using any other method.

Linux allows binding multiple network interfaces into a single channel/NIC using special kernel module called bonding. "The Linux bonding driver provides a method for aggregating multiple network interfaces into a single logical "bonded" interface. The behavior of the bonded interfaces depends upon the mode; generally speaking, modes provide either hot standby or load balancing services. Additionally, link integrity monitoring may be performed."

Note:-What is bonding?

Bonding allows you to aggregate multiple ports into a single group, effectively combining the bandwidth into a single connection. Bonding also allows you to create multi-gigabit pipes to transport traffic through the highest traffic areas of your network. For example, you can aggregate three megabits ports (1 mb each) into a three-megabits trunk port. That is equivalent with having one interface with three megabits speed.

Setting up bounding is easy with RHEL v5.0.and above

Step #1:

Create a bond0 configuration file

Red Hat Linux stores network configuration in /etc/sysconfig/network-scripts/ directory. First, you need to create bond0 config file:

Code:

# vi /etc/sysconfig/network-scripts/ifcfg-bond0

Append following lines to it:

DEVICE=bond0

IPADDR=192.168.1.59

NETWORK=192.168.1.0

NETMASK=255.255.255.0

USERCTL=no

BOOTPROTO=none

ONBOOT=yes

Note:Replace above IP address with your actual IP address. Save file and exit to shell prompt

Step #2:

Modify eth0 and eth1 config files:

Open both configuration using vi text editor and make sure file read as follows for eth0 interface

# vi /etc/sysconfig/network-scripts/ifcfg-eth0

Modify/append directive as follows:

DEVICE=eth0

USERCTL=no

ONBOOT=yes

MASTER=bond0

SLAVE=yes

BOOTPROTO=none

Open eth1 configuration file using vi text editor:

# vi /etc/sysconfig/network-scripts/ifcfg-eth1

Make sure file read as follows for eth1 interface:

DEVICE=eth1

USERCTL=no

ONBOOT=yes

MASTER=bond0

SLAVE=yes

BOOTPROTO=none

Save file and exit to shell prompt

Step # 3:

Load bond driver/module

Make sure bonding module is loaded when the channel-bonding interface (bond0) is brought up. You need to modify kernel modules configuration file:

# vi /etc/modprobe.conf

Append following two lines:

alias bond0 bonding

options bond0 mode=balance-alb miimon=100

Note:-Save file and exit to shell prompt. You can learn more about all bounding options at the end of this document

Step # 4:

Test configuration

First, load the bonding module:

# modprobe bonding

Restart networking service in order to bring up bond0 interface:

# service network restart

Verify everything is working:

# less /proc/net/bonding/bond0

Output:

Bonding Mode: load balancing (round-robin)

MII Status: up

MII Polling Interval (ms): 0

Up Delay (ms): 0

Down Delay (ms): 0

Slave Interface: eth0

MII Status: up

Link Failure Count: 0

Permanent HW addr: 00:0c:29:XX:XX:X1

Slave Interface: eth1

MII Status: up

Link Failure Count: 0

Permanent HW addr: 00:0c:29:XX:XX:X2

List all interfaces:

# ifconfig

Output:

bond0 Link encap:Ethernet HWaddr 00:0C:29:XX:XX:XX

inet addr:192.168.1.59 Bcast:192.168.1.255 Mask:255.255.255.0

inet6 addr: fe80::200:ff:fe00:0/64 Scope:Link

UP BROADCAST RUNNING MASTER MULTICAST MTU:1500 Metric:1

RX packets:2804 errors:0 dropped:0 overruns:0 frame:0

TX packets:1879 errors:0 dropped:0 overruns:0 carrier:0

collisions:0 txqueuelen:0

RX bytes:250825 (244.9 KiB) TX bytes:244683 (238.9 KiB)

eth0 Link encap:Ethernet HWaddr 00:0C:29:XX:XX:XX

inet addr:192.168.1.59 Bcast:192.168.1.255 Mask:255.255.255.0

inet6 addr: fe80::20c:29ff:fec6:be59/64 Scope:Link

UP BROADCAST RUNNING SLAVE MULTICAST MTU:1500 Metric:1

RX packets:2809 errors:0 dropped:0 overruns:0 frame:0

TX packets:1390 errors:0 dropped:0 overruns:0 carrier:0

collisions:0 txqueuelen:1000

RX bytes:251161 (245.2 KiB) TX bytes:180289 (176.0 KiB)

Interrupt:11 Base address:0x1400

eth1 Link encap:Ethernet HWaddr 00:0C:29:XX:XX:XX

inet addr:192.168.1.59 Bcast:192.168.1.255 Mask:255.255.255.0

inet6 addr: fe80::20c:29ff:fec6:be59/64 Scope:Link

UP BROADCAST RUNNING SLAVE MULTICAST MTU:1500 Metric:1

RX packets:4 errors:0 dropped:0 overruns:0 frame:0

TX packets:502 errors:0 dropped:0 overruns:0 carrier:0

collisions:0 txqueuelen:1000

RX bytes:258 (258.0 b) TX bytes:66516 (64.9 KiB)

Interrupt:10 Base address:0x1480

Note:-If the administration tools of your distribution do not support master/slave

notation in configuration of network interfaces, you will need to configure

the bonding device with the following commands manually:

# /sbin/ifconfig bond0 192.168.1.59 up

# /sbin/ifenslave bond0 eth0

# /sbin/ifenslave bond0 eth1

Que:-What are the other MODE options in modprobe .conf file

Ans:-You can set up your bond interface according to your needs. Changing one parameters (mode=X) you can have the following bonding types:

mode=0 (balance-rr)

Round-robin policy: Transmit packets in sequential order from the first available slave through the last. This mode provides load balancing and fault tolerance.

mode=1 (active-backup)

Active-backup policy: Only one slave in the bond is active. A different slave becomes active if, and only if, the active slave fails. The bond's MAC address is externally visible on only one port (network adapter) to avoid confusing the switch. This mode provides fault tolerance. The primary option affects the behavior of this mode.

mode=2 (balance-xor)

XOR policy: Transmit based on [(source MAC address XOR'd with destination MAC address) modulo slave count]. This selects the same slave for each destination MAC address. This mode provides load balancing and fault tolerance.

mode=3 (broadcast)

Broadcast policy: transmits everything on all slave interfaces. This mode provides fault tolerance.

mode=4 (802.3ad)

IEEE 802.3ad Dynamic link aggregation. Creates aggregation groups that share the same speed and duplex settings. Utilizes all slaves in the active aggregator according to the 802.3ad specification.

mode=5 (balance-tlb)

Adaptive transmit load balancing: channel bonding that does not require any special switch support. The outgoing traffic is distributed according to the current load (computed relative to the speed) on each slave. Incoming traffic is received by the current slave. If the receiving slave fails, another slave takes over the MAC address of the failed receiving slave.

mode=6 (balance-alb)

Adaptive load balancing: includes balance-tlb plus receive load balancing (rlb) for IPV4 traffic, and does not require any special switch support. The receive load balancing is achieved by ARP negotiation. The bonding driver intercepts the ARP Replies sent by the local system on their way out and overwrites the source hardware address with the unique hardware address of one of the slaves in the bond such that different peers use different hardware addresses for the server.

PS

1) Displaying top CPU_consuming processes:

ps aux | head -1; ps aux | sort -rn | head -10

2) Displaying top 10 memory-consuming processes:

ps aux | head -1; ps aux | sort -rn | head

3) Displaying process in order of being penalized:

ps -eakl | head -1; ps -eakl | sort -rn

4) Displaying process in order of priority:

ps -eakl | sort -n | head

5) Displaying process in order of nice value

ps -eakl | sort -n

6) Displaying the process in order of time

ps vx | head -1;ps vx | grep -v PID | sort -rn | head -10

7) Displaying the process in order of real memory use

ps vx | head -1; ps vx | grep -v PID | sort -rn | head -10

8) Displaying the process in order of I/O

ps vx | head -1; ps vx | grep -v PID | sort -rn | head -10

9) Displaying WLM classes

ps -a -o pid, user, class, pcpu, pmem, args

10) Determinimg process ID of wait processes:

ps vg | head -1; ps vg | grep -w wait

11) Wait process bound to CPU

ps -mo THREAD -p

How to List perl modules installed on my system?

List installed perl module
To display the list enter the following command:
$ instmodsh
Output:

Available commands are:
l - List all installed modulList installed perl module
To display the list enter the following command:
$ instmodsh
Output:

Available commands are:
l - List all installed modules
m - Select a module
q - Quit the program
cmd?

At cmd? prompt type l to list all installed modules:
cmd? les
m - Select a module
q - Quit the program
cmd?

At cmd? prompt type l to list all installed modules:
cmd? l

Monday, November 8, 2010

Sed Grouping and BackReference - PART-2

Example :

echo "[asd] [qwe] [zxc]"
[asd] [qwe] [zxc]

echo "[asd] [qwe] [zxc]" | sed -e "s/\[\(\(.\)*\)\]/\<\1\>/g"


echo "[asd] [qwe] [zxc]" | sed -e "s/\[\([^[]*\)\]/\<\1\>/g"


part in red color makes the difference. :P

Sed Grouping and BackReference - PART-1

Grouping can be used in sed like normal regular expression. A group is opened with “\(” and closed with “\)”.Grouping can be used in combination with back-referencing.

Back-reference is the re-use of a part of a Regular Expression selected by grouping. Back-references in sed can be used in both a Regular Expression and in the replacement part of the substitute command.

Example 1: Get only the first path in each line

$ sed 's/\(\/[^:]*\).*/\1/g' path.txt
/usr/kbos/bin
/usr/local/sbin
/opt/omni/lbin

In the above example, \(\/[^:]*\) matches the path available before first : comes. \1 replaces the first matched group.

Example 2: Multigrouping

In the file path.txt change the order of field in the last line of the file.

$ sed '$s@\([^:]*\):\([^:]*\):\([^:]*\)@\3:\2:\1@g' path.txt
/usr/kbos/bin:/usr/local/bin:/usr/jbin:/usr/bin:/usr/sas/bin
/usr/local/sbin:/sbin:/bin:/usr/sbin:/usr/bin:/opt/omni/bin:
/root/bin:/opt/omni/sbin:/opt/omni/lbin

In the above command $ specifies substitution to happen only for the last line.Output shows that the order of the path values in the last line has been reversed.

Example 3: Get the list of usernames in /etc/passwd file

This sed example displays only the first field from the /etc/passwd file.

$sed 's/\([^:]*\).*/\1/' /etc/passwd
root
bin
daemon
adm
lp
sync
shutdown

Example 4: Parenthesize first character of each word

This sed example prints the first character of every word in paranthesis.

$ echo "Welcome To The Geek Stuff" | sed 's/\(\b[A-Z]\)/\(\1\)/g'
(W)elcome (T)o (T)he (G)eek (S)tuff

Example 5: Commify the simple number.

Let us create file called numbers which has list of numbers. The below sed command example is used to commify the numbers till thousands.

$ cat  numbers
1234
12121
3434
123

$sed 's/\(^\|[^0-9.]\)\([0-9]\+\)\([0-9]\{3\}\)/\1\2,\3/g' numbers
1,234
12,121
3,434
123

Friday, September 24, 2010

Basic I/O Monitoring on Linux

The technique we discuss here is basic, it gives a good overview and is very easy to use. So let get focused… We will use iostat utility. In case you need you know where to find more about it — right, man pages.

So we will use the following form of the command:

iostat -x [-d]
  • -x option displays extended statistics. You definitely want it.
  • -d is optional. It removes CPU utilization to avoid cluttering the output. If you leave it out, you will get the following couple lines in addition:
    avg-cpu: %user %nice %sys %iowait %idle
    6.79 0.00 3.79 16.97 72.46
  • is the number of seconds iostat waits between each report. Without a specified interval, iostat displays statistics since the system was up then exits, which is not useful in our case. Specifying the number of seconds causes iostat to print periodic reports where IO statistics are averaged for the time period since previous report. I.e., specifying 5 makes iostat dump 5 seconds of average IO characteristics, every 5 seconds until it’s stopped.

If you have many devices and you want to watch for only some of them, you can also specify device names on command line:

iostat -x -d sda 5

Now let’s get to the most interesting part — what those cryptic extended statistics are. (For readability, I formatted the report above so that the last two lines are in fact a continuation of the first two.)

Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s rkB/s
sda 0.00 12.57 10.18 9.78 134.13 178.84 67.07

wkB/s avgrq-sz avgqu-sz await svctm %util
89.42 15.68 0.28 14.16 8.88 17.72
  • r/s and w/s— respectively, the number of read and write requests issued by processes to the OS for a device.
  • rsec/s and wsec/s — sectors read/written (each sector 512 bytes).
  • rkB/s and wkB/s — kilobytes read/written.
  • avgrq-sz — average sectors per request (for both reads and writes). Do the math — (rsec + wsec) / (r + w) = (134.13+178.84)/(10.18+9.78)=15.6798597
    If you want it in kilobytes, divide by 2.
    If you want it separate for reads and writes — do you own math using rkB/s and wkB/s.
  • avgqu-sz — average queue length for this device.
  • await — average response time (ms) of IO requests to a device. The name is a bit confusing as this is the total response time including wait time in the requests queue (let call it qutim), and service time that device was working servicing the requests (see next column — svctim).

    So the formula is await = qutim + svctim.

  • svctim — average time (ms) a device was servicing requests. This is a component of total response time of IO requests.
  • %util — this is a pretty confusing value. The man page defines it as, Percentage of CPU time during which I/O requests were issued to the device (bandwidth utilization for the device). Device saturation occurs when this value is close to 100%. A bit difficult to digest. Perhaps it’s better to think of it as percentage of time the device was servicing requests as opposed to being idle. To understand it better here is the formula:

    utilization = ( (read requests + write requests) * service time in ms / 1000 ms ) * 100%
    or
    %util = ( r + w ) * svctim /10 = ( 10.18 + 9.78 ) * 8.88 = 17.72448

Traditionally, it’s common to assume that the closer to 100% utilization a device is, the more saturated it is. This might be true when the system device corresponds to a single physical disk. However, with devices representing a LUN of a modern storage box, the story might be completely different.

Rather than looking at device utilization, there is another way to estimate how loaded a device is. Look at the non-existent column I mentioned above — qutim — the average time a request is spending in the queue. If it’s insignificant, compare it to svctim — the IO device is not saturated. When it becomes comparable to svctim and goes above it, then requests are queued longer and a major part of response time is actually time spent waiting in the queue.

The figure in the await column should be as close to that in the svctim column as possible. If await goes much above svctim, watch out! The IO device is probably overloaded.

There is much to say about IO monitoring and interpreting results. Perhaps this is only the first of a series of posts about IO statistics. At Pythian we often come across different environments with specific characteristics and various requirements that our clients have. So stay tune — more to come.

Interpreting iostat Output

In this post I am going to explore how extended iostat statistics can be useful to a system administrator beyond a binary “Disk is bottleneck / Disk is not bottleneck.” Before we can get to any of that however, we must make sure we have a basic background knowledge of the Disk IO Subsystem.

Linux Disk IO Subsystem:

I am not a kernel hacker, so this overview might be flawed in parts but hopefully it is accurate enough to give the background needed for analyzing the output of iostat.

Layer

Unit

Typical Unit Size

User Space System Calls

read() , write()


Virtual File System Switch (VFS)

Block

4096 Bytes

Disk Caches

Page


Filesystem (For example ext3)

Blocks

4096 Bytes (Can be set at FS creation)

Generic Block Layer

Page Frames / Block IO Operations (bio)


I/O Scheduler Layer

bios per block device (Which this layer may combine)


Block Device Driver

Segment

512 Bytes

Hard Disk

Sector

512 Bytes

There are two basic system calls, read() and write(), that a user process can make to read data from a file system. In the kernel these are handled by the Linux Virtual Filesystem Switch (VFS). VFS is an abstraction to all file systems so they look the same to the user space and it also handles the interface between the file system and the block device layer. The caching layer provides caching of disk reads and writes in terms of memory pages. The generic block layer breaks down IO operations that might involve many different non-contiguous blocks into multiple IO operations. The I/O scheduling layer takes these IO operations and schedules them based on order on disk, priority, and/or direction. Lastly, the device driver handles interfacing with the hardware for the actual operations in terms of disk sectors which are usually 512 bytes.

A Little Bit on Page Caching:

The page cache caches pages of data that do or will reside on disk. Therefore before it writes data to disk it puts it in memory, and before it reads data from disk it checks to see if it is in memory already (With the exception of Direct IO). Writing pages out to disk actually gets deferred. This is done to increase performance so writes can be grouped together more efficiently. When a page of disk data gets changed and needs to be written out to disk it is called “dirty”. Since it is dangerous to keep pages in memory for too long in case of a system shutdown the kernel’s pdflush threads scan for dirty pages and then flushes them out to disk. Linux will actually try to use as much memory as it can for caching files which is why the top command usually shows so much used memory. When you want to see how much memory is free for processes you can run the free command and look at the ‘-/+ buffers/cache’.

iostat output:

So with this background lets look at some of the output of iostat and tie it together with our background knowledge. Iostat can break down the statistics at both the partition level and then device level, however in this post I am going to focus on the device level.

The Overview Statistics: “Is it Saturated or Not?”

From iostat there are two summary statistics which are Input/Output CPU wait time (iowait) and device utilization which are both expressed in terms of percentages.

iowait is from the CPU’s perspective and it is the percentage of time that the CPU spent waiting for a IO device to be ready. Another way to look at iowait is the amount of time that the CPU could have been doing something but couldn’t because all the processes were waiting on the disk or the network devices.

Device utilization is covered throughly by Alex Gorbahev in Basic I/O Monitoring on Linux. He summarizes it as “The percentage of time the device spent servicing requests as opposed to being idle.”

iostat and caching:

It is import to note that iostat shows requests to the device (or partition) and not read and write requests from user space. So in the table above iostat is reading below the disk cache layer. Therefore, iostat says noting about your cache hit ratio for block devices. So it is possible that disk IO problems might be able to be resolved by memory upgrades. From my research there is no way to pull out a cache hit/miss ratio out of Linux easily when it comes to block devices which is a bit disappointing. One suggestion from serverfault is to install a kernel with debuging symbols and use SystemTap to trace the VFS events and tie them together with the block layer events. I intend to explore this but I would prefer to see a way to get this data from /proc or /sys.

iostat Output for Random and Sequential Reads:

One of the main things to do when examining disk IO is to determine if the disk access patterns are sequential or random. This information can aid in our disk choices. When operations are random the seek time of the disk becomes more important. This is because physically the drive head has to jump around. Seek time is the measurement of the speed at which the heads can do this. For small random reads solid state disks can be a huge advantage.

So in fio I have created two different simple tests to run. The first is sequential reading, and the second is random reading. During these tests I ran iostat -x 3 throughout the test.

Snapshot of Random Read Test:

Device:         rrqm/s   wrqm/s     r/s     w/s   rsec/s   wsec/s avgrq-sz avgqu-sz   await  svctm  %util
sda 0.00 0.00 172.67 0.00 1381.33 0.00 8.00 0.99 5.76 5.76 99.47

Snapshot of Sequential Read Test:

Device:         rrqm/s   wrqm/s     r/s     w/s   rsec/s   wsec/s avgrq-sz avgqu-sz   await  svctm  %util
sda 13.00 0.00 367.00 0.00 151893.33 0.00 413.88 2.46 6.71 2.72 100.00
What is more important to me for this is not just what these numbers are but what, in the context of of random vs sequential reading and in context of the IO subsystem, they mean.

The first two columns, rrqm/s and wrqm/s, are read and write requests merged per second. In my above diagram of the Linux Block IO subsystem above I mentioned that that the scheduler can combine operations. This can be done when multiple operations are physically adjacent to each other on the device. So in sequential operation it would make sense to often see a large number of merges. In the snapshot of the random reads, we see no merges. However, the merging layer feels a little bit like “magic” and I don’t believe it is the best indicator of if the patterns are random or sequential.

The next 5 columns are read and write requests to the device (r/s, w/s), followed by the amount of sectors read and written from the device (rsec/s, wsec/s), and then the size of each request (avgrq-sz). In the random test there are 172 reads that result in 1,381 sectors being read in. In the sequential test there are 367 read request to 151,893 sectors being read. So in the random test we get about 8 sectors per request and in the sequential test we get 413 sectors per read. If you look closely, this happens to be the same number as avgrq-sz which does this math for us (Sectors Read / Read Operations). However it is worth noting that this is how it is calculated as the average request size does not differentiate between reads and writes. From these tests a low sector write/read to request ratio or small request sizes seem to indicate a random IO profile. I believe this to be a better indicator than the amount of merges as to whether or not there is random or sequential disk patterns.

The final 4 columns are the average queue length of requests to the device (avgqu-sz), how long requests took to be serviced including their time in the queue (await), how long requests took to be serviced by the device after they left the queue (svctm), and lastly the utilization percentage which I already mentioned in the overview statistics section. In the above example random requests take longer for the disk to service as expected because of the seek time. However, the queue itself ends up being shorter which I am unable to explain. Utilization, in more detail, is the service time in ms * total IO operations / 1000 ms. This gives the percentage of how busy the single disk was during the given time slice. I believe for a given utilization level a higher number of operations is probably indicative of a sequential pattern.

I have run various variations on the above. They include a mixture of reads and writes for both random and sequential data as well as sequential and random writes. For the writes I got similar results as far as the ratios were concerned and queue and services time were higher.

In the end it seems average request size is the key to show if the disk usage patterns are random or not since this is post merging. Taking this into the context of the layers above this might not mirror what an application is doing. This is because a read or write operations coming from user space might operate on a fragmented file in which case the generic block layer will break it up and it appears as random disk activity.

Conclusion:

As far as I am concerned this is only a start in interpreting IO statistics. I think these tests need to be repeated, perhaps with different tools to generate the disk IO, as my interpretations might just be totally off. Also, a pretty big limitation of what I did is that my work was all on a single disk and these numbers might have different results under various RAID configurations. I feel the inability to measure the cache hit ratio of reads on a block device is a significant shortcoming that I would love to see addressed since from a system administrators perspective the solution to certain IO problems might be to throw more memory at the problem.

Lastly, I want to make a point about these sort of low level statistics in general. Everything needs to monitored from the an application perspective as well. These statistics can be misleading and are most useful when they can be correlated with the data that actually matters to users of the applications, for example, response time from the user perspective. These also need to be monitored over time because you want to be able to see changes for capacity planning as well as to give them context to past performance when problems arise.

Further Reading:
http://www.igvita.com/2009/06/23/measuring-optimizing-io-performance/
Understanding The Linux Kernel, Chapter 14
http://www.ufsdump.org/papers/io-tuning.pdf
http://bhavin.directi.com/iostat-and-disk-utilization-monitoring-nirvana/
http://www.kernel.org/doc/Documentation/iostats.txt

Tuesday, September 21, 2010

Compound Interest Calculation

Formula for calculating compound interest:

Where,

  • P = principal amount (initial investment)
  • r = annual nominal interest rate (as a decimal)
  • n = number of times the interest is compounded per year
  • t = number of years
  • A = amount after time t


Example usage:

An amount of $1500.00 is deposited in a bank paying an annual interest rate of 4.3%, compounded quarterly. Find the balance after 6 years.

A. Using the formula above, with P = 1500, r = 4.3/100 = 0.043, n = 4, and t = 6:

A=1500\left(1 + \frac{0.043}{4}\right)^{4 \times 6} =1938.84

So, the balance after 6 years is approximately $1,938.op.

rurlhttpcolonslashslashendotwikipediadotorgslashwikislashCompound_interest

Monday, September 20, 2010

LDAP Authentication In Linux

This howto will show you howto store your users in LDAP and authenticate some of the services against it. I will not show howto install particular packages, as it is distribution/system dependant. I will focus on “pure” configuration of all componenets needed to have LDAP authentication/storage of users. The howto assumes somehow, that you are migrating from a regular passwd/shadow authentication, but it is also suitable for people who do it from scratch.

Requirements

OpenLDAP
pam_ldap
nss_ldap
PADL migrationtools

Introducion

The thing we want to achieve is to have our users stored in LDAP, authenticated against LDAP ( direct or pam ) and have some tool to manage this in a human understandable way.

This way we can use all software, which has ldap support or fallback to PAM ldap module, which will act as a PAM->LDAP gateway.

Configuring OpenLDAP

OpenLDAP consists of slapd and slurpd daemon. This howto covers one LDAP server without a replication, so we will focus only on slapd. I also assume you installed and initialized your openldap installation (depends on system/disribution). If so, let’s go to configuration part.

On my system (Gentoo), openldap’s configuration is stored in /etc/openldap, we are interested in/etc/openldap/slapd.conf file. But first we have to generate a password for LDAP administrator, to put it into the config file:

# slappasswd -h {md5}

The config looks like this:

# vim /etc/openldap/slapd.conf

include         /etc/openldap/schema/core.schema

include /etc/openldap/schema/cosine.schema

include /etc/openldap/schema/inetorgperson.schema

include /etc/openldap/schema/nis.schema

allow bind_v2

pidfile /var/run/openldap/slapd.pid

argsfile /var/run/openldap/slapd.args

modulepath /usr/lib/openldap/openldap

access to attrs=userPassword

by dn="uid=root,ou=People,dc=hackadmin,dc=com" write

by dn="cn=Manager,dc=hackadmin,dc=com" write

by anonymous auth

by self write

by * none

access to dn.base="" by * read

access to *

by dn="cn=Manager,dc=hackadmin,dc=com" write

by * read

database bdb

suffix "dc=hackadmin,dc=com"

rootdn "cn=Manager,dc=hackadmin,dc=com"
rootpw {MD5}Tk1sMytv5ipjr+Vhcf03JQ==

directory /var/lib/openldap-data

index objectClass eq

Remember to change suffix and paths to your needs.

These are basic options with some basic ACLs needed to change passwrods by user. If you want more functionality, please read the manual about openLDAP. Now when we have a proper config for slapd, we can start the daemon :

# /etc/init.d/ldap start

# chkconfig ldap on

Now we can test if openldap is running and working properly. We do not have any data yet in the directory, but we can try to bind as cn=Manager,dc=domain,dc=com. When you are asked for password, you should use the one you generated (of course the plain text version of it :) :

# ldapsearch -D “cn=Manager,dc=hackadmin,dc=com” -W

Migrate/Add data to the directory

Now when we have a running LDAP server, we have to fill it with data, either create or migrate entries. I will show you howto migrate existing entries from regular /etc/passwd, /etc/shadow , /etc/groups

The first step is to configure mogrationtools to your needs. The configuration file on gentoo is located in/usr/share/migrationtools/migrate_common.ph.

Generally you need to change only these:

$DEFAULT_BASE = "dc=hackadmin,dc=com";

$EXTENDED_SCHEMA = 1;

Now you are ready to migrate the data (actually it works even without the export command):

export ETC_SHADOW=/etc/shadow

# ./migrate_base.pl > /tmp/base.ldif
# ./migrate_group.pl /etc/group /tmp/group.ldif
# ./migrate_hosts.pl /etc/hosts /tmp/hosts.ldif
# ./migrate_passwd.pl /etc/passwd /tmp/passwd.ldif

Now we have the data in the format understood by LDAP server. Please open one the files with text editor to get used to the syntax. After that we can add the data from ldifs.

# ldapadd -D “cn=Manager,dc=domain,dc=com” -W -f /tmp/base.ldif

# ldapadd -D “cn=Manager,dc=domain,dc=com” -W -f /tmp/group.ldif

# ldapadd -D “cn=Manager,dc=domain,dc=com” -W -f /tmp/passwd.ldif

# ldapadd -D “cn=Manager,dc=domain,dc=com” -W -f /tmp/hosts.ldif

You can try searching for some data:

# ldapsearch uid=foouser

Client configuration

By client I mean the machine, which connects to LDAP server to get users and authorize. It can be also the machine, the ldap server runs on. In both cases we have to edit three files : /etc/ldap.conf, /etc/nsswitch.conf and /etc/pam.d/system-auth

Let’s start woth ldap.conf, the ldap’s client:

BASE    dc=hackadmin, dc=com

scope sub

suffix "dc=hackadmin,dc=com"

## when you want to change user's password by root

rootbinddn cn=Manager,dc=hackadmin,dc=com

## there are needed when your ldap dies

timelimit 5

bind_timelimit 5

uri ldap://ldap.hackadmin.com/

pam_password exop

ldap_version 3

pam_filter objectclass=posixAccount

pam_login_attribute uid

pam_member_attribute memberuid

nss_base_passwd ou=Computers,dc=cognifide,dc=pl

nss_base_passwd ou=People,dc=cognifide,dc=pl

nss_base_shadow ou=People,dc=cognifide,dc=pl

nss_base_group ou=Group,dc=cognifide,dc=pl

nss_base_hosts ou=Hosts,dc=cognifide,dc=pl

Now it is time for nsswitch.conf and pam

Add these to nsswitch.conf:

passwd: files ldap

shadow: files ldap

group: files ldap

And change the system-auth (or hatever you have like login, sshd etc) to :

auth       required     pam_env.so

auth sufficient pam_unix.so likeauth nullok

auth sufficient pam_ldap.so use_first_pass

auth required pam_deny.so

account sufficient pam_unix.so

account sufficient pam_ldap.so

account required pam_ldap.so

password required pam_cracklib.so difok=2 minlen=8 dcredit=2 ocredit=2 retry=3

password sufficient pam_unix.so nullok md5 shadow use_authtok

password sufficient pam_ldap.so use_first_pass

password required pam_deny.so

session required pam_limits.so

session required pam_unix.so

session optional pam_ldap.so

Time to test it. The best tool for it is a good old getent. Pick a user from your system and issue:

# getent passwd | grep foouser

You should get the result twice, if so the nss_ldap works fine. The pam part can be tested by deleting a user from the /etc/passwd and trying to log in through ssh.

Apache mod_auth_ldap

To have LDAP authorization in apache, you have to load mod_auth_ldap module

LoadModule mm_auth_ldap_module modules/mod_auth_ldap.so

Now it is enought to make .htaccess like that:

AuthName "Restricted"
AuthType Basic
AuthLDAPURL ldap://ldap.hackadmin.com:389/ou=People,dc=hackadmin,dc=com?uid
AuthLDAPBindDN "cn=Manager,dc=hackadmin,dc=com"
AuthLDAPBindPassword "your_secret_secret_password_to_ldap_admin"
require valid-user

Note that this method can be also used for webdav subversion authorization

Administration tools for ldap

There are few tool I recommend using to administrate OpenLDAP server

phpldapadmin - web based tool
ldapvi - vim browsing
PADL migrationtools - migrationtools
IDEALX sambaldap tools - samba ldap tools


rurlwwwdothackadmin.comdot2010slash03slash05slashldap-authentication-in-linuxslash

Creating home directories on Linux hosts with pam_mkhomedir

I have been converting a number of hosts to LDAP authentication. I’m currently creating user home directories on each server, which has a number of pros and cons. One of the cons is that a newly provisioned user won’t have a home directory, all will be assigned “/” as their home directory when they login. This is less than ideal, since most users will need a place to modify files and customize their environment. To simplify my life, I have been playing around with autodir and pam_mkhomedir. Both solutions provide an automated way to create user home directories, and are pretty easy to set up.

To configure pam_mkhomedir, you can add add the following line to the session management section of /etc/pam.d/system-auth:

session     optional      pam_mkhomedir.so

After the module is enabled, users should see a “Creating directory” line when they login to a server for the first time:

$ ssh test@foo
test@foo’s password:
Creating directory ‘/home/test’.

In addition to creating the home directory specified in the passwd file (or in the homeDirectory attribute if you are using LDAP), the mkhomedir module will also populate the user’s home directory with the files in /etc/skel:

$ ls -la /etc/skel

total 40
drwxr-xr-x. 4 root root 4096 2009-07-07 13:56 .
drwxr-xr-x. 113 root root 12288 2009-07-16 11:08 ..
-rw-r--r--. 1 root root 18 2009-04-08 06:46 .bash_logout
-rw-r--r--. 1 root root 176 2009-04-08 06:46 .bash_profile
-rw-r--r--. 1 root root 124 2009-04-08 06:46 .bashrc
drwxr-xr-x. 2 root root 4096 2009-03-17 20:54 .gnome2
drwxr-xr-x. 4 root root 4096 2009-07-07 13:44 .mozilla
-rw-r--r--. 1 root root 658 2009-03-02 12:18 .zshrc

Adding to the base set of files that are placed in each user’s home directory is as simple as copying one or more files into /etc/skel, or modifying the existing files. I will touch on the autodir module in a follow up post.

Tuesday, September 14, 2010

Using cut – Shellscript string manipulation

This post is designed to be a refresher, reference or quick intro into how to manipulate strings with the cut command in bash.Some times its useful to take the output of a command and reformat it. I sometimes do this for aesthetic purposes or tor format for use as input into another command.


Cut has options to cut by bytes (-b), characters (-c) or fields (-f). I normally cut by character or field but byte can come in handy some times.

The options to cut by are below.

N N’th byte, character or field, counted from 1

N- from N’th byte, character or field, to end of line

N-M from N’th to M’th (included) byte, character or field

-M from first to M’th (included) byte, character or field

The options pretty much explain themselves but I have included some simple examples below:

Cutting by characters (command on top, output below)

echo "123456789" | cut -c -5

12345

echo "123456789" | cut -c 5-

56789

echo "123456789" | cut -c 3-7

34567

echo "123456789" | cut -c 5

5

Sometimes output from a command is delimited so a cut by characters will not work. Take the example below:

echo -e "1\t2\t3\t4\t5" |cut -c 5-7

3 4

To echo a tab you have to use the -e switch to enable echo to process back slashed characters. If the desired output is 3\t4 then this would work great if the strings were always 1 character but if anywhere before field 3 a character was added the output would be completely changed as followed:

echo -e "1a\t2b\t3c\t4d\t5e" | cut -c 5-7

b 3

This is resolved by cutting by fields.

Cutting by fields

The syntax to cut by fields is the same as characters or bytes. The two examples below display different output but are both displaying the same fields (Fields 3 Through to the end of line.)

echo -e "1\t2\t3\t4\t5" | cut -f 3-

3 4 5

echo -e "1a\t2a\t3a\t4a\t5a" | cut -f 3-

3a 4a 5a

The default delimiter is a tab, if the output is delimited another way a custom delimiter can be specified with the -d option. It can be just about any printable character, just make sure that the character is escaped (back slashed) if needed. In the example below I cut the string up using the pipe as the delimiter.

echo "1|2|3|4|5" | cut -f 3- -d \|

3|4|5

One great feature of cut is that the delimiter that was used for input can be changed by the output of cut. In the example below I change the format of the string from a dash delimited output and change it to a comma.

echo -e "1a-2a-3a-4a-5a" | cut -f 3- -d – --output-delimiter=,

3a,4a,5a

Formatting with Cut Example

Sometimes certain Linux applications such as uptime do not have options to format the output. Cut can be used to pull out the information that is desired.

Normal up-time Command:

owen@the-linux-blog:~$ uptime

19:18:40 up 1 day, 22:15, 4 users, load average: 0.45, 0.10, 0.03

Time with up-time displayed:

owen@the-linux-blog:~$ uptime |cut -d , -f 1,2 | cut -c 2-

19:19:36 up 1 day, 22:22

For the above example I pipe the output of uptime to cut and tell it I want to split it with a comma , delimiter. I then choose fields 1 and 2. The output from that cut is piped into another cut that removes the spaces in front of the output.

Load averages extracted from uptime:

owen@the-linux-blog:~$ uptime |cut -d , -f 4- | cut -c 3-

load average: 0.42, 0.10, 0.03

This is about the same as the previous example except the fields changed. Instead of fields 1 and 2 I told it to display fields 4 through the end. The output from that is piped to another cut which removes the three spaces that were after the comma in "4 users, " by starting at the 3rd character.

The great thing about cutting by fields is that no matter if the field length changes the data stays the same. Take the example below. I now have 17 users logged in which would have broke the output if I had used -c (since there is an extra character due to a double digit number of users being logged in.)

owen@the-linux-blog:~$ uptime

19:25:11 up 1 day, 22:28, 17 users, load average: 0.00, 0.06, 0.04

owen@the-linux-blog:~$ uptime |cut -d , -f 4- | cut -c 3-

load average: 0.00, 0.06, 0.04

That just about covers everything for the cut command. Now you know about it you can use cut to chop up all types of strings. It is one of the many great tools available for string manipulation in bash. If you can remember what cut does it will make your shell scripting easier, you don’t need to memorize the syntax because all of the information on how to use cut is available here, in the man pages and all over the web.

Extracting sub string in Bash

Extracting sub string at Bash is very easy, let say you have a phone number 012-4567890 and you just wanna extract 4567890 out, you can do as below.


num="016-4567890";
echo ${num:5:7}
567890

O/P is 567890

5 is the starting point and 7 is the string length for sub string.

Wednesday, August 18, 2010

crontab UNLEASHED

Your cron job looks like as follows:

1 2 3 4 5 /path/to/command arg1 arg2

Where,

  • 1: Minute (0-59)
  • 2: Hours (0-23)
  • 3: Day (0-31)
  • 4: Month (0-12 [12 == December])
  • 5: Day of the week(0-7 [7 or 0 == sunday])
  • /path/to/command - Script or command name to schedule

Same above five fields structure can be easily remembered with following diagram:

* * * * * command to be executed
- - - - -
| | | | |
| | | | ----- Day of week (0 - 7) (Sunday=0 or 7)
| | | ------- Month (1 - 12)
| | --------- Day of month (1 - 31)
| ----------- Hour (0 - 23)
------------- Minute (0 - 59)

Use of operators

An operator allows you to specifying multiple values in a field. There are three operators:

  1. The asterisk (*) : This operator specifies all possible values for a field. For example, an asterisk in the hour time field would be equivalent to every hour or an asterisk in the month field would be equivalent to every month.
  2. The comma (,) : This operator specifies a list of values, for example: "1,5,10,15,20, 25".
  3. The dash (-) : This operator specifies a range of values, for example: "5-15" days , which is equivalent to typing "5,6,7,8,9,....,13,14,15" using the comma operator.

Use special string to save time :

Instead of the first five fields, you can use any one of eight special strings. It will not just save your time but it will improve readability.





























Special string Meaning
@rebootRun once, at startup.
@yearlyRun once a year, "0 0 1 1 *".
@annually(same as @yearly)
@monthlyRun once a month, "0 0 1 * *".
@weeklyRun once a week, "0 0 * * 0".
@dailyRun once a day, "0 0 * * *".
@midnight(same as @daily)
@hourly Run once an hour, "0 * * * *".

Run ntpdate every hour:

@hourly /path/to/ntpdate

Make a backup everyday:

@daily /path/to/backup/script.sh

Understanding /etc/crontab file and /etc/cron.d/* directories

/etc/crontab is system crontabs file. Usually only used by root user or daemons to configure system wide jobs. All individual user must must use crontab command to install and edit their jobs as described above. /var/spool/cron/ or /var/cron/tabs/ is directory for personal user crontab files. It must be backup with users home directory.

Typical /etc/crontab file entries:

SHELL=/bin/bash
PATH=/sbin:/bin:/usr/sbin:/usr/bin
MAILTO=root
HOME=/

# run-parts
01 * * * * root run-parts /etc/cron.hourly
02 4 * * * root run-parts /etc/cron.daily
22 4 * * 0 root run-parts /etc/cron.weekly
42 4 1 * * root run-parts /etc/cron.monthly

Additionally, cron reads the files in /etc/cron.d/ directory. Usually system daemon such as sa-update or sysstat places their cronjob here. As a root user or superuser you can use following directories to configure cronjobs. You can directly drop your scripts here. run-parts command run scripts or programs in a directory via /etc/crontab





















DirectoryDescription
/etc/cron.d/ Put all scripts here and call them from /etc/crontab file.
/etc/cron.daily/ Run all scripts once a day
/etc/cron.hourly/ Run all scripts once an hour
/etc/cron.monthly/ Run all scripts once a month
/etc/cron.weekly/Run all scripts once a week

Thursday, July 29, 2010

Loop device information

You can see what is being used by a loop device with losetup:
# losetup /dev/loop0
/dev/loop0: [fd06]:234921356 (/linux/isos/backtrack.iso)

To detach an image from loop device
losetup -d /dev/loop0 ## to detach image associate with loop0

It is possible to increase the number of available loop devices. Free
all loop devices, and add a line with the following to
/etc/modprobe.conf:
options loop max_loop=64

(maximum is 256)

Then, do: rmmod loop && modprobe loop

If you get an error that the module couldn't be removed, you still have
loop devices in use.

Newer kernels (2.6.21 or 2.6.22) use a dynamic allocation of loop
devices, so you will only have to create the filesystem representation
of the devices:
for ((i=8;i<64;i++)); do
[ -e /dev/loop$i ] || mknod -m 0600 /dev/loop$i b 7 $i
done

Thursday, July 22, 2010

10 Steps to Configure tftpboot Server in UNIX / Linux (For installing Linux from Network using PXE)

In this article, let us discuss about how to setup tftpboot, including installation of necessary packages, and tftpboot configurations.


TFTP boot service is primarily used to perform OS installation on a remote machine for which you don’t have the physical access. In order to perform the OS installation successfully, there should be a way to reboot the remote server — either using wakeonlan or someone manually rebooting it or some other ways.


In those scenarios, you can setup the tftpboot services accordingly and the OS installation can be done remotely (you need to have the autoyast configuration file to automate the OS installation steps).



Step by step procedure is presented in this article for the SLES10-SP3 in 64bit architecture. However, these steps are pretty much similar to any other Linux distributions.


Required Packages


The following packages needs to be installed for the tftpboot setup.




  • dhcp services packages: dhcp-3.0.7-7.5.20.x86_64.rpm and dhcp-server-3.0.7-7.5.20.x86_64.rpm

  • tftpboot package: tftp-0.48-1.6.x86_64.rpm

  • pxeboot package: syslinux-3.11-20.14.26.x86_64.rpm


Package Installation


Install the packages for the dhcp server services:


$ rpm -ivh dhcp-3.0.7-7.5.20.x86_64.rpm
Preparing... ########################################### [100%]
1:dhcp ########################################### [100%]

$ rpm -ivh dhcp-server-3.0.7-7.5.20.x86_64.rpm
Preparing... ########################################### [100%]
1:dhcp ########################################### [100%]

$ rpm -ivh tftp-0.48-1.6.x86_64.rpm

$ rpm -ivh syslinux-3.11-20.14.26.x86_64.rpm

After installing the syslinux package, pxelinux.0 file will be created under /usr/share/pxelinux/ directory. This is required to load install kernel and initrd images on the client machine.


Verify that the packages are successfully installed.



$ rpm -qa | grep dhcp
$ rpm -qa | grep tftp

Download the appropriate tftpserver from the repository of your respective Linux distribution.


Steps to setup tftpboot


Step 1: Create /tftpboot directory


Create the tftpboot directory under root directory ( / ) as shown below.


# mkdir /tftpboot/

Step 2: Copy the pxelinux image


PXE Linux image will be available once you installed the syslinux package. Copy this to /tftpboot path as shown below.


# cp /usr/share/syslinux/pxelinux.0 /tftpboot


Step 3: Create the mount point for ISO and mount the ISO image


Let us assume that we are going to install the SLES10 SP3 Linux distribution on a remote server. If you have the SUSE10-SP3 DVD insert it in the drive or mount the ISO image which you have. Here, the iso image has been mounted as follows:


# mkdir /tftpboot/sles10_sp3

# mount -o loop SLES-10-SP3-DVD-x86_64.iso /tftpboot/sles10_sp3

Refer to our earlier article on How to mount and view ISO files.


Step 4: Copy the vmlinuz and initrd images into /tftpboot


Copy the initrd to the tftpboot directory as shown below.


# cd /tftpboot/sles10_sp3/boot/x86_64/loader

# cp initrd linux /tftpboot/

Step 5: Create pxelinux.cfg Directory



Create the directory pxelinux.cfg under /tftpboot and define the pxe boot definitions for the client.


# mkdir /tftpboot/pxelinux.cfg

# cat >/tftpboot/pxelinux.cfg/default
default linux
label linux
kernel linux
append initrd=initrd showopts instmode=nfs install=nfs://192.168.1.101/tftpboot/sles10_sp3/

The following options are used for,



  • kernel – specifies where to find the Linux install kernel on the TFTP server.

  • install – specifies boot arguments to pass to the install kernel.


As per the entries above, the nfs install mode is used for serving install RPMs and configuration files. So, have the nfs setup in this machine with the /tftpboot directory in the exported list. You can add the “autoyast” option with the autoyast configuration file to automate the OS installation steps otherwise you need to do run through the installation steps manually.


Step 6: Change the owner and permission for /tftpboot directory



Assign nobody:nobody to /tftpboot directory.


# chown nobody:nobody /tftpboot

# chmod 777 /tftpboot

Step 7: Modify /etc/dhcpd.conf


Modify the /etc/dhcpd.conf as shown below.


# cat /etc/dhcpd.conf

ddns-update-style none;
default-lease-time 14400;
filename "pxelinux.0";

# IP address of the dhcp server nothing but this machine.
next-server 192.168.1.101;
subnet 192.168.1.0 netmask 255.255.255.0 {
# ip distribution range between 192.168.1.1 to 192.168.1.100
range 192.168.1.1 192.168.1.100;
default-lease-time 10;
max-lease-time 10;
}

Specify the interface in /etc/syslinux/dhcpd to listen dhcp requests coming from clients.


# cat /etc/syslinux/dhcpd | grep DHCPD_INTERFACE
DHCPD_INTERFACE=”eth1”;

Here, this machine has the ip address of 192.168.1.101 on the eth1 device. So, specify eth1 for the DHCPD_INTERFACE as shown above.


On a related note, refer to our earlier article about 7 examples to configure network interface using ifconfig.



Step 8: Modify /etc/xinetd.d/tftp


Modify the /etc/xinetd.d/tftp file to reflect the following. By default the value for disable parameter is “yes”, please make sure you modify it to “no” and you need to change the server_args entry to -s /tftpboot.


# cat /etc/xinetd.d/tftp
service tftp {
socket_type = dgram
protocol = udp
wait = yes
user = root
server = /usr/sbin/in.tftpd
server_args = -s /tftpboot
disable = no
}

Step 9: No changes in /etc/xinetd.conf


There is no need to modify the etc/xinetd.conf file. Use the default values specified in the xinetd.conf file.


Step 10: Restart xinetd, dhcpd and nfs services


Restart these services as shown below.


# /etc/init.d/xinetd restart

# /etc/init.d/dhcpd restart

# /etc/init.d/nfsserver restart

After restarting the nfs services, you can view the exported directory list(/tftpboot) by the following command,



# showmount -e

Finally, the tftpboot setup is ready and now the client machine can be booted after changing the first boot device as “network” in the BIOS settings.


If you encounter any tftp error, you can do the troubleshooting by retrieving some files through tftpd service.


Retrieve some file from the tftpserver to make sure tftp service is working properly using the tftp client. Let us that assume that sample.txt file is present under /tftpboot directory.


 $ tftp -v 192.168.1.101 -c get sample.txt

Monday, July 19, 2010

Using the DBI Framework

Using the DBI Framework



Here are the basic steps for using DBI. For
more information on DBI, see Programming the Perl
DBI
by Alligator Descartes and Tim Bunce
(O'Reilly).




Step 1: Load the necessary Perl module



Nothing special here, you need to just:



use DBI;




Step 2: Connect to the database and receive a connection handle


The Perl code to establish a DBI connection to a MySQL database and
return a database handle looks like this:



# connect using to the database named $database using given 

# username and password, return a database handle
$database = "sysadm";
$dbh = DBI->connect("DBI:mysql:$database",$username,$pw);
die "Unable to connect: $DBI::errstr\n" unless (defined $dbh);


DBI will load the low-level DBD driver for us
(DBD::mysql) prior to actually connecting to the
server. We then test if the connect( ) succeeded
before continuing. DBI provides RaiseError and

PrintError options for connect(
)
, should we want DBI to perform this test or
automatically complain about errors when they happen. For example, if
we used:



$dbh = DBI->connect("DBI:mysql:$database",

$username,$pw,{RaiseError => 1});


then DBI would call die for us if the
connect( ) failed.




Step 3: Send SQL commands to the server


With our Perl module loaded and a connection to the database server
in place, it's showtime! Let's send some SQL commands to
the server. We'll use some of the SQL tutorial queries from
Appendix D, "The Fifteen-Minute SQL Tutorial" for examples. These queries will use the
Perl q convention for quoting (i.e.,
something is written as
q{something}), just so we don't have to
worry about single or double quotes in the actual queries themselves.
Here's the first of the two DBI methods for sending commands:



$results=$dbh->do(q{UPDATE hosts 

SET bldg = 'Main'
WHERE name = 'bendir'});
die "Unable to perform update:$DBI::errstr\n" unless (defined $results);



$results will receive either the number of rows
updated or undef if an error occurs. Though it
is useful to know how many rows were affected, that's not going
to cut it for statements like SELECT where we
need to see the actual data. This is where the second method comes
in.


To use the second method you first prepare a SQL
statement for use and then you ask the server to
execute it. Here's an example:



$sth = $dbh->prepare(q{SELECT * from hosts}) or 

die "Unable to prep our query:".$dbh->errstr."\n";
$rc = $sth->execute or
die "Unable to execute our query:".$dbh->errstr."\n";



prepare( ) returns a new
creature we haven't seen before: the statement handle. Just
like a database handle refers to an open database connection, a
statement handle refers to a particular SQL statement we've
prepare( )d. Once we have this statement handle,
we use execute to actually send the query to our
server. Later on, we'll be using the same statement handle to
retrieve the results of our query.


You might wonder why we bother to prepare( ) a
statement instead of just executing it directly. prepare(
)
ing a statement gives the DBD driver (or more likely the
database client library it calls) a chance to parse the SQL query.
Once a statement has prepare(
)
d, we can execute it repeatedly via our statement handle
without parsing it over and over. Often this is a major efficiency
win. In fact, the default do( ) DBI method does
a prepare( ) and then
execute( ) behind the scenes for each statement
it is asked to execute.


Like the do call we saw earlier,
execute( ) returns the number of rows affected.
If the query affects zero rows, the string 0E0 is
returned to allow a Boolean test to succeed. -1 is
returned if the number of rows affected is unknown by the driver.



Before we
move on to ODBC, it is worth mentioning one more twist supported by
most DBD modules on the prepare( ) theme:
placeholders. Placeholders, also called positional markers, allow you
to prepare( ) an SQL statement that has holes in
it to be filled at execute( ) time. This allows
you to construct queries on the fly without paying most of the parse
time penalty. The question mark character is used as the placeholder
for a single scalar value. Here's some Perl code to demonstrate
the use of placeholders:



@machines = qw(bendir shimmer sander);

$sth = $dbh->prepare(q{SELECT name, ipaddr FROM hosts WHERE name = ?});
foreach $name (@machines){
$sth->execute($name);
do-something-with-the-results
}


Each time we go through the foreach loop, the
SELECT query is executed with a different
WHERE clause. Multiple placeholders are
straightforward:



$sth->prepare(

q{SELECT name, ipaddr FROM hosts
WHERE (name = ? AND bldg = ? AND dept = ?)});
$sth->execute($name,$bldg,$dept);


Now that we know how to retrieve the number of rows affected by
non-SELECT SQL queries, let's look into
retrieving the results of our SELECT requests.




Step 4: Retrieve SELECT results


The mechanism here is similar to our brief discussion of cursors
during the SQL tutorial in Appendix D, "The Fifteen-Minute SQL Tutorial". When we send
a SELECT statement to the server using
execute( ), we're using a mechanism that
allows us to retrieve the results one line at a time.


In DBI, we call one of the methods in Table 7-1 to
return data from the result set.



Table 7.1. DBI Methods for Returning Data





























Name



Returns



Returns If No More Rows




fetchrow_arrayref( )



An array reference to an anonymous array with values that are the
columns of the next row in a result set




undef




fetchrow_array( )



An array with values that are the columns of the next row in a result
set



An empty list




fetchrow_hashref( )



A hash reference to an anonymous hash with keys that are the column
names and values that are the values of the columns of the next row
in a result set




undef




fetchall_arrayref( )



A reference to an array of arrays data structure



A reference to an empty array



Let's see these methods in context. For each of these examples,
assume the following was executed just prior:



$sth = $dbh->prepare(q{SELECT name,ipaddr,dept from hosts}) or

die "Unable to prepare our query: ".$dbh->errstr."\n";
$sth->execute or die "Unable to execute our query: ".$dbh->errstr."\n";


Here's fetchrow_arrayref( ) in action:



while ($aref = $sth->fetchrow_arrayref){

print "name: " . $aref->[0] . "\n";
print "ipaddr: " . $aref->[1] . "\n";
print "dept: " . $aref->[2] . "\n";
}


The DBI documentation mentions that fetchrow_hashref(
)
is less efficient than fetchrow_arrayref(
)
because of the extra processing it entails, but it can
yield more readable code. Here's an example:



while ($href = $sth->fetchrow_hashref){

print "name: " . $href->{name} . "\n";
print "ipaddr: " . $href->{ipaddr}. "\n";
print "dept: " . $href->{dept} . "\n";
}


Finally, let's take a look at the "convenience"
method, fetchall_arrayref( ). This method sucks
the entire result set into one data structure, returning a reference
to an array of references. Be careful to limit the size of your
queries when using this method because it does pull the entire result
set into memory. If you have a 100GB result set, this may prove to be
a bit problematic.


Each reference returned looks exactly like something we would receive
from fetchrow_arrayref( ). See Figure 7-2.




figure

Figure 7.2. The data structure returned by fetchrow_arrayref


Here's some code that will print out the entire query resultset:



$aref_aref = $sth->fetchall_arrayref;

foreach $rowref (@$aref_aref){
print "name: " . $rowref->[0] . "\n";
print "ipaddr: " . $rowref->[1] . "\n";
print "dept: " . $rowref->[2] . "\n";
print '-'x30,"\n";
}


This code sample is specific to our particular data set because it assumes a certain number of columns in a certain order. For instance,we assume the machine name is returned as the first column in the query ($rowref->[0]).

We can use some magic attributes (often called metadata) of statement handles to rewrite our result retrieval code to make it more generic.Specifically, if we look at $sth->{NUM_OF_FIELDS} after a query, it will tell us the number of fields (columns) in our result set.$sth->{NAME} contains a reference to an arraywith the names of each column. Here's a more generic way to write the last example:



$aref_aref = $sth->fetchall_arrayref;

foreach $rowref (@$aref_aref){
for ($i=0; $i < $sth->{NUM_OF_FIELDS};i++;){
print $sth->{NAME}->[$i].": ".$rowref->[$i]."\n";
}
print '-'x30,"\n";
}


Be sure to see the DBI documentation for more metadata attributes.




Step 5: Close the connection to the server


In DBI this is simply:



# tells server you will not need more data from statement handle

# (optional, since we're just about to disconnect)
$sth->finish;
# disconnects handle from database
$dbh->disconnect;





7.2.1. DBI Leftovers


There are two remaining DBI topics worth mentioning before we move on
to ODBC. The first is a set of methods I call "shortcut"
methods. The methods in Table 7-2 combine steps 3
and 4 from above.



Table 7.2. DBI Shortcut Methods




















Name



Combines These Methods into a Single Method




selectrow_arrayref($stmnt)




prepare($stmnt), execute(),
fetchrow_arrayref( )




selectcol_arrayref($stmnt)




prepare($stmnt), execute(),
(@{fetchrow_arrayref( )})[0] (i.e., returns
first column for each row)




selectrow_array($stmnt)




prepare($stmnt), execute(),
fetchrow_array( )



The second topic worth mentioning is DBI's ability to bind variables to query results. The methods bind_col() and bind_columns( ) are used to tell DBI to automatically place the results of a query into a specific variable or list of variables. This usually saves a step or two when coding. Here's an example using bind_columns( ) that makes its use clear:



$sth = $dbh->prepare(q{SELECT name,ipaddr,dept from hosts}) or

die "Unable to prep our query:".$dbh->errstr".\n";
$rc = $sth->execute or
die "Unable to execute our query:".$dbh->errstr".\n";

# these variables will receive the 1st, 2nd, and 3rd columns
# from our SELECT
$rc = $sth->bind_columns(\$name,\$ipaddr,\$dept);

while ($sth->fetchrow_arrayref){
# $name, $ipaddr, and $dept are automagically filled in from
# the fetched query results row
do-something-with-the-results
}