Tag Archives: cluster

Red Hat – Clustering and Storage Management – Course Objectives – part 2

Post 1 – http://www.guldmyr.com/blog/red-hat-clustering-and-storage-management-course-objectives/ Where I checked out udev, multipathing, iscsi, LVM and xfs.

This post is about getting using luci/ricci to get a Red Hat cluster working, but not on a RHEL machine because sadly I do not have one available for practice purposes. So CentOS64 it is. Using openstack for virtualization.

Topology: Four hosts on all three networks, -a, -b and internal. Three cluster nodes and one management node.

Get the basic cluster going:

  • image four identical nodes
  • ssh-key is distributed
  • /etc/hosts file has all hosts, IPs and networks
    • network interfaces are configured –
    • set a gateway in /etc/sysconfig/network
  • firewall
    • all traffic allowed from -a and -b networks
    • at a minimum allow traffic from the network that the hostname corresponds to that you enter in luci
  • dns (PEERDNS=no is good with several dhcp interfaces)
  • timesync with ntpd
  • luci installed on mgmt-node # ricci is a web gui
  • ricci installed on all cluster nodes # this is the service talks with corosync
    • password set for user ricci on cluster nodes
  • create cluster in luci
    • multicast perhaps doesn’t work so well in openstack ?
    • on cluster nodes this runs “yum -y install cman rgmanager lvm2-cluster sg3_utils gfs2-utils” if shared storage is selected, probably less if not.
  • fencing is really important, how to do it in openstack would require a bit of work though. Not as easy as with kvm/xvm to send a destroy domain message.

Tests:

  • Update and distribute cluster.conf
  • Have a service run on a node on the cluster (doesn’t have to have a shared storage for this).
  • Commands:
    • clustat
    • cman_tool
    • rg_test test /etc/cluster/cluster.conf start service name-of-service
    • ccs_config_validate

 

Share an iSCSI target between all nodes:

  • Using management node to share the iSCSI LUN.
  • tgtd, multipath
  • clvmd running on all nodes
  • lvmconf – make sure locking is set correctly
  • create vg with clustering
  • partprobe; multipath -r # do this often
  • vgs/lvs and make sure all nodes see the clusterd lv
  • minimum GFS filesystem is around 128M – you didn’t use all the vg right? =)
    • for testing/small cluster lowering the journal size is goodness
  • mount!

 

Red Hat – Clustering and Storage Management – Course Objectives

Attending “Red Hat Enterprise Clustering and Storage Management” in August. Quite a few of these technologies I haven’t touched upon before so probably best to go through them before the course.

Initially I wonder how many of these are Red Hat specific, or how many of these I can accomplish by using the free clones such as CentOS or Scientific Linux. We’ll see :) At least a lot of Red Hat’s guides will include their Storage Server.

I used the course content summary as a template for this post, my notes are made within them.. below.

For future questions and trolls: this is not a how-to for lazy people who just want to copy and paste. There are plenty of other sites for that. This is just the basics and it might have some pointers so that I know which are the basic steps and names/commands for each task. That way I hope it’s possible to figure out how to use the commands and such by RTFM.

 

 

Course content summary :

Clusters and storage

Get an overview of storage and cluster technologies.

ISCSI configuration

Set up and manage iSCSI.

Step 1: Setup a server that can present iSCSI LUNs. A target.

  1. CentOS 6.4 – minimal. Set up basic stuff like networking, user account, yum update, ntp/time sync then make a clone of the VM.
  2. Install some useful software like: yum install ntp parted man
  3. Add a new disk to the VM

Step 2: Make nodes for the cluster.

  1. yum install iscsi-initiator-utils

Step 3: Setup an iSCSI target on the iSCSI server.

http://www.server-world.info/en/note?os=CentOS_6&p=iscsi

  1. yum install scsi-target-utils
  2. allow port 3260
  3. edit /etc/tgt/target.conf
  4. if you do comment out the ip range and authentication it’s free-for-all

http://www.server-world.info/en/note?os=CentOS_6&p=iscsi&f=2

Step 4: Login to the target from at least two nodes by running ‘iscsiadm’ commands.

Next step would be to put an appropriate file system on the LUN.

UDEV

Learn basic manipulation and creation of udev rules.

http://www.reactivated.net/writing_udev_rules.html is an old link but just change the commands to “udevadm” instead of “udev*” and at least the sections I read worked the same.

udevadm info -a -n /dev/sdb

Above command helps you find properties which you can build rules from. Only use properties from one parent.

I have a USB key that I can pass through to my VM in VirtualBox, without any modifications it pops up as /dev/sdc.

By looking in the output of the above command I can create /etc/udev/rules.d/10-usb.rules that contains:

SUBSYSTEMS=="usb", ATTRS{serial}=="001CC0EC3450BB40E71401C9", NAME="my_usb_disk"

After “removing” the USB disk from the VM and adding it again the disk (and also all partitions!) will be called /dev/my_usb_disk. This is bad.

By using SYMLINK+=”my_usb_disk” instead of NAME=”my_usb_disk” all the /dev/sdc devices are kept and /dev/my_usb_disk points to /dev/sdc5. And on next boot it pointed to sdc6 (and before that sg3 and sdc7..). This is also bad.

To make one specific partition with a specific size be symlinked to /dev/my_usb_disk I could set this rule:

SUBSYSTEM=="block", ATTR{partition}=="5", ATTR{size}=="1933312", SYMLINK+="my_usb_disk"

You could do:

KERNEL=="sd*" SUBSYSTEM=="block", ATTR{partition}=="5", ATTR{size}=="1933312", SYMLINK+="my_usb_disk%n"

Which will create /dev/my_usb_disk5 !

This would perhaps be acceptable, but if you ever want to re-partition the disk then you’d have to change the udev rules accordingly.

If you want to create symlinks for each partition (based on it being a usb, a disk and have the USB with specified serial number):

SUBSYSTEMS=="usb", KERNEL=="sd*", ATTRS{serial}=="001CC0EC3450BB40E71401C9", SYMLINK+="my_usb_disk%n"

These things can be useful if you have several USB disks but you always want the disk to be called /dev/my_usb_disk and not sometimes /dev/sdb and sometimes /dev/sdc.

For testing one can use “udevadm test /sys/class/block/sdc”

Multipathing

Combine multiple paths to SAN devices into one fault-tolerant virtual device.

Ah, this one I’ve been in touch with before with fibrechannel, it also works with iSCSI.
Multipath is the command and be wary of devices/multipaths vs default settings.
Multipathd can be used in case there are actually multiple paths to a LUN (the target is perhaps available on two IP addresses/networks) but it can also be used to set a user_friendly name to a disk, based on its wwid.

Some good commands:

service multipathd status
yum provides */multipath.conf # device-mapper-multipath is the package. 
multipath -ll

Copy in default multipath.conf to /etc; reload and hit multipath -ll to see what it does.
After that the Fun begins!

 

Red Hat high-availability overview

Learn the architecture and component technologies in the Red Hat® High Availability Add-On.

https://access.redhat.com/site/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/High_Availability_Add-On_Overview/index.html

Quorum

Understand quorum and quorum calculations.

Fencing

Understand Fencing and fencing configuration.

Resources and resource groups

Understand rgmanager and the configuration of resources and resource groups.

https://access.redhat.com/site/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/High_Availability_Add-On_Overview/ch.gfscs.cluster-overview-rgmanager.html

Advanced resource management

Understand resource dependencies and complex resources.

Two-node cluster issues

Understand the use and limitations of 2-node clusters.

http://en.wikipedia.org/wiki/Split-brain_(computing)

LVM management

Review LVM commands and Clustered LVM (clvm).

Create Normal LVM and make a snapshot:

Tutonics has a good “ubuntu” guide for LVMs, but at least the snapshot part works the same.

  1. yum install lvm2
  2. parted /dev/vda # create two primary large physical partitions. With a CentOS64 VM in openstack I had to reboot after this step.
  3. pvcreate /dev/vda3 pvcreate /dev/vda4
  4. vgcreate VG1 /dev/vda3 /dev/vda4
  5. lvcreate -L 1G VG1 # create a smaller logical volume (to give room for snapshot volume)
  6. mkfs.ext4 /dev/VG1/
  7. mount /dev/VG1/lvol0 /mnt
  8. date >> /mnt/datehere
  9. lvcreate -L 1G -s -n snap_lvol0 /dev/VG1/lvol0
  10. date >> /mnt/datehere
  11. mkdir /snapmount
  12. mount /dev/VG1/snap_lvol0 /snapmount # mount the snapshot :)
  13. diff /snapmount/datehere /mnt/datehere

Revert a Logival Volume to the state of the snapshot:

  1. umount /mnt /snapmount
  2. lvconvert –merge /dev/VG1/snap_lvol0 # this also removes the snapshot under /dev/VG1/
  3. mount /mnt
  4. cat /mnt/datehere

XFS

Explore the Features of the XFS® file system and tools required for creating, maintaining, and troubleshooting.

https://access.redhat.com/site/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/Storage_Administration_Guide/xfsmain.html

yum provides */mkfs.xfs

yum install quota

XFS Quotas:

mount with uquota for user quotas, mount with uqnoenforce for soft quotas.
use xfs_quota -x to set quotas
help limit

To illustrate the quotas: set a limit for user “user”:

xfs -x -c "limit bsoft=100m bhard=110m user"

Then create two 50M files. While writing the 3rd file the cp command will halt when it is at the hard limit:

[user@rhce3 home]$ cp 50M 50M_2
cp: writing `50M_2': Disk quota exceeded
[user@rhce3 home]$ ls -l
total 112636
-rw-rw-r-- 1 user user 52428800 Aug 15 09:29 50M
-rw-rw-r-- 1 user user 52428800 Aug 15 09:29 50M_1
-rw-rw-r-- 1 user user 10477568 Aug 15 09:29 50M_2

Red Hat Storage

Work with Gluster to create and maintain a scale-out storage solution.

http://chauhan-rhce.blogspot.fi/2013/04/gluster-file-system-configuration-steps.html

Updates to the Red Hat Enterprise Clustering and Storage Management course

Comprehensive review

Set up high-availability services and storage.

cfengine – what’s that about?

http://cfengine.com/what-is-cfengine

It’s a (old) software that is used to make sure that (for example) the same config files are used on all machines. There are several other CMSs, for example puppet. Wikipedia has a nice overview of them.

Let’s use the lustre   machines we set up in a previous post.

On cfengine.com there are many examples too.

Inside a policy you have a promise.

Install

Installing on an RPM-based distribution is easy, cfengine has their own repository where the community edition is available.

http://cfengine.com/cfengine-linux-distros

Get the gpg-key, import it, set up the repository-file and install “cfengine-community”.

Check if “cfengine3” is set to start on boot.

Test

A small example how to write a promise.

  • “cf-promise -f ” can be used to test that a promise is valid (syntax and more is OK)
  • “cf-agent -f” run the promise, so if we use the example in the link above it echoes a Hello World.

 

Client/Server

Client pulls policies from the server.

policy-server: mds – 192.168.0.2
client1: client1 – 192.168.0.4
client2: oss1 – 192.168.0.3

on the policy-server hit: “/var/cfengine/bin/cf-agent –bootstrap –policy-server 192.168.0.2”

open port 5308 on the policy-server.

After you see “-> Bootstrap to 192.168.0.2 completed successfully” you can run the same cf-agent command on the client. This points it to use 192.168.0.2 as the policy-server.

No need to open port on the clients.

On the policy-server add this to /var/cfengine/masterfiles/cftest1.cf:

bundle agent test
{
 files:
  "/tmp/cf_test_file"
   comment => "Promise that a plain file exists with stated permissions",
    perms => mog("644", "root", "sys"),
   create => "true";
}

Then in /var/cfengine/masterfiles/promises.cf you can’t follow the guide verbatim, the promises.cf needs to look like this (really important to have “, ” as a separator between the bundles, notice the space after the “,”.

   body common control 
     {
     bundlesequence => { "main", "test" };
             inputs => { 
                       "cfengine_stdlib.cf", 
                       "cftest1.cf",
                       };
            version => "Community Promises.cf 1.0.0";
     }

After that you can run “cf-agent -Kv” on the client, and it will do what is promised in the cftest1.cf file!

Try to change ownership/permissions on the file, in a while it will have been changed back :)

In /var/cfengine/promise_summary.log you’ll see if it couldn’t keep a promise and if it corrected the mistake.

Distribute it

And to get oss1 the same file. Just run the good old “/var/cfengine/bin/cf-agent –bootstrap –policy-server 192.168.0.2” on it and eventually that file /tmp will pop up in there too. Nice!

Some useful stuff.

I’ll probably try out some more useful things in the near future.

Streamline resolv.conf settings, ip routes, config files for software like to make sure /etc/dcache/dcache.conf is the same on all pool servers or why not a kind of user database? Like for /etc/passwd? Check out the solutions on cfengine.com!

Setup a 3 Node Lustre Filesystem

Introduction

Lustre is a filesystem often used by clusters because many computers can mount the filesystem simultaneously.

This is a small log/instruction for how to setup Lustre in 3 virtualized machines (one metadata server, one object storage server and one client).

Basic components:

VMWare Workstation
3 x CentOS 6.3 VMs.
Latest Lustre from Whamcloud

To use Lustre your kernel needs to support it. There’s a special one for server and one for the client. Some packages are needed on both.

Besides lustre you’ll need an updated version of e2fsprogs as well (because the version that comes from RHEL6.3 does not support large partitions).

Starting with the MDS. When the basic OS setup is done will make a copy of that to use for OSS and Client.

Setup basic services.

Install an MDS

This will run the MDT – the metadata target.

2GB RAM, 10GB disk, bridged networking, 500GB for /boot, rest for / (watch out, it may create a really large swap). Minimal install. Setup OS networking (static ip for servers, start on boot, open port 988 in firewall, possibly some for outgoing if you decide to restrain that too), run yum update and setup ntp. Download latest lustre and e2fsprogs to /root/lustre-client, lustre-server and e2fsprogs appropriately (x86_64). Lustre also does not support selinux, so disable that (works fine with it in enforcing until time to create mds/mdt, also fine with permissive until it’s time to mount).
Put all hostnames into /etc/hosts.
Poweroff and make two full clones.
Set hostname.

Install an OSS

This will contain the OST (object storage target). This is where the data will be stored.

Networking may not work (maybe device name changed to eth1 or eth2).
You may want to change this afterwards to get the interface back to be called (eth0). A blog post about doing that.

Install a client

This will access and use the filesystem.

Clone of the OSS before installing any lustre services or kernels.

Install Lustre

Before you do this it may be wise to take a snapshot of each server. In case you screw the VM up you can then go back :)

Starting with the MDS.

Installing e2fsprogs, kernel and lustre-modules.

Skipping debuginfo and devel packages, installing all the rest.

yum localinstall \ 
kernel-2.6.32-220.4.2.el6_lustre.x86_64.rpm kernel-firmware-2.6.32-220.4.2.el6_lustre.x86_64.rpm \
kernel-headers-2.6.32-220.4.2.el6_lustre.x86_64.rpm \
lustre-2.2.0-2.6.32_220.4.2.el6_lustre.x86_64.x86_64.rpm \ 
lustre-ldiskfs-3.3.0-2.6.32_220.4.2.el6_lustre.x86_64.x86_64.rpm \
lustre-modules-2.2.0-2.6.32_220.4.2.el6_lustre.x86_64.x86_64.rpm

The above was not the order they were installed. Yum changed the order so that for example kernel-headers was last.

yum localinstall e2fsprogs-1.42.3.wc3-7.el6.x86_64.rpm \
e2fsprogs-debuginfo-1.42.3.wc3-7.el6.x86_64.rpm \
e2fsprogs-devel-1.42.3.wc3-7.el6.x86_64.rpm \
e2fsprogs-libs-1.42.3.wc3-7.el6.x86_64.rpm \
e2fsprogs-static-1.42.3.wc3-7.el6.x86_64.rpm \
libcom_err-1.42.3.wc3-7.el6.x86_64.rpm \
libcom_err-devel-1.42.3.wc3-7.el6.x86_64.rpm \
libss-1.42.3.wc3-7.el6.x86_64.rpm \
libss-devel-1.42.3.wc3-7.el6.x86_64.rpm

After boot, confirm that you have lustre kernel installed by typing:

uname -av

and

mkfs.lustre --help

to see if you have that and

rpm -qa 'e2fs*'

to see if that was installed properly too.

By the way, you probably want to run this to exclude automatic yum kernel updates:

echo "exclude=kernel*" >> /etc/yum.conf

After install and reboot into new kernel it’s time to modprobe lustre, start creating MDT, OST and then mount things!
But hold on to your horses, first we ned to install the client :)

 

And then the Client

Install the e2fsprogs*

We cannot just install the lustre-client packages, because we run a different kernel than the ones that whamcloud have compiled the lustre-client against.

We can either back-pedal and install an older kernel. Or we can build (from source / SRPMS) a lustre-client that works on a kernel of our choosing. The later option seems like a better way, because we can then upgrade the kernel if we want to.

 

Build custom linux-client rpms

Because of a bug it appears that some ext4 source packages are needed – while they are not. You need to add some parameters to ./configure. This will be the topic of a future post.

The above rpmbuild should create rpms for the running kernel. If you want to create rpms for a non-running kernel you are supposed to be able to run.

Configure Lustre

Whamcloud have good instructions. Don’t be afraid to check out their wiki or use google.

/var/log/messages is the place to look for more detailed errors.

On the MDS

Because we do not have infiniband you want to change the parameters slightly for lnet to include tcp(eth0). These changes are not reflected until reboot (quite possibly something else) – but just editing a file under /etc/modprobe.d/ called for example lustre.conf is not enough.

Added a 5GB disk to the mds.

fdisk -cu /dev/sdb; n, p, 1, (first-last)

modprobe lustre lnet

mkfs.lustre –mdt –mgs

mount

On the OSS

Also add the parameters into modprobe.

mkfs.lustre –ost

mount

On the client

Add things into modprobe.

mount!

Write something.

Then hit: lfs df -h

To see usage!

 

Get it all working on boot

You want to start the MDS, then the OSS and last the client.
But while it’s running you can restart any node and eventually it will start working again.

Fstab on the client:
ip@tcp:/fsname /mnt lustre defaults,_netdev 0 0

Fstab on the OSS and MDS:
/dev/sdb1 /mnt/MDS lustre defaults,_netdev 0 0

While it’s running you can restart any node and eventually it will start working again.