Category Archives: Storage – now using object storage!

To continue this series of blog posts about the awesome web site where you can see if there was in fact, an NHL game last night :)

Some background: First I had a python script that scraped the website of and later changed that to just grab the data from the JSON REST API of – much nicer. But it was still outputing the result to stdout as a set and a dictionary. And then I would in the application import this file to get the schedule. This was quite hacky and ugly :) But hey it worked.

As of this commit it now uses Google’s Cloud Object Storage:

  • a special URL (one has to be an admin to be able to access it)
  • there’s a cronjob which calls this URL once a day (22:00 in some time zone)
  • when this URL is called a python script runs which:
    • checks what year it is and composes the URL to the API so that we only grab this season’s games (to be a bit nicer to the API)
    • does some sanity checking – that the fetched data is not empty
    • extracts the dates and teams as before and writes two variables,
      • one list which has the dates when there’s a game
      • one dictionary which has the dates and all the games on each date
        • probably the last would be enough ;)
    • finally always overwrites the schedule


To only update it when there are changes would be cool as then I could notify myself (and possibly others) when there have been changes, but it would mean that the JSON dict has to be ordered, which they aren’t by default so I’d have to change some stuff. The GCSFileStat has a checksum-like metadata of the files called ETAG. But probably it would be best to first compute a checksum of the generated JSON and then add that as an extra metadata to the object as this ETAG is probably implemented differently between providers. – fixed – working again!

check_irods – nagios plugin to check the functionality of an iRODS server

Part of my $dayjob as a sysadmin is to monitor all things.

Today I felt like checking if the users on our servers could use the local iRODS storage and thus check_irods was born!

It checks if it can:

  1. put
  2. list
  3. get
  4. remove

a temporary file.


  •  iRODS 3.2 with OS trusted authentication
  • mktemp


Nagios Health Check of a DDN SFA12K

Part of my $dayjob as a sysadmin is to monitor all things.

I’ll be publishing my home-made nagios checks on github in the near future.

Here is the first one that uses the Web API of a DDN’s SFA12K (might work on the 10k too, haven’t tried) which is a storage platform.

The URL to the check is located here:

Unfortunately it seems that the Python Egg (the library / API bindings) is still not available online so one has to ask DDN Support to get that.

It’s not perfect, there’s much room for improvement, refactoring, moving the password/username out of a variable and it makes many assumptions.
But making it work for you shouldn’t be too hard. If you have any questions comment here or on github :)

High amount of Load_Cycle_Count on Green Western Digital disks

You are monitoring the SMART values of your disks right? They’re usually a real good indicator of the health of the drive.

Thought I’d check out the SMART value of the disks in my desktop today (while checking if I had notifications from smartd on).

Low and behold, the Load_Cycle_Count (LLC) was really high, much higher than power_cycle_count on the 3TB WD disk I have. It turns out this is quite an old problem so there are a few posts about this on the Internets.
The Interwebs says max in the specs are 300k load cycles. Smartctl -a says I’m already at 218602 after 9302 power on hours (387 days but I power off the computer at night).


Model Family:     Western Digital Caviar Green (AF, SATA 6Gb/s)
Device Model:     WDC WD30EZRX-00DC0B0

For Windows there’s a wdidle3.exe that is a DOS program that one can put on a bootable floppy (…) and boot a computer on to change some stuff on a disk.

Fortunately I run Linux (Ubuntu 14.10 since yesterday) and there’s a tool called idl3ctl – one can grab it from here:

I got the latest source code and compiled it myself because there had been some updates to it since the last release (2012 vs 2011 ..).
“idl3ctl -g” shows that the disk was set to park itself after 8s. I disabled that with idl3ctl and powered off and on the computer and now the tool says it’s disabled.

Hopefully this should increase the lifetime of my disk.

Lustre 2.5 + CentOS6 Test in OpenStack

Reason: Testing to Lustre 2.5 from a clean CentOS 6.5 install in an openstack.

Three VMs: two servers, one MDS, one OSS and one Client. CentOS65 on all. An open internal ethernet network for the lustre traffic (don’t forget firewalls). Yum updated to latest kernel. Two volumes presented to the lustreserver and lustreoss for MDT + OST, both are at /dev/vdc. Hostnames set. /etc/hosts updated with three IPs: lustreserver,  lustreoss and lustreclient.

With 2.6.32-431.17.1.el6.x86_64 there’s some issues at the moment for building the server components. One needs to use the latest branch for 2.5 so the instructions are

Server side

MDT/OST: Install e2fsprogs and reboot after yum update (to run the latest kernel kernel).

yum localinstall all files from:

Next is to rebuild lustre kernels to work with the kernel you are running and the one you have installed for next boot:

RPMS are here:

For rebuilding these are also needed:

yum -y install kernel-devel* kernel-debug* rpm-build make libselinux-devel gcc


  • git clone -b b2_5 git://
  • autogen
  • install kernel.src from redhat (puts tar.gz in /root/rpmbuild/SOURCES/)
  • if rpmbuilding as user build, then copy files from /root/rpmbuild into /home/build/rpmbuild..
  • rebuilding kernel requires quite a bit of hard disk space, as I only had 10G for / then I made symlinks under $HOME to the $HOME/kernel and $HOME/lustre-release

yum -y install expect and install the new kernel with lustre patches and the lustre and lustre modules.

Not important?: WARNING: /lib/modules/2.6.32-431.17.1.el6.x86_64/weak-updates/kernel/fs/lustre/fsfilt_ldiskfs.ko needs unknown symbol ldiskfs_free_blocks

/sbin/new-kernel-pkg –package kernel –mkinitrd –dracut –depmod –install

chkconfig lustre on

edit /etc/modprobe.d/lustre.conf and add the lnet parameters

modprobe lnet
lctl network up
# lctl list_nids

creating MDT: mkfs.lustre –mdt –mgs –index=0 –fsname=wrk /dev/vdc1
mounting MDT: mkdir /mnt/MDT; mount.lustre /dev/vdc1 /mnt/MDT

creating OST: mkfs.lustre –ost –index=0 –fsname=wrk –mgsnode=lustreserver /dev/vdc1
mounting OST: mkdir /mnt/OST1; mount -t lustre /dev/vdc1 /mnt/OST1

Client Side

rpmbuild --rebuild --without servers

cd /root/rpmbuild/RPMS/x86_64
rpm -Uvh lustre-client*

add modprobe.d/lustre.conf
modprobe lnet
lctl network up
lctl list_nids

mount.lustre lustreserver@tcp:/wrk /wrk

lfs df!

Brocade Certified Professional Data Center Track – Check!

After ~49 posts on this blog on the topic Brocade the first larger block is finally complete: the Brocade Certified Professional Data Center Track (BCPDC)!

What’s that? So Brocade has several (4) tracks which consist of  certifications/accrediations, some are shared between the tracks and some are only in one track.
Currently, after completing 3 out of 4 you gets the title Brocade Distinguished Architect! Woop!

It took me ~3.5 years (counting since first blog post about BCFA (certified fabric administrator)) to complete all the prerequisites for BCPDC, but naturally I didn’t do it as fast as I could. I was patient and many of the certificates I got by being signing up for Brocade’s beta tests of their certs.

Not that many certificates left to take actually before I can complete another track.
Most of the remaining ones are labeled accreditations, which are unprobro_edu4_cert_pro_data_center_rgbctored tests one does at home.

  • For Brocade Certified Professional Converged Networking (BCPCN) I have 3 accrediations left (Fabric Specialist, FCoE Specialist and Ethernet Fabric Support Specialist) and 1 certification: Ethernet Fabric Professional 2013. The certification I have signed up for the free one I mentioned in an earlier blog post.
  • For Brocade Certified Professional FICON (BCPF) there’s one accrediation (Accredited FICON Specialist) an done certification (Certified Architect for FICON 2013) remaining.
  • For Brocade Certified Professional Internetworking (BCPI) there’s 3 certifications: Certified Layer 4-7 Engineer 2010, Certified Network Professional 2012 and Certified Layer 4-7 Professional 2013.

BANAS – Brocade Certification – Studying

I’m going to focus on the below things when studying for BANAS: They are based on the current objectives listed on Brocade’s page.


Brocade Accredited Network Advisor Specialist Exam Topics

  • The Brocade Accredited Network Advisor Specialist exam has these objectives:

Product Features

  • Demonstrate knowledge of Brocade Network Advisor product features

Installation and Configuration

  • Describe the installation and configuration of Brocade Network Advisor

  • Perform SAN Discovery

    • What are seed switches?
  • Perform IP Discovery

    • BNA 170-WBT is a course that’s currently free by Brocade – it’s about IP Discovery in BNA!
    • Once discovered devices are stored in the Management application database. First IP of the device discovered becomes the primary address of the device.
    • Simple/Profile based discovery: single: hostname/IP. Profile: range.
    • Requirements
      • Users must have Discover Setup-IP and “All IP Products AOR” privileges
        • For rediscovery only “All IP Products AOR” is needed?
      • ICMP or telnet must be enabled on devices
      • Snmpv1+v2 or v3 read-write
      • IP range of devices must be known
      • All devices must have SNMP MIB support
    • Access by: “Discover -> IP Products”.
    • One can add default username/password. One can add several and it tries the default and then the rest..
    • It uses OIDs to select products to include/exclude.
      • Cisco/Juniper are available by default.
    • Seed address: the IP the BNA server will use to contact the switches?


  • Describe considerations when migrating to Brocade Network Advisor from other tools
    • Check out the Installation Guide for BNA.


  • Demonstrate knowledge of troubleshooting Brocade Network Advisor

Vyatta: a router/vpn/firewall in a VM

Brocade has a beta exam up for BCVRE – Certified vRouter Engineer – which is on the Vyatta software from the company with the same name that Brocade bought last year.

There is the free open source core. Download from here: (no you don’t have to register).  The evaluation/subscriber version has the API and web gui available, I’ll probably check those out closer to the exam date.

I grabbed VC6.6 – Virtualization ISO. Use it in a VM and assign 5GB disk (install only requires 1G, or you could just run it on the iso, but then it doesn’t keep state between reboots) and 1GB RAM. Two NICs: One NAT and one private. But to get more acquainted with it you’ll likely have to do a bit more configuration on the hypervisor side. Such as turn off dhcpd in your virtual networks.

To install it to disk: hit “install system” at the CLI after it’s booted.

More documentation: – there are descriptions how to get for example ssh management working ( set service ssh ).

The server is basically Debian with a more recent kernel (6.6 has 3.3) and a shell to make it more switch-like. It actually uses the bash completion to make it look like this. Check out /etc/bash_completion.d/vyatta-*

To remove a setting use “delete” (comparable to no in other CLIs). There is a web interface, but this is only for subscribers. Core version allows SNMP though if you want to use that :)

What to do with vyatta? A bunch of tutorials are here:

  • NAT
  • VPN (for example connect private cloud <-> Amazon VPN)
  • Firewall
  • Routing (OSPF, BGP, etc)

But no SDN stuff (separate data and the control plane). It looks like it’s not possible to modify the flow table of a switch via Vyatta. This looks like a software router/VPN/firewall with some extras added to it.

Red Hat – Clustering and Storage Management – Course Objectives

Attending “Red Hat Enterprise Clustering and Storage Management” in August. Quite a few of these technologies I haven’t touched upon before so probably best to go through them before the course.

Initially I wonder how many of these are Red Hat specific, or how many of these I can accomplish by using the free clones such as CentOS or Scientific Linux. We’ll see :) At least a lot of Red Hat’s guides will include their Storage Server.

I used the course content summary as a template for this post, my notes are made within them.. below.

For future questions and trolls: this is not a how-to for lazy people who just want to copy and paste. There are plenty of other sites for that. This is just the basics and it might have some pointers so that I know which are the basic steps and names/commands for each task. That way I hope it’s possible to figure out how to use the commands and such by RTFM.



Course content summary :

Clusters and storage

Get an overview of storage and cluster technologies.

ISCSI configuration

Set up and manage iSCSI.

Step 1: Setup a server that can present iSCSI LUNs. A target.

  1. CentOS 6.4 – minimal. Set up basic stuff like networking, user account, yum update, ntp/time sync then make a clone of the VM.
  2. Install some useful software like: yum install ntp parted man
  3. Add a new disk to the VM

Step 2: Make nodes for the cluster.

  1. yum install iscsi-initiator-utils

Step 3: Setup an iSCSI target on the iSCSI server.

  1. yum install scsi-target-utils
  2. allow port 3260
  3. edit /etc/tgt/target.conf
  4. if you do comment out the ip range and authentication it’s free-for-all

Step 4: Login to the target from at least two nodes by running ‘iscsiadm’ commands.

Next step would be to put an appropriate file system on the LUN.


Learn basic manipulation and creation of udev rules. is an old link but just change the commands to “udevadm” instead of “udev*” and at least the sections I read worked the same.

udevadm info -a -n /dev/sdb

Above command helps you find properties which you can build rules from. Only use properties from one parent.

I have a USB key that I can pass through to my VM in VirtualBox, without any modifications it pops up as /dev/sdc.

By looking in the output of the above command I can create /etc/udev/rules.d/10-usb.rules that contains:

SUBSYSTEMS=="usb", ATTRS{serial}=="001CC0EC3450BB40E71401C9", NAME="my_usb_disk"

After “removing” the USB disk from the VM and adding it again the disk (and also all partitions!) will be called /dev/my_usb_disk. This is bad.

By using SYMLINK+=”my_usb_disk” instead of NAME=”my_usb_disk” all the /dev/sdc devices are kept and /dev/my_usb_disk points to /dev/sdc5. And on next boot it pointed to sdc6 (and before that sg3 and sdc7..). This is also bad.

To make one specific partition with a specific size be symlinked to /dev/my_usb_disk I could set this rule:

SUBSYSTEM=="block", ATTR{partition}=="5", ATTR{size}=="1933312", SYMLINK+="my_usb_disk"

You could do:

KERNEL=="sd*" SUBSYSTEM=="block", ATTR{partition}=="5", ATTR{size}=="1933312", SYMLINK+="my_usb_disk%n"

Which will create /dev/my_usb_disk5 !

This would perhaps be acceptable, but if you ever want to re-partition the disk then you’d have to change the udev rules accordingly.

If you want to create symlinks for each partition (based on it being a usb, a disk and have the USB with specified serial number):

SUBSYSTEMS=="usb", KERNEL=="sd*", ATTRS{serial}=="001CC0EC3450BB40E71401C9", SYMLINK+="my_usb_disk%n"

These things can be useful if you have several USB disks but you always want the disk to be called /dev/my_usb_disk and not sometimes /dev/sdb and sometimes /dev/sdc.

For testing one can use “udevadm test /sys/class/block/sdc”


Combine multiple paths to SAN devices into one fault-tolerant virtual device.

Ah, this one I’ve been in touch with before with fibrechannel, it also works with iSCSI.
Multipath is the command and be wary of devices/multipaths vs default settings.
Multipathd can be used in case there are actually multiple paths to a LUN (the target is perhaps available on two IP addresses/networks) but it can also be used to set a user_friendly name to a disk, based on its wwid.

Some good commands:

service multipathd status
yum provides */multipath.conf # device-mapper-multipath is the package. 
multipath -ll

Copy in default multipath.conf to /etc; reload and hit multipath -ll to see what it does.
After that the Fun begins!


Red Hat high-availability overview

Learn the architecture and component technologies in the Red Hat® High Availability Add-On.


Understand quorum and quorum calculations.


Understand Fencing and fencing configuration.

Resources and resource groups

Understand rgmanager and the configuration of resources and resource groups.

Advanced resource management

Understand resource dependencies and complex resources.

Two-node cluster issues

Understand the use and limitations of 2-node clusters.

LVM management

Review LVM commands and Clustered LVM (clvm).

Create Normal LVM and make a snapshot:

Tutonics has a good “ubuntu” guide for LVMs, but at least the snapshot part works the same.

  1. yum install lvm2
  2. parted /dev/vda # create two primary large physical partitions. With a CentOS64 VM in openstack I had to reboot after this step.
  3. pvcreate /dev/vda3 pvcreate /dev/vda4
  4. vgcreate VG1 /dev/vda3 /dev/vda4
  5. lvcreate -L 1G VG1 # create a smaller logical volume (to give room for snapshot volume)
  6. mkfs.ext4 /dev/VG1/
  7. mount /dev/VG1/lvol0 /mnt
  8. date >> /mnt/datehere
  9. lvcreate -L 1G -s -n snap_lvol0 /dev/VG1/lvol0
  10. date >> /mnt/datehere
  11. mkdir /snapmount
  12. mount /dev/VG1/snap_lvol0 /snapmount # mount the snapshot :)
  13. diff /snapmount/datehere /mnt/datehere

Revert a Logival Volume to the state of the snapshot:

  1. umount /mnt /snapmount
  2. lvconvert –merge /dev/VG1/snap_lvol0 # this also removes the snapshot under /dev/VG1/
  3. mount /mnt
  4. cat /mnt/datehere


Explore the Features of the XFS® file system and tools required for creating, maintaining, and troubleshooting.

yum provides */mkfs.xfs

yum install quota

XFS Quotas:

mount with uquota for user quotas, mount with uqnoenforce for soft quotas.
use xfs_quota -x to set quotas
help limit

To illustrate the quotas: set a limit for user “user”:

xfs -x -c "limit bsoft=100m bhard=110m user"

Then create two 50M files. While writing the 3rd file the cp command will halt when it is at the hard limit:

[user@rhce3 home]$ cp 50M 50M_2
cp: writing `50M_2': Disk quota exceeded
[user@rhce3 home]$ ls -l
total 112636
-rw-rw-r-- 1 user user 52428800 Aug 15 09:29 50M
-rw-rw-r-- 1 user user 52428800 Aug 15 09:29 50M_1
-rw-rw-r-- 1 user user 10477568 Aug 15 09:29 50M_2

Red Hat Storage

Work with Gluster to create and maintain a scale-out storage solution.

Updates to the Red Hat Enterprise Clustering and Storage Management course

Comprehensive review

Set up high-availability services and storage.

Factory reset of a Brocade SAN switch

Ever wondered which is the easiest way?

Using the “configDefault –all” does not clear everything, for example it doesn’t clear: system name, zoning, etc.

Setting the switch to AG mode (Access Gateway) – will clear more things as it basically dumbs down the switch, it does not remove the licenses, IP and password.

ag --modeenable
ag --modedisable

The ‘ag –modedisable’ (puts switch back in normal switch mode) command sets the default zones access to No Access, so if you want to merge this switch into a fabric you’ll most likely need to change that and disable/enable the E_Ports.

Quite often there are some good tips on the Brocade’s community forum.

Command View P6000 EVA Simulator 10.0

Due to somewhat popular demand here’s another post detailing the steps for somewhat successfully installing HP P6000 Command View Simulator on Windows 7 x64. It can be a bitch.

The older post is from 2011 with CV 9.4, this one also has PA – performance advisory bundled.

  • Download:
  • Two files: EVA Simulator 10.0 (Z7550-00252_EvasimInstaller_100fr_v1.exe) and a readme
  • There is an e-mail listed in the readme!
    • But if you want to, you can put in a comment below saying how sexy I am :p
  • The readme is quite long but most of it is about how to use the PA (performance analyzer), Appendix B is a required read. It describes how to add the Groups so you can log on to CV.
    • A previous blog post by myself truly also goes through how to add a user group :)

For lazy hounds:

  1. (optional) Disable UAC in Windows and make yourself admin.
  2. Put an account in the Windows Group called “HP Storage Admins”.
  3. Launch the downloaded file (it extracts a setup.exe and .msi file)
  4. Launch setup.exe – it’s located in the same directory where you launched the Z7550-00252_EvasimInstaller_100fr_v1.exe
  5. Next, next, next, next, yes, yes, Wait, yes, Installed!
  6. Try out the “Start HP P6000 EVA Simulator” new icon on your desktop, does it work? Profit!

“XF application has stopped working” – some friendly error I got and CV simulator did not start.. Most likely permission issue. Peaking through one of the command-prompts it repeats access denied.

It’s amazing that the CV simulator still relies on .bat scripts. Guess it’s for backwards compatibility with XP and Vista? Only one file necessary for all those Windows OS variants.

With default Windows security, the Simulator runs into a problem when it tries to write to files under c:\program files (x86)\ . There are probably many ways to remedy that, one might be step 1 above. This worked:

  1. Go to C:\Program Files (x86)\Hewlett-Packard\HP P6000 EVA Simulator\evasim
  2. Right-click on ‘start_bundle.bat’ and run it as an administrator. This should start the simulator.
  3. Open up a command prompt with Admin Privileges, cd your way into evasim directory and type: “start startcv.bat”
  4. That should launch the Command View process and also IE pointing to CV.
  5. If not, point your web-browser to: https://localhost:2374/SPoG/ or https://localhost:2374/
  6. Log in with the user/password you added into the “HP Storage Admins” group earlier.

Some tips:

In one of the “DOS” windows, there might be more clues as to what’s going on.

Open a command prompt with admin privilieges by typing “cmd” in the search bar then right-clicking and starting as administrator.

Inside the Simulator DOS prompt you can hit enter and if you see some commands (save, stop, exit, start) then that’s the simulator window.

If you want your changes to be kept, type “save” in the simulator window before quitting.

Some thoughts:

It feels a bit ruggish. I bet this whole mess could be improved quite easily with some decent scripts. Here’s one I’d like to see:

if $os == Win7:
    if $write_read_permissions_in_program_files != "allowed":
        print_in_big_letter("You need more axx! Do $THIS")


Studying for BCNE – Brocade Certified Network Engineer

In early April of 2013 Brocade had a great offer – ask for it and you’ll get a voucher to an exam – for free!

I took them up on their offer and scored a voucher for the BCNE – Brocade Certified Network Engineer.

After that I noticed that Brocade also has a limited offer for BCNE , you can take them up on it if you already have a CCNA. By doing that you also get a free voucher to the BCNE exam..

I chose to try it without the recommended course. A bit risky but a long time ago I took the CCNA and passed. For me this exam was probably more about remembering and looking at improvements to all the things in CCNA back in 2005. This post is about my study technique or perhaps more of a record of how I did things. To find places for improvement.

Do you have any study tips you would like to share?

Some really useful links:

  • BCNE in a Nutshell guide – It’s also available on their saba/education page. But it’s out of date in there.
  • Brocade IP Primer – this is a great refresher on most Ethernet things if you’ve been out of touch.
  • Go through the manuals – but read the material in the newer released manuals.
  • IP Quick Reference – CLI Quick and quite comprehensive overview not only of commands but also of technologies. has the list of pages and manuals and guides, but to get the newest documents you have to look elsewhere.
One place to get them is on each Product’s page on, at the bottom there is a place to get some manuals.

First thing I did before diving into the materials was to take the BCNE Knowledge Assessment test. Get some sort of idea of what kind of topic the exam is about.

Then I read the nutshell guide and marked the things I needed to learn more about (basically all). Last time I took an exam with Brocade I only read the nutshell in the beginning of my study time, this time I’m re-reading it every now and then to see if I catch something that is not clear and I want to focus extra on. I’m also keeping a focus on the objectives of the exam. Reading the objectives and trying to answer them with as much detail as I can.-The objectives are general so there’s quite a lot of room for freedom there. As a bonus, if you can’t describe something in the objectives well, you just found something you do not know well  enough.

After going through the nutshell guide and checking up on a few acronyms and technologies I hadn’t heard about I read through the IP Primer and did the same things there: Mark the things that I thought would be of interest and what I would need to dig deeper into.

Then went through the NetIron and FastIron configuration guides. Not only did I have a peak at all the pages that were listed as relevant, but also read chapters that was not listed. Either because I found them interesting or perhaps because the subject in those chapters are touched upon in Nutshell. To me that just means the more you know about the subject the better.

Rehash objectives/previous notes and dig deeper. Perhaps first time you read it you glanced over some part. By digging deeper I mean finding the chapters in all the manuals that touch on this subject and reading them, making more notes. Could also be surfing the Internets or Wikipedia for basic overview of how a technology operates. Eventually all of this crystallizes into a view that describes things in your own words.

To me there are parts of IT exams that you just can’t know even if you’ve been working with it for a long time. For example license options or feature differences between all the products. To learn things like these (also other types of questions I thought would come on the exam) I made flashcards in a spreadsheet and printed it on normal A4 so that the question is on one side and the answer is on the back. This was no easy feat.

After going through all these documents you should be able to figure out yourself which areas are being focused on – which you should be making sure that you know.

Some good articles/blog posts:

P.s. I passed :)

Setup a 3 Node Lustre Filesystem


Lustre is a filesystem often used by clusters because many computers can mount the filesystem simultaneously.

This is a small log/instruction for how to setup Lustre in 3 virtualized machines (one metadata server, one object storage server and one client).

Basic components:

VMWare Workstation
3 x CentOS 6.3 VMs.
Latest Lustre from Whamcloud

To use Lustre your kernel needs to support it. There’s a special one for server and one for the client. Some packages are needed on both.

Besides lustre you’ll need an updated version of e2fsprogs as well (because the version that comes from RHEL6.3 does not support large partitions).

Starting with the MDS. When the basic OS setup is done will make a copy of that to use for OSS and Client.

Setup basic services.

Install an MDS

This will run the MDT – the metadata target.

2GB RAM, 10GB disk, bridged networking, 500GB for /boot, rest for / (watch out, it may create a really large swap). Minimal install. Setup OS networking (static ip for servers, start on boot, open port 988 in firewall, possibly some for outgoing if you decide to restrain that too), run yum update and setup ntp. Download latest lustre and e2fsprogs to /root/lustre-client, lustre-server and e2fsprogs appropriately (x86_64). Lustre also does not support selinux, so disable that (works fine with it in enforcing until time to create mds/mdt, also fine with permissive until it’s time to mount).
Put all hostnames into /etc/hosts.
Poweroff and make two full clones.
Set hostname.

Install an OSS

This will contain the OST (object storage target). This is where the data will be stored.

Networking may not work (maybe device name changed to eth1 or eth2).
You may want to change this afterwards to get the interface back to be called (eth0). A blog post about doing that.

Install a client

This will access and use the filesystem.

Clone of the OSS before installing any lustre services or kernels.

Install Lustre

Before you do this it may be wise to take a snapshot of each server. In case you screw the VM up you can then go back :)

Starting with the MDS.

Installing e2fsprogs, kernel and lustre-modules.

Skipping debuginfo and devel packages, installing all the rest.

yum localinstall \ 
kernel-2.6.32-220.4.2.el6_lustre.x86_64.rpm kernel-firmware-2.6.32-220.4.2.el6_lustre.x86_64.rpm \
kernel-headers-2.6.32-220.4.2.el6_lustre.x86_64.rpm \
lustre-2.2.0-2.6.32_220.4.2.el6_lustre.x86_64.x86_64.rpm \ 
lustre-ldiskfs-3.3.0-2.6.32_220.4.2.el6_lustre.x86_64.x86_64.rpm \

The above was not the order they were installed. Yum changed the order so that for example kernel-headers was last.

yum localinstall e2fsprogs-1.42.3.wc3-7.el6.x86_64.rpm \
e2fsprogs-debuginfo-1.42.3.wc3-7.el6.x86_64.rpm \
e2fsprogs-devel-1.42.3.wc3-7.el6.x86_64.rpm \
e2fsprogs-libs-1.42.3.wc3-7.el6.x86_64.rpm \
e2fsprogs-static-1.42.3.wc3-7.el6.x86_64.rpm \
libcom_err-1.42.3.wc3-7.el6.x86_64.rpm \
libcom_err-devel-1.42.3.wc3-7.el6.x86_64.rpm \
libss-1.42.3.wc3-7.el6.x86_64.rpm \

After boot, confirm that you have lustre kernel installed by typing:

uname -av


mkfs.lustre --help

to see if you have that and

rpm -qa 'e2fs*'

to see if that was installed properly too.

By the way, you probably want to run this to exclude automatic yum kernel updates:

echo "exclude=kernel*" >> /etc/yum.conf

After install and reboot into new kernel it’s time to modprobe lustre, start creating MDT, OST and then mount things!
But hold on to your horses, first we ned to install the client :)


And then the Client

Install the e2fsprogs*

We cannot just install the lustre-client packages, because we run a different kernel than the ones that whamcloud have compiled the lustre-client against.

We can either back-pedal and install an older kernel. Or we can build (from source / SRPMS) a lustre-client that works on a kernel of our choosing. The later option seems like a better way, because we can then upgrade the kernel if we want to.


Build custom linux-client rpms

Because of a bug it appears that some ext4 source packages are needed – while they are not. You need to add some parameters to ./configure. This will be the topic of a future post.

The above rpmbuild should create rpms for the running kernel. If you want to create rpms for a non-running kernel you are supposed to be able to run.

Configure Lustre

Whamcloud have good instructions. Don’t be afraid to check out their wiki or use google.

/var/log/messages is the place to look for more detailed errors.

On the MDS

Because we do not have infiniband you want to change the parameters slightly for lnet to include tcp(eth0). These changes are not reflected until reboot (quite possibly something else) – but just editing a file under /etc/modprobe.d/ called for example lustre.conf is not enough.

Added a 5GB disk to the mds.

fdisk -cu /dev/sdb; n, p, 1, (first-last)

modprobe lustre lnet

mkfs.lustre –mdt –mgs


On the OSS

Also add the parameters into modprobe.

mkfs.lustre –ost


On the client

Add things into modprobe.


Write something.

Then hit: lfs df -h

To see usage!


Get it all working on boot

You want to start the MDS, then the OSS and last the client.
But while it’s running you can restart any node and eventually it will start working again.

Fstab on the client:
ip@tcp:/fsname /mnt lustre defaults,_netdev 0 0

Fstab on the OSS and MDS:
/dev/sdb1 /mnt/MDS lustre defaults,_netdev 0 0

While it’s running you can restart any node and eventually it will start working again.

Thoughts after Brocade’s Analyst and Technology Day 2012

Thursday today, the day after the Day. It was a real long day, and to my surprise it said ‘press’ on my pass – so I had to try to ask some questions :)

Some things picked up:

* New VDX 8770 product released – a modular Ethernet switch. Room for 384 10GbE ports. 100GbE ready and also ready for SDN protocols like VXLAN (vmware) and NVGRE (windows 2012). The VDX 8770 chassi is called “Mercury” internally in Brocade. I found it very similar to the DCX chassis  except that the supervisor modules are half-height.

* Today Brocade opened up registrations for the BCEFP certification – Brocade Certified Ethernet Fabric Professional (which include the VDX8770), It looks advanced and you probably want to take the previous exam – BCEFE – before.

* SDN – storage-defined network was the main focus of the day. Fibre Channel was barely mentioned at all.
Ken Cheng‘s (one of the VPs of Brocade) definition of SDN:

“A set of technologies which are focused on achieving three objectives: network virtualization (vxlan), programmatic control (openflow) and cloud orchestration (openstack).”

It was quite obvious that Brocade’s VCS is the technique/medium which they intend to enable these new technologies. SDN is still quite immature (even though internet2 are already using it in their production network) – so be prepared to wait if you want ready solutions.

* VCS seems quite similar to QLogic’s/Juniper’s QFabric. They had a hands-on lab where we could connect four smaller vdx switches and a vdx8770 (4-slot version). The switches had only had a unique ID set on them and their were end-devices (web-servers, web cams and a tablet) on different IP subnets on each switch. All I needed to do to connect switches (and devices) was to connect two switches via a fibre pair. Quite easy. Almost too easy to be true. This is something I really enjoy that’s part of Fibre Channel. The technology has quite a few features, self-forming trunks being one of them (with frames being striped over all members of a trunk). It also gets rid of spanning tree (so no more unused links).

* Quite soon we should see Brocade’s OEMs release embedded VDX switches for their blade chassis. No news yet about which but lately IBM have been quick to release new Brocade products. As a side note: Brocade from start only sold their gear through OEMs, this is no longer always the case and they are trying to communicate more directly with customers.

* Cost per bit was really important to push down for internet exchanges.

* It’s a lot easier to write a blog post on my wordpress blog via Chrome (on android) than via the native browser. Using my asus transformer tf101 as a note taking device for the day worked out great. Success!

Hello Silicon Valley!

Checked in at the hotel, mighty fancy one, it has a pool :)

Weather is great down here, much warmer than in San Fran. Hotel is basically right next door to where I’m heading tomorrow and it looks like there’s quite a lot of people coming!

Cisco is right next door to, gotta go snab some pictures while I’m here, may not get any time off tomorrow.

Lots of lunch places next door too, I was afraid it was going to be in the middle of an industrial area.

And a bonus picture has been taken, but I appear to be having some issues inserting it into the post..


Brocade Analyst and Technology Day 2012

I’ve been invited by Brocade to their Brocade Analyst and Technology Day 2012!

It’s in San José, California and it’s going to be a blast to get over to the US again, it’s been a while.

In anticipation of possible future blog posts I just want to let you know that I’m not getting paid for this – Brocade are paying for flights/hotel though.

More details about the day can be found on facebook and there’s an agenda here.

It’s also possible to register and view the event via the Internets. Will it be broadcasted on this URL perhaps? Strange page, just empty with a sky background.

With talks from soon-to-be ex-CEO (new CEO being announced?) and veep of Data Center Networking Group Jason Nolet about new innovations for fabrics it looks interesting indeed.


Time to suit up :)

Brocade Accredited Data Center Specialist – BADCS


Time to study for another one :) Working my way towards the “Data Center Track”. To complete it it would be enough for me to complete 5 accreditations.

This one has a pretty cool name – BADCS!

I haven’t tried one of these Accredited exams before, but as far as I can tell:

  • Cheap: only 20$ USD
  • The exam is web based, no need to find a test center, you can do it exactly when you want to.
  • Accreditations do not expire
  • You don’t _have_ to take the course in the prerequisites before taking the exam, but it is recommended :)

Also, for this Accreditation the pre-requisite is the FC-101 course on brocade’s SABA page – and it’s free!

– The BADCS exam consists of 38 questions and lasts 60 minutes
– To pass this exam you must get a score of 71% or better 

So that’s about 27 correct out of 38 questions.

The objectives are on this page.

The only part I was initially not entirely sure about is the “Given a scenario, describe when portlog dumps are required”. The objectives indicate that a Fibre Channel theory knowledge is necessary, so the FC-101 course seems like a very good idea to study. I doubt many people remember specific FC mechanisms/theory if they don’t work with these occasionally. Like the well-known addresses – who remembers the address of the name-server or controller? =)

My general tip for the BADCS: Learn the material of the FC-101 course. Really. Learn. it.

You may be tricked into thinking that Brocade’s accrediations are easy because you can do them from home.

Access Gateway – NPV – TR

Say what??

Access Gateway – Brocade

NPV (N_port Virtualization (not NPIV) – Cisco

Transparent Mode – QLogic

These are all names for the basic idea / functionality but as there’s no standard the vendors have made up their own names for it.

A switch in Access Gateway (AG) mode does not consume Domain IDs, you can do port mapping, needs NPIV on the port in the switch that it connects to. AG requires a switch / fabric to connect to as it doesn’t run the normal fibre channel services.

It is very useful in case you are going to mix vendors in your fabric. Meaning you can populate the core with Brocade switches and then connect other vendors’ switches in the above modes to the Brocade switches.

On some QLogic switches you can also set a port into TR-mode, see this post on HP’s EBC forum about how to do it. It is not exactly the same as AG or NPV, because you still need to do zoning on the QLogic switch.

There is also the IPM by Qlogic for IBM – it looks like a module that you cannot switch between ‘fabric’ and ‘IPM’ mode. Which is what you can do on a Cisco or on a Brocade switch.


To create a new user group in Windows 7

This post is created upon request by a reader.

May or may not be needed for the P6000 Simulator. It is however required when you install the real HP P6000 Command View.

First you need to get into ‘Computer Management’, do this by right-clicking on ‘My Computer’. Then click your way into ‘Local Users and Groups’, and then into the ‘Groups’ section. In there, right-click somewhere and create a new group called ‘HP Storage Admins’ (or HP Storage Users for read-only). While creating it you can add a user (the one you log in with probably), you can also add it later by right-clicking the group.

Storage FC HBA Transfer Size Tuning

HP just published an advisory describing how to tune some parameters for Emulex, Qlogic and Brocade Fibre Channel HBAs: c02518189. It sounds like these are new, but these changes have been around for at least 6 months in all three vendors’ HBAs.


“Emulex driver version 2.42.002 or later, along with OneCommand Manager version or later,”

Use HBAnywhere to change these.

Examples to tune the server or port level transfer size:

  • 128 kbytes, set the LimTransferSize = 2 and ExtTransferSize = 0 (default)
  • 512 kbytes, set the LimTransferSize = 0 (default) and ExtTransferSize = 0 (default)
  • 1 Mbytes, set the LimTransferSize = 0 (default) and ExtTransferSize = 1


This is part of the Qlogic SANSurfer utility.

  • c:\>qlfc -tsize /fc
  • c:\>qlfc -tsize /fc /set 128
  • c:\>qlfc -tsize /fc /set default


  • bcu drvconf –key bfa_max_xfer_len –val 64
  • bcu drvconf –key bfa_max_xfer_len –val 128

OpenIndiana + PostgreSQL + dCache

This is a test for installing openindiana and set up a working dCache test-vm.

dCache is a storage element of the Grid (scientific computing).

OI == OpenIndiana. Kind of like opensolaris with an Illumos kernel, not the sun/oracle kernel.

With as a base for how to set up ip settings etc in OI.

oi-dev-151a-text-x86.iso installed

pkg install package/pkg

pkg update

java -version

mkdir /var/postgres

useradd postgres

groupadd postgres

chown postgres:postgres /var/postgres

chmod 755 /var/postgres

The pkg update makes it into 151a2

If you do not create the ones above the install of service/postgres will fail and create a new BE.

pkg install pkg:/database/postgres-84
pkg install pkg:/service/database/postgres-84

vi /etc/passwd

change postgres to 90:90 and homedir to /export/home/postgres

mkdir /export/home/postgres

chown postgres.postgres /export/home/postgres
root@oi:~# vi /export/home/postgres/.profile

you probably also want to add these to the root user’s path

svcadm enable postgresql-84:32_bit

root@oi:/var/log# svcs -a|grep postg
disabled       17:29:37 svc:/application/database/postgresql_84:default_64bit
online         17:31:35 svc:/application/database/postgresql_84:default_32bit

su - postgres



I initially did this in an ESXi VM in VMWare Workstation, but that keept freezing so I went over to a ‘real vm’ instead. The VM is more responsive.

dCache stuff

wget it from

pkgadd -d dcache-server-1.9.12-16.pkg

follow for the instructions of which postgresql-scripts and users and stuff to create

It’s however not enough :

root@oi:~# /opt/d-cache/bin/dcache start
/opt/d-cache/bin/dcache[127]: local: not found [No such file or directory]
/opt/d-cache/bin/dcache[128]: local: not found [No such file or directory]
/opt/d-cache/bin/dcache[129]: local: not found [No such file or directory]
/opt/d-cache/bin/dcache[130]: local: not found [No such file or directory]
/opt/d-cache/bin/dcache[131]: local: not found [No such file or directory]
/opt/d-cache/bin/dcache[132]: local: not found [No such file or directory]
/opt/d-cache/bin/dcache[317]: .[162]: local: not found [No such file or directory]

so, edit /opt/d-cache/bin/dcache and remove the if in the beginning that will make it use /usr/xpg4/bin/sh – so that it uses /bin/bash instead.

Like this:

if [ "$1" = "%" ]; then
elif [ "`uname`" = "SunOS" ]; then
    if [ -x /bin/bash ]; then
        exec /bin/bash $0 % "$@"
        echo "Cannot find POSIX compliant shell. This script will"
        echo "probably break, but we attempt to execute it anyway."

after I changed this, I noticed in the console that it said:

rpcbind: non-local attempt to set


anyway, then start dCache

root@oi:/opt/d-cache/bin# /opt/d-cache/bin/dcache start
Starting dCacheDomain done

in /var/log/dCacheDomain.log you’ll find why it’s not working:

touch /etc/exports

and it appears to be stable, except for some errors about (NFSv3-oi), however, we disregard those for now, we just want to get it running!

vi /opt/d-cache/etc/dcache.conf
mkdir /pool1


vi /opt/d-cache/etc/layouts/single.conf

uncomment the pool1 section, set a maxDiskSize=2G to specify max disk space allowed.
Specifics are in the installation part on in the book.

Then point your webbrowser to – see any blue buttons?! yay, it’s up!

Next step is to try it out, this might prove a little bit more difficult (to find dcap/root/srm client for opensolaris/oi).

PostgreSQL problem

so maybe next time you restart the vm it gives some errors and puts the postgresql-server in maintenance mode. Look in /var/adm/messages for some tips, it should point you to

svcs -xv svc:/application/database/postgresql_84:default_32bit


which will tell you more about what’s going on and how to fix it

svcadm restart
svcadm clear

Use dCache with webdav

We’ll start with trying to use Webdav (doesn’t require anything fancy on the client side, except maybe a browser plugin for uploading).

go to the layout file and uncomment the webdav part, add


The script /opt/d-cache/bin/ sadly assumes that you need bash or a special version of bash somehow.
So running

bash /opt/d-cache/bin/ mkdir /data

works, but

/opt/d-cache/bin/ mkdir /data

does not.

See for the rest.

If you keep the webdav in the same domain you’ll need to restart the whole dcache.

In Windows 7 you can then mount a new network folder and click “Connect to a web site that you can use to store your documents and pictures” and in there type:

Now you get another folder in your computer where you can create folders. These will also show up if you surf to , sadly however, you cannot write files. says it’s because pool is full. But it’s 2048MiB and all free?

suggests minimum pool size might be 4G, changed pool maxdiskspace to 8G.

tada, now the copy starts, or the file creation starts, but I cannot actually write anything to it. So if I create a .txt file, I can give it a name and save it, unless I try to write anything inside it!

some errors to accompany this:

 (WebDAV-oi) [door:WebDAV-oi@dCacheDomain:13295xxx] Your resource factory returned a resource with a different name to that requested!!! Requested: null returned: world-writable - resource factory: class org.dcache.webdav.DcacheResourceFactory
 (WebDAV-oi) [door:WebDAV-oi@dCacheDomain:13295xxx] resource is being locked with a null user. This won't really be locked at all...
 (WebDAV-oi) [door:WebDAV-oi@dCacheDomain:13295xxx] resource is being locked with a null user. This won't really be locked at all...
 (WebDAV-oi) [door:WebDAV-oi@dCacheDomain:13295xxx] Your resource factory returned a resource with a different name to that requested!!! Requested: null returned: world-writable - resource factory: class org.dcache.webdav.DcacheResourceFactory
 (pool1) [00002CBCC971ABC14BDC9E496A0AEAA31FC3] A task was added to queue 'store', however the queue is not configured to execute any tasks.

trying dccp

[root] # cd /etc/yum.repos.d/
[root] # wget
[root] # yum install dcap dccp -d 63 -H /bin/bash dcap:// creates another empty file, while it adds an entry to the 'store' queue and then not so much happens. stuck on this: Sending control message: 2 0 client open "dcap://" w -mode=0755 -truncate sl1 40619 -timeout=-1 -onerror=default  -alloc-size=938672  -uid=0 (len=153) 


uncomment the nfsv3 and add nfsv41
then on a system you should be able to ‘apt-get install nfs-common’; modprobe nfs; mkdir /nfsv4 mount -t nfs4 /nfsv4′. But for me this stops working with an “cp: closing `./bash’: Input/output error”. Possibly because I could not specify -o minorversion=1 on this ubuntu install (3.0.0-16).


NFSv41 with dCacheToGo

Download dCache2Go from here:

To convert it into VMware format:

VBoxManage clonehd source.vdi target.vmdk --format VMDK

Then create new vm and set the new vmdk file as the disk.

When this VM is up (and the dCache server of course), hit:

mount -t nfs4 -o minorversion /mnt


cd /mnt/data/world-writable
mkdir another
cd another
cp /bin/bash .
cp bash /tmp/bash
diff /tmp/bash /bin/bash

SCORE! We have a working dCache setup in a VM running openindiana!

Brocade Certified Fabric Designer – BCFD – Exam

Just took the BCFD (brocade certified fabric designer) exam two days ago.

Some tips:

Bring some water and food.

Good exam, but I am really tired of exams and certifications for now!

Also, isn’t easy to confuse Brocade Certified Fabric Designer with somebody who makes clothes?

Check out my other posts on the BCFD subject:

Brocade Certification – BCFD – Objectives

Data Collection

  • Given a scenario, design a solution that meets the customer’s requirements
  • Given a scenario, demonstrate knowledge of resiliency, redundancy, HA, and locality
  • Given a scenario, describe the various documents required in the design assessment

Practice by making up many scenarios and then deciding which is the best way to design it.

Management and Monitoring Tools

  • Given a scenario, describe how to satisfy a specific monitoring requirement
  • Demonstrate knowledge of Brocade management tools

What to monitor
How to monitor these

Hardware/Software Products and Features

  • Demonstrate knowledge of interoperability of B-Series/M-Series products
  • Given a scenario, describe Brocade hardware products and their purpose
  • Given a scenario, demonstrate knowledge of Brocade software features and purpose

Features: VF, FCR, TI, QoS, FW, IRL, Trunking, Port Fencing, D_Port

Distance Solutions

  • Given availability, performance and distance requirements, design an appropriate long distance solution using Fibre Channel
  • Given a specific set of requirements, demonstrate ability to design a SAN extension solution using FCIP

FastWrite, Tape Pipelining, SACK
Max distance for LWL and ELWL:
Max performance of FCIP:

Performance Tuning Optimization

  • Given a performance scenario, determine an appropriate solution
  • Describe strategies for maximizing throughput in a Data Center Fabric

ICL, nohops, trunking.
How to increase performance in FCIP and FCR:

Migration and Integration

  • Given an existing fabric, identify migration strategies to upgrade the fabric with new technology
  • Given a set of existing fabrics and network devices, determine a consolidation plan that minimizes disruption
  • Describe the requirements to integrate a Brocade DCX Backbone into an existing M-Series fabric

FCR, Integrated Routing, E_port on a switch in the M_series.


  • Identify requirements for restricting which switches/devices may join a fabric
  • Identify security features to restrict administrative access to a switch