Tag Archives: openstack

Studying for Openstack Certified Administrator

The plan : study a bit and then attempt the coa exam. If I don’t pass then attend the course during openstack summit: SUSE

And what to study? I’ve been doing openstack admin work for the last year or two. So I have already done and used most services, except Swift. But there are some things that were only done once when each environment was setup. Also at $dayjob our code does a lot for us.

One such thing I noticed while looking through https://github.com/AJNOURI/COA/wiki/02.-Compute:-Nova

Was setting the default project quota. I wonder if that’s a cli/webui/API call or service config. But a config file would be weird, unless it’s in Keystone. Turns out default quotas are in each of the services’ config files. It’s also possible to set a default quota with for example the nova command.

Another perhaps useful thing I did was to go through the release notes for the services. $dayjob run Newton so I started with the release after that and tried to grok and look for biggest changes. Introduction of placement was one of them and I got an introduction to that while playing with devstack and “failed to create resource provider devstack” error. After looking through logs I saw a “409 conflict” HTTP error or placement was complaining that the resource already existed. So somehow during setup it was created but in the wrong way? I deleted it and restarted nova and it got created automatically and after that nova started acting a lot better :)

haproxy lab setup!

Been seeing haproxy more and more lately as it seems even the stuff I work with are moving towards web :)

So a good time as any to play around with it!

First setup is the tag “single-node” in https://github.com/martbhell/haproxy-lab – this means it just configures one apache httpd and one haproxy. In the haproxy it creates multiple vhosts with content being served from different directories, and then it points to each of these as a haproxy backend.

To illustrate the load balancing the playbook also installs php and shows the path of the file that’s being served.

I used ansible for this and only tested it with CentOS7 in an OpenStack. The playbook also sets up some “dns” in /etc/hosts.

There are also “ops_playbooks” for disabling/enabling backends and setting weights.

I wonder what’s a good next step. Maybe multiple hosts / Docker containers? Maybe SSL termination + letsencrypt? Maybe some performance benchmarking/tuning?
I like the help for the configuration file – it begins with some detail about what an HTTP request looks like :)

Lustre 2.5 + CentOS6 Test in OpenStack

Reason: Testing to Lustre 2.5 from a clean CentOS 6.5 install in an openstack.

Three VMs: two servers, one MDS, one OSS and one Client. CentOS65 on all. An open internal ethernet network for the lustre traffic (don’t forget firewalls). Yum updated to latest kernel. Two volumes presented to the lustreserver and lustreoss for MDT + OST, both are at /dev/vdc. Hostnames set. /etc/hosts updated with three IPs: lustreserver,  lustreoss and lustreclient.

With 2.6.32-431.17.1.el6.x86_64 there’s some issues at the moment for building the server components. One needs to use the latest branch for 2.5 so the instructions are https://wiki.hpdd.intel.com/pages/viewpage.action?pageId=8126821

Server side

MDT/OST: Install e2fsprogs and reboot after yum update (to run the latest kernel kernel).

yum localinstall all files from: http://downloads.whamcloud.com/public/e2fsprogs/1.42.9.wc1/el6/RPMS/x86_64/

Next is to rebuild lustre kernels to work with the kernel you are running and the one you have installed for next boot: https://wiki.hpdd.intel.com/display/PUB/Rebuilding+the+Lustre-client+rpms+for+a+new+kernel

RPMS are here: http://downloads.whamcloud.com/public/lustre/latest-feature-release/el6/server/SRPMS/

For rebuilding these are also needed:

yum -y install kernel-devel* kernel-debug* rpm-build make libselinux-devel gcc

basically:

  • git clone -b b2_5 git://git.whamcloud.com/fs/lustre-release.git
  • autogen
  • install kernel.src from redhat (puts tar.gz in /root/rpmbuild/SOURCES/)
  • if rpmbuilding as user build, then copy files from /root/rpmbuild into /home/build/rpmbuild..
  • rebuilding kernel requires quite a bit of hard disk space, as I only had 10G for / then I made symlinks under $HOME to the $HOME/kernel and $HOME/lustre-release

yum -y install expect and install the new kernel with lustre patches and the lustre and lustre modules.

Not important?: WARNING: /lib/modules/2.6.32-431.17.1.el6.x86_64/weak-updates/kernel/fs/lustre/fsfilt_ldiskfs.ko needs unknown symbol ldiskfs_free_blocks

/sbin/new-kernel-pkg –package kernel –mkinitrd –dracut –depmod –install 2.6.32.431.17.1.el6_lustre

chkconfig lustre on

edit /etc/modprobe.d/lustre.conf and add the lnet parameters

modprobe lnet
lctl network up
# lctl list_nids

creating MDT: mkfs.lustre –mdt –mgs –index=0 –fsname=wrk /dev/vdc1
mounting MDT: mkdir /mnt/MDT; mount.lustre /dev/vdc1 /mnt/MDT

creating OST: mkfs.lustre –ost –index=0 –fsname=wrk –mgsnode=lustreserver /dev/vdc1
mounting OST: mkdir /mnt/OST1; mount -t lustre /dev/vdc1 /mnt/OST1

Client Side

rpmbuild --rebuild --without servers

cd /root/rpmbuild/RPMS/x86_64
rpm -Uvh lustre-client*

add modprobe.d/lustre.conf
modprobe lnet
lctl network up
lctl list_nids

mount.lustre lustreserver@tcp:/wrk /wrk

lfs df!

Red Hat – Clustering and Storage Management – Course Objectives – part 2

Post 1 – http://www.guldmyr.com/blog/red-hat-clustering-and-storage-management-course-objectives/ Where I checked out udev, multipathing, iscsi, LVM and xfs.

This post is about getting using luci/ricci to get a Red Hat cluster working, but not on a RHEL machine because sadly I do not have one available for practice purposes. So CentOS64 it is. Using openstack for virtualization.

Topology: Four hosts on all three networks, -a, -b and internal. Three cluster nodes and one management node.

Get the basic cluster going:

  • image four identical nodes
  • ssh-key is distributed
  • /etc/hosts file has all hosts, IPs and networks
    • network interfaces are configured –
    • set a gateway in /etc/sysconfig/network
  • firewall
    • all traffic allowed from -a and -b networks
    • at a minimum allow traffic from the network that the hostname corresponds to that you enter in luci
  • dns (PEERDNS=no is good with several dhcp interfaces)
  • timesync with ntpd
  • luci installed on mgmt-node # ricci is a web gui
  • ricci installed on all cluster nodes # this is the service talks with corosync
    • password set for user ricci on cluster nodes
  • create cluster in luci
    • multicast perhaps doesn’t work so well in openstack ?
    • on cluster nodes this runs “yum -y install cman rgmanager lvm2-cluster sg3_utils gfs2-utils” if shared storage is selected, probably less if not.
  • fencing is really important, how to do it in openstack would require a bit of work though. Not as easy as with kvm/xvm to send a destroy domain message.

Tests:

  • Update and distribute cluster.conf
  • Have a service run on a node on the cluster (doesn’t have to have a shared storage for this).
  • Commands:
    • clustat
    • cman_tool
    • rg_test test /etc/cluster/cluster.conf start service name-of-service
    • ccs_config_validate

 

Share an iSCSI target between all nodes:

  • Using management node to share the iSCSI LUN.
  • tgtd, multipath
  • clvmd running on all nodes
  • lvmconf – make sure locking is set correctly
  • create vg with clustering
  • partprobe; multipath -r # do this often
  • vgs/lvs and make sure all nodes see the clusterd lv
  • minimum GFS filesystem is around 128M – you didn’t use all the vg right? =)
    • for testing/small cluster lowering the journal size is goodness
  • mount!

 

Red Hat – Clustering and Storage Management – Course Objectives

Attending “Red Hat Enterprise Clustering and Storage Management” in August. Quite a few of these technologies I haven’t touched upon before so probably best to go through them before the course.

Initially I wonder how many of these are Red Hat specific, or how many of these I can accomplish by using the free clones such as CentOS or Scientific Linux. We’ll see :) At least a lot of Red Hat’s guides will include their Storage Server.

I used the course content summary as a template for this post, my notes are made within them.. below.

For future questions and trolls: this is not a how-to for lazy people who just want to copy and paste. There are plenty of other sites for that. This is just the basics and it might have some pointers so that I know which are the basic steps and names/commands for each task. That way I hope it’s possible to figure out how to use the commands and such by RTFM.

 

 

Course content summary :

Clusters and storage

Get an overview of storage and cluster technologies.

ISCSI configuration

Set up and manage iSCSI.

Step 1: Setup a server that can present iSCSI LUNs. A target.

  1. CentOS 6.4 – minimal. Set up basic stuff like networking, user account, yum update, ntp/time sync then make a clone of the VM.
  2. Install some useful software like: yum install ntp parted man
  3. Add a new disk to the VM

Step 2: Make nodes for the cluster.

  1. yum install iscsi-initiator-utils

Step 3: Setup an iSCSI target on the iSCSI server.

http://www.server-world.info/en/note?os=CentOS_6&p=iscsi

  1. yum install scsi-target-utils
  2. allow port 3260
  3. edit /etc/tgt/target.conf
  4. if you do comment out the ip range and authentication it’s free-for-all

http://www.server-world.info/en/note?os=CentOS_6&p=iscsi&f=2

Step 4: Login to the target from at least two nodes by running ‘iscsiadm’ commands.

Next step would be to put an appropriate file system on the LUN.

UDEV

Learn basic manipulation and creation of udev rules.

http://www.reactivated.net/writing_udev_rules.html is an old link but just change the commands to “udevadm” instead of “udev*” and at least the sections I read worked the same.

udevadm info -a -n /dev/sdb

Above command helps you find properties which you can build rules from. Only use properties from one parent.

I have a USB key that I can pass through to my VM in VirtualBox, without any modifications it pops up as /dev/sdc.

By looking in the output of the above command I can create /etc/udev/rules.d/10-usb.rules that contains:

SUBSYSTEMS=="usb", ATTRS{serial}=="001CC0EC3450BB40E71401C9", NAME="my_usb_disk"

After “removing” the USB disk from the VM and adding it again the disk (and also all partitions!) will be called /dev/my_usb_disk. This is bad.

By using SYMLINK+=”my_usb_disk” instead of NAME=”my_usb_disk” all the /dev/sdc devices are kept and /dev/my_usb_disk points to /dev/sdc5. And on next boot it pointed to sdc6 (and before that sg3 and sdc7..). This is also bad.

To make one specific partition with a specific size be symlinked to /dev/my_usb_disk I could set this rule:

SUBSYSTEM=="block", ATTR{partition}=="5", ATTR{size}=="1933312", SYMLINK+="my_usb_disk"

You could do:

KERNEL=="sd*" SUBSYSTEM=="block", ATTR{partition}=="5", ATTR{size}=="1933312", SYMLINK+="my_usb_disk%n"

Which will create /dev/my_usb_disk5 !

This would perhaps be acceptable, but if you ever want to re-partition the disk then you’d have to change the udev rules accordingly.

If you want to create symlinks for each partition (based on it being a usb, a disk and have the USB with specified serial number):

SUBSYSTEMS=="usb", KERNEL=="sd*", ATTRS{serial}=="001CC0EC3450BB40E71401C9", SYMLINK+="my_usb_disk%n"

These things can be useful if you have several USB disks but you always want the disk to be called /dev/my_usb_disk and not sometimes /dev/sdb and sometimes /dev/sdc.

For testing one can use “udevadm test /sys/class/block/sdc”

Multipathing

Combine multiple paths to SAN devices into one fault-tolerant virtual device.

Ah, this one I’ve been in touch with before with fibrechannel, it also works with iSCSI.
Multipath is the command and be wary of devices/multipaths vs default settings.
Multipathd can be used in case there are actually multiple paths to a LUN (the target is perhaps available on two IP addresses/networks) but it can also be used to set a user_friendly name to a disk, based on its wwid.

Some good commands:

service multipathd status
yum provides */multipath.conf # device-mapper-multipath is the package. 
multipath -ll

Copy in default multipath.conf to /etc; reload and hit multipath -ll to see what it does.
After that the Fun begins!

 

Red Hat high-availability overview

Learn the architecture and component technologies in the Red Hat® High Availability Add-On.

https://access.redhat.com/site/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/High_Availability_Add-On_Overview/index.html

Quorum

Understand quorum and quorum calculations.

Fencing

Understand Fencing and fencing configuration.

Resources and resource groups

Understand rgmanager and the configuration of resources and resource groups.

https://access.redhat.com/site/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/High_Availability_Add-On_Overview/ch.gfscs.cluster-overview-rgmanager.html

Advanced resource management

Understand resource dependencies and complex resources.

Two-node cluster issues

Understand the use and limitations of 2-node clusters.

http://en.wikipedia.org/wiki/Split-brain_(computing)

LVM management

Review LVM commands and Clustered LVM (clvm).

Create Normal LVM and make a snapshot:

Tutonics has a good “ubuntu” guide for LVMs, but at least the snapshot part works the same.

  1. yum install lvm2
  2. parted /dev/vda # create two primary large physical partitions. With a CentOS64 VM in openstack I had to reboot after this step.
  3. pvcreate /dev/vda3 pvcreate /dev/vda4
  4. vgcreate VG1 /dev/vda3 /dev/vda4
  5. lvcreate -L 1G VG1 # create a smaller logical volume (to give room for snapshot volume)
  6. mkfs.ext4 /dev/VG1/
  7. mount /dev/VG1/lvol0 /mnt
  8. date >> /mnt/datehere
  9. lvcreate -L 1G -s -n snap_lvol0 /dev/VG1/lvol0
  10. date >> /mnt/datehere
  11. mkdir /snapmount
  12. mount /dev/VG1/snap_lvol0 /snapmount # mount the snapshot :)
  13. diff /snapmount/datehere /mnt/datehere

Revert a Logival Volume to the state of the snapshot:

  1. umount /mnt /snapmount
  2. lvconvert –merge /dev/VG1/snap_lvol0 # this also removes the snapshot under /dev/VG1/
  3. mount /mnt
  4. cat /mnt/datehere

XFS

Explore the Features of the XFS® file system and tools required for creating, maintaining, and troubleshooting.

https://access.redhat.com/site/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/Storage_Administration_Guide/xfsmain.html

yum provides */mkfs.xfs

yum install quota

XFS Quotas:

mount with uquota for user quotas, mount with uqnoenforce for soft quotas.
use xfs_quota -x to set quotas
help limit

To illustrate the quotas: set a limit for user “user”:

xfs -x -c "limit bsoft=100m bhard=110m user"

Then create two 50M files. While writing the 3rd file the cp command will halt when it is at the hard limit:

[user@rhce3 home]$ cp 50M 50M_2
cp: writing `50M_2': Disk quota exceeded
[user@rhce3 home]$ ls -l
total 112636
-rw-rw-r-- 1 user user 52428800 Aug 15 09:29 50M
-rw-rw-r-- 1 user user 52428800 Aug 15 09:29 50M_1
-rw-rw-r-- 1 user user 10477568 Aug 15 09:29 50M_2

Red Hat Storage

Work with Gluster to create and maintain a scale-out storage solution.

http://chauhan-rhce.blogspot.fi/2013/04/gluster-file-system-configuration-steps.html

Updates to the Red Hat Enterprise Clustering and Storage Management course

Comprehensive review

Set up high-availability services and storage.

Thoughts after Brocade’s Analyst and Technology Day 2012

Thursday today, the day after the Day. It was a real long day, and to my surprise it said ‘press’ on my pass – so I had to try to ask some questions :)

Some things picked up:

* New VDX 8770 product released – a modular Ethernet switch. Room for 384 10GbE ports. 100GbE ready and also ready for SDN protocols like VXLAN (vmware) and NVGRE (windows 2012). The VDX 8770 chassi is called “Mercury” internally in Brocade. I found it very similar to the DCX chassis  except that the supervisor modules are half-height.

* Today Brocade opened up registrations for the BCEFP certification – Brocade Certified Ethernet Fabric Professional (which include the VDX8770), It looks advanced and you probably want to take the previous exam – BCEFE – before.

* SDN – storage-defined network was the main focus of the day. Fibre Channel was barely mentioned at all.
Ken Cheng‘s (one of the VPs of Brocade) definition of SDN:

“A set of technologies which are focused on achieving three objectives: network virtualization (vxlan), programmatic control (openflow) and cloud orchestration (openstack).”

It was quite obvious that Brocade’s VCS is the technique/medium which they intend to enable these new technologies. SDN is still quite immature (even though internet2 are already using it in their production network) – so be prepared to wait if you want ready solutions.

* VCS seems quite similar to QLogic’s/Juniper’s QFabric. They had a hands-on lab where we could connect four smaller vdx switches and a vdx8770 (4-slot version). The switches had only had a unique ID set on them and their were end-devices (web-servers, web cams and a tablet) on different IP subnets on each switch. All I needed to do to connect switches (and devices) was to connect two switches via a fibre pair. Quite easy. Almost too easy to be true. This is something I really enjoy that’s part of Fibre Channel. The technology has quite a few features, self-forming trunks being one of them (with frames being striped over all members of a trunk). It also gets rid of spanning tree (so no more unused links).

* Quite soon we should see Brocade’s OEMs release embedded VDX switches for their blade chassis. No news yet about which but lately IBM have been quick to release new Brocade products. As a side note: Brocade from start only sold their gear through OEMs, this is no longer always the case and they are trying to communicate more directly with customers.

* Cost per bit was really important to push down for internet exchanges.

* It’s a lot easier to write a blog post on my wordpress blog via Chrome (on android) than via the native browser. Using my asus transformer tf101 as a note taking device for the day worked out great. Success!

openstack testing day

Only one day late!

I actually started installing this on the 8th but I forgot to install it to hdd so the ‘yum update’ failed and broke the machine with I/O errors :)

Installing it in a VMWare Workstation (fedora 64-bit type, 2, cores, 4G RAM, 20G disk).

http://fedoraproject.org/wiki/Test_Day:2012-03-08_OpenStack_Test_Day

Basic Setup

1

http://fedoraproject.org/wiki/QA:Testcase_install_OpenStack_packages – No problem.

2

http://fedoraproject.org/wiki/QA:Testcase_setup_OpenStack_Nova –

Says that if you are doing this in a VM you need to “configure nova to use qemu without KVM and hardware virtualization:”. This is not true, as VMWare Workstation 8 has virtualization pass-through.

[root@localhost mart]# vgcreate nova-volumes $(sudo losetup --show -f /var/lib/nova/nova-volumes.img)
  No physical volume label read from /dev/loop0
  Writing physical volume data to disk "/dev/loop0"
  Physical volume "/dev/loop0" successfully created
  Volume group "nova-volumes" successfully created
openstack-nova-db-setup

Gives this error, which already is reported:

Verified connectivity to MySQL.
Creating 'nova' database.
Asking openstack-nova to sync the databse.
2012-03-09 07:28:26 WARNING nova.utils [-] /usr/lib/python2.7/site-packages/nova/db/sqlalchemy/migrate_repo/versions/075_convert_bw_usage_to_store_network_id.py:49: SADeprecationWarning: useexisting is deprecated.  Use extend_existing.
  useexisting=True)

2012-03-09 07:28:28 WARNING nova.utils [-] /usr/lib/python2.7/site-packages/nova/db/sqlalchemy/migrate_repo/versions/081_drop_instance_id_bw_cache.py:40: SADeprecationWarning: useexisting is deprecated.  Use extend_existing.
  useexisting=True)

Complete!

3

[root@localhost nova]# ADMIN_PASSWORD=$OS_PASSWORD openstack-keystone-sample-data
The default service password has been detected.  Please consider
setting an actual password in environment variable SERVICE_PASSWORD

But after that it generates users.

4

No problems, should ‘glance index’ return anything at this stage?

5

No problems.

6 Add SSH keypair

No problems, just do exactly what the instructions say (don’t try to be smart and put them in .sh files for example :P).

7 Register Guest Images

At this point the wiki went down :/

[root@localhost ~]# glance add name=f16 is_public=true disk_format=qcow2 container_format=ovf copy_from=http://berrange.fedorapeople.org/images/2012-02-29/f16-x86_64-openstack-sda.qcow2
Failed to add image. Got error:
Unexpected response: 500
Note: Your image metadata may still be in the registry, but the image's status will likely be 'killed'.

Yes, this is where it fall short. Manpage for clance doesn’t even have the ‘copy_from’. Maybe it could be downloaded? ‘glance index’ doesn’t work either.

 

[root@localhost ~]# glance index
Failed to show index. Got error:
Internal Server error: Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/eventlet/wsgi.py", line 336, in handle_one_response
    result = self.application(self.environ, start_response)
  File "/usr/lib/python2.7/site-packages/webob/dec.py", line 147, in __call__
    resp = self.call_func(req, *args, **self.kwargs)
  File "/usr/lib/python2.7/site-packages/webob/dec.py", line 210, in call_func
    return self.func(req, *args, **kwargs)
  File "/usr/lib/python2.7/site-packages/glance/common/wsgi.py", line 279, in __
    response = req.get_response(self.application)
  File "/usr/lib/python2.7/site-packages/webob/request.py", line 1086, in get_re
    application, catch_exc_info=False)
  File "/usr/lib/python2.7/site-packages/webob/request.py", line 1055, in call_a
    app_iter = application(self.environ, start_response)
  File "/usr/lib/python2.7/site-packages/keystone/middleware/auth_token.py", lin
    valid = self._validate_claims(claims)
  File "/usr/lib/python2.7/site-packages/keystone/middleware/auth_token.py", lin
    return self._validate_claims(claims, False)
  File "/usr/lib/python2.7/site-packages/keystone/middleware/auth_token.py", lin
    self.admin_password)
  File "/usr/lib/python2.7/site-packages/keystone/middleware/auth_token.py", lin
    return json.loads(data)["access"]["token"]["id"]
  File "/usr/lib64/python2.7/json/__init__.py", line 326, in loads
    return _default_decoder.decode(s)
  File "/usr/lib64/python2.7/json/decoder.py", line 366, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "/usr/lib64/python2.7/json/decoder.py", line 384, in raw_decode
    raise ValueError("No JSON object could be decoded")
ValueError: No JSON object could be decoded

[root@localhost ~]# cd images/
[root@localhost images]# ls
aki-tty  ami-tty  ari-tty
[root@localhost images]# http://berrange.fedorapeople.org/images/2012-02-29/f16-                                                                                        x86_64-openstack-sda.qcow2^C
[root@localhost images]# glance add name=aki-tty is_public=true container_format                                                                                        =aki disk_format=aki < aki-tty/image
=================================================[100%] 7.79M/s, ETA  0h  0m  0s
=[  2%]                                                 1.25M/s, ETA  0h  0m  3s                                                                                        Failed to add image. Got error:
You are not authorized to complete this action.
Details: 401 Unauthorized

This server could not verify that you are authorized to access the document you                                                                                         requested. Either you supplied the wrong credentials (e.g., bad password), or yo                                                                                        ur browser does not understand how to supply the credentials required.


Note: Your image metadata may still be in the registry, but the image's status w                                                                                        =================================================[100%] 20.9M/s, ETA  0h  0m  0s
[root@localhost images]#

Stuck!