Category Archives: Storage

HP P6000 – EVA – Thin Privisioning and Dynamic Lun Migration – XCS 10.000.000

http://h30499.www3.hp.com/t5/Storage-Area-Networks-SAN/XCS-10-000-00/m-p/4789795

2011-07-12: Updated with new link to new HP ITRC forum.

Also it looks like VAAI is not implemented in this firmware.

Some more news!

With Google Translate:
“Under the new EVA was also the XCS software updated and is now more stable and effi ¤ competitive. IT IS now available for r the EVA4000/6000/8000 EVA4100/6100/8100 and the version of XCS tion 6240 and for r the EVA 4400/6400/8400 and EVA XCS version of the P6000 10.000.00 to availability, with new features such as thin Provisioning “and” Dynamic LUN migration “for r coming EVA x400 systems.
In addition, the new EVA Command View v.9.4, which for r all generations of EVA systems can be used.”

The newsletter was in German – it mentions Thin Provisioning and Dynamic LUN Migration (a blog post about Tiering). Especially interesting I find that the x400 will also get this 10.000.00 firmware – which kind of makes sense as it’s already on 09.534.000 (one more number?).
I take this to mean that the architecture inside the P6000 Controllers are the same as in the EVA x400 -series (PowerPC etc.).

Sounds like a great move, as long as the new firmware is as stable as they claim.
EVA x400 was for most of the time not stable until the 09534000 firmware was released, unless you were lucky/did not have that many disk shelves.

It will probably be called XCS 10000000 , not 10.000.00 as written in the newsletter above.

HP’s Brocade firmwares compatible with other switches?

After a question in my SAN switch firmware upgrade article I made a comparison of two downloads of 6.3.1b (one via IBM and one from HP) – the only differences were a file called ancillary and one called EULA.pdf. I used examdiff to find the differences.

All the sub-directories were the same, only the above two files were added in the HP one.
I believe quite strongly that you can use the HP firmwares to upgrade Brocade switches that are branded by other vendors.

At least IBM and normal Brocade ones.

As they are using the very same Brocade firmware that Brocade themself use, it might be hard for the vendors to change the switch that much.

It would be interesting to investigate if other vendors add something to make theirs not, but I have no way of acquiring such a firmware.

The EULA looks like a normal HP standard end user license agreement form. The HP ancillary.txt file contains this:

“This ancillary.txt file provides information as to how to obtain the open source or other third party licenses in this distribution. To obtain such licenses, run the following CLI command at the prompt, “opensource”.
This ancillary.txt file also provides the instructions for customers who require a copy of the
machine-readable GPL Source Code by written request.  Upon your written request, HP will provide to You, for a fee covering the cost of distribution, a complete machine-readable copy of the GPL Source Code. Your written request for GPL Source Code can be sent via email to FC_Infrastructure_OpenSourceRequest@hp.com. In the request, include product name, version number, your name, and your shipping address. “

HEPIX Spring 2011 – Day 5

What day it is can be told by all the suitcases around the room.

Version Control

An overview of the version control used in CERN. Quite cool, they’re not using Git yet but they are moving away from CVS to SVN (subversion) which is not updated anymore. Apparently hard to migrate.

They use DNS load balancing

  • Browse code / logging, revisions, branches: WEBSVN – on the fly tar creation.
  • TRAC – web SVN browsing tool plus: ticketing system, wiki, plug-ins.
  • SVNPlot – generate SVN statsw. No need to checkout source code (svnstats do ‘co’).

Mercurial was also suggested at the side of Git (which is founded by Linus Torvalds).

Cern – VM – FS

Cern-VM-FS (CVMFS) looked very promising. The last one is not intended at the moment for images but more for sending applications around. It uses Squid proxy server and looked really excellent. Gives you a mount point like /cvmfs/ and under there you have the softwares.

http://twitter.com/cvmfs

Requirements needed to set it up:

  • Rpms: cvmfs, -init-scripts, -keys, -auto-setup (for tier-3 sites does some system configs), fuse, fuse-libs, autofs
  • squid cache – you need to have one. Ideally two or more for resilience. Configured (at least) to accept traffic from your site to one or more cvmfs repository servers. You could use existing frontier-squids.

 

National Grid Service Cloud

A Brittish cloud.

Good for teaching with a VM – if a machine is messed up it can be reinstalled.

Scalability – ‘cloudbursting‘ – users make use of their local systems/clusters – until they are full – and then if they need to they can do extra work in the cloud. Scalability/cloudbursting is the key feature that users are looking for.

Easy way to test an application on a number of operating systems/platforms.

Two cases were not suitable. Intensive – with a lot of number crunching.

Good: you don’t have to worry about physical assembly or housing. They do have to install the servers and networking etc. Usually this is done by somebody else. Images are key to making this easier.

Bad: Eucalyptus stability – not so good. Bottlenecks: networking is important. More is required to the whole physical server when it’s running vms.

To put a 5GB vm on a machine you would need 10GB. 5 for the image and 5 for the actual machine.
Some were intending to develop the images locally on this cloud and then move it on to Amazon.

Previous Days:
Day 4
Day 3
Day 2
Day 1

HEPIX Spring 2011 – Day 4

Dinner on the 3rd night was amazing. It was at the hotel Weisse Schwan in Arheilgen outside Darmstadt and it was a nice reception hall with big round tables, waiters with lots of wine and great buffet food. A+

Cloudy day!

Or – Infrastructure as a Service – IaaS

A few had the standpoint that the HEP community is not ready for cloud, not secure enough and we have something that’s working. But maybe a mix period would work. At least for now it’s quite awesome for non i/o intensive applications.

There were talks about virtual images and how to (securely) transfer them between sites. Several options about this, stratuslab cloud distribution of images and cloudscheduler.

One great use case for running computing nodes in the cloud is at the moment for when the cluster is maxed out – then you can kick up some more vms in the cloud to help speed up the run. Or when running the jobs it keeps the VM running as long as jobs that require that kind of VMs are in the queue. Or for testing – quite easy to set up several VMs with different operating systems/platforms and then run testing on them. See cloudscheduler.org

Infrastructure as a Code – IaaC – see Opscode and Chef. A pretty interesting looking  configuration management system.

Terms:
fairshare
json

Oracle

Maybe the most interesting presentation at the end of the day – and the debate following was maybe the most – it was the presentations from Oracle Linux and Oracle Open Source.

Before the presentation they had a nice slide stating that they don’t make any promises based on the presentation. That presentation is not available but the other one is – the one about Oracle and Open Source..

Oracle Linux (OL) looks pretty good, it’s free to download but if you want any updates you need to pay them. They have an upgrade thing so if you’re on RHEL6 you can apparently update easily (changes some yum repos). A lot of advertisement – but it was a presentation about the distribution. It’s based on RHEL, they take the updates from RHEL, then add their own magic to it. They have a boot setup so if you want to you can boot OL in Red Hat Compatibility mode. Apparently Oracle wants to put Red Hat out of business (after which they were asked: “Where will you get the kernel then?”). x86-64 only.

On the horizon:  

  • btrfs(fs that supports error detection, CoW, snapshots, ssd optimization, small files are put in metadata)
  • vswitch(full network switch, set up virtual network in the OS, ACL, VLAN, QoS, flow monitoring with openFlow)
  • Zcache(keep more pages of the fs page cache longer in main memory, more cache using LZO compression and thus fewer I/O operations – a lot faster to compress/uncompress than to access disk)
  • storage connect
  • linux containers (resource management, jails on bsd, zones on solaris, own apps/libs/root, runs on top of the kernel, not a virtualization).

From the discussion:


Pidgin – some wanted Video. Pidgin said: no way. This is how Oracle will run their open source projects like MySQL, Lustre.

“If you don’t like how the project is going – fork.” – Gilles Gravier.

Two reasons to fork: proactively (worried) or because they are unhappy with how it’s going (how it’s going or not going).

People in the audience are afraid that a lot of times a company acquires an open source project and then closes it down.

“When you acquire a company and it’s the projects. You have two options if don’t want the project. Drop it or kill it. Kill it does not work for open source.” – Gilles Gravier.

Openoffice is not dropped yet. Lots of other options. Fork and work on closed source (like Grid Engine). Drop it and stop working on it. Drop it and “talk to the community”.

No info about Lustre – when asked about it Oracle did not want to comment. Asked to e-mail gilles.gravier@oracle.com for more information.

Will Oracle port debconf to Oracle Linux? Oracle will take a look.

There was lot of angst against Oracle that surfaced, but Oracle handed it quite well and had good answers.

From one of the Oracles: “Allow me to be a bit provocative: If Oracle’s prices were lower; would you consider buying an Oracle product?”

“It takes 25 years to make a good reputation, 5 minutes to loose it.” – CERN employee.
“SUN used to make hardware and give away software for free; Oracle is .. the other way around.” – Lenz Grimmer
“Laughter” – Audience.

European Open File System SCE

  • http://www.eofs.org
  • one repository of lustre
  • hpcfs.org is another lustre open source – this will merge with opensfs.org. Both are American.
  • Close work together with eofs.org – the two above have agreed on a set of improvements.
  • 2.1 lustre will be released by Whamcloud in summer 2011.
  • LUG – lustre user group – reports and interviews at http://insidehpc.com

 

Next Day:
Day 5

Previous Days:
Day 3
Day 2
Day 1

HEPIX Spring 2011 – Day 3

Day 3 woop!

An evaluation of gluster: uses distributed metadata, so no bottleneck that comes with a metadata server, can or will do do some replication/snapshot.

Virtualization of mass storage (tapes). Using IBM’s TSM (Tivoli Storage Manager) and ERMM. Where ERMM manages the libraries, so that TSM only sees the link to the ERMM. No need to set up specific paths from each agent to each tape drive in each library.
They were also using Oracle/SUN’s T10000c tape drives that goes all the way up to 5TB – which is quite far ahead of LTO consortium’s LTO-5 that only goes to 1.5/3TB per tape. Some talk about buffered tape marks which speeds up tape operations significantly.

Lustre success story at GSI. They have 105 servers that provide 1.2PB of storage and max throughput seen is 160Gb/s. Some problems with

Adaptec 5401 – boots longer than entire linux. Not very nice to administrate. Controller complains about high temps – and missing fans of non-existing enclosures. Filter out e-mails with level “ERROR” and look at the ones with “WARNING” instead.

Benchmarking storage with trace/replay. Using strace (comes default with most Unixes) to record some operations and the ioreplay to replay them. Proven to give very similar workloads. Especially great for when you have special applications.

IPv6 – running out of IPv4 addresses, when/will there be sites that are IPv6? Maybe if a new one comes up? What to do? Maybe collect/share IPv4 addresses?

Presentations about the evolve needed of two data centers to accomodate requirements of more resource/computing power.

Implementing ITIL with Service-Now (SNOW) at CERN.

Scientific Linux presentation. Live CD can be found here:

www.livecd.ethz.ch. They might port NFS 4.1 that comes with Linux Kernel 2.6.38 to work with SL5. There aren’t many differences between RHEL and SL but in SL there is a tool called Revisor, which can be used to create your own linux distributions/CDs quite easily.

 

Errata is a term – this means security fixes.

Dinner later today!

 

Next Days:
Day 5
Day 4

Previous Days:
Day 2
Day 1

HEPIX Spring 2011 – Day 2

Guten aben!

Darmstadt is a very beautiful city. It’s quite old and there are lots of parks and eh, cool, houses.

A person from the UK said yesterday (in the pub Ratkeller) something like this: “A particle physicist’s raison d’être is to find complexities, they wouldn’t turn away from one if their life depended on it. These are the people we provide IT for.”

So no wonder that their IT systems/infrastructure is a little bit complex too!

Today’s topics are: Site Reports, IT Infrastructure (Drupal, Indico, Invenio, Fair 3D cube) and Computing(OpenMP, CMS and Batch nodes).

Site reports

Some of these institutions have a synchrotron which is a cyclic particle accelerator – looks quite cool on the pictures. Some use cfengine for managing the clusters – as in they want to avoid logging on to each node and doing configuration but instead do it from a tool. One such tool that is quite common (Puppet) can also be used for Desktops.

Not many use HP storage stuff, DDN is quite common. Nexsan, bluearc

One site had big problems with their Dell servers – caused by misapplied cooling paste on the CPUs – Dell replaced 90% of the heatsinks and fixed this.


One also had disk failures during high load.They ran the HS06 – Hep Spec 06 – test and while running that disks dropped off.Disk failures traced to anomalously high cooling fan vibration. After replacing all components, and then moving fans to another machine, they saw the error.

IT Infrastructure

CERN is working on moving to Drupal for their web sites. Investigating Varnish (good for ddos, caching and load balancing). Drupal is hard to learn.

Then there were some sessions about programming – CMS 64-bit and OpenMP.
One thought here: is it possible to discern the properties of an Intel/AMD CPU based on the name? Like E5530? Maybe this link on intel.com can be of some assistance.

Fair 3D Tier-0 Green-IT Cube

Quite cool concept(patented) that they are very soon starting to build here in Germany.
Using water vaporization with outside air (and fans in summer) to cool the air, and also water based heat exchangers in each rack to push warm air (by pressure built up by fans, so the racks needs to be quite air tight) from the back of the servers through the heat exchanger that cools the air, and then pushes it over the aisle to the next row of racks. They managed to get down to a PUE of 1.062 at best.

Next Days:
Day 5
Day 4
Day 3

Previous Day:
Day 1

 

Next Generation EVA – P6000

On theregister there was recently a post that maybe HP will announce the new EVA’s in Vegas in June – the P6000.

I just opened a discussion on the ITRC forum, maybe we’ll see some more posts there. I myself am hoping for some thin provisioning or why not SAS backends instead of the quite notorious FC loops. How about native iSCSI or FCoE?

*** 2011-05-05 Update, HP has now officially announced some information about it. For example that it will have the SAS-backend! Whoop! No more loop problems :)

*** 2011-05-05 Another one! Now it came – Thin Provisioning!
http://www.theregister.co.uk/2011/05/05/hp_p6000_data/

HEPIX Spring 2011 – Day 1

Morning.
Got in last night at around 2140 local time.
I should’ve done a little more exact research for how to find my hotel. Had to walk some 30 minutes (parts of it the wrong way) to get to it. But at least I made it to see some ice hockey.. . to bad Detroit lost.

Today’s another day though!

First stop: breakfast.

Wow. What a day, and it’s not over yet! So much cool stuff talked about.

Site Reports

The first half of the day was site reports from various places.

GSI here in Darmstadt (which is where some of the heaviest elements have been discovered). They have started an initiative to keep Lustre alive – as apparently Oracle is only going to develop this for their own services and hardware. They are running some SM – SuperMicro servers that have infiniband on board – and not like the HP ones I’ve seen that has the mellanox card as an additional mezzanine card. Nice. They were also running some really cool water cooling racks that uses the pressure in some way to push the hot air out of the racks. They found that their SM file servers had much stronger fans at the back, and not optimized airflow inside the servers so they had to tape over some (holes?) over the PCI slots on the back of the server to make it work properly for them. They were also running the servers in around 30C – altogether they got a PUE of around 1.1 which is quite impressive.

Other reports: Fermilab (loots of storage, their Enstore has for example 26PB of data on tape), KIT, Nikhef (moved to ManageEngine for patch and OS deployment, and Brocade for IP routers), CERN (lots of hard drives had to be replaced.. around 7000.. what vendor? HP, Dell, SM?), DESY (replaced Cisco routers with Juniper for better performance, RAL (problem with LSI controllers, replaced with adaptec), SLAC (FUDForum for communication).

 

Rest of the day was about:

Messaging

Some talk about messaging – for signing and encrypting messages. Could be used for sending commands to servers but also for other stuff. I’ve seen ActiveMQ in EyeOS and it’s also elsewhere as well. Sounds quite nice but apparently not many use it, instead they use ssh scripts to run things like that.

Security

About various threats that are public in the news lately and also presentation of some rootkits and a nice demo of a TTY hack. Basically the last one consists of one client/linux computer that has been taken hacked, then from this computer a person with access to a server sshs there. And then the TTY hack kicks in and gives the hacker access to the remote host. Not easy to defend against.

There was also a lengthier (longest of the day) 1h-1.5h presentation of a French site that went through how they went ahead when replacing their home-grown Batch management system with SGE(now Oracle Grid Engine).

*** Updated the post with links to some of the things. Maybe the TTY hack has another name that’s more public.

Next Days:

Day 5
Day 4
Day 3
Day 2

HEPIX Spring 2011

I’m heading to Hepix this whole week!

Looks like there’s some really interesting topics like:

Lustre, glustre, ipv6, stuff about the CERN it facilities, Scientific Linux report, cloud/grid virtualization, Oracle Linux.

I’ll sure be doing a bit of blogging about what’s going down.

Learning Storage

** 2011-08-18 Just updated the link to the HP forum.

I also wrote a primer to Data Storage.

Just listened to this podcast episode #96 of Infosmack on theregister (about storage).
Very interesting to hear what some really experienced people say about storage and the

future. Like hybrid disk drives becoming more and more predominant in the future and maybe encrypted drives?

One participant of the talk was Larry Freeman who wrote this book called storage brain that apparently Netapp uses for introducing new hires to storage. They also have a page with some other cool storage stuffs :

http://www.storage-brain.com/2011/04/my-infosmack-prognostications/

Book looks pretty cheap – might be quite interesting to read. The only other one I’ve read was HP’s Introduction to SAN – and evidently this is extremely HP specific and it is quite “high level” like the theory and intentions for a SAN. But it does go through the basics. Sometimes it is like reading a brochure.

I wonder if the Storage Brain is then focused on Netapp products?

Some more tips can be found in this HP forum thread if you are interested.

For example I really recommend Brocade’s FC 101 training. Excellent start for SAN – the networking part of storage. But there is a lot more: disk arrays, tape libraries, host side stuff like multipathing and why not disaster recovery, redundancy or replication.

King in ITRC!

Just got the Royal rank – 2500 points :)

http://forums11.itrc.hp.com/service/forums/questionanswer.do?threadId=1477054&admit=109447626+1302808876390+28353475

Woop woop!

HP’s ITRC forum is a great place to learn more, help out :)

For me the primary reason is to remember the things I learnt while working at the support, because I do not work with hardware in that respect anymore it’s incredible how fast this kind of stuff is forgotten, at least the specifics :/

I hope the ITRC won’t change into something crap after the move to another forum system now in the summer.

WD buys Hitachi GST

Linky to the register

“Western Digital is buying Hitachi Global Storage Technologies for $4.25bn in a friendly takeover …”

Hitachi GST has quite a few home-based disk products which do look interesting.

Hitachi GST’s page is very slow at the moment (2011-03-07 22:06 EET) – it even stopped responding. Maybe that’s because it’s getting slashdotted (./ed) or digged or something like that :)

Ye reckon this will make the drives cheaper? Sure hope so. I would like to get a little more, only have 2x500GB in my desktop at the moment :/
But might also be nice to get an SSD disk, but then again I don’t see why. I don’t really have any performance problems..

VSP / VMAX / VAAI

Lots of acronyms :)

VMAX – EMC
VSP – HDS
VAAI – VMWare’s Storage API

I really liked a post by the storage anarchist as it quite eloquently explains what is happening and a great benefit of VAAI. Granted the post primarily speaks about the eraze part and I’m not that knowledgeable about VAAI itself to tell if there are any other features – but there is – at least according to this article on vmetc.com. This does not focus on the same two features that the anarchist mentioned. Maybe the storage anarchist did not want to mention all of these other features because they actually do work quite well for arrays that are behind the VSP.

Brocade SAN Switch Firmware Upgrades

Overview

This is my guide/template to upgrading Fabric OS (FOS) – Firmware – on the Brocade SAN Switches. If you have any additions, comments or questions please go ahead and comment or if you have any questions you can find my e-mail on https://guldmyr.com. The post has been updated over 188 times according to my WordPress revisions, first update in January 2011.

This article was originally built from my experience with HP branded Brocade SAN Switches – not with any other OEM or pure Brocade switches. I have however since beginning this document gotten experience with other vendors.
I do not think others are different except for licenses and some default fabric.ops.
I made a comparison of two downloads of the 6.3.1b Fabric OS Firmware (one via IBM and one from HP). You can find a link to the “IBM” firmware and release notes after 6.x in that article too. I found that they are very similar and the HP firmware works on the IBM switch and vice versa. Another example is that firmware gotten from HDS works on an HP branded Brocade switch.

When you see 7.2.x this means any version in the Fabric OS 7.2.x series. For upgrades, this would generally mean the latest available in that series (like 7.2.1g for 7.2.x or 8.0.2d for 8.0.x) unless of course there is a problem with the latest. Sign up for your vendor’s security and update alerts to get notified about new releases.

Carefully plan the upgrade, it takes time but it is rewarding and worth it.

Updates in this article:

Old Updates

2011-02-22: Updated links because the release notes I had before to 6.1.x and above did not work anymore. Also changed the sub-versions in 6.1.x and above to the latest released one by HP.
2011-02-24: Found link to 5.2.x and 6.0.x FOS on HP.com with the help of an ITRC thread.
2011-04-21: Added links for correlating Brocade Product name, model number and HP name. Latest in 6.4.x series is now 6.4.1b
2011-05-05: Added link to Web Tools for 6.2.x with reference to how to upgrade Firmware via the web tools.
2011-05-15: A note added about compatibility regarding the ‘HP’ firmware files and other vendors – as far as I can tell the ones downloaded from HP will work on other non-HP switches. Also posted a new blog post about that. Added link to IBM.com – for correlating product names and for getting (all Fabric OS) firmwares. EMC also has Brocade products.
2011-05-18: Added a link to a post on HP’s support forum where the post helped a bit. Also made post a little easier, wrote a little about the release.plist confusion.
2011-05-24: Added example to show that driver updates are important. Some more restructuring of the article.
2011-07-12: Added FOS 7.0.0a
2011-07-14: Added link to HP knowledge base and updated a link to an ITRC forum thread to point to the new forum.
2011-09-29: Added FOS 7.0.0b and section about CF cards.
2011-10-19: Wrote a bit about firmware upgrade order.
2011-10-24: The HP links to 6.0.0c and 5.3.x seems to not work anymore. I could not find either of these for download on HP’s website. The IBM one still has 6.0.0c (release notes anyway).
2011-12-05: Went through all links to make sure they worked. Re-wrote some of the steps and re-ordered so that ‘decide’ is before ‘prepare’. Added output from switch when doing the firmware upgrade via CLI.
2011-12-10: Added table of contents via a plugin.
2012-01-02: Added FOS 7.0.0c
2012-01-09: Added EMC branded switches default pw
2012-02-14: Added HP’s link to FOS 5.x. firmware.
2012-02-15: Added IBM’s link to FOS 7 info and downloads.
2012-02-21: Some notes about which switches can do which firmware. Re-wrote a part of the upgrade order section.
2012-02-27: Note about licenses.
2012-02-29: Added note about 5.1.x to 5.3.x, made upgrade path clearer. Also made how to find 5.3.x and 6.x firmwares a little clearer for HP’s page.
2012-03-01: Added 6.2.2f and 7.0.1 and note about plist/ftp for 5.1.x
2012-04-03: Addeed 6.3.2e
2012-04-24: Added 7.0.1a
2012-04-27: Rewrote some part of the upgrade section.
2012-06-07: Added 6.4.3 and 7.0.1b
2012-06-14: Added link to Brocade FOS Target Path in decide section.
2012-10-27: Some grammar updates and 16G FOS 7.x requirement. 6.4.3b and 7.0.2.
2012-11-05: Updated links to release notes. Perhaps it’s time to condense the updates list. Notes about passive/active ftp, ifmodeshow|ipaddrshow and java version required (listed in release notes).

2013-03-10: 7.0.2b and 6.4.3c added some notes about compatibility. Improved list of which FOS works with which FC speeds.
2013-03-29: Added 7.1.0a and 7.0.2c. Only HP is out with 7.1.0a as of now. Brocade may have it non-publicly, at least I cannot see it in my brocade. Other minor updates.
2013-04-04: Added link to 6.4.3d
2013-05-02: Updated link to FOS Target Path.
2013-06-23: Changed some ftp:// links to http://
2013-07-16: Added link to IBM’s pdf with pictures for firmware upgrade.
2013-08-03: 6.4.3e by IBM – not available by HP yet. Disruptive upgrades are OK from 6.2 to 6.4.
2013-08-05: Added 7.1.1, updated some links to release notes.
2013-10-03: Made it a bit clearer regarding which is the earliest firmware you can upgrade from. Newer revisions of some Brocade release notes. 7.0.2d out and 6.4.3e link to hp.com
2013-11-14: Removed comment that B300 does not support 6.4.x – it does! It should have been the 200E! Thanks Eugene :)

2014-02-01: As of 2014-02-01 HP does not allow anybody without a valid support agreement to download firmwares. Release notes and at least some firmware links appears to still be working. Expect difficulty and broken links while hunting for firmwares. Fabric OS firmwares downloaded from IBM’s site works on HP switches too, but there might be some differences (although I couldn’t find any important ones when I compared 6.3.1b). So far it seems this restriction of access to firmwares only applies to HP servers.

2014-02-07: Added new link to HP’s page for FOS 5.2 to 6.3  Thanks Leo R!

2014-02-11: Added 7.2.0b and 7.1.1c (HP have 7.1.1c release notes up but IBM does not – to find Brocade version go to IBM’s download the firmware page that’s on Brocade.com and get the release notes from there). Also 3 years anniversary on this post on 2014-02-01!
2014-02-17: But be careful with 7.2.0b – IBM has a note on their 7.x page about 7.2.0b saying: “IBM recommends that customers not deploy FOS 7.2.0b if virtual switch capability is needed. Virtual switch users should migrate to an earlier version as soon as possible.
2014-03-17: The problem with 7.2.0b was likely DEFECT000491192 fixed in 7.2.0c and later also DEFECT000494570 was fixed in 7.2.0d. 7.2.0x seems a bit unstable at the moment. 7.2.1 is currently available for download via HP’s pages but not via IBM/Brocade’s. Also no release notes available. Archived 2013 updates.
2014-04-18: Added 6.4.3f and 7.1.1c and 7.1.2 is out. Updated migration paths a bit. The Brocade release notes of 7.1.x actually have a decent list of the migration path needed now. See the section “Recommended Migration Paths to FOS v7.1.2”.
2014-08-06: 7.2.1a, 7.1.2a, 7.0.2e. Updated some links that had gone bad (FOS Target Path) and made the “Show/Hide” Updates work again.
2014-09-28: 7.1.2b and 7.2.1b is out.
2014-10-01:  7.3.0a is out but can’t find any release notes for it.
2014-10-25: 7.2.1c
2015-01-18: 7.2.1d, 7.3.0c. “HP’s” release notes are too hard to find.. Added a note about FileZilla being a good ftp server. Thanks Harry Redl!
2015-02-20: 6.2.2g, 6.4.3g
2015-04-06: 7.3.1a from HP
2015-04-21: 7.3.1a from IBM
2015-06-19: Note about FileZilla being hosted off sourceforge – installer might contain malware.
2015-07-07: 7.4.0a, 7.3.1b, 7.2.1e. Removed link to HP’s webkey license page. Doesn’t work anymore. Note about BNA version required to manage Fabric OS v7.4 switches.
2015-08-31: 6.4.3g and 7.3.1b new revision of release notes. 7.2.1f, 7.4.0b
2015-09-01: 7.3.1c
2015-09-16: New versions of release notes.
2015-10-25: 7.4.1 is out for the brave and 6.4.3h (Fixes to OpenSSL CVE-2015-0286, CVE-2015-0288, CVE-2015-0289, CVE-2015-0292)
2015-12-05: 7.2.1g and 7.3.1d. Updated some links. Need to go through a lot of the HP ones here to point to HPE..
2016-02-24: 7.4.1b
2016-03-15: 7.3.2
2016-05-17: 7.4.1c
2016-08-13: 7.3.2a, 7.4.1d
2016-12-19: 7.3.2b, 7.4.1e

New Updates

2017-03-10: 8.x has been out for a while
2017-04-29: new links to Brocade FOS target path and better links for where to fetch firmwares
2017-05-03: 8.0.2b and added links to Upgrade Guides for 8.0.0 and 7.4.0)
2017-05-13: 7.4.2
2017-08-07: 8.1.0c
2017-11-09: 8.1.1a
2017-12-05: 8.0.2c
2018-01-01: 8.1.2a and 7.4.2b
2018-02-08: 7.4.2c and 8.0.2d
2018-06-15: 8.2.0a and 8.1.2d and 8.0.2e
2018-10-11: fixing some links, Brocade is now Broadcom so some links are not working anymore surprise. Some HP links no longer work so removed those too.
2018-12-07: 7.4.2d and 8.0.2f and 8.1.2f
2019-04-18: FOS Target Path on Broadcom.  From a reader got a link to NetApp's Brocadeassist portal where newer firmware can be found than on IBM's. Some more link updates and link to 8.2.1b
2019-04-19: More link fixes.  FOS 6.x - 8.x firmwares can all be downloaded from the brocadeassist portal.
2019-04-20: 8.1.2g and 8.2.1b on IBM release notes so updated links
2019-07-17: BlueChris in a comment found a nice HPE link to older Firmwares!
2019-08-23: 8.2.1c is on NetApp's link but there are no release notes - I'd pass
2
2019-11-21: all links to firmware downloads don't work right now. Only one I have found is to HPE, but there you need support contract. Found a public link? Help out other people in the comments please. I have a cache myself up to 7.4.x but I'm not sure about the legalities.
2020-05-10: people are sharing links in the comments. Many thanks! Also some new releases, 8.2.2 is out.
2020-09-27:  9.0.0a got released in August

[/su_spoiler]

Steps

  1. decide
  2. prepare
  3. upgrade

Decide

One major release at a time is required for the upgrades after 5.2.x, see details below at the release notes section.

If you have to upgrade many steps, you should upgrade to the latest in the series (or if it's very new, probably safest to go with the second newest, just check the release notes of the newest to make sure nothing related is fixed).

If the switch is on 5.1.x you can go directly to 5.3.x.

What I usually recommend is this path:
5.0.1d -> 5.2.3 -> 5.3.2c -> 6.0.1a -> 6.1.2c -> 6.2.2g -> 6.3.2e -> 6.4.3h > 7.0.2e > 7.1.2b > 7.2.1g > 7.3.2a > 7.4.2f > 8.0.2f > 8.1.2g > 8.2.2b > 9.0.0a

It's also possible to upgrade from a version earlier than 6.4.1b to 7.0.x or from 7.0.x to 7.2.x  - but this is a disruptive upgrade (meaning ports will go offline/online during upgrade). 

Brocade now has a document that describes a process of determining the 'ideal' version of Fabric OS you should be running. It is called Brocade FOS Target Path.

Yet one more official document to help is the Brocade Fabric OS Features and Standards Support Matrix, 8.2.x

There is also a section (Recommended Migration Paths to FOS ) in the release notes describing how to get to the release you're reading notes for. In addition to these, there are Upgrade Guides from Brocade, at least for newer Fabric OS ( 7.4.0 and 8.0.0).

There are newer releases being released every now and then, in several series at the same time. You can think of it as releasing updates for Windows XP and 7 at the same time.
For example, in February 2011 6.4.1a and 6.2.2e were released by HP. You can see this on HP's site if you look at the date next to the download. Quite often Fabric OS versions are not released by the OEMs at the same time, for example "Customer Notice of 7.1.0a release 25th of March 2013" HP released 7.1.0a before IBM.

Which is the recommended one? Usually it's the latest one in the highest series that the switch supports. If you have storage from more than one vendor you may want to check with all and see if they all support the version you want to upgrade to. Vendors certify their equipment with different firmware versions. If you have a tape library, ask the vendor if they have a recommended / list of certified versions.

HP: HP B-series Connectivity stream (available in HP SPOCK).
Brocade: "Brocade FOS Target Path"
Other: Contact them for their compatibility matrices, for example IBM, HDS, EMC, Fujitsu.
Brocade also has their own "Brocade Fabric OS 7.x Compatibility Matrix" which lists compatibility with other vendors.

You could in principle also say that (some blades in directors are excepted from these generalizations):

2G cannot upgrade to Fabric OS 6.x
4G and 8G can be on Fabric OS 6.x
All 4G except some 4/8 & 4/16 (that's 200E) and HP's P- and C-class 4G blade switches (4012 & 4024) can run 6.4.x
8G can run Fabric OS 6.4.x
8G and above can run Fabric OS 7.x
16G (Gen5) needs to be on Fabric OS 7.x or Fabric OS 8.x
32G (Gen6) and 64G (Gen7) needs to be on Fabric OS 8.x or Fabric OS 9.x

Do you want to use the latest one in each series? Probably.
Do check for published advisories and the release notes in the firmwares.
Some models or blades may work on 7.0.x and not on 7.1.x or vice versa.
Fabric OS 7.3.x supports all hardware that supports 7.2.x.
Basically you need to read the release notes for at least the version you are upgrading to, to confirm that it supports your switch.

Download firmware links:

If you go to downloads for HP's 4/16 there is a link that also takes you to the older FOS firmware. If you don't click through it also only have the firmware that this switch supports. So the latest on there at the moment is 6.2.2f.

On the link above you can also download HP's branded NA (Network Advisor, previously known as DCFM - Data Center Fabric Manager), see notes about that below.

If you click on manuals on the left side you will also be able to download release notes and other guides and references.

5.0.x firmware can also be found at http://ftp.hp.com/pub/softlib/software12/COL22074/co-86832-6/FOS-Drawer_Statement.htm

6.x, 7.x and 8.x. can be found in the IBM and NetApp links.

Firmware Upgrade Order

You also probably want to decide on an order to upgrade the firmware on the switches.
It's possible to do it via DCFM (now called Network Advisor, used to be something else) one switch at a time or even in parallel. I'd advice against doing it in parallel. One at a time and one step at a time seems the most cautious one. It's not too bad to run a SAN with switches in different firmwares. One idea is to have all switches of one model on the same firmware. If you need to upgrade in several steps, do one step at a time.

Also, switches that are of higher importance like Principal Switch, Core Switches or Seed Switches for DCFM/NA. Should you start with these or perhaps start with another switch of less importance to make sure the upgrade goes smoothly?

With more recent firmwares (6.4 and 7.x) it's possible to jump more than one hop - if you are ok with disruptions in the network. Nice if you need to upgrade switches that aren't in production.

Release notes:

FOS  7.xIBM_Link& 8.x IBM Link

Brocade release notes in .pdf

5.2.3
5.3.1c
6.0.0c
6.1.2c
6.2.2g
6.3.2e
6.4.3h
7.0.2e
7.1.2b
7.2.1g
7.3.2b
7.4.2d
8.0.2f
8.1.2g
8.2.1a
8.2.2b
9.0.0a (hp release notes, thin on information)

Notes from the release notes:

Upgrading from Fabric OS 5.0.x to 5.2.3 is supported
Upgrading from Fabric OS 5.1.x to 5.3.1a is supported, but upgrading from Fabric OS 5.0.x or a previous release directly to 5.3.1a is not.
Upgrading to Fabric OS 6.0.0b is only allowed from Fabric OS 5.3.x. (6.0.0c is a special upgrade version, only meant to be used in between firmware upgrades)
Upgrading to Fabric OS 6.1.2c is allowed only from Fabric OS 6.0.0b
Upgrading to Fabric OS 6.2.2f is allowed only from Fabric OS 6.1.0a or later.
Upgrading to Fabric OS 6.3.2e is allowed only from Fabric OS 6.2.0a or later.
Upgrading to Fabric OS 6.4.3f is allowed only from Fabric OS 6.3.x. You can upgrade non-disruptively from 6.2
Upgrading to Fabric OS 7.0.2 can be done non-disruptively from Fabric OS 6.4.1a or later.
Upgrading to Fabric OS 7.1.2 can be non-disruptively upgraded from 7.0.x and 7.1.x. With caveats: For example, any previously existing error log entries with FOS v7.1.0 will be permanently lost once upgraded to FOS v7.1.2.
Upgrading to Fabric OS 7.2.x can be done non-disruptively from 7.1.x. Disruptively from 7.0.x is supported.
Upgrading to Fabric OS 7.3.x can be done non-disruptively from 7.2.x. Disruptively from 7.1.x is supported (see the FOS_UpgradeGuide_v730.pdf and the Brocade Release notes).
Upgrading to Fabric OS 7.4.x can be done non-disruptively from 7.3.x. From 6.4.x with firmwarecleaninstall
Upgrading to Fabric OS 8.0.x can be done non-disruptively from any Brocade 16G (Gen 5) platform and all blades in the Supported blades table running any FOS v7.4 firmware. From 7.3.0 with "firmwaredownload -s"
Upgrading to Fabric OS 8.1.x can be done non-disruptively from Brocade platform running 8.0.2 or later. From 7.4.x disruptively with "firmwaredownload -s".
Upgrading to Fabric OS 8.2.x can be done non-disruptively from Brocade platform running 8.1.0a or later. From 8.0.x disruptively with "firmwaredownload -s".
Any Brocade platform listed in the Supported Device section running any FOS 8.2 version can be non-disruptively upgraded to FOS 9.0.0

About non-disruptively: This means you can go to 7.0.xfrom earlier  than 6.4.1a but ports will go offline during the upgrade.
See the release notes or Upgrade Guides for more details.

DCFM: Data Centre Fabric Manager / BNA: Brocade Network Advisor .

From 6.2.2a release notes:

With the introduction of Fabric OS 6.1.1, certain features and functions were removed from Web Tools (resident in the firmware) and migrated to the DCFM management application. HP recommends that, before you upgrade to Fabric OS 6.1.1x or later, if DCFM is not running on your fabric, you review the Web Tools functionality moved to DCFM, page 29 in these release notes and take note of what has changed so you can assess the impact on your fabric.

Fabric OS 7.x cannot be managed by DCFM 10.4 or BNA 11.0. You need BNA 11.1.0, see the release notes for 7.x.

Brocade Network Advisor 12.4.0 or later is required to manage switches running FOS 7.4.0 or later.
Brocade Network Advisor 14.0.1 or later is required to manage switches running Fabric OS 8.0.1 or later

Updates to documents

Sometimes Brocade releases updates to the manuals without actually updating the manuals. On HP's page you can find them as "Documentation Updates", "Fabric OS Administrator's Guide Update".

Fabric Watch and MAPS with FOS v7.3

Users running Fabric Watch for switch monitoring in FOS v7.3 are advised to convert to MAPS monitoring before upgrading to FOS v7.4. If you don't, Fabric Watch will stop working.

Also the APM have been replaced with Fabric/Flow Vision.

Interoperability

See the release notes of the firmware for the specifics. For example Fabric OS 8.0.2 cannot be in the same fabric as for example HP C-Class 4/12 FC switches (4024) and one must use Fibre Channel Routing.

Prepare

Download old Brocade Fabric OS Firmware.

Basically, you need to update in steps.

To get FOS 5.2.1b and 6.0.0c firmware: Contact OEM Vendor or Brocade. I've found that two vendors have the firmware available online for free: HP and IBM, see below:
Eventually after looking around on HP's old pages we found to http://ftp.hp.com/pub/softlib/software12/COL22074/co-86832-6/FOS-Drawer_Statement.htm  - this link sometimes changes.

Link to IBM's page for downloading FOS 6 firmwares. This has firmwares going back all the way to FOS 2.6, it even has Fabric OS 6.0.0c and 5.2.3. On the page they have listed release notes and a little further down there is a link called "Release 6 Firmware".
Actually, if you click on 'Release 6 Firmware' you are taken to a page on brocade.com where you can find many different firmwares, including 5.x and 7.x
IBM also have a link about FOS 7.x and FOS 8.x

Also note that some features does not exist/work on older Fabric OS. For example on Fabric OS 5.1.x DHCP and SCP may not work (which forces you to use static IP and ftp).

Equivalent Product Names

Page with the equivalent Brocade and HP product names.
Page with the model number as seen in switchshow and HP's model and Brocade's model. This is a good one.
Page for correlating IBM and Brocade product names.

Recommendations

HP recommend that you upgrade one fabric and one switch at a time.
Waiting a week or at least a couple of days after you upgrade the first fabric is a good idea - gives you time to see if anything went wrong, if you can fix it and if you can do anything different next time.
See HP SPOCK for more details in regards to compatibility and interop modes.
The HP B-series Connectivity Stream lists the recommended firmware and all the supported ones for each switch model. It also has a list of the supported SFPs. Find it by clicking on "Switches" in the left-hand navigation pane under the "Other Hardware" section. The Connectivity Stream is great and it is updated often so I will not link directly to it. You need an HP Passport to log on to HP SPOCK - it is free to create and you do not need a contract or product in warranty.

Other vendors have similar matrices. HP for example does not have a list stating which Fabric OS firmware is supported with which HP P6000 firmware. The idea is that you go with the general recommendation of Fabric OS firmware.

Do read the release notes for the firmware(s) you decide on: for example not all 4GB SAN-switches can run the 6.4.x FOS. The 8- and 16-port 4Gbps switches (Brocade 200E) do not run 6.4.x or 6.3.x.
Only 8Gb and 16Gb switches can run the 7.x.x FOS.
The release notes also have the fixes, enhancements, upgrade paths and supported switches.
Generally the Brocade versions of the release notes are more verbose when it comes to fixes, but if you have an HP branded it might be easier to use the HP one as that has the HP names of the products. Also it might be hard to find the Brocade release notes if you do not have a contract with Brocade. Other vendors (like IBM/Fujitsu/HDS) provide you with the Brocade version of the release notes. You can find the release notes from their support pages.

Do  consider updating OS patches, HBA drivers/firmware, management softwares and storage drivers/firmware. For example Qlogic had driver updates to their drivers that prevent HP blades from getting stuck in G_port after a reboot. Another for Qlogic FC cards was to not write a partition table on Dell servers at 2TB on the LUN (not so nice for > 2TB disks)..

Upgrading Tools

SANLoader is an unofficial HP tool to upgrade firmwares. With this you do not have to create an ftpserver etc. Contact HP Support, they may give this to you.
This is meant to be used when the other ways does not work, but it helps out a lot as you do not have to set up an FTP/SCP server.

Sanloader used to (winter 2010) not work well on Windows 7 and may not work flawlessly on the pre 6.x firmwares.

Other ways:

  • Set up a ftp/scp server and upgrade via the CLI (command line interface).
  • Use DCFM ( Data Center Fabric Manager - now called Network Advisory ) to upgrade firmware.
  • Firmware can also be upgraded through the web interface (click on switch admin and then on firmware download). You will still need an FTP/SCP server for this though. See the web tools admin guide page 73-74 (FOS 6.2.x but it hasn't moved).

FileZilla is a free FTP-server that works well. There are many alternatives around. But unfortunately some don't work sometimes (not 100% sure but probably combination of older FOS with older ftp client with FTP server that couldn't handle that client) as listed in the comments thread in this post. FileZilla is however still on sourceforge so you may want to be careful about installing that - it might contain malware. Storing them on a Synology NAS works - thanks Henny!

For FTP clients:

  • /usr/bin/ftp in Ubuntu (also in Ubuntu on Windows)
  • WinSCP for a free opensource Windows alternative that does both ftp and SCP (and more).

For SCP any machine with Linux and sshd on should work. You can also get an scp-server running on Windows, OpenSSH would work. Both protocols are old, SCP is safer while FTP is sending data in clear text.

Personally I like doing this via the CLI. The Network Advisor way gives you the possibility to upgrade in parallel, but that's also risky. If you use a Linux server to provide the firmwares via SCP don't forget to let the switches in via firewall or tcp.wrapper ( /etc/hosts.allow ). If you do the upgrade via ftp - make sure that passive and active ftp both works.

How to access the SAN-switch

The most common way is to access the CLI of a Brocade switch by connect to the IP of it with an ssh- or telnet-client, PuTTy is the name of a free Windows client. If you are comfortable with CLI, Windows 10 has WSL and a good ssh and scp client built in. Telnet is unsafe so do try to use the ssh at all costs. Windows 10 has Bash which is in my opinion much nicer to use than putty.
It's also possible to access the switch CLI via a serial cable, however as the firmware files are several 100MB (approaching 1GB for 6.4.x) that's not really viable when upgrading firmware. Hyperterminal is a free windows tool that comes by default in some Windows versions. You can also use PuTTy for serial access.
To access the web interface just point the web browser to . It requires Java. The Java version that's supported is listed in the release notes of the Fabric OS.

Upgrade

Here on HP's Support Forum are some more notes about v6.x. Basic steps:

Note: version 6 does not require to specify the exact folder location SWBDxx: it just needs the root containing "the install" file

1) Unpack the downloaded firmware in the FTP or SCP download directory
2) Start the FTP/SCP Server and allow access
3) Connect to the CLI of the switch via telnet or ssh
4) Type this in the CLI: firmwaredownload
5) Answer all questions: when it asks for File Name be sure to write /v6.4.1b, that is the folder under which you find all the SWBDxx folders. Failing to do so makes it impossible to download the firmware
6) Wait for reboot of the switch and reconnect, check the firmware version with the "version" command

More notes about the upgrade

CLI Command to start the update process is firmwaredownload - this starts the interactive version, it is possible to specify user, directory, host directly via the CLI. See the Command Reference Guide for details. There are reference guides for each major Fabric OS release.

Specifying Directory

Please use forward slashes when specifying directories.

For example when you unzip the firmware file and it creates a sub-folder in the FTP-root that is called v5.3.1a then you need to specify /v5.3.1a as the directory.

For firmwares prior to 5.3.x you have to specify the release.plist - /v5.2.2a/release.plist.
However it says in the release notes for 5.2.3 that release.plist is no longer needed.

In some cases you may have to specify the sub directory.
For example the 4/16 HP Switch is a Brocade 200E with switchtype 34. So you would then use directory SWBD34 - /v5.3.1a/SWDB34. You can also try with /v5.3.1a/release.plist, /v5.3.1a/SWDB34/release.plist or /v5.3.1a/install. However with 5.3.1a you should not have to so /v5.3.1a should be enough.

firmwaredownload example:

switch:admin> firmwaredownload
Server Name or IP Address: IP.TO.SCP.SERVER
User Name: username
File Name: /path/to/v6.2.2e
Network Protocol(1-auto-select, 2-FTP, 3-SCP) [1]: 3
Password:
Server IP: IP.TO.SCP.SERVER, Protocol IPv4
Checking system settings for firmwaredownload...
System settings check passed.
You can run firmwaredownloadstatus to get the status
 of this command.
This command will cause a warm/non-disruptive boot on the switch,
 but will require that existing telnet, secure telnet or SSH sessions
 be restarted.
Do you want to continue [Y]: y
 Firmware is being downloaded to the switch. This step may take up to 30 minutes.
 Preparing for firmwaredownload...
 Start to install packages...
 dir ##################################################
 [[lots of these for all packets]] ##################################################
 [[also stuff like these are seen many times:]]
 warning: /etc/fabos/pki/switch.0.rootcrt created as /etc/fabos/pki/switch.0.rootcrt.rpmnew
 kernel-module-ipsec ##################################################
 Removing unneeded files, please wait ...
 Finished removing unneeded files.
All packages have been downloaded successfully.
 Firmware has been downloaded to the secondary partition of the switch.
 HA Rebooting ...

Transfer Protocol and Connectivity

If you are using SCP and that does not work, please try with FTP. If neither works, see if something else can log on to the FTP/SCP server. And of course, make sure the right permissions/root directory are set on the FTP-server. If your FTP/SCP server has log files, check them. If it works from one client but not from the switch, check the logs and see if there's a difference. Sometimes if the SCP doesn't work via CLI it might work by doing SCP (but starting it from the Web Tools, thanks Eric in the comments for this!).

If you are logged on as root on the SAN-switch you can use the scp- or ssh-client on the switch to confirm connectivity, like this:

ssh username@server ls /tmp/v6.0.1a to list the /tmp/v6.0.1a on the SCP server.

You need to be root to run the above command.

If that also does not work, you have some kind of networking problem - you can try direct connecting a laptop to the LAN interface of the switch. To see the network settings on the switch: ifmodeshow and ipaddrshow

Passwords

Sometimes when upgrading from 6.1.1d to 6.2.2 we have seen that the passwords have gotten reset.

Default password is then "password" or "fibranne".

You can reset the password with the CLI command "passwd admin" to reset password on the admin account.

If you forget all passwords it might be possible to be able to reset it via the serial cable interface while booting the switch.

On EMC branded switches the default password might be: Serv4EMC

CF Cards

If your switch is out of warranty/contract and it's still working. I'd suggest making a copy(dd  in linux for example) of the CF-card. Then if the CF card decides to fail you can just get a new one from random_electronic store and dd the contents of the flash back.

Licenses

When replacing a switch make sure that the licenses are correct.
If for example you have a switch with 'power pack' - then for HP there is a special spare part number for a switch with power pack and one without. Power pack is a grouping of licenses,  which licenses are in the pack differs between models.

Google Interview – Data Center IT Technician

* Updated 2011-01-29
* Updated 2011-03-08 Dtek contributed some more questions from the first interview.
* Updated 2011-03-23 – Added question about PCI/PCIe and DOS partitions. Also I have been asking the few who have commented to add their input but not so much feedback yet. Feel free to drop me an e-mail or put in a comment :)
* Updated 2011-03-24 – Wrote a little more about the ‘100 broken computers’ question.
* Updated 2011-05-21 – Added some more detail/discussion about the questions in the first section.

A little while ago I had an interview for a position with Google as a Data Center IT Technician, I never signed an NDA so should be safe to put them up here :)

However, if you want to play it safe I’d refrain from posting here until after the interviews are over.

I didn’t get the job, they never answered why I didn’t pass when I replied back for some feedback (besides the template e-mail).

If you read this go ahead and comment, maybe we can figure out a better way to approach the questions :)

First Interview

First there was one interview which had 20 questions (I don’t remember all) around basic (older – like no SATA) PC stuff:
What are all the components in a PC or Server?
PC: chassi, system board, psu, cpu, ram, hdd, fans, cables, graphics card, dvd, monitor, keyboard/mouse
Server: same with deduction of a extra graphics card (is one on the system board), and addition of hdd controller, possibly backplane, no cd/dvd, extra nic, double cpu, ram, psu, fans, remote monitoring/console.

What protocol is used by ping? ICMP (this is a sneaky question – an obvious fault is to go for TCP or UDP)
How many IDE devices can you have in a PC?
– two per channel (usually 4)
How many can you have on each channel? What are they called? – 2 / master and slave
What is the resolution in Windows 2000 safe mode? – 800×600 or 640×480?- see this link on mydigitallife or this post on tom’s hardware.
What is a MAC address?Media Access Control – a unique identifier for network devices. Used by many protocols.
What is the name of the Ethernet plug?RJ45
How do you recognize a broken hard drive without software or removing it from the machine? – 1) Noise (tick tack sound of the arm getting stuck/hitting something) 2) Any leds on the disk, system board, controller 3) Any vibration or anything from the disk?
How do you find the first disk in a linux OS? – Look under /dev/ for a disk like /dev/sda(SATA) /dev/hda (PATA). Then /dev/sda1 is the first partition.
Name two devices needed to make a network: switch and router (well, network card (NIC) and router should do it, or a switch and network card.. depends how big you want to make it, really i guess you can have a network with a crossover cable and two nics).
What is BIOS? Basic input/output system. Responsible for initializing hardware, POST/startup diagnostics, boot the OS and varies hardware settings.
What is the bit rate of a serial interface of a network device? – the default apparently in hyperterminal – 9600 (this might be tricky, in my experience this varies between the devices – max is probably 115200bps). Maybe what they are asking for is what is the default speed of a Cisco switch’s aux or console port? If so, the answer is 9600.
What is the port used for HTTP? – 80
What is the difference between PCIe and PCI? – PCI-e is newer than PCI and PCI-x. The slots look different and they are not compatible.
How many primary partitions can you have in DOS?
– Four primary and maximum one active per disk. See this link for some explanations. Unsure at this stage what the exact question was.
What did you do in your previous jobs?
Would you be able to re-locate?

What does HTTP stand for? Hypertext Transfer Protocol
What controls GPU CPU Mem at boot up?
What is ROM?
Read Only Memory – Used for storing data that you do not want somebody to write to.

Length of cat5 transmission? 100m
What does NIC stand for? 
Network Interface Card

What is the standard type of filesystems now? disk filesystem and network filesystems
How would you create an EXT4 filesystem on the first partition of the first SCSI drive? 
mkfs.ext4 /dev/sda1
How would you install lilo? 
How do you get IP info from Linux or Windows? “ip addr” or “ifconfig” in Linux and “ipconfig” in Windows.
What is the subnet mask for a Class B Network?
255.255.0.0

Second Interview

Then there was a second one that was done by a person who worked in a data center and had questions like these:

You have 100 broken computers, how do you proceed.
Personally I didn’t do very well on this question I think.
There are lots of ways to approach this one, how far can you take it?
One way to make this a little easier/visible would be to make a tree or mindmap and keep on expanding it.

But thinking about this afterwards I would probably approach it something like this, (not necessarily in this order):

  1. Get an overview – where are they, what kind of hardware, importance, severity of problem, the wider you can make this the better. To have this done automagically is important and speeds up troubleshooting immensely. Consider monitoring softwares like nagios, ganglia, cactii. They not only monitor hardware but can also services.
  2. What’s the status of the central components? Network, power, etc.
  3. Hopefully not all have the same problem, try to find certain groups of them that have the same obvious error.
  4. Maybe there are more than one underlying issues, but they appear to be the same – or gives the same problem.
  5. Maybe there is one problem on one computer that is causing problem for all the rest. For example bad ethernet/fibre channel card or cable can cause network interruptions on the whole network or SAN.
  6. Maybe a service and there is something in that software on one node that is causing this issue. Like a job that runs on many machines but it broke on one machine and that caused the rest to break.
  7. Look in logs of the systems/services.
  8. Run a diagnostic CD on computers like ultimateboot CD to look for hardware errors. Server vendors may have their own diagnostic tools. memtest86 is a good boot CD for memory testing (probably best to test memory that way, the least amount of memory locked by the OS)
  9. For severe hardware problems you can look in the POST of the machine, check leds on them, but for 100 machines this might be more of a last resort.
  10. If you suspect the problem is SW – again try to find something they have in common – same manufacturer, same softwares/patches installed. Maybe this software has a monitoring part that can tell you more. Check the logs.
  11. When did the problem start? At the same time as a power outage, after a patch deployment, etc.
  12. Are they all physically close? Anything else gone down?

How do you see what happens during boot of the OS in Linux?
Answer: output of command dmesg and also in /var/log/syslog
Where do you find the logs in Linux?

Answer: /var/log
How do you mount a disk?

Answer: mount
Every boot?

Answer: fstab
How do you see what version of the kernel is running?
Answer:  uname -v (-r gives 2.6.x etc)
How do you put an image on a pc?
Answer: pxeboot as an example
How do you turn a room into a data center?

Maybe something like this? If you have any additions please go ahead and let me know :)

floor strong enough to hold the weight of all the equipment?
physical security – bar windows, access control(keys), cameras, guards
ventilation – perforated tiles?
cooling
anti-fire
racks
UPS – electrical work

What is the difference between a switch and a router
said that the diff is that switches are closer to the hosts and the routers are further away -> in the core
Did you have experience writing documents – kb?
Worst job you ever had?
What do you expect from your colleagues and your boss?
What do you do outside work?
Do you have any questions?

EMC – Symmetrix and CLARiiON 3

http://thestorageanarchist.typepad.com/weblog/2011/01/3017-vmax-2011-edition-powerful-trusted-smartest.html

Whoa, EMC released a bunch of stuff last night :)

Updating this post as we go today :p

***

http://finland.emc.com/ some kind of bicycle jump on in 5h and 45mins! Or 1330 UK Time, 1430 Central European, 1530 Finnish.

***

damn I missed the jump!
something else going on, a kid replacing a disk and watching the status on an ipad?
after one of the guys wanted to hi-five the kid but he missed ;)

now 26 people inside a mini cooper :)

maybe i haven’t missed it :p

***

Quite in depth :)

EMC – Symmetrix and CLARiiON – 2

So we reserved two green old chairs.
Just might need a nut and a metal plate for a bolt that’s missing one then they’ll do nicely!

***

Also been reading up on the CLARiiON now so far as to what I’ve found.

Hardware base is quite similar to the EVAs with the backend loops, CTS on the CX4 and loop io modules on the CX3. With a max of 120 disks per loop.
The number in the naming of the CX4 (maybe the others too) is that the number is max amount of disks. CX4-960 is 960 disks and 8 loops (120 per loop then?). Loop pairs I hope.

Turns out Dell are selling these.

Hot sparing is used and a raid group is like an EVA disk group.
Quite similar to the way HP’s XP is creating the parity groups.
Navisphere for management on the CLARiiON and System Management Console on the Symmetrix.

Interesting stuff this is!

Symmetrix is high end and CLARiiON is mid range.

So Symmetrix has even more similarities to the XP. For example the blades with the directors(host ports, device ports, disks, memory, cache) and assemblies.

Both appears to have more ways of configuring it than the EVA – the admin interface looks more complex anyway and you can tune the cache which is neat ;)

http://en.wikipedia.org/wiki/EMC_Symmetrix
http://en.wikipedia.org/wiki/EMC_Clariion

Both of these are quite extensive but especially the Symmetrix article looks a lot like an advertisement.

box

EMC – Symmetrix and CLARiiON

box

storage

So got two calls this week about a storage specialist job.

I suspect both are for the same job but for both top of the list was EMC – Symmetrix and CLARiiON knowledge.

So!

Today I will be googling around for some information that describes them.

Just read a little about the tiering and that looks really awesome. Having data that is frequently accessed on SSD drives and when they move down in use they can be moved down to another tier with cheaper/larger disks. Just brilliant.

***

So updating inside the post.

Found this aricle: http://storagenerve.com/2008/12/19/emc-symmetrix-management-console/ looks pretty nice. Would like to find a user guide that shows how to install and how to use it :)

***

Maybe those aren’t supposed to be public. Hard to find them on EMC’s website anyway ;/

***

Going to a flea market today, need some chairs for the upcoming move-in party :)

***

http://finland.emc.com/collateral/demos/microsites/mediaplayer-video/application-view-provisioning-storage-vmax.htm

.. which stops after 5 seconds in FF4b9 and IE8 :/
same goes for videos. sucky website.

***

Crappy image yeah I know but I’ll try to get better ;)