Monthly Archives: January 2012

BCFD – SAN Design Best Practices

This is a post in series of me studying for the BCFD – Brocade Certified Fabric Designer and it’s my comments on the document SAN Design Best Practices. Apparently this document is planned to be updated. The one I have is version 2.1. To find the latest go to My.brocade.com , documentation, Best Practices Guides. There’s also a “SAN Migration” guide there, but it’s from 2003 so irrelevant when it comes to anything specific, but ideas and reasons and methods might be valuable.

OK. I thought about doing something similar for this document as for the previous ones. But I just don’t feel like that, it’s basically just re-writing things in different wording so that it sticks in my brain. No instead I’ll post the questions that popped into my brain while reading it.

For a starter, I printed this .pdf. OK it’s not so environmentally friendly but it’s nice to have a break. One thing though, it took me a lot longer to read this than the course modules for BCFD. The SAN Design Best Practices is a first class pdf. At least in my opinion. I mean it’s general and specific. It needs to be general because there’s a lot of reasons behind designing things. Also, I don’t have any actual previous experience designing a SAN, so this is all new to me, and brings up a new side of Storage and Storage Networking that I just haven’t bothered much with before. Hopefully I have and will be learning a lot.

Links

This paper refers to a lot of documents.

The “Brocade Scalability Guidelines” is not updated with 16G products (Only goes to FOS 6.3.0).

Latency

Page 10 it says “hop count is not a concern if the total switching latency is less than the disk I/O timeout value”.

Every switch hop adds latency (frame needs to be put in ASIC, processed then sent on its way).

Switch latency is measured in microseconds.
Disk I/O – is that the same as multipathing timeout? So 60 seconds for MPIO default in Windows?

How are these latencies measured?

Redundancy Resiliency

Two fairly similar words. One indicates something has a replica or a duplicate to fall back on. The other indicates the strength, can it by itself handle a problem.

Core switches should be equal or higher perf compared to edge switches.
Highest performing switch should be the principal switch.
Redundant links should be placed on different blades/ASICs or at least different port groups.

EHT – edge hold time

New timeout value that can discard blocked frames earlier than the 500ms default (down to 100ms). An I/O retry will still happen for each dropped frame.

Is a new features in FOS 7(confirm) and it is ASIC dependent. Meaning ports in another port group are not affected by the EHT in another port group.

EHT applies to all F_Ports on a switch and all the E_ports that share ASIC with F_Ports.

Intended for initiators only.

ICL

Directors interconnected via ICL is not considered a hop in FICON, is it in Open Systems?
Are the links uni-directional?

ICL cables should all have the same length.
ISL can be a bit different, max 30m in difference.
Don’t have ISL and ICL to the same switch/domain.

Links

Hyper-Scale Fabrics: Scale-out Architecture with Brocade DCX 8510 Feature Brief.

Small Fabrics

Page 15: Brocade recommends core-edge as primary SAN design, or mesh for small fabrics (under 2000). !!! That’s pretty big..
On page 16 it says use full-mesh under 1500 ports.

Fan-In and Fan-Out and Oversubscription

Host ports to Target Ports

Device to ISL

Fan In : number of device ports that need to share a single port, be it target or ISL.

Consider: port queue depth, iops and throughput.

Example: If you have 4 devices with one 8G FC port each (32Gbps) and they are connecting over an ISL of 2x8G to another switch to a storage array that also have 2x8G then there is a 2:1 oversubscription, both on the ISL and on the target ports.

Bottleneck Detection

BD consumes switch memory, don’t monitor more than 100 ports on a 48k (no limit on DCX).

Start monitoring a small number of storage ports.

Fabric Watch

Thresholds and actions are generally different between initiators and targets. Thus place these on different switches.

FW Administrator’s Guide 7.0.0

Monitor Class 3 frame discards (C3TX_TO), they are an indicator of high-latency devices.

Fabric Watch Classes

This is a wide grouping of similar devices.
For example, temperature is a part of the class Environment.

Long Distance

Buffer Allocation

Number of credits: 6+ ((link speed Gb/s * Distance in km) / frame size in KB)

On 8510 4K buffers are available per ASIC to drive 16Gbps to 500km at 2KB frame size. With credit linking, buffers can be borrowed from a neighboring ASIC to extend distance.
Details about ‘credit linking’? Not many hits about this on google.

You can connect DWDMs in pass-thru mode where the switch is providing all the buffering.

FCIP

FCIP adds a small latency (35 micro seconds). This is without the underlying TCP/IP delays.

Use QoS to give FCIP traffic highest priority.
Use CAR (committed access rate) to limit other traffic.
Use ARL (adaptive rate limiting) and set the limit to the remaining bandwidth.

FCIP traffic believes it is the only one using the bandwidth it has available, other traffic will suffer if they if they are sharing.

Use rate limiting on the FCIP on the Brocade systems, don’t limit it on the IP network.

MLX

This is mentioned for when extending mainframe/FICON extension over FCIP.

MLX is a Brocade Router.

OC

OC1 =~ 52Mbps or without overhead ~50Mbps
OC12 = 12*52 or about 622Mbps
OC48 = 48*52 or about 2488Mbps

OC12 is recommended for Compression Mode 3 (GZIP/software only)
OC48 is recommended for Compression Mode 2 (SW with HW assist)

Neither of those are recommended for synchronous replications. Mode 0 is recommended and that is HW only compression.

Gaussian or Normal Distribution

http://en.wikipedia.org/wiki/Normal_distribution

Have fun.

Virtualization

There’s quite a bit about new Virtualization Engines in this paper. It basically means a device that has other disk arrays behind it, and then this device presents disks to servers. The danger is told to be that the engine can send a lot of small control frames, using up the buffer credits without using all the available bandwidth.

APM and Fabric Watch can apparently be used to monitor for excessive levels of SCSI reservations. How? – No specific details found but it is apparently threshold configurable in  fabric watch.

NPIV

Less domains equals to reduced:

  • inter-switch zone transfers
  • name server synchronizations
  • RSCN processing

Dynamic Fabric Provisioning (DFP)

Only on Brocade HBAs and 16G.

Dynamically provision switch-generated virtual WWN.

Can be user-generated as well.

WWN stays the same even after HBA replacement.

In practice this means you can zone, QoS even before the HBA is online and before you know what the WWN is of the new device.

Brocade Certification – BCFD – Fabric Designer – Preparation

BCFD exam is going into Beta testing in January as well!

This post will be updated as I move along through the different objectives / documents.

// Update 2012-01-15: Added the Knowledge Assessment Test.
// Update 2012-01-28: Went through each .pdf and updated some in here.

The link to the Brocade page where it tells you how to register and where to get the material: http://community.brocade.com/docs/DOC-2379

# Note: This link no longer works

When are these available?
On Thursday 12/01/2012 at 0728 EET it was not available.
On Thursay 12/01/2012 at 0803 EET it was available.

So, that would indicate that the time Pearson follows is GMT-6 or Central Time.

On top of that the only available dates for me was 23rd and 24th of January :(
Time to study!
// update, that was changed, it was a mistake so now I get some more time to study :)

Exam Study Resources (page numbers are document page numbers, NOT the PDF page)

As I see it, the importance of each document could be arranged like this:

  1. CFD 200 Modules 3-7
  2. SAN Design Best Practices
  3. FOS Administrator’s Guide
  4. The rest.

With 1/2 sharing the top spot. I haven’t gone through the modules yet but I presume they all complement each other.
The reason for them sharing the top spot is because for this Beta Exam, the CFD200 material is for 8Gbps (and it has quite a lot of details about the M-series McData switches, which the 16Gbps BCFD did not include).

There is also a Knowledge Assessment on my.brocade.com ‘education’ page.
It’s called “CFD 201 8 Gbit/sec BCFD Knowledge Assessment”. Again, this is for 8G so beware that some stuff may not be up to date if you are doing the Beta for BCFD 16G. But, the actual type of questions is something that is useful. It mentions EFCM or Fabric Manager some times (this is the previous names of DCFM or what’s now called Brocade Network Advisor).

There is a nutshell guide for BCFD, but this is from November 2008 making it possibly even more outdated than the CFD200 material. But, because most of the topics are still valid it would still work as a refresher, but you can’t use it for anything specific.

I am doubtful that the M-series will be included in the BCFD 16G exam but as it’s still in the objectives for the 8G it’s probably wise to not skip that part completely. For that 1.5 years (half 2009 and 2010) when I did SAN support I only had one call about a McData switch.

Exam Study Resources with my comments:

CFD 200 BCFD Design Course Modules 3-7

  • Obviously these are the most important. I’ll go through these at a later stage.

Brocade DCX 8510 Backbone Family Datasheet

(GA-DS-1564-01)

  • Lots of details about the system specs.

SAN Design Best Practices

(GA-BP-329-02-02)

  • Pages 2,5-16,19-26,31,32-36,40-45,51-53,55,58-62,66,67,72

Fabric OS Administrators Guide v7.0

(53-1002148-03)

  • Pages 37,43,66-70,102,142,151,153,157,196,199,241,273-286,301,314,315,320,372,383,395-398,402-406,414,417,425,429,437,438-443,449,454-461,464,503,504
  • topics
    • 256-area addressing
    • WWN-based PID assignment
    • enabling/disabling a port and port decommissioning
    • gateway links, ICL,
    • RADIUS/LDAP authentication
    • fddcfg / DCC/SCC policies
    • device authentication
    • ipfilter
    • firmwaredownload
    • advanced zoning (regular, broadcast, frame redirection, lsan, qos, ti)
    • traffic isolation zoning (and VF considerations for TI zones)
    • bottleneck detection
    • in-flight encryption and compression (technologies, enabling/disabling)
    • licensing (enable 10GbE, 7800, QoS, FCIP Extension, FICON acceleration, etc, etc, etc)
    • advanced performance monitoring (top talker, frame monitor, end-to-end)
    • adaptive networking (ingress rate limiting)
    • QoS prioritization (SID/DID or CS_CTL – class specific control)
    • trunking (ISL, ICL, EX_Port, F_Port)
    • Long Distance (buffer credit allocation, max distance, credit recovery)
    • FC-FC Routing (support platforms)
    • interopability (FOS vs M-EOS)

Fabric OS Command Reference v7.0

(53-1002147-01)

  • Pages  302,695,716,721,957,
  • commands
    • fcrconfigure  /  fcredgeshow
    • portcfgspeed
    • portdportest
    • portfencing
      • Why is the test for “Invalid Word Transmission” called ITW?
      • Ah, on portThConfig it is called “Invalid Transmission Word”.
    • supportshow

Fabric OS FCIP Administrators Guide v7.0

(53-1002155-01)

  • Pages 1,6
  • topics
    • FCIP platforms and supported features
      • 7800, FX8-24 and FR4-18i
      • FCIP Trunking
      • Adaptive Rate Limiting
      • 10GbE
      • 8G FC Ports
      • Compression (LZ and Deflate)
      • Acceleration (FCIP Fastwrite, OSTP)
      • QoS
      • VLAN Tagging
      • FICON
      • IPSEC
      • VEX
      • IPv6
      • Jumo Frames
    • 7800 switch hardware overview
    • FX8-24 has support for all features above, except: Jumbo frames (only FR4-18i supports those), IPv6 addresses for FCIP tunnels or IPsec, or 3rd WAN optimization hardware (the other do support this pre FOS 7)

 

Monitoring and Diagnostic Testing in Today’s High Speed High Density Networks

  • Pages 2-4
  • topics
    • powerpoint presentation of four pages in total
    • fc cable lengths
    • measuring loss
    • embedded diagnostics (bottleneck detection, fabric watch, frame monitoring, port fencing)
      • fmmonitor is a CLI that you can use to set up frame monitoring, for example SCSI reservations and aborts.

Brocade Network Advisor SAN User Manual

(53-1002355-01)

  • Pages 12,164,186,255,596,770,794,796
  • topics
    • “Connectivity Map Toolbar” & “Product List”
    • Call Home Feature
    • Copying and Deleting Views
    • SAN Device Configuration (configuration repository management)
    • LSAN Zoning
    • Performance Overview
    • Bottleneck detection

Why dB Loss Matters for Building Reliable Stable Networks

GA-TN-048-01

  • Pages 2,3
  • topics
    • total 8 pages
    • link lengths and link loss budgets

Brocade 6505 Hardware Reference

(53-1002449-01)

  • Pages 13,15
  • topics
    • ISL trunking
    • switchstatuspoolicy
    • fos native and AG modes

Brocade Access Gateway Administrator’s Guide

(53-1002156-01)

  • Pages xiv,72,
  • topics
    • supported hardware and software (which switches and FOS)
    • enabling NPIV on M-EOS and Cisco switches
      • CISCO: config t; npiv enable
      • MEOS:
    • new features -F_Port static mapping, APM, B6510, Target Aggregation, Direct target attachment, N_Port monitoring

“You can run the agshow command to display Access Gateway information registered with the fabric. When an Access Gateway is exclusively connected to non-Fabric-OS-based switches, it will not show up in the agshow output on other Brocade switches in the fabric.”

CEE Admin Guide 53 1002163-02

  • Page xviii
  • topics
    • Supported Hardware: Standalone switch B8000 and the blade FCOE10-24
    • IGMP configuring (IGMP is used in multicast, ethernet)
    • Replacing the B8000
      • configdownload
      • and copy running config and stuff! Looks very similar to the Cisco CLI.

Brocade Adaptors Admin Guide

(53-1002143-01)

  • Pages 3,13,
  • topics
    • AnyIO technology on the 1860 Fabric Adapter, just change the SFP and set the mode with bcu port –mode or bcu adapter –mode.
      • HBA or FC mode (FC)
      • Ethernet or NIC mode (GbE)
      • CNA mode (FCoE)
    • Adapter Support (OS + description of adapters)

The New Data Center 1st Edition

ISBN: 978-1-4507-0195-2

  • Pages 65,66,78
  • topics
    • Fabric Based Disaster Recovery (64-67)
      • An overview of some of the extension technologies and reasons behind them.
    • Network Security (77) + Power, Space and Cooling Efficiency (78)
      • Network Security is not FC related.