Tag Archives: google

wtangy.se – site rename and automatic deployments!

This is a good one!

Previous entries in this series: http://www.guldmyr.com/blog/wasthereannhlgamelastnight-com-now-using-object-storage/ and  http://www.guldmyr.com/blog/wasthereannhlgamelastnight-appspot-com-fixed-working-again/

Renamed to wtangy.se

First things first! The website has been renamed to wtangy.se! Nobody in their right mind would type out wasthereannhlgamelastnight.com.. so now it’s an acronym of wasthereannhlgameyesterday. wtangy.se . Using Sweden .se top level domain because there was an offer making it really cheap :)

 

Automatic testing and deployment

Second important update is that now we do some automatic testing and deployment.

This is done with travis-ci.org where one can view builds, the configuration is done in this file.

In google cloud there’s different versions of the apps deployed. If we don’t promote a version it will not be accessible from wtangy.se (or wasthereannhlgamelastnight.appspot.com) but via some other URL.

Right now the testing happens like this on every commit:

  1. deploy the code to a testing version (which we don’t promote)
  2. then we run some scripts:
    1. pylint on the python scripts
    2. an end to end test which tries to visit the website.
  3. if the above succeeds we do deploy to master (which we do promote)
wasthereannhlgamelastnight.com

wasthereannhlgamelastnight.com – now using object storage!

To continue this series of blog posts about the awesome https://wasthereannhlgamelastnight.appspot.com/WINGS web site where you can see if there was in fact, an NHL game last night :)

Some background: First I had a python script that scraped the website of nhl.com and later changed that to just grab the data from the JSON REST API of nhl.com – much nicer. But it was still outputing the result to stdout as a set and a dictionary. And then I would in the application import this file to get the schedule. This was quite hacky and ugly :) But hey it worked.

As of this commit it now uses Google’s Cloud Object Storage:

  • a special URL (one has to be an admin to be able to access it)
  • there’s a cronjob which calls this URL once a day (22:00 in some time zone)
  • when this URL is called a python script runs which:
    • checks what year it is and composes the URL to the API so that we only grab this season’s games (to be a bit nicer to the API)
    • does some sanity checking – that the fetched data is not empty
    • extracts the dates and teams as before and writes two variables,
      • one list which has the dates when there’s a game
      • one dictionary which has the dates and all the games on each date
        • probably the last would be enough ;)
    • finally always overwrites the schedule

 

To only update it when there are changes would be cool as then I could notify myself (and possibly others) when there have been changes, but it would mean that the JSON dict has to be ordered, which they aren’t by default so I’d have to change some stuff. The GCSFileStat has a checksum-like metadata of the files called ETAG. But probably it would be best to first compute a checksum of the generated JSON and then add that as an extra metadata to the object as this ETAG is probably implemented differently between providers.

 

wasthereannhlgamelastnight.appspot.com – fixed – working again!

wasthereannhlgamelastnight.appspot.com – fixed – working again!

With NHL 2017-2018 season coming up and I had some extra spare time I thought why not finally fix this great website again :)

As NHL changed the layout of their schedule page about two seasons ago – there’s these days “infinite scrolling” or whatever it’s called when the page only loads what you see on the screen. This means it’s a bit difficult to scrape the page (but not impossible).

Lately I’ve been using REST API and JSON data for quite many things – after a short search I managed to find this hidden gem: https://statsapi.web.nhl.com/api/v1/schedule?startDate=2016-01-31&endDate=2016-02-05&expand=schedule.teams,schedule.linescore,schedule.broadcasts,schedule.ticket,schedule.game.content.media.epg&leaderCategories=&site=en_nhl&teamId=

Now that’s a link to an API provided by NHL where you get the schedule and you can filter it. I’m not sure what all the parameters do, they’re not all needed. You just need the startDate and endDate. The API also has standings and results. I have not managed to find any documentation for it. Best so far seems to be this blog post.  So I’m not sure about if it’s OK to use it or if there are any restrictions.

p.s. – there is a shorter URL to the main page: https://rix.fi/nhl – but the commands – like  https://wasthereannhlgamelastnight.appspot.com/MTL – does not work.

Was there an NHL game last night?

Trick with labels in gmail – how to find out who is selling your e-mail address

I actually read about this in some comment field somewhere else, but here is a great explanation of my intention:

http://g04.com/misc/GmailTipsComplete.html#Tip-03
I thought first that I could just e-mail to address+label@gmail.com and it would automagically come to that e-mail. That’s not the case. You still need to create a label and set up a filter. But still quite useable. In case for example you want to send notes to yourself or something like that.
Another cool thing, let’s say you register on a website. Then don’t register with your normal e-mail, register with address+facebook@gmail.com

Then all those e-mails from facebook will come to you – with the above address in the to: field.

Quite neat huh?

In case you want to find out who is selling your e-mail address you can set this up for each place you register and then when you get a spam message you can click on more details and see where it was sent to!

Firefox 4.0 is here! – Or – I went with Google-Chrome instead

Firefox download links:

Windows
http://releases.mozilla.org/pub/mozilla.org/firefox/releases/4.0/win32/en-US/Firefox%20Setup%204.0.exe

Linux
http://releases.mozilla.org/pub/mozilla.org/firefox/releases/4.0/linux-i686/en-US/firefox-4.0.tar.bz2

Mac
http://releases.mozilla.org/pub/mozilla.org/firefox/releases/4.0/mac/en-US/Firefox%204.0.dmg

Going to test this as soon as I get home on My Windows machine.

On my RHEL6 laptop however I couldn’t just unpack the linux version and run the ./firefox.
I also couldn’t find the installation guide. Nonetheless, it complains about this;

./firefox-bin: error while loading shared libraries: libgtk-x11-2.0.so.0: cannot open shared object file: No such file or directory

But sudo yum install gtk2 gives: Package gtk2-2.18.9-4.el6.x86_64 already installed and latest version.

And after a ‘find /’ I found the file here:

cat ~/find.all | grep libgtk-x11-2.0.so.0
/usr/lib64/libgtk-x11-2.0.so.0.1800.9
/usr/lib64/libgtk-x11-2.0.so.0

How do I proceed? – Did not find anything online quick enough that would help me. The other requirements I could also find in my system..
I tried to run ./firefox-bin which complained about libxul.so which I also have in my system.
I tried to run it in a sudo, no difference.

If anybody reads this and has some ideas or so – please let me know :)

So I tried Google Chrome instead (haven’t tried this before) and wow, compared to Firefox 3.6.x which is the default one on RHEL6 it is really fast!

This is the link I used to install it and it worked perfectly:

http://www.if-not-true-then-false.com/2010/install-google-chrome-with-yum-on-fedora-red-hat-rhel/

  1. Add this to /etc/yum.repos.d/google.repo
[google64]
name=Google - x86_64
baseurl=http://dl.google.com/linux/rpm/stable/x86_64
enabled=1
gpgcheck=1
gpgkey=https://dl-ssl.google.com/linux/linux_signing_key.pub
  1. yum install google-chrome-unstable
  2. start google-chrome with: google-chrome

How to get an update when a blog has gotten a new post

Hello and welcome to my blog :)

If you want to get an update whenever there is a new post you can subscribe (for free of course) to an RSS feed. This feed is a small page in .XML format that gets updated whenever a post is added on guldmyr.com.

Does it sound a little complicated? – It does not have to be!

What I found amazing about this is that most webpages have a feed, so you can put all your favorite sites in an RSS-reader, and then you just load this reader and it will tell you if any of your pages has any new posts!

Feeds of my categories – I think this might be a good idea because I tend to write about many things.

The storage category is focusing on storage, SAN and data networks.
The Finland category is about anything that is not about IT.
The IT category will is about the rest of IT :)

http://www.guldmyr.com/blog/category/finland/feed/
http://www.guldmyr.com/blog/category/storage/feed/
http://www.guldmyr.com/blog/category/IT/feed/
http://www.guldmyr.com/blog/feed/ –  This one has all included.

Google Reader
Personally I use google readerhttp://reader.google.com – to view the RSS.
Pro’s: Works from my android phone from the built-in web browser, works from any computer with Internet access, all it takes is for me to log in to the address above with my gmail account. Automatic translations of feeds, tags, sharing and more.
Con’s: I haven’t actually tried any other. :)

Steps:

  1. Point your web browser to: http://www.google.com/reader/
  2. Click on “Add a subscription”
  3. Add one of my RSS feeds – for example http://www.guldmyr.com/blog/feed/ – and click on add.
  4. Done!

Now every time you log on you will see a number next to the name of the feed which shows how many new posts there has been since last time. To read them just click on them.
It is possible to use keyboard shortcuts in google reader too:

j -> next post
k -> previous post
u -> enables/disables the menu on the left part of the screen

There are more short cuts buts those are the only ones I use.

Firefox
To find the RSS feed faster, you can in many pages go to your bookmarks menu and click on “subscribe” – this will open your RSS reader and attempt to put the RSS feed in there – remember, it doesn’t cost you anything :)

Google Interview – Data Center IT Technician

* Updated 2011-01-29
* Updated 2011-03-08 Dtek contributed some more questions from the first interview.
* Updated 2011-03-23 – Added question about PCI/PCIe and DOS partitions. Also I have been asking the few who have commented to add their input but not so much feedback yet. Feel free to drop me an e-mail or put in a comment :)
* Updated 2011-03-24 – Wrote a little more about the ‘100 broken computers’ question.
* Updated 2011-05-21 – Added some more detail/discussion about the questions in the first section.

A little while ago I had an interview for a position with Google as a Data Center IT Technician, I never signed an NDA so should be safe to put them up here :)

However, if you want to play it safe I’d refrain from posting here until after the interviews are over.

I didn’t get the job, they never answered why I didn’t pass when I replied back for some feedback (besides the template e-mail).

If you read this go ahead and comment, maybe we can figure out a better way to approach the questions :)

First Interview

First there was one interview which had 20 questions (I don’t remember all) around basic (older – like no SATA) PC stuff:
What are all the components in a PC or Server?
PC: chassi, system board, psu, cpu, ram, hdd, fans, cables, graphics card, dvd, monitor, keyboard/mouse
Server: same with deduction of a extra graphics card (is one on the system board), and addition of hdd controller, possibly backplane, no cd/dvd, extra nic, double cpu, ram, psu, fans, remote monitoring/console.

What protocol is used by ping? ICMP (this is a sneaky question – an obvious fault is to go for TCP or UDP)
How many IDE devices can you have in a PC?
– two per channel (usually 4)
How many can you have on each channel? What are they called? – 2 / master and slave
What is the resolution in Windows 2000 safe mode? – 800×600 or 640×480?- see this link on mydigitallife or this post on tom’s hardware.
What is a MAC address?Media Access Control – a unique identifier for network devices. Used by many protocols.
What is the name of the Ethernet plug?RJ45
How do you recognize a broken hard drive without software or removing it from the machine? – 1) Noise (tick tack sound of the arm getting stuck/hitting something) 2) Any leds on the disk, system board, controller 3) Any vibration or anything from the disk?
How do you find the first disk in a linux OS? – Look under /dev/ for a disk like /dev/sda(SATA) /dev/hda (PATA). Then /dev/sda1 is the first partition.
Name two devices needed to make a network: switch and router (well, network card (NIC) and router should do it, or a switch and network card.. depends how big you want to make it, really i guess you can have a network with a crossover cable and two nics).
What is BIOS? Basic input/output system. Responsible for initializing hardware, POST/startup diagnostics, boot the OS and varies hardware settings.
What is the bit rate of a serial interface of a network device? – the default apparently in hyperterminal – 9600 (this might be tricky, in my experience this varies between the devices – max is probably 115200bps). Maybe what they are asking for is what is the default speed of a Cisco switch’s aux or console port? If so, the answer is 9600.
What is the port used for HTTP? – 80
What is the difference between PCIe and PCI? – PCI-e is newer than PCI and PCI-x. The slots look different and they are not compatible.
How many primary partitions can you have in DOS?
– Four primary and maximum one active per disk. See this link for some explanations. Unsure at this stage what the exact question was.
What did you do in your previous jobs?
Would you be able to re-locate?

What does HTTP stand for? Hypertext Transfer Protocol
What controls GPU CPU Mem at boot up?
What is ROM?
Read Only Memory – Used for storing data that you do not want somebody to write to.

Length of cat5 transmission? 100m
What does NIC stand for? 
Network Interface Card

What is the standard type of filesystems now? disk filesystem and network filesystems
How would you create an EXT4 filesystem on the first partition of the first SCSI drive? 
mkfs.ext4 /dev/sda1
How would you install lilo? 
How do you get IP info from Linux or Windows? “ip addr” or “ifconfig” in Linux and “ipconfig” in Windows.
What is the subnet mask for a Class B Network?
255.255.0.0

Second Interview

Then there was a second one that was done by a person who worked in a data center and had questions like these:

You have 100 broken computers, how do you proceed.
Personally I didn’t do very well on this question I think.
There are lots of ways to approach this one, how far can you take it?
One way to make this a little easier/visible would be to make a tree or mindmap and keep on expanding it.

But thinking about this afterwards I would probably approach it something like this, (not necessarily in this order):

  1. Get an overview – where are they, what kind of hardware, importance, severity of problem, the wider you can make this the better. To have this done automagically is important and speeds up troubleshooting immensely. Consider monitoring softwares like nagios, ganglia, cactii. They not only monitor hardware but can also services.
  2. What’s the status of the central components? Network, power, etc.
  3. Hopefully not all have the same problem, try to find certain groups of them that have the same obvious error.
  4. Maybe there are more than one underlying issues, but they appear to be the same – or gives the same problem.
  5. Maybe there is one problem on one computer that is causing problem for all the rest. For example bad ethernet/fibre channel card or cable can cause network interruptions on the whole network or SAN.
  6. Maybe a service and there is something in that software on one node that is causing this issue. Like a job that runs on many machines but it broke on one machine and that caused the rest to break.
  7. Look in logs of the systems/services.
  8. Run a diagnostic CD on computers like ultimateboot CD to look for hardware errors. Server vendors may have their own diagnostic tools. memtest86 is a good boot CD for memory testing (probably best to test memory that way, the least amount of memory locked by the OS)
  9. For severe hardware problems you can look in the POST of the machine, check leds on them, but for 100 machines this might be more of a last resort.
  10. If you suspect the problem is SW – again try to find something they have in common – same manufacturer, same softwares/patches installed. Maybe this software has a monitoring part that can tell you more. Check the logs.
  11. When did the problem start? At the same time as a power outage, after a patch deployment, etc.
  12. Are they all physically close? Anything else gone down?

How do you see what happens during boot of the OS in Linux?
Answer: output of command dmesg and also in /var/log/syslog
Where do you find the logs in Linux?

Answer: /var/log
How do you mount a disk?

Answer: mount
Every boot?

Answer: fstab
How do you see what version of the kernel is running?
Answer:  uname -v (-r gives 2.6.x etc)
How do you put an image on a pc?
Answer: pxeboot as an example
How do you turn a room into a data center?

Maybe something like this? If you have any additions please go ahead and let me know :)

floor strong enough to hold the weight of all the equipment?
physical security – bar windows, access control(keys), cameras, guards
ventilation – perforated tiles?
cooling
anti-fire
racks
UPS – electrical work

What is the difference between a switch and a router
said that the diff is that switches are closer to the hosts and the routers are further away -> in the core
Did you have experience writing documents – kb?
Worst job you ever had?
What do you expect from your colleagues and your boss?
What do you do outside work?
Do you have any questions?