Tag Archives: gcloud

wasthereannhlgamelastnight.com

wasthereannhlgamelastnight.com – now using object storage!

To continue this series of blog posts about the awesome https://wasthereannhlgamelastnight.appspot.com/WINGS web site where you can see if there was in fact, an NHL game last night :)

Some background: First I had a python script that scraped the website of nhl.com and later changed that to just grab the data from the JSON REST API of nhl.com – much nicer. But it was still outputing the result to stdout as a set and a dictionary. And then I would in the application import this file to get the schedule. This was quite hacky and ugly :) But hey it worked.

As of this commit it now uses Google’s Cloud Object Storage:

  • a special URL (one has to be an admin to be able to access it)
  • there’s a cronjob which calls this URL once a day (22:00 in some time zone)
  • when this URL is called a python script runs which:
    • checks what year it is and composes the URL to the API so that we only grab this season’s games (to be a bit nicer to the API)
    • does some sanity checking – that the fetched data is not empty
    • extracts the dates and teams as before and writes two variables,
      • one list which has the dates when there’s a game
      • one dictionary which has the dates and all the games on each date
        • probably the last would be enough ;)
    • finally always overwrites the schedule

 

To only update it when there are changes would be cool as then I could notify myself (and possibly others) when there have been changes, but it would mean that the JSON dict has to be ordered, which they aren’t by default so I’d have to change some stuff. The GCSFileStat has a checksum-like metadata of the files called ETAG. But probably it would be best to first compute a checksum of the generated JSON and then add that as an extra metadata to the object as this ETAG is probably implemented differently between providers.

 

wasthereannhlgamelastnight.appspot.com – fixed – working again!

wasthereannhlgamelastnight.appspot.com – fixed – working again!

With NHL 2017-2018 season coming up and I had some extra spare time I thought why not finally fix this great website again :)

As NHL changed the layout of their schedule page about two seasons ago – there’s these days “infinite scrolling” or whatever it’s called when the page only loads what you see on the screen. This means it’s a bit difficult to scrape the page (but not impossible).

Lately I’ve been using REST API and JSON data for quite many things – after a short search I managed to find this hidden gem: https://statsapi.web.nhl.com/api/v1/schedule?startDate=2016-01-31&endDate=2016-02-05&expand=schedule.teams,schedule.linescore,schedule.broadcasts,schedule.ticket,schedule.game.content.media.epg&leaderCategories=&site=en_nhl&teamId=

Now that’s a link to an API provided by NHL where you get the schedule and you can filter it. I’m not sure what all the parameters do, they’re not all needed. You just need the startDate and endDate. The API also has standings and results. I have not managed to find any documentation for it. Best so far seems to be this blog post.  So I’m not sure about if it’s OK to use it or if there are any restrictions.

p.s. – there is a shorter URL to the main page: https://rix.fi/nhl – but the commands – like  https://wasthereannhlgamelastnight.appspot.com/MTL – does not work.

Was there an NHL game last night?