Jon Udell is senior technical evangelist for Microsoft and a long-time technology writer and analyst. David Weinberger introduces him with a story about pitching his company to Udell at Byte Magazine in 1990, reminding us all that Weinberger, Udell and anyone who remembers Byte magazine that we’re all getting pretty old.
Udell’s talk is titled “Rethinking the community calendar: A case study in learning and teaching Fourth R principles“. He begins by pointing us to the webpage for the event, and tells us that the page fulfills one of its two functions. It enables human beings to find out about the event, but it doesn’t provide the data that would allow us to syndicate this event to other calendars or other automated systems. If we look at the data field that’s available – the RSS feed – we get pretty frustrated. There’s machine-readable tags in that feed, but they don’t include event location and time data, clearly specified – you’d need to dig it out of the human-readable text fields.
This is partly the fault of technologists, who’ve told people “thou shalt publish RSS feeds”. Website publishers responded: “we hear and obey”. And now users wonder why we can’t make sense of calendars. We did the right thing the wrong way – the feed is right, but the data is wrong.
RSS is mostly a newsreader format, which means it’s mostly in the human-readable data space. It’s the key protocol that makes possible a powerful ecosystem around blogs, where people can publish, aggregate, subscribe to blogs. Some tools do just one thing – WordPress lets you publish. Others are more complex – Google Reader both aggregates and acts as a reader. With these tools and the underlying protocol, feeds can be connected in arbitrary topologies.
Jon notes that one of the key ideas that makes RSS work is the idea that the feed is you, authoritatively representing yourself to the web. The idea that I control this data and publish it at a URL, bound to my identity, in a usable format makes syndication work as a concept, not just for blogs but for any form of data syndication.
If RSS is the key protocol for the blog space, iCalendar is the protocol for calendaring. The protocol is 12 years old, and all major calendaring systems accept .ics files. But the ecosystem hasn’t really emerged yet. There are publishers – systems like Eventful and Eventbrite… and, moving down the long tail, iCal. And there are groups that publish calendars, like the Harvard Gazette or Wicked Local Cambridge. But we don’t as of yet have an aggregation engine. That means that, if you’re promoting an event in Cambridge – Jon’s talk, for instance – you need to contact each publisher, feed in your data in their format, etc. To avoid this repetition, we need aggregators – a hub that can route between publishing and subscribing calendars.
Jon’s project, Elmcity, is that hub. It’s a free service, running on Microsoft’s Azure cloud. It uses open source code and open data, and focuses on routing the data between publishers and subscribers. It’s a useful tool, but Jon tells us, it’s also a way to invite people to think about how we collectively build the data web. We are all authoritative sources for one sort of data or another – when we are those authoritative sources, we should be able to put that online.
This is hard for people to understand. They have a hard time distinguishing between putting something online as a .pdf or as an XML feed. He references a high school principal, who asked him “We posted weekly.pdf to the website – isn’t that good enough?” People have a hard time distinguishing between data that people can read and data that computers can read. And people are very bad at understanding that transformations of data are not necessarily reversible – turning structured data into a formatted web page is a one-way transformation (unless the formatting rules are well know…).
Jon references computer scientist Jeannette Wing, who has argued that patterns associated with computing are broadly generalizable and should be broadly taught. She refers to this as “computational thinking“, and suggests that this form of thinking be taught as a fourth essential set of skills to reading, writing and arithmetic – “the fourth R”. Educators are thinking about this, Jon tells us – these ideas manifest as digital literacy or systems thinking. But we may need to get very deep into this space. Developer and entrepreneur Phil Libin suggests that “the basics of asymmetric cryptography are fundamental concepts that any member of society who wants to understand how the world works” must understand.
What are the concepts that might make up this fourth R? Jon suggests some ideas focused on data:
- Structured data can be represented many ways
- Some representations are best for people, others best for computers
- Machine friendly data can syndicate without loss of fidelity
- Data feeds have globally unique names – URLS
- URLs enable the “small pieces loosely joined” effect
- URLs pass by reference, not by value – their contents can change
- When data syndicates from an URL, the publisher controls it
We may think that “digital natives” automatically understand these ideas – Jon argues they don’t. These ideas need to be taught. And we need to teach them because these principles don’t just underly calendars – they lie beneath science 2.0, library 2.0, government 2.0, edu 2.0 and identity 2.0. When we think of government data in terms of “government has data – release it and we’ll use it to do good things”, we’re missing the point that this data is something we’re jointly contributing. We all shed data as we move through the world, and we are allowing others to own and control it. If we better understood the fact that we were the authoritative sources for this data, we might handle identity issues differently.
Jon’s talk is followed with a lively debate about the idea of this fourth R – will individuals actually need to be literate about these low-level principles, or in a hundred years, will this conversation seem as naive as demanding that everyone understand the basics of electric motors? Is the failure to create collective calendars actually a reflection on our poor understanding of data syndication, or the product of habit? Is it fair to beat up the school principal for using a PDF – perhaps he’s serving his users’ needs? Good questions, though the resolution to them is unclear.