APS2: Rethinking the Login

	Announcer: This is one in a series of podcasts from the HighEdWeb Conference in Austin 2011. Mark Heiman: Welcome. Thank you all for coming. It's great to be here. I've never been to Austin before. It's kind of a fabulous place. I feel like I'm in the Austin weird bubble, though, and I'm protected from the rest of Texas. [Laughter] Mark Heiman: So we're going to have real cowboys all the time in my presentation today. My name, as they said, is Mark Heiman. I'm the Senior Web Developer at Carleton College in Northfield, Minnesota. Carleton is the best small college that you've never heard of. We've got about 2,000 students at our top-ranked Liberal Arts college and we offer special scholarships for students from places where it doesn't snow. Don't tell them that I said that. As I get started, I want to give you a little bit of context here. I'm going to be treading a very fine line in terms of the technical level of this talk, because the topics that I want to cover are inherently somewhat technical, but I'd like to make this accessible to folks who may not have encountered these technologies before. So if I start to veer one way or the other, please forgive me. If you get lost, just put your head down and I'll call on you.
01:13	I wanted to spend a few minutes talking about what the context is that I'm coming out of, the problems that we were trying to solve at Carleton, so that you can hopefully get a sense of where I'm coming from on this particular topic and maybe see some similarities with your own situation that will help you apply this a little bit better. The background is that we began to reconsider our approach to authentication when our Alumni Relations office came to us and said, "We are spending far too much time supporting alumni who are calling up with account problems." Alumni are the most significant secondary audience along with the parents, prospects, and friends and other secondary or tertiary audiences depending on how we feel about them at the moment. We've got 26,000 active alumni and they've all been issued accounts, user names and passwords, that they can use to log in to our site to conduct various bits of business with us.
02:05	So what's the problem? The problem is that they can't remember how to log in. They can't remember their passwords. They can't remember their user names. And although we have some online support tools to help them, a lot of the calls still come filtering through to our alumni office. Now, I can't put the blame entirely on our alumni because we've set them up in a difficult situation. They've got a genuine right to complain. Most of them don't log in more than a few times a year, so there's no particular reason why they should remember the credentials that we've given them. And we've pre-created all of the accounts, so we've assigned them a user name which they're not necessarily in a position to remember. It's not a difficult user name, it's just their name and class year, but, again, if you don't use an account, there's no reason for you to remember those credentials. Finally, these folks are not with us on campus, so getting support is not an easy task. There's a lot that we can do through our website, and we do, but for folks who can't or won't use the online recovery mechanisms that we have, they have to have recourse to the telephone, and that's so 20th century.
03:12	Now, there are a few ways that we could address these problems. We could, I suppose, require people to log in more, but that's probably not going to make us popular. We've talked about allowing people to choose their own login credentials, which does make some sense. It's a better solution, but it's not one that's technically easy with our identity management infrastructure. It also seems as though passwords are at least as much of a problem as user names. People forget the passwords, they forget the user names, they forget everything. We could try to increase the options that we provide for people to go through some sort of automated account recovery, but we haven't found any models that are obviously better than what we're currently doing. So we looked in a different direction entirely. This is one of Google's earliest data centers. We recognized that everyone who needs to log in with us probably has at least one account somewhere that they log in to all the time everyday, whether it's their email or their Facebook or their Twitter or some combination of all of those things.
04:10	The reason that that's important is that some of those accounts allow us to use them for remote authentication. We can use a Google account as a means to log in to our site. When we looked at our numbers, it was really pretty surprising. Of our 26,000 active alumni, we've got 19,000 email addresses in our alumni database. That's almost three-quarters of alumni for whom we have an email address on file. Of those 19,000 email addresses, 66% of them fall into one of these four email providers. It's not real surprising; Gmail's huge, 30% of our population, and you've got Yahoo!, Hotmail, AOL in there. The other 44% is an extremely long tail. There's a lot of folks for whom they're the only ones who have an address in that domain, but I really strongly suspect that a lot of those 44% who have given us some work email address or something, they've also got a Gmail account, they've also got an AOL account, they've also got some other way to log in that falls into the category of the majors.
05:15	So if we only rely on email addresses that we have on file, at least 66% of our alumni could log in to our site using those credentials with no additional steps. And that right there is a huge win for a single project. If we can reduce the account-related call traffic to our alumni office by 66%, they'll have more time for planning big parties involving large quantities of beer, and that means more big checks for the college, and everyone's happy. So what are we talking about, technically speaking? Unfortunately, there's not a lot of consensus about what we should call this cluster of technologies, even though they've been around awhile. You'll see the term 'distributed auth', which is the one I like, so I'm going to stick with that today. I think it pretty well describes the notion of an authentication system where the sources of identity are distributed, in the cloud, essentially.
06:04	Some folks would call this 'federated auth', but I don't think that's actually a very accurate term because there's no federation, there's no agreement between the parties that are involved, and 'federated auth' really refers to other things but the term gets used loosely. It's probably better applied to some of the Shibboleth federations, if you're familiar with that. If you've never heard of Shibboleth, it's OK. I'll give you the secret handshake after the show. There's a more recent term that's cropped up called 'Social Auth' and you see that a lot among the companies that are selling this as a service. Some of the commercial services are talking about social auth, and that reflects the fact that we're using a lot of social services like Facebook and Twitter to do this kind of authentication. But it doesn't capture the whole breadth of what we're talking about, so I'm going to set that one aside as well. I will be talking about distributed authentication as we go forward. So what exactly are we talking about? It's really a bundle of related technologies, and I'd like to pull that bundle apart piece by piece and talk in turn about the three components. Those three components are OpenID, OpenAuth or OAuth, and then some vender-specific technologies.
07:13	I would like to get you all to a point where you understand each of those pieces, not at a really deep technical level, but you understand what they are and what they're for. And then we'll pull that bundle back together again and we'll talk about how we would implement that as a login solution on our site. Let's start with OpenID. This is the granddaddy of distributed auth. It's been around for about seven years. That's 14 years in dog years, I think. OpenID is built around the notion that you should control your own identity, that you should be able to decide who it is that can provide basically a reference for you. Anyone that runs a website can pretty easily set themselves up to accept or provide some kind of OpenID authentication. In reality, the big players are the folks like the Googles and the Yahoo!s of the net, but it's really a very democratic system. Anybody can play.
08:05	What is OpenID? How does OpenID actually work? All right, hang on to your saddles. In two minutes I guarantee you're going to be able to walk out into the hall and explain OpenID to somebody. They may look at you funny, but you're going to understand this. This is Cowgirl Sue. You can see by her outfit that she's a cowgirl. Actually, she's not a real cowgirl, but she has an outfit, and if you've got an outfit, you could be a cowgirl, too. [Laughter] Mark Heiman: Sue thinks that it will improve her cowgirl cred if she signs into slims-dude-ranch.com, which is the leading social networking site for cow persons. That's Slim. Slim says that she can log in, but only if she presents a letter of reference from somebody who can vouch for her identity, because Slim accepts OpenID authentication.
09:00	Slim tells Cowgirl Sue that Google, where she has her email address, can provide her with that letter of reference, meaning Google is an OpenID identity provider. So Sue asks Google for a reference which she can give to Slim, and Google provides that. Technically, what happens behind the scenes is that Cowgirl Sue is sent to Google's login page, and she logs in on that Google-hosted page. And she doesn't actually have to actually understand what's going on. She just has to do the login process, and then she gets bounced back to Slim's site, which passes him that letter of reference. When Slim receives that letter of reference, then he's got the basic information he needs to allow Cowgirl Sue to log in. It doesn't provide him with a lot. It's typically just an email address, maybe her real name, maybe a couple of other pieces of data, but it's not a lot. But it's enough. He can allow Sue to log in, and he can create an account for her on the fly.
10:03	In the short term, as long as the timestamp on that letter of reference hasn't expired and Cowgirl Sue is still logged into her Google account, she can keep coming back to Slim's site and she won't have to log in again. It will persist. Once that reference expires or she logs out of her Google account, then the process resets and she has to start over again. That's how OpenID works. Does that make sense? Everybody understand? You're all rooting for Cowgirl Sue. All right. Now there's a lot of technical jiggery-pokery that makes all this work under the surface, but you don't really want to look at that too closely unless you're that sort of geek. But it works, and it works pretty well. One more note about OpenID before we move on. In a pure OpenID implementation, you would log in using something called an OpenID identifier, which is some sort of a user name for OpenID. Now an OpenID identifier is actually in the form of a URL that identifies the site that's providing the identity, say, Google or Yahoo! or whomever, and in many cases it actually embeds some additional identity information.
11:10	I've got a couple of examples up here: openid.aol.com/whatever-your-username or username-whatever.wordpress.com. That's the actual identifier that OpenID is using to make that connection and tie all these pieces together. As a total aside, one of the cool things you can do because these are URLs is that you can set up a web page anywhere and insert some special HTML tags that contain one or more of these valid OpenID identifiers, and then you can use the URL of that web page as the OpenID identifier. And when you provide that to somebody who accepts OpenID, they will then follow that chain of references and go to whatever providers you've got listed so you can change those out at any time. But not a lot of people do that. It's really geeky. The fact is, these OpenID identifiers are one of the downsides of the technology. Unless you use this identifier a whole lot, it's really no better than an arbitrary user name that somebody else assigns because it's hard to remember, it's hard to type, it's long.
12:09	Average users are not going to understand this. It's completely foreign. It's designed by geeks for geeks, and it probably works really well that way. So while sites that accept OpenID typically have some place where you can type in an identifier and use that to start the process, that's not typically the primary interface. We'll look at how that works in just a bit. But I want to move on right now to open auth or OAuth, as it is typically displayed. Now OAuth is sometimes presented as an alternative to OpenID, but it's not. It's a complementary technology. OpenID is a way of answering the question, 'Who are you?' We go to Google, we ask, 'Who is this person?' we get a response back. OAuth is about answering the question, 'What can I know about you?'
13:03	So we go to Facebook and we say, 'What information can I get about this person? Can I get this person's contacts? Can I get this person's posting activity? Can I get this person's photostream?' OAuth provides a mechanism for allowing the user to allow other services to get that kind of information. It's an important distinction. OpenID is about authentication, which is asking, 'Who are you?' and OAuth is about authorization, which is, 'What rights do I have?' Lots of services like Google and Facebook have a lot of data about their users and they've provided interfaces for programmers to query that data and do interesting things with it. OAuth provides the mechanism for the user to say, 'OK, I want to give you permission to look at that data that is available about me for some limited amount of time, and I just want to expose these pieces.' So OAuth is not an authentication technology. It's not designed to be used for logging in. But it can be, and is used, as a pseudo-authentication.
14:10	The assumption is that if I can give Google permission to share my data, I must have already identified myself to Google in such a way that Google feels comforted in allowing me to make that decision. So I can use the existence of that authorization as a sort of de facto authentication. That sound you hear is security geeks cringing. But it works. So let me explain OAuth to you in terms which have now been burned into your brain. Cowgirl Sue again and slims-dude-ranch.com, but this time, Slim is relying on OAuth instead of OpenID. So we're going to see how this is a different process. When Cowgirl Sue asks to log in, Slim has a new question for her. He doesn't want her identity. He wants the full package of the rich information that is available about her.
15:04	Now Sue doesn't carry her papers around with her, of course. She keeps them in her steamer trunk, and she keeps her steamer trunk at Google. So Cowgirl Sue asks Google to generate a key that she can give to Slim that would give him access to the trunk where her papers are stored. Again, in real life, she's been bounced to a Google login page where she logs in. She's probably given a choice to say what kinds of data she wants to expose in this particular interaction and for how long. When she is bounced back after that login back to slims-dude-ranch.com, Slim receives that key, and then his application makes a call out to Google to ask what kind of information is available using this key. And then Google generates that. Again, that could be all kinds of stuff. It really depends on what service you're querying, what kinds of data you're getting back.
16:04	Once Slim has what he's looking for, then he can give Cowgirl Sue whatever access seems to be appropriate, given the information that he has gotten back. But, again, this is pseudo-authentication. We're using the implication that Cowgirl Sue has the right to this account because she's been able to give us the authorization to access that data. That may seem like a subtle distinction, and it is a subtle distinction, but I would be remiss if I did not point it out. So you understand OAuth. This is going to be really useful for future cocktail party conversations. All right. There is a third category of technologies, and this sits somewhere between and among OpenID and OAuth, and essentially they're private interfaces that are provided by various venders that do all or some of the same kinds of things in vender-specific ways.
17:01	Facebook Connect is probably the best-known example. It allows you to do both authentication, asking, 'Who are you?' and authorization, giving you access to different sets of data and allows you to work with various sets of Facebook data that an individual has given you permission to access. Now, I'm not going to go real deeply in any of these specific vender technologies because the principles are really all the same, and the differences between them would require going into a level of technical depth that many of you would probably not appreciate, but understanding what you do now about OpenID and about OAuth, you can basically make the leap to some of these vender-specific technologies and you'll be pretty accurate. So let's move on. Let's say that you're in a situation like Carleton where we have this alumni population who's having trouble with their passwords and you want to use these technologies to solve that particular problem.
18:00	Looking at the list of email providers that I had up on the chart earlier that are used by our alumni, the first thing that we realize is that some of those, like Google, are offering an OpenID interface, others of them, like Microsoft, are only offering an OAuth interface, and there may be others that are vender-specific that we would need to bring in as well to extend this further. In order to implement this project in a way that actually covers our users, we're going to need to be implementing OpenID and OAuth and Facebook Connect. Suddenly it doesn't seem like such a small project anymore to roll all that stuff into a bundle that's usable. What's the solution? The solution is, don't do it yourself. There are a number of libraries and services out there that can handle all of the technical details, so you get a big step up, and really all you have to worry about are the details of your particular implementation.
19:00	There's still going to be some coding involved. You're going to have to invoke the developers on your team to do some hookup, but the extent of that integration is going to vary depending on what set of tools that you choose. Some of them are really kind of 'plug and play'. What these services do is that they wrap up OpenID and OAuth and Facebook Connect and all the rest of these things into a single, easily integrated package. And the nice thing is you can pay as much or as little as you want. Here's a real shortlist of some of the leading open source and commercial implementations. You don't need to write this down; it will all be up on the website. I don't promise that this list is a comprehensive list. These are the packages that I've found mentioned most frequently. It's not a real huge market yet, so I don't think there are a lot more of these. But if you've got a favorite that I've neglected, let me know and I'll add stick it up before this presentation goes up. I'm not going to go in a lot of depth again about the features of each of these. It's a really pretty short list so you can look at them yourself. It's also a really fast-evolving space. Even though the underlying technologies have been around for a while, the notion of combining them into packages like this is pretty new. So these things are somewhat of a moving target.
20:14	There are some things, however, that you should think about as you are considering one of these or another one and trying to imagine what an implementation would look like. Some of these packages focus primarily on the authentication side of things, again asking the question, 'Who are you?' while others provide built-in access to the richer set of metadata about people that's available through OAuth or through the various vender interfaces. And depending on your needs, you may want to look in one direction or the other. If you're really just worried about handling that login piece, then you don't need all the rest of this stuff. But if you want to do some richer things and draw on the social data that you might get out of this relationship, then you may want to look at some of the packages that implement that more fully.
21:00	You're also going to want to review which standards each of these various packages supports. Now everybody does OpenID, everybody does OAuth, so it's really a question of what's added on beyond that, what are the other pieces that the particular developer has decided to sweeten that package with. Again, it's really a question of what problems you're trying to solve and what service providers you need to be able to support in order to solve those problems. However, user interface in this realm is really critical. The open source solutions typically expect you to implement your user interface on your own using coding against their interfaces. The commercial solutions typically will provide you with some kind of user interface that you just can plug in, and frankly, some of those are really awful. So you want to look at that closely. And that leads me to the next point, which is, for the commercial solutions in particular, what programming interfaces or what customization options are available there? If that default user interface is really awful, what are your options for making it better? How easy is it to integrate with your existing systems? What kinds of platforms are required, what kinds of skills are needed to make those connections to do that integration, to do that customization?
22:14	Finally, the last question really is, how well-supported does this thing seem to be? Again, OpenID and OAuth are not changing, but the services that provide them and the other vender interfaces are changing, and new services are cropping up all the time, and if the folks who are maintaining a particular package are not really on top of things, you could get left behind. We've chosen the open source HybridAuth package for Carleton's implementation for a variety of reasons. It works well on our environment in particular. But recently, for example, one of the connectors to one of the services stopped working because the vender had changed something. I popped a note off to the developer and 24 hours later he had it fixed. That's good support for a free package.
23:02	So you want to think about those kinds of questions. Now one of the questions that should be niggling at the back of your mind is bound to be about security. Is it safe to outsource authentication for institutional users to arbitrary identity providers out in the cloud? That is absolutely a question that you should be asking. I think part of the answer has to do with your audience. Now, I don't know if this would be a really good choice for all your students or your faculty or your upper administrators, not because of any inherent insecurity in these technologies, but just because of all the added uncertainties it would add to an environment, but that's a decision you'd have to make. And maybe it is OK. Maybe this would be a huge win for you with some of those primary audiences. One guideline that we've used in order to think about what are the security implications of this for us is this: if we allow users currently to reset their passwords via a message sent to an external email provider, so they come to our page and they say, 'I lost my password. Send me a link to fix it,' and we send that to Gmail, if we do that now, we're not adding any additional risk if we allow them to use that Gmail account to establish their identity, because if that Gmail account is compromised, we've already got to compromise on our accounts as well.
24:28	So if you do that, then I think you're pretty much in the same state security-wise. If you don't currently do that, then this may be changing your profile a little bit. What we've chosen not to do is to accept identities from providers that haven't already been verified with us through some route. That's where it comes back to, that 19,000 email addresses in our alumni database. Those are the ones we're going to use to verify identities, and if it's not in that database, then we're not going to use it. I'm going to talk a little bit more about that in a few minutes.
25:04	But the bottom line here from the security standpoint is that these technologies are relatively mature. There are no unaddressed exploits out there against these. It's really a question of shifting your thinking about what constitutes proof of identity for your institution. What is enough proof of identity to allow someone to log in? Now, there is one criticism that has repeatedly been raised about OpenID in particular, and you'll find this readily if you go Googling around, and I would be remiss if I didn't spend a few minutes on it. It has to do with phishing. And it goes like this: if we train people to type in their Google credentials, or Yahoo! or Facebook or whatever, whenever that login screen pops up in order to get access to stuff, that makes it all the easier for somebody to trick our users into typing their Google password or their Facebook password or whatever into a fake Google login screen, and then all of the places that have been tied to that identity become compromised.
26:16	The argument goes, therefore, it's better security-wise to retain the system that we've had all the while of having a separate user name and password for every single service that we use and keep those all separate, so that everything is compartmentalized and there's no danger of security being breached across those boundaries. That's an absolute perfectly valid argument, and I can't dismiss it. I think what it points to, though, is that eternal struggle between security and usability where 'absolutely secure and completely unusable' is over here and 'absolutely usable and completely insecure' is over here, and we all have to find our own way somewhere in the middle of this and figure out where we are comfortable being on that continuum because we can't be at either end.
27:07	I think the phishing argument is good and valid. What we have discovered at Carleton is that it's very difficult to change behaviors with regard to phishing at all, even with logins that are limited to our own site. We actually did some interesting usability testing a few weeks ago where we sat people down with a login screen and the login screen had a little note on it that said, 'Make sure that when you type using your password in that this is a login page that's being hosted at carleton.edu.' It's right there on the page for half of the people; the other half of the people we did this with didn't have that on there. And then we made those pages look as though they were hosted at some Russian URL.
28:02	Absolutely everybody typed their user name and password in to that page! So we have a phishing problem. I don't know how we address it. OpenID has a phishing problem. I don't know that it's any different than the phishing problem that we have everywhere else. It's a question of how wide the exposure is. So I have to let you make that decision on your own, but you need to be thinking about it. What are the downsides of a distributed authentication system, or the other downsides, depending on how you feel about the phishing problem? The biggest one I think is that the technologies have not really become widespread enough outside the techie community for the average user to really have any idea what the heck is going on. We did some usability testing with an early draft of our login process and we encountered some really surprising misconceptions. People were asking things like, "How do you know my Yahoo! password?" or"'Can Google see what I'm doing when I use that to log in on your site? I don't like that," or even more interesting, "Can anyone with an AOL account log in to this site?'" That's sort of the implication is if I can use my AOL to log in and view private alumni data, what's to keep anybody else out in the world with an AOL account from doing that too?
29:17	So there are all these misconceptions which get created when you add this distributed authentication into the mix. It's really a framing problem in my mind. Sites that support distributed authentication really need to find some better ways of conveying to people what it is that's happening, of explaining this, not simply assuming that people can figure it out. Directly related to that framing question is the fact that most of the existing user interfaces for using distributed auth are really quite confusing. As I mentioned before, the native OpenID identifiers, those URLs, they're really not at all memorable. So most sites that accept OpenID have moved to a model where the user chooses the identity provider from a long list of providers, and then that URL identifier is generated invisibly in the back.
30:09	What this leads to, though, is what's been called the 'OpenID NASCAR Problem', where you get windows packed with logos that just leave your head spinning. This example from Janrain is actually one of the better cases, but you can see there's more pages of logos back there. Some of these implementations are just packed with dozens and dozens of different logos and it's just insane. It is not a welcoming interface for users. I wish I could say that Carleton has solved all these problems perfectly, but I can't. What we have done instead, I think, is to define the problem space a little bit differently, which has allowed us to take some approaches that might not be suitable for a general public site that has to accept just any identifier from anybody at all but I think does work well in situations like ours, and maybe like yours where you have a more constrained population. I'd like to move on to a few screenshots.
31:00	These are the latest mockups of our central login page provided by our designer, Matt Ryan. Oh, the resolution is not good. I'll tell you what it says up there. User name and password boxes. Oh, why, they've vanished entirely. There's a little link there that says User name/Password Help'. One of the biggest challenges that we have is that we're not providing distributed authentication for all of our users, so the user name and password route is still the primary method for logging in to our site. But that's OK, because if you know your user name and password, you don't need help. Everything's good. You're not one of the problem cases that we're trying to solve. Distributed auth in that case is a bonus if it's something that's available to you. The cases that we're trying to solve are the people who don't remember their credentials. And what are they going to do when they see this page? There's only one thing they can do. They're going to click on that User name/Password Help' link. So that's our gateway. Well, what about folks who are already using distributed auth? They come to this and they still have to find and search for this stuff? No. I'll show you that in a moment.
32:05	Behind the 'Help' link is a branch between our two categories of audiences, those that can use distributed auth, the alumni and parents, and those that can't, the students, faculty and staff, currently. If we choose the alumni and parents path, here's the meat. So you've got two choices at this point. If you're a user name/password' kind of person, you can do that. You can initiate a process to have your password reset right here through the email address that we have on file for you, and we'll go through that process. But if you're feeling adventurous, you can try the distributed auth. It's right here. Now, we've dodged the NASCAR problem by recognizing that at least half of our users are on one of three email providers. So we can give a really small set upfront because we know that that space is already limited and we can make it work for most people. And if people aren't in that particular population, there's a little tiny link that says 'Other account providers' underneath there that they can click through. There's also a link there that says 'Why this is safe'. But let's say we click on Google. You get bounced to Google's login page, and on a successful login, you get bounced back to Carleton site.
33:08	Now here's where it gets tricky. If the Gmail address that you've just logged in with is one that we have in our alumni database associated with a particular alum, we're golden. We log you in as that person, we forward you directly to the page you were trying to get to in the first place when you hit the login. However, if we have no record of that address, then we've got a problem. This is not a conventional OpenID implementation where we'll just create an account for anybody as long as they can provide an identity. We've got a predefined realm of valid users and we need to somehow verify that you're in that particular set and which person in that set you are before we can give you access to anything. So what happens if you're in that situation is that we give you a choice. We give you several choices. What you can do here, you've got three different ways you can associate that Gmail account with an existing Carleton account.
34:02	One is basically the password recovery process where you answer some questions about yourself, we send you an email, you confirm, and you're back, and we make the linkage. Alternatively, if you actually know your user name and password and you were just goofing around, you can enter your user name and password, we make the connection, your Gmail link is registered, and you can use distributed auth. Or you can call the alumni office and they can make the connection for you, and things are good. In any of those cases, as soon as that process is done, you're logged in and you can continue with your business. Finally, if you have another email account that you think we might have on file other than the one you just tried, you can try that, too, and see if that one works. This seems really complicated, but the beauty is you only have to do it once for any person. And once you've made that association between the external identity and your Carleton identity, everything works fine the next time you show up. And what happens the next time you show up? You get a button right at the top that says 'Sign in this Google account', because we know you signed in with your Google account last time, so do it again. You don't have to do any extra steps. You click the button, you log in, you're good.
35:09	It's probably becoming clearer a little bit what we're doing under the hood here. We're really just using that external identity as a shared key with our existing user database. So when we log you in, we're not logging you in with cowgirlsue@gmail.com. We're logging you in with the account you already have on our site, just as we would do if you had typed your user name and password. The beauty of this is that we can associate multiple identities with a single user. So if in the next phase of this process we want to start accepting Facebook IDs, we can do that just by adding a place in our alumni database where we store those IDs alongside the email addresses, and then you can use any or all of them. If we really wanted to get crazy, we could log you in to Facebook, we could check the verified email address that Facebook has associated with that Facebook account, and if it matches an email address in our alumni database, we will log you in. If you trust Facebook's email verification, it's just as good. But we probably won't do that.
36:02	So a couple of last notes. This login infrastructure is being built on the Reason CMS framework, which will support this kind of authentication out of the box as soon as we work out the kinks. And that's all I'm going to say about Reason, except to affirm that it is indeed 'cuter'n a possum.' But anybody who'd like to talk about CMSs later, we can do that. Reason's also part of the CMS cage match in Session 4 of this track. So a few questions for you to take away. Do you have any audiences for whom this kind of approach would really be useful? Again, the characteristics that I think indicate such a solution would be really secondary or tertiary audiences who don't log in a whole lot, for whom support is remote, who have trouble with user names and passwords. And if so, do you have some data, say, in an alumni database, that you can use to bootstrap this process, essentially to make distributed login work for a lot of people without having to have a lot of initial signups because you already have the data to start this process? Finally, I didn't talk a lot about the rich metadata that's available using OAuth and some of the vender APIs. There's all kinds of social media content that folks can choose to make available to you if they log in in those ways. We haven't really explored that ourselves, but it seems like there must be some really interesting ways that we could use that social data to increase the interactions between these audiences and our institutions.
37:21	But that is a whole 'nother presentation. So thank you for taking some time. Let's have a few questions. Yes? Audience 1: Can you tell me more about how you're branching out? Mark Heiman: Absolutely. So somebody tries to log in with an email address and we don't have that email address in our database, that's the question you're asking? Yeah, OK. Again, we've got three things that they can do at that point. They can go through a short set of forms where they give us some information about themselves, in this case it's birth date and name.
38:04	If they have no email address on file with us, then we're kind of up a creek because we have no way to make a linkage, and for that population right now, they're going to have to give us a call. And that's true right now with our current password recovery process with the regular accounts is that they don't have any email address on file with us, we have no way to verify that identity, so you've got to go through a person. I don't see any way around that, unless you really want to either expose a lot of data so that people have to answer a lot of questions about themselves and you hope that they give you the same answers as they gave you the last time, or you have to have a person involved. That's where we are. It's not perfect, but again, because we've got three-quarters of our folks with email addresses, we're in a pretty good state. Yeah? Audience 2: How did you guys, what was your internal discussion on the trustworthiness of the set of email addresses where you would find everything? For example, in our alumni database, we have email addresses for half our alums. We're a small private, about the same size as you guys.
39:08	But this is just, anybody could've walked up to our table in an event, scrawled their email address down, they've got data entered, and we're building a whole lot of trust on that. How do you guys confront that? Did you just accept that risk? And going forward as you folks know what the value and what the repercussions are, how did you guys address your comfort level with that? Mark Heiman: That's a great question. It really comes down to what your internal processes are for collecting that data. I think we are fairly confident about that data because we either collect it through in-person interactions or we collect it through some process where there has been some verification already so that they're filling out a form that they've gotten to by logging in somehow.
40:11	I guess we've got a fairly good level of trust that that data is as good as other data that we have about the person that we use to verify their identity. So if somebody calls up and says, 'I am Joe Smith from the Class of 1988 and my birth date is this,' and a couple other questions, we'll give them access to stuff. Somebody could be spoofing that information, too, as you well know. I don't think we're setting ourselves in a different position relative to the amount of access we're giving based on the data. Again, with the secondary audiences, we have to work that way. We all do it, because we don't have those in-person connections that we can really verify every single identity as we go forward.
41:03	This was driven by our alumni office, so that's our first implementation. I think after we've been running it for a while, we may reconsider. But it wasn't part of the initial scope, so we haven't really had those conversations at a deep level. We may. Yeah? Audience 4: We talked about doing this a little bit, and one of the suggestions was use that metadata to help keep records up-to-date. So they have to give their address, their phone number. Mark Heiman: Sure. Absolutely. Audience 4: So why not verify, 'Oh, is this where you have moved to?' Mark Heiman: That's a great idea. Audience 4: Based on your experience with doing it in a small scale, do users think that's creepy? Mark Heiman: You were in my presentation last year, weren't you? I talked a lot about creepy last year. [Laughter]
42:01	Mark Heiman: We haven't actually started to do that, so I can't answer that question. Right now we're really focusing just on the authentication, just on the identity piece. We would like to explore using some of that metadata to improve our data, but, again, that's outside our initial scope. Ask me next year and I may have some better data. I think we're on the creepy continuum, but I'm not sure where. Other questions? Yes. Audience 5: Do you have any numbers? Mark Heinam: Right. Sure. That's a fabulous question. What you need to know about me is that every time I propose a topic for HighEdWeb, the project that that proposal is based on gets sidetracked and doesn't actually get implemented on time. So we don't actually have any data yet.
43:05	Absolutely. So please, fill out your evals. This is the only chance you have to stop my madness. Thank you all for coming! [Applause]

APS2: Rethinking the Login

Mark Heiman Senior Web Application Developer, Carleton College

Mark Heiman
Senior Web Application Developer, Carleton College