TPR9: Mission: Impossible - Content Management

Jason Pitoniak 
Web Services Technical Team Lead, Rochester Institute of Technology


The audio for this podcast can be downloaded at http://2011.highedweb.org/presentations/TPR9.mp3


This is one of the series of webcast from the HighEdWeb Conference in Austin 2011.

Jason Potoniak: Your mission should you choose to accept it deploy an enterprise class content management solution that runs entirely on your existing web environment supports multiple sites, maintains existing user access, security, and disc code of protections is easy for non technical contributors to learn and use. We're being flexible enough to handle wide range of site and that's relatively simple for IT to maintain while you're still expected to do everything else that you do.

Of course we also have no budget for this, so this pretty much started off as a -I wouldn't say a project that was meant to fail, but I didn't really look all that promising from the beginning. So, I'm Jason Potoniak as Jason just introduced me and this is a true story. So, the web at RIT -this is probably where I should have put up the wall cat and saying you're doing it wrong.

01:05

We have no central marketing department. Our university publication's office oversees our homepage and some of the second level content on our site. But then beyond that, it's all individual departments. And there is really no structure.

We have some guidelines that university of publications publishes, how you suppose to set things up, but there is really nothing in place for anyone to enforce that or anything like that. So, it's really all over the place. And no one actually has a web budget either.

So, departments are kind of left to their own discretion to figure out how to pay for their websites which leaves everyone with basically four development options they can build internally, they can hire a student, they can hire on campus resources that are available, or they can hire an off campus developer.

02:04

People internally that usually means that usually means that your buy a staff assistant a copy of Dreamweaver, and if she's really lucky you also go out and spend $20 on a web templates somewhere and give her that as a starting point.

Fewer colleges do have dedicated web staffs, but the other ones kind of see it as why should I hire a web person when I can be hiring. Alright, still can't hear me.

03:07

Hello? Am I on? Can you hear me? All right, I have a ton of content here, so, I'm going to just keep going whether you can hear me or not, whether they're recording me or not.

[Laughter]

Alright, so the other option, hire a student, we have really talented students obviously our dames, Rochester Consumer technology we are taking the whole school. We have students in information technology that are taking web design as a course of study.

And they have some great ideas. We also have a coop program where a lot and most of our programs we have some sort of an experiential component to them or you have to actually go out and get a job somewhere as part of where you get your degree.

04:05

So, a lot of departments will hire a student as a coop for 10-11 weeks to build a website for them. And it's pretty cost effective because you pay $10 an hour, it works out to be about $5,000 for the quarter, and you got a brand new bright, shiny website.

But, the students then leaves and then what happens to your site. Sometimes nothing, sometimes nothing for many many years including content. So, then on campus resources, our library has a group that does web design. The do it on a charge back basis.

Their phrases are probably pretty consistent with what you would pay if went out and hired a coop to do it. They are actually using coops to do their own work. I have another microphone coming.

05:03

OK now. All right. The advantage of going with ETC with this group over just hiring a student is, you'll know that ETC is going to follow the standards. That's not necessarily something that's important to the departments right away, but that's kind of important to the rest of us.

But they do have a very high demand and they are often quite backlogged so it can take a while before they get anything done. Then you can go out and hire and off campus developer.

I've seen everything from freelance to some of the bigger agencies in town being used for some departments. But that's just generally going to be your least aspect of options. So, that leaves us with sites like that.

06:02

And interest of full disclosure this was actually a site that my team developed and we just hired a kick ass designer that did this for us. So, too fat. So, content management at RIT, the early 2003 actually went out and purchase of proprietarian system.

I'm not going to say what system that was, but it was very expensive. It was a full content management system. It wasn't just a Web CMS, so it did a lot more than we actually bought it to do which kind of needlessly complex and difficult to use.

ITS didn't have dedicated resources to run it, so it was just kind of -someone was like,Hey, you're the CMS guy". I'm still not saying the name of the system. So, in the end it didn't really get used. Never really got pushed out and evangelized by ITS.

07:03

So, people didn't really know it was out there, those that did didn't have the time to learn it and try to get going with it. So, it never really went anywhere.

Somewhere 2007-2008 we placed our digital computing environment on if you've remembered DEC was probably already gone for about 10 years by then. So, it was a little bit overdue. We had a little on new features that people wanted like php5 we're giving them free before and we couldn't really do anything with it

Nice sequel, and we also switch from this crazy URL structure that we had that came after in the early '90s when the web was like brand new. Where we just took your department number, but www at the end of it, and that was your web address.

So, we actually went to a clean URL structure that is actually a search engine friendly and whatnot now. And we also introduced the staging and production environments so you could build your site on one server and do all your testing and everything and not affect your site that everyone's out there using.

08:09

So, about 2008-2009 University of Publications came out with their new web guidelines and two representatives from their -sitting right there right now, if you have any questions about that I'm sure they would be happy to answer them.

And around the same time our web advisory committee said, "Hey you know it's kind of time we start thinking about CMS again giving people that don't really want to go out and spend money to build a Website, giving them some tools so that they can get a halfway decent working site together.

And so, we started our CMS selection. So, we looked at what was already being used on the environment with people who are installing. And we also did a survey and unfortunately my boss at that time when we did this out together the survey and this really cool survey tool that we have in campus.

09:03

He's no longer head of our IT, and I had read access to the survey, but it's gone. They must have purged it when they purchase this account or something like that. Because I could not find the survey.

Because they have some interesting questions and I want to get that in there. But once we got that we kind of went trough the results figure out where we're going to go. Of course we have no budget, so our solutions had to be free -had to run on our current servers because we could not go out and buy any new ones.

The solution outside was to be pretty easy to use because we didn't have any money to get trained and how to use it. But we were also told that we had to consider the system that we're already using just to see how that would way out. It was pretty much set up that there was no way that was going to happen because while A; it run on windows , and where we're using Solaris on our environment. So, right there we had a little problem.

So, the contenders we looked at Joomla. We like the Drupal and we looked at the commercial CMS name for the colored shapes that it used in its UI that we are not still going to his name.

10:10

So Joomla was already being used by an under graduated missions and under graduated missions has a big role in our web advisory committee.  So that was one that we definitely wanted to look at. 

It is open source so it was free.  It’s pretty flexible, has a lot of good extensions out there. If you’re trying to do something, you'll probably figure out a way to do it with Joomla, it runs PHP, MySQL, would run in our environment just fine.

But it did not have multi-site support, I really say I don’t believe it. Drupal was already very popular, it was probably between that and Wordpress or like the number one and number two thing that people have installed on their environment. Very extensible, you can find modules for anything you want. 

11:01

There’s also a huge developer communities so it’s not going anywhere anytime soon, but it does tend to be kind of slow and bloated and somewhat complex if you’re a new user going into it. It’s a little daunting. 

Crimson Circle CMS, out of the box support for multiple sites, but it was a lot more than just a website manager and that made it difficult to use, run on Windows so that was a problem although we already did have service provisioned for it. And the UI really worked well in IE and not so much in anything out. 

So we have pretty high population of Mac users. That was going to cause some problems for them. And then of course there was the price that we had no money to pay for. And in order to use it effectively, you really needed to get trained and again we had no money for that. 

12:02

So the winner was Drupal. Why? Well, it’s easy enough for non-technical users to pick up a little bit training. Flexible enough to power our more complex websites, our e-commerce sites and that sort of thing, we can do all that in Drupal. 

E-commerce solutions are available. Students that are doing a lot of our site development, they have already been using it, they know it. So that made it a good fit for us. So our implementation team from ITS, we had a technical lead, we had a senior programmer analyst, that was me and a project manager, and then from University Publications we had two web developers that helped us out with it. 

And this core team was pretty much responsible for all the decision making and the implementation, the evangelism and training for the general community. 

13:03

We also called in some resources from other areas mainly areas that had professional web developers in them already. We did things like planning and requirements gathering, we want to make sure that it was something that our professional teams were going to want to use and then that way once they started using it, it would trickle out to the less technical people. 

Our library was using Drupal pretty sensibly already and so we tapped in to one of their developers and then eventually stall him away, but we used him for a lot of training for us in the core team, just to how -all the questions like how do we set up views, how do we implement this module, it’s the best way to do this. 

And then we also used those teams to do a couple of pilot sites to make sure we’re on the right track and make sure that everything was working before we went to the full campus. 

14:02

So now getting to the challenges, I’d like to say we just went around and install that PHP and everything was good, but can’t really say that. So first challenges are technical lead, one often became an FBI agent. 

Yeah, he either applied for this job before he even came to RIT and then got the call one day it's like, 'hey guess what we picked you, you’re one of two in the hundred people we are considering'. I really can’t make that up. 

So that forced me to take on a lot of additional responsibilities for the project, also forced me to take on three other projects that he was working on. So pretty much I did not see home ever, but we made it through. 

Challenge two, we had to fit Drupal into our existing structure and the way that we’re set up right now, we were at the time we still are, every site that we set up has its own Unix account. And we have probably the most complicated web environment ever. 

15:09

We’re using Mod Rewrite. There’s a feature called a Rewrite Map and basically it’s a text file that match this key and that redirects you to whatever you have on the other side. 

So we’re using all these Rewrite Map files to actually get that, when you go to rit.edu/science or I think it's COS, College of Science then goes and redirects you internally to tell the slash or w-cos or something like that. 

It’s all transparent to the end user but it provides a challenge for us trying to get Drupal working.  And using the setup, we can maintain quotas, we can take sites down if there is a security issue or anything like that pretty easily. 

16:01

So it’s fairly secure. I won’t say perfectly secure but a fairly secure setup. And we wanted to try to keep that. So Drupal would ideally be installed once and we could setup multiple sites within one installation of Drupal, but we just had figure out where that was going to be. 

We considered putting in the account that runs the homepage but we were not planning on rolling out Drupal and we still haven’t for the homepage content. And while it would be possible to mix the static homepage content that we had out there with Drupal, we figured that would get really hard to manage. 

And specially since ITS maintains Drupal but University Publications maintains the content, we’d be stepping on each other’s toes all the time.  That probably wasn’t the best way to go.  So what if I just go in and start adding sites or adding URLs into the Rewrite Map files manually.

17:05

And try that and it actually worked pretty well except that when I went and created a new account, Webman the tool that we use to create accounts and everything in, it goes and rewrites these files every time we set up a new account. 

So I lost all the Drupal stuff that I added. We work in Webman being the only developer available to do it, that wasn’t an option at this time.  We’ve had way too much stuff going on to try to do that. 

So what if I just create a new Rewrite Map files and just stick them in with the other ones and that actually worked really well, but that meant that everything in Drupal ended up in one account.  So we lost the ability to control Produs, it just became more difficult to manage things like MySQL and what not because we have this whole system setup to do it one way and we are kind of changing that.

18:02

And we also made it a little bit more difficult if we had to take down a site to do that although it wouldn’t be impossible. So really came up with we need to find a way to continue using this system we have in place now. 

The way I did that was I came up with an HTXS file that just redirected everything for Drupal into the other account using the same process that we’re using with the RewriteMaps, everything is internal no one knows that it’s happening.  So it allowed us to continue using our existing controls on our sites made retrofitting content easier because we could stick Drupal on the site, but all the old content could still be there while the developers were working on moving things over. 

So this is basically the HTXS that we came up with and if you’re familiar with Drupal, all of these are the typical files and directories that you see in a new Drupal install for the most part, there’s a couple of other things we had grow in for bug fixes and what not.

19:02

And then that just redirects to the w-drupal account where Drupal happens to be installed. And then we do have down here, it’s looking for things that don’t exist when you request something. And if there isn’t really a file there then it just sends you off to Drupal. 

And then there’s some additional stuff there for handling when you just go to Slash, have a financial URL or a file or whatever you want to call it and then your URL. On the Drupal side, we use a kind of little known feature of PHP, there’s an auto PrePAN file, it's in the slide so I don't remember what it’s called. 

What that does is you put that in your directive in your PHP and you’ve set that to a path thru a file and PHP every time it executes a script, it will actually include that file first before it does anything in the script.

20:04

So you can do all kinds of stuff if you want to tweak the environment variables or anything like that. So if you go back here, you see right up here at the top we have got this requested site and that’s just setting an environment variable that says what URL you are coming from because when you do, the internal redirects, PHP actually sees it coming from /w.drupal and it doesn’t know how to handle it. So we’re going the wrong way. 

So here’s my PrePAN PHP file, I had to go searching for patching lounges that requested site when it doe redirect sometimes two or three times in front of it.  So I go and I find that variable in whatever form it’s in at the time. And then I just changed script name in PHP self, did just the string replace for w-drupal and replaced it with whatever that environment variable was.

21:00

Worked great and that’s what we’re continuing to use now. So the next challenge was how we protect users from themselves. We wanted to give as much control to people as we can. One of the issue, we consider ourselves an innovation university so want people we need to encourage people to be innovative so it kind of goes against being restrictive.

As much as some of us would love to lock down our web environment and not have to worry about security vulnerabilities and all that sort of stuff that comes with having an open system, it doesn’t really fit with the mission of our RIT. 

History also tells us is that if we get to restrictive back when we had that old deck system that didn’t do anything that people wanted to do, they were going out and buying servers and sticking them under their desks and that created a lot more problems for us than we wanted to have. 

22:00

So we don’t want to go back down that road again. So we need to keep things fairly open.  However, we knew that if we gave out too much control people would break things, possibly on purpose, possibly just accidentally. They could turn off security settings, they could change file paths to locations that don’t work. 

And then there’s always the lan on ITS. Why did ITS need to be in my site?  I don’t want them messing with that. Well we need to have access to the site so we can do updates and things like that. So the solution is actually talking to the developer that replaced me at my old job, our National Technical Institute for the Deaf, it’s one of our colleges. 

And they had actually created a system that went in and Drupal has concept of user one which is kind of like route in Unix. They just blocked anyone from being able to do anything to do with user one unless you were user one. So I took their code and I kind of expanded that, added some code in there, they did the same kind of thing, just hides form fields on anything that I don’t want someone messing with. 

23:10

So I can control what modules people can turn on and off.  I hide other form fields that I don’t want them changing settings and stuff like that on.  And it’s working out pretty well for us, but then we ran into this other issue. 

Drupal like I've said before, we’ve got this, Drupal kind of expects to be solving your web group and all your sites underneath it. And we’ve got this horizontal thing where we got all these accounts that we’re trying to run Drupal in from one account. 

So I figured that the easy fix for that would just be use symbolic links around Solaris, should work fine. So this is what I basically setup, I’d create a directory in the account called Drupal files.

24:01

And I just redirect the Drupal or just link where the Drupal files directory should be for a site, just linked it to that and did some permission settings and stuff and got it all working. Then I got a call from one of our backup operators and it was the day after I set this up, it’s like this account he has no idea that we’re doing anything with Drupal or what Drupal is for that matter.

And I get this call and he’s like there’s some account in the web environment and you’re the owner of the account, it’s w-drupal, it’s linking us endless loop in the backup system. So he’s like I had to kick it out.  So we’re not backing that account up anymore. That’s not an option. 

So I immediately said, it must be the symlinks. So we’ve decided that we needed to get rid of them. So Drupal has this concept of public and private files.

25:02

Private files are actually loaded by Drupal so you have to go through the whole Drupal bootstrap and everything but it loads the file and it pushes it out to the client instead of letting Apache do that. 

So we switched all the sites, we had like three at the time, we switched them all over to use private files.  And we thought we are good, but the private files weren’t sending out proper cache headers.  And so nothing was getting cached and people started complaining their sites were really slow, the three sites that we had. 

So the headers were being sent as they should be, but they were set to expire. So we thought about this for a while.  And then one day I said, hey we’ve got that code in there that says if this doesn’t exist as a file then redirect to Drupal. 

26:02

What if I put in Drupal uses /system/files for its private URLs?  What if I actually created the directory system/files and we put all the files there?  So we switched to that and basically Drupal thought we were using private files, but Apache saw the files and it just loaded them as it normally would. 

And so that worked great, almost.  We still needed to give people access to themes and there was no way of changing that themes directory.  Again, going back to the, if we don’t let people do things the way that they want to do them, they’re going to go off and do it their own way some other way. 

We needed this to work for Drupal to be successful.  So it just happened to be in a meeting with one of our system administrator one day, something really unrelated.

27:03

Why are we having this problem, can we fix this somehow? And so we did some testing and we found out that there really wasn’t a problem.  I don’t what caused the backup issue in the first place, I saw him figure that one out but there wasn’t the symlinks. So went back and we put everything in, symlinks and we’re still using that to this day except that the symlinks don’t push to production. 

The way that our environment works, we don’t give any of our users direct access to the production environment, they have to use a webapp to move their files across. And when you try to move a symlink, you get an error. And I know exactly why that happens, I don’t really know how to fix it just because of the way that the system is implemented.  So yeah, I already said that. 

28:00

So I created a quick script, no real UI or anything, just a form field that I could say set up the site, click production and did everything for me. So we’re actually still using a very similar system to that and it’s working great, but it was just one small step in a very complicated installation process which took a lot of time to work with, about 15 steps to install a site. 

I had to create the account, create database, copy MySQL passwords from the account, we have a system for including your passwords so that you don’t actually have to have it included in your code and things like that. 

We had to copy all that into the Drupal account, setup all the stuff for Drupal and then copy database over. It was taking me about 30 to 45 minutes to do one site and that was with experience and I still was making mistakes.

29:00

And we really wanted to turn this over to either to our student employees or to our service desk so that we could just automate, hey I want a Drupal site, click a box and you have Drupal. 

We want him to be able to do that this way.  So I wrote drupalizer which was for the front end, a little script that version 1 was just a simple script that created an HP access file, that was all it did, still a lot of manual stuff that I had to do. 

Version 1.0 we added a few options there, we could have to create directories and that sort of thing now. I have a feeling I’m going to end up on the homepage of the conference site next year as I always seemed to.

[Laughter]

And then finally the version we’re using right now is a wizard type interface that has a couple of steps. 

30:03

You’d set everything up in the account and then it moves over to Drupal and sets up everything in Drupal, really got us down to six steps, create the account, upload drupalizer, run drupalizer, copy the database, and then just go in and set the site specific stuff that needs to be set, and then add user access of course. 

So I’m going to do it real quick demo to show you how easy this is.  I’ve already gone copied the database and everything.  So here is my account, I have drupalizer in there, I had Internet I guess. So everything pretty much gets pulled out of environment variables. 

So really all you have to do is click install and this does the create chxs file and then all that, finish installation to do the back end stuff in the Drupal account. 

31:10

And that already exists. I forgot to clear that out. I’m setting up and tearing down sites in that account all the time, testing stuff, but anyway now if I go back to the account, there was just in, it’s just in, I'll just idds/jason, instead of seeing that link to drupalizer, you now see Drupal. 

And normally it would be an actual working copy Drupal but again like I said, I’m always tearing things down and setting things up so obviously not working at the moment which kind of a bummer, but that’s essentially how long it takes to setup a site now. 

32:03

I have too many sites in this presentation and I’m down to 10 minutes. All right, so where are we now? Just after a year and a half since we launched, we’ve got 65 sites which is about 10 percent of all the sites that we have on the web environment in Drupal right now. There’s no requirement that people switch to Drupal, it’s just an option that we’re making available. 

So I would say that’s pretty good for a year and a half.  About a year ago, this time last year we had about 20 sites. So we’re moving on pretty well.  Similar setup for National Technical Institute for the Deaf has another 32 sites, that a team that just does pretty much everything for the sites. 

33:02

They wanted to just switch to Drupal right after I left, they hired their new developer and a new manager and somewhere in there they said, hey we should be doing this things a little different, started looking at Drupal. So they helped us out a lot and then we helped them out, we set up a multi site setup the same way that we’re doing it.  So they got 32 sites in there. 

One college has converted all of their individual department accounts, for at least most of them at this point to Drupal.  I guess we have two other sites that are starting to talk about it. Every college has at least one site that’s in Drupal right now. So I think those are pretty impressive numbers. We’re also working to convert the 30 plus financing administration division sites that my group actually maintains converting all of the direction of our vice president into Drupal by early next year. 

34:07

So next step, so want to get upgraded to Drupal 7. We’re in 6 right now, we know we’ve got to do the update sometime soon, we’re trying.  We’ve got some technical issues that we need to get through but we really want to do that. We also now have some money available. We just got approved to get actual servers. 

So we’re going to be looking to move Drupal off of our main web environment, do some prophecy stuff and have a little more robust environment that’s a little more whacked down to the sites that are running Drupal. We’re also looking to eliminating any single points of failure like MySQL that we have currently. 

And we want to put a reverse proxy cache like varnish in front of it to just speed things up. And I want to do some more improvements to drupalizer, make the whole process automated so like I said we can turn it over to the service desk.

35:04

Someone calls up and says, hey I want a Drupal site, they can click a button and create a Drupal site. I also want to look into this idea that I have Drupal pseudo.  Right now if we got students that are maintaining sites for us, myself and in other full-time developer doing a lot of Drupal work and Drupal has this user one is the one administrative account that is like super user for everything. 

So kind of going on to the idea with Unix as how can I create something so that the way I’m envisioning is this, you would log-in to a website somewhere, it would set a cookie that says you’re a super user. And then when you log-in to a Drupal site, it would see that cookie and do some sort of a web service lookup to make sure that you are actually legitimately have that cookie. And that would just promote you to be user one.

36:00

I don’t know if that’s going to work or not but that’s something I want to look into. And I’m also kind of doing a side project right now, it’s part of our conversion from quarters to semesters, I have a student that’s looking at some enterprise search stuff. 

He is specifically looking to find all the places where contents need to be updated but we don’t have a very good search for a campus right now. So I’m kind of hoping that I can piggyback the work that he’s doing to say hey, we need to set up some decent search servers whether it’s a Google search plans or he’s using an open source or free.

I don’t know if it’s open source, product called search blocks to do his work.  I want to integrate a search so that we can just cite specific searching but also have kind of an enterprise wide search much better than what we have right now.  So that’s essentially what we’re doing. 

37:01

Hopefully found it useful that kind of know some of our trials and tribulations.  I like red staplers just keep that in mind as you’re filling out your evaluation.  And if you have any questions, I’d be happy to answer them. Yes? Yeah. Yeah, about that. 

We didn’t consider it. I don’t really know why we didn’t. I think more because it was still more of a blog platform and would be a little bit more difficult for end users to figure it out for doing more straight content page based types sites. Any more questions? 

38:00

Yeah, I guess I did.

[Laughter] 

Yes? Yes, it is.  No, you don’t. Yeah, Rewrite Maps, you don’t. if you’re just doing straight Rewrites and account files, you do have to read restart Apache but for RewriteMaps and I think that’s why. 

That was actually implemented before I took this job. So I didn’t have any real part in that but I have done a lot of research into how they work and all that. And I think that was why because we could stand up accounts and I have to go and restart Apache every time. Yes?

39:15

This was my main work for probably about a little less than a year and a half, was my main focus. Like I said I had four or five other projects going on at the same time but this was the one that took up the most of my time. 

And it did take us quite a while to figure it all out and get it working. We’d launched officially we had our first open forum for the whole campus to kind of launch Drupal and say that hey it’s available. It was March 24th 2010. And we started this I think it was late 2008 when we’ve started doing the work. So it took us quite a while. 

40:07

Okay, well if there’s no more questions, if you do have questions flag me down, I’ll be here through, well I’m doing post conference workshops.  So I’ll be here right up until the end. 

Flag me down or I tend to kind of track.  It’s TPR9.  So if you tag down on Twitter or something like that, I’ll be kind of watching that common sense stuff like that. So ask questions, I’ll hopefully find them and answer them for you.