Last week, I was able to catch up with my good friend and colleague Rebecca Fraimow. Rebecca is an NDS Resident in Boston at WGBH. I’ve included some transcript excerpts from our conversation below:
Julia: it’s been about 3 months! You’re about a third of the way with your project! How do you feel?
Rebecca: That’s pretty difficult to believe.
Julia: How do you feel? Do you have a good grasp of what’s going to happen come May? What do you think?
Rebecca: Mostly I feel like I have a better sense as far as my project goes. What happens when it’s May…I have no idea! As far as doing my work right now, I feel like what these last three months have really prepped me to do is figure out what kind of questions to ask my mentors.
Julia: I know a big part of what you’ve been talking about is your LTO [Linear Tape-Open] failure? How does that fit in with your project and how are you exploring that as a research project? How are you framing that question?
Rebecca: Well that’s a question…it’s sort of a part of my project that’s come up. It wasn’t originally designed as a part of my project because, obviously, WGBH didn’t know they were going to have a big failure and this is not specifically a failure of the LTO drives necessarily, but a failure in transferring the material off of an LTO automated storage system over a network to a local system.
Julia: It was because people were trying to access the material that [this error was discovered]?
Rebecca: Exactly, it was because they were trying the pull the material off the network. The storage material was storage for all of WGBH, it wasn’t storage specifically for the archive. Which is sort of what motivated the change within the archives to have their own Hydra DAM system to maintain local LTO machines with direct connections to local computers.
Julia: Ok, it was because of these failures that that you decided to…make the archive your own and no longer trust the institutional-wide workflow?
Rebecca: Pretty much.
Julia: That’s a good idea probably.
Rebecca: It’s pretty good to have the direct access and know you can just walk down to storage and pull out the tape and have control of that process all the way through.
Julia: Was that failure discovered after you came onto the project?
Rebecca: Not after, but shortly before. It was during the summer when all these massive amounts of material were transferred to Crawford [the vendor storage company]. So I first heard about it when I first visited WGBH, I guess it was the beginning of August… And they started talking about it at AMIA and it became clear there was a lot of interest… And more than that, other people had these kinds of problems, but maybe were not as open with them. It’s unclear if the failures are with the transfers, the transfer protocols, or the LTO tapes, but these are things that people ought to know about and can learn from.
Julia: So in terms of working with this as a research question? Are you studying and acquiring all the brands of LTO tapes you had? How are you replicating older system setups? It’s not like you can replicate that network? What steps are you taking?
Rebecca: Good question. Even now, it was even a question if the failures were with the storage of the files earlier. We probably won’t be able to necessarily replicate what led to the failures. What we are trying to do is narrow down exactly where the point of failure. So we got the files before they originally, or about half the files before they were transferred and their checksums. We can run some file analysis tools on them. We can try to characterize them using tools like FFMPEG and compare the results we get.
Julia: But what kind of failures were these failures?
Rebecca: That’s a really good question. There were actually three different types of failures. So the files… they looked like files when they came. The files transferred all the way, but the files types were unrecognizable. Let me backtrack to before I came onto the project. When these files were initially pulled, there was just such a hurry to transfer them. WGBH was sending these drives of born-digital material created at WGBH to Crawford, the vendor partner on the American Archive. WGBH pulled a whole bunch off of their own WGBH institutional files through their network and sent the hard drives to Crawford. When Crawford received some of the video files, they started to see large and unfortunate failure rates. Initially in the first batch, in that particular batch there was a 57% failure rate: 693 failed in an initial FFMPEG analysis. That means that when FFMEG tried to characterize them, it couldn’t characterize them as video files. 394 files failed QC, meaning they had issues that made the file unusable — for example, a greenscreen with no audio, or the audio is digital noise only. And then 108 failed in transcoding — the software could not recognize the initial format well enough to rewrite it in a different format. Some of those problems are a little more straightforward than other problems.
Julia: Interesting! So, I’m trying to understand, are they all utter failures? Are some of these failures much worse than others?
Rebecca: That’s part of what I’m trying to discover: why these files were showing problems. So I’ll be doing some testing on the files that failed with FFMPEG and other file analysis tools. Maybe more importantly, the gist of my investigation there is going to be discovering exactly where these failures occurred: whether they happened on the LTO tapes [themselves], in being transferred to the LTO [from WGBH], or whether they were damaged in transfer coming off the LTO tapes [to the hard drives]. That is, whether it was pulling them off the network that damaged them and the transfer protocol itself.
Julia: Do you think your project is elastic enough that this can be absorbed into your project? You can just talk to your mentors and this is a clear priority with your mentors as well?
Rebecca: Oh, yeah. This is kind of their idea anyway – to have me incorporate this into the stage when I was going to work on the American Archive data anyway. Casey initially proposed it and I thought that was a fantastic idea and a really interesting research to do. My project mentors are amazing; my project is amazing. There’s a lot of flexibility built in, which is kind of as it should be because you don’t really know what kind of problems you’ll see until you’re on the ground.
I think this is one of the keys to the NDSR Program. It has to have structure, it has to have clearly defined product, but at the same time it has to have a lot of flexibility. An archive has to work flexibly and you don’t really know what you’re going to find until you start doing it and see the systems that you have that you hit some major bumps in the road.
Julia: True, but at the same time at the end of our 9 months, we have very clear deliverables to define whether or not our project is a success.
Rebecca: Well…I think for all my fellow NDS residents in Boston, we’ve all seen our projects shift since we’ve hit the ground. For me, again, since my project is structured a little differently-in chunks rather than one overarching thing, it provides for more of that flexibility. My end deliverable is suppose to be an instructional webinar on everything I’ve learned in the archive, so obviously everything I do will go into that no matter what. But for a lot of my fellow peers who’ve had a large projects to work on throughout the course of 9 months, the deliverables changed and end goals changed when they got there.
Julia: It could also be a part of the design of the projects. The NY NDS projects are not diverging much…this may change, however. My project is going at a fast clip, but we also mostly work in a “chunks” and phases. That could be something.
So, we’ve talked a lot about your projects and different interesting problems, but I’m interested in hearing you describe more of your day-to-day schedule. What do you do in a week? How do you organize your time?
Rebecca: It kind of shifts day-to-day. We’re all in one big room, so I see everyone everyday. It’s a very, very friendly and comfortable environment. People can walk over to each other’s cubicles to ask questions, or pop up like gophers and shout across the way.
Julia: Paint the picture. Do you all get coffee and chat in the morning? Do you have regular meetings with your mentor?
Rebecca: Well, my main mentor is Casey [Davis], but during the first stage I’ve met more with Leah [Weisse]. Everyone gets in at different times, so meetings have to be in the middle of the day, like 11-3pm to not interfere with people’s lunches.
Julia: So it’s pretty friendly and informal. Do you have a dress code? That was one of the big question when we [in New York] all started: what were the official dress codes of our respective institutions and what were typically days like.
Rebecca: No, not any official one, but I don’t wear ripped jeans to work. Or at least no stated written dress code.
Julia: How often do you see your mentors? Do you mostly interact with the 5 or so people in your department? It sounds like Casey has been really great at promoting your work within the entire organization, but I’m curious how you work with other people in your organization and department. Well for me, for example, tomorrow I’ll go in at 10, prep a workstation for an intern coming in at 11. I’ll prep for a meeting with another department and answer emails, check in with Don…Lay it out for me.
Rebecca: So, for example right now, I’m at a point of transition between projects, so it’s more meeting-heavy. I usually have 3-4 meetings a week, but tomorrow I have 3 meetings. So, I came in the morning to set-up for meetings and I set-up my LTO operations because transferring data takes a long time, which you well know.
Julia: That’s similar to my schedule. Do any of these workflows that you’re creating interact or are they discrete?
Rebecca: They do somewhat intersect, the American Archive workflow will serve as a test case for the larger WGBH workflow, but at the same time they have different focus and energy. Which I like because I can take the morning to figure out what project I want to work on. Do I want to work on the workflow document? Do I want to put my headphones on to work on improving shell scripts? The specifics of how I spend my day are up to me. It really varies pretty widely.
Julia: I want to loop back to a previous conversation we had about your preliminary workflow examination and the realization that there were partial remnants of old workflows… because it wasn’t really questioned because it’s such a large, complex workflow with different inputs? Maybe no one understands the full scope of it?
Rebecca: Well, I wouldn’t say it wasn’t questioned. I work with a lot of really smart people here who at one point or another would say, “well this doesn’t need to be here.” I would say that unless you have someone dedicated to making sure change happens, it’s a lot easier to keep going with the old workflow because you don’t have to go through the work of explaining the new workflow to large number of people. You have to keep up and you don’t have time to stop even if you know you have some issues. So that’s the benefit of a NDS Resident!
Julia: Yeah, we’re here for a very finite amount of time with a clear mandate. So you’re a beaming endorsement of the project? You really enjoy it, you find it fulfilling, you get along with your colleagues…
Rebecca: Only good things to say!
Julia: Any advice, to individuals thinking about applying to the next NDSR rounds?
Rebecca: I say go for it! It’s a cool way to get a lot of different kinds of experience and to really have the opportunity to push yourself to learn- that’s the other really nice benefit. For all the work you are doing at your institutions, you are also really encouraged and in fact required to spend a large amount of time taking advantage of professional development opportunities: to continue taking classes, go to conferences, take webinars. For example, if I want to take the afternoon off to write a blog post, i don’t have to feel guilty about that; I’m suppose to be doing that! Or if you want to apply for a copyright class, I can take some time to do that because it’s related to my projects. There are some downsides, but I would definitely recommend it.
Julia: Totally. It’s the best part. Any surprises?
Rebecca: I thought that I was going to be the only resident in my cohort that wasn’t going to be creating a workflow diagram- it’s not true. There will always be a workflow! You can never escape a workflow.
Julia: Especially with these large scale projects. I think everybody, whether creating technical writing documentation or a diagram, is essentially creating workflows.
Rebecca: You learn to love it. You learn to embrace it. How did I ever live without the workflow? I want to create a workflow for everything.
Julia: It’s strange to design workflows for phantom individuals and departments, albeit in my case, I know that the documentation will lay the groundwork for newly created positions for a new department that is now forming.
Rebecca: I thought about that too, but I spent a huge amount of time documenting an inordinately complex document. Leah came in and said she wanted to show it to everyone to show the scope of the work we do.
Julia: Workflow as an advocacy tool! Anything to help advocate. That’s great! Well thanks so much Rebecca! I really enjoyed getting to know a more about your experience so far!
Rebecca: Thanks Julia!