Podcast audio transcript

DrupalEasy Podcast S15E2 - Rosie Le Faive - Islandora (Drupal-powered Document Asset Management system)

[0:06] Back to the drupal podcast. This is season 15 episode two.

In today's episode, I'll be talking with Rosie Lafave about the drupal Powered Island Dora document asset management system.

Before we get to that interview, let me tell you a little bit about drupal easy's long form training courses.

Our beginner focus drupal career online course is now in its 12th year and we have graduated hundreds of students during that time frame.

Class sizes are limited to no more than 12 students.

That class runs twice a week for 12 weeks starting August 28th.

You can learn more about that at drupal easy dot com slash DC O.

Our second long form drupal training course is our professional module development course.

The full version of the course is 90 hours over 15 weeks or we have a light version which is 60 hours over 10 weeks with both versions meeting twice a week.

In this course, we cover things like developer tools, dependency, injection, custom plugins, cashing and PHP unit tests.

[1:09] During the course, we focus on writing two custom modules that make external API calls, use and define an event, subscriber, custom service class and more.

We have received some great feedback about this course, one student from last semester said and I quote, this course provided tons of hands on practice with many drupal API S and best practices like writing automated tests.

Class begins August 2nd for the full version and a few weeks later for the light version to learn more, just go to drupal easy dot com slash P MD.

[1:48] Welcome, Rosie Lafave to the drupal podcast. Thank you for joining me today. Thank you for having me.

So, Rosie, we're gonna be talking about Island Dora, which is a digital asset management system built in drupal.

Yeah, a lot of times you see the abbreviation of D AM or a dam system, I guess. Is that the proper way of saying it?

Yeah, that's what we call it. Um A lot of times we use the word repository, but that can be a lot misleading because obviously lots of people have code repositories in other places and things like that.

But uh yeah, technically, we are digital asset management or often what you create with Island Dora is a digital repository, right?

For documents and what, where a document is in the most generic form? Absolutely.

Any binary? Really? OK.

All right. Before we get to that, because we are, I want to kind of define what a document asset management system is a little bit more than, than what you just did.

Um You are a librarian at the University of Prince Edward Island.

Do I have that? Right. That's right.

All right. A member of the drupal community and I do want to mention so Island Dora is, uh, it's built on drupal nine currently.

Um, I imagine there's efforts to move that to drupal 10.

Absolutely. Ok. We'll say I'm going on the way. I don't want to get to the weeds with any of that stuff. Uh, but let's take a step back.

[3:09] And kind of define because I know for me, I've never actually implemented a dam.

Um I've read about them a bunch. I've seen them in action, but let's kind of define like, not only what it is but what are some of the more common use cases like who would need a document asset management system?

Sure, digital asset management system.

It's probably easiest to give some examples. So the university where I work, we've digitized close to 100 years of an old newspaper, the newspaper of record on pe I the Guardian.

And so you can go and browse year by year and by day, all the different, you know, all the, all the content that's in that newspaper.

So people love this for doing local history stuff um for reading the obituaries, other things like that.

And one of the cool things about Island Dora is you get some great stuff for free.

So when you upload the scan, we have systems that send it out and uh it does optical character recognition or OCR.

So once it comes back, your newspaper has selectable text and you can just uh you know, copy and paste that or uh search within the newspaper for what you're looking for.

Yeah, I want to get to that bit in a minute because that's the part that kind of blows my mind about these systems.

[4:24] So you and I met in person at drupal Pittsburgh a few weeks ago to talk about this and you mentioned an acronym which I had never heard. Which was Glam.

Oh, yeah, it was an organ, I guess Glam organizations are often have a need for a digital asset management system. So then define Glam.

Glam stands for galleries, libraries, archives and museums. So you can also say cultural heritage institutions.

Um a lot of times if you're working there, you'll want to digitize some stuff, put it online, make some sort of exhibit either, you know, highly curated, like with a little walkthrough or even just a whole pile of, of thousands or hundreds of thousands of items that people can peruse at their, at their leisure.

So we need something that's scalable.

Um We need something that can look really nice.

[5:18] Um And that's where I think drupal comes in because all of the seeing and stuff that works with drupal can work with Island Dora and we need something that can handle any type of file that you throw at it.

So we're focusing on, you know, audio video, but you can build, you can pretty much build any functionality with any type of file that you want to, to store in there.

So it would be, we don't have it right now, but it would be fairly straightforward to build a viewer for 3d objects and use that to show, you know, 3d models in an island.

Or I think when I first started learning about dams, I assumed somewhat incorrectly.

I know this. Now, looking back that.

They were basically focused around. Uh you would upload a document, a PDF, let's say, and the dam would basically keep track of revisions.

Like every time there's a new revision to that PDF, you can upload a new revision and, and maybe leave a note about it and it would kind of keep track of what the revisions were.

Kind of not totally unlike what the, you know, the drupal core revision system does, but where things really kick the door open for me and this, I want to dig into this part a bit and I'm excited about it.

So I don't wanna get there too fast is derivatives?

What happens when the file gets up uploaded and things that you can do with it?

[6:34] Um which is really interesting and what Island do gives you kind of built in?

It is pretty amazing, but that's a tease. So before we get to that, I do want to talk about a couple of the or a few of the modules that Island Dora uses.

One of them that I was really surprised to see in a drupal nine context was the context module.

Oh, really? That's hilarious. Well, those of us who we've been around for a while and have built, you know, pre drupal eight sites for a long time, like context module.

That was a, that was a, that was a main go to in drupal seven.

And before I haven't, I don't know if I've implemented at all for modern drupal sites, you know, drupal 89 and 10.

[7:17] But it looks to be a pretty important part of Island Dora.

Well, we have, we, we use contexts to run all of our, if this and that kind of logic.

And I don't remember the precise reasons why it was chosen other than rules.

Um But, you know, we had, we looked at those, both of those, I think as we were moving to drupal eight and I think that around drupal eight was when context was chosen.

And so we've written a lot of code extensions of plugins for contexts and we've written some, some logic that uses hooks to make certain contexts execute when they normally wouldn't.

[7:53] And so it's because of all of that inertia that we've put into context that we haven't moved to something else.

I think EC A might be the one that's uh that's hot right now.

But I can definitely see us transitioning to that as I think context might be one of our sticking points in moving towards drupal 10.

Well, I did notice that there's a stable release for drupal Nine, which I'm, I'm assuming that's the one that Island is currently using.

And there is a, I don't know if it's an alpha or a beta release for context for drupal 10, but it was just kind of, it warmed my heart a bit to know that context kind of still lives and that there's still active development app apparently going on with it.

Absolutely. Yeah. And it's funny a lot of the modules that I'm looking at, especially for, um, for drupal 10 compatibility are in alpha.

And I feel like a few years ago people used to have releases a lot more often of their drupal modules and things would be green covered by that security shield.

But I'm seeing that less and less these days, which is interesting.

I feel like sometimes in the drupal community, we go through these phases where people are almost afraid to make a full release because then they are somehow more responsible beholden.

[9:04] Yeah. Yeah. So the other module I want to talk about is real quick, is fly system.

Sure. So I have a lot of familiarity with the S3 FS module, which is kind of like a specific use case, right?

That, that basically allows your drupal site to use Amazon S3 for its files director.

Fly system is a more kind of open model uh where fly system can connect it, it can provide that same service for Amazon S3, but also for a number of other file systems.

Yeah. And I think I know of people within the island or a community who are using fly system to store things in S3.

Um and other places rather than the Fedora repository that we have as default, right?

And we were talking before we recorded and I'm going to correct you since you corrected me, Fedora Cars repository, that's a technical name for it, I guess. Yeah. OK.

Maybe I'm not so much correcting you as just trying to show off my, my newfound knowledge that, that you recently bestowed upon me.

[10:08] All right. So let's kind of let's get into like the, the, the crux of what island Dora does.

So let's use the example that you brought up of, you know, scanned images of newspaper pages from 50 years ago or how however long.

[10:24] So one would take a scan J peg or whatever it is, upload it to Island Dora as a node or as a media item on a node.

I think is probably that's how we do it. Yeah, we uh we first make a node for the kind of conceptual object of this page and meta data, right?

And then you add a media and there's where you upload your file, your your big Tiff file.

It's like 40 megabytes or and then magic happens. Yeah.

[10:50] And the magic and I'll just introduce the magic and I'll let you talk about it.

Um The magic is in this, the site can be created to automatically create derivatives of that uploaded document that Tiff as you're saying, and those derivatives can provide additional metadata about that document.

And the one, I guess the obvious one for a, a Tiff file of a of a newspaper page would be automatically read the text on that and save that text into something that's more searchable using an OCR plug-in or something like that.

Yeah. So yeah. So talk about that and take us through some, some.

What are the available plugins at? Well, maybe plugins isn't the best word but some examples of derivatives that are created from different file types.

Sure. I'll try to stay top level and I'll get too much into the weeds here, right, right off the bat.

But we have a system of micro services and because Island Dora came from Prince Edward Island, our mascot is a lobster and so we've named things in certain ways.

So Crayfish is our suite of microservices and they're all just PHP programs and.

[11:59] We in order to talk to them in a scalable way, um We've got a queue that kind of sits in between and then to connect that queue to the micro services.

We have a Apache Camel based middleware service that we call Alpaca.

It is basically the one that takes those, you know, so once you upload your Tiff and you save your, your media during that save hook, a context can be executed.

And if that context includes an action such as OCR my file, then a message will be put into the queue and then Alpaca reads from the queue and sends it to the appropriate micro service.

So we've got a few, I'll try to remember all of them.

Hypercube is the name of our, not whatever these, these names.

I, I always get confused because we've got these special names for the programs that run.

So Tester Act of course is the OCR software that we use. It's pretty well known.

And so that's what we're doing. The fourth dimension jump. We got Hypercube, we call it, I'll try not to go into the rest.

Um Yeah, we've got one that wraps FFM peg and we use that for our audio and our video derivatives.

So I'm gonna ask a question about that because I was wondering about OK, so let's say a video is uploaded.

So FFM peg is used to extract audio to make the audio available separate from the video or, or? Oh, that's a great idea.

Do you want to implement that? We would definitely take a pull request.

[13:24] We don't do that yet. Um What we've done is we transcoded down to a web friendly MP four.

So if you're sending up something that hasn't already, you know, like a raw way file or you know, if you, because if you want um your preservation object to be uh a noncom video, then, you can let you can let the FFM PAG do that compression for you so that people can watch it on, on stream it, stream it online.

How about text extraction from that from a video that we haven't put that in yet.

But it wasn't very long ago that there was this um a new system was announced for like for audio files.

So we'd have to probably extract the audio, get an audio and put that up there.

But yeah, some, some pretty great leaps have been, have been had in like text extraction from audio that I think it would be really great if we could implement.

But we haven't done that yet. Let's go back to the example, the newspaper.

[14:20] Yeah. Yeah. Extract the text from a page.

Um so that becomes searchable and, and part of the, the, the node metadata, you gave me another example when we were talking in Pittsburgh.

That was text extraction with uh like the, the placement of that text on the page for me.

Sure. So the, the schema that we use is called HOCR, I'm not sure what the H stands for, but it is an XML format and every element includes placement information.

And so it's really, it's pretty straightforward to have a viewer display that text kind of as an overlay on top of the image and all the text will line up with what it was read underneath.

[15:02] And so that allows somebody to come in. If they're using a screen reader, they can read the text as it is. They can also select the text and copy it.

And we can also, if we wanted to use that as our feed that into the search system, we use sol.

So there's also I know that image magic is used as well.

I'm assuming that is just for re encoding or images in a different format, more web friendly format.

Yeah, for sure. Um Or for a more easily downloadable format like we might make AJ peg of your, of your big Tiff file or something like that.

[15:35] Um And it also makes thumbnails, which is useful.

Although drupal already makes thumbnails out of images, but unfortunately, it's still necessary for us to do this, for things like Tiffs, like the large Tiff files or JPEG two thousands because they don't work with Drupal's basic image library.

Yeah, you can't, you can't make an image and add a JP or JP two to it.

It just, it won't, not a compatible file type.

And then another one in there is a PDF to text which is pretty obvious what that does as well. Yeah. Yeah, for sure.

So what I found really interesting about this is that if we go back to the top, a new node is created metadata is, is entered the document, whatever that whatever that digital document is gets uploaded.

And then as these derivatives are created, they are basically created as and correct me if I'm wrong here, but additional media items on that node, right?

So when you go to look at one of those nodes there, there is a listing that basically shows the original uploaded item.

And then here is a more web friendly version of that item, whether it be a, a video or a or a or an image.

Here is the the text that it has been extracted from this item.

And so it kind of shows you all of the derivatives that have been created for this, for this item.

And then in addition, we talked about again in Pittsburgh the fact that nodes can be related to one another.

[17:05] And I guess using your newspaper example, I'm just gonna take a guess here.

But you could have like a, you know, January 31st, 1950 newspaper.

Yes. Yes.

And then you probably put your January 30 issue into some sort of collection that is all of the issues of that newspaper that you have, right?

Oh OK. So it goes, it goes bigger too. Yeah. Yeah. All right.

That's, that's very cool.

One other bit I want to mention about these uh the nodes, the these uh these document nodes is you showed me the ability to uh that the site can automatically generate citations.

[17:50] Yes. So talk about that for a moment because I thought that was super valuable. Sure.

Um So this is a kind of contributed module by our friends at the University of Toronto Scarborough and it uses the Bib sites library from the bib site module and creates a mapping so that you can say so for this content type, whether it's article page or something else, here's how all my fields in that node map over to the citation parameters.

So that's author, publisher date, things like that.

[18:23] So once you've made that mapping, we've got a little block that we've placed and you can select one of several different citation styles.

So whether you want ML A or A P A or Chicago or something else, you can cut, you can upload the, the CS L style sheets into like bibs site's way of doing that.

And then it, yeah, it displays a copy citation for you and they're, they're usually pretty good.

That's pretty cool. I, I do want to mention a couple of other contributed modules.

Well, actually not specific contributor modules. I think, hopefully you'll be able to tell me which ones, but there's just from my notes from when we talked.

Uh I thought it was really interesting the way that Island Dora can handle fuzzy dates.

Yes. So are there is that strictly contributor modules or some, some custom stuff as well or maybe give some examples as well? Yeah.

No, that's a module called controlled access terms, which is a bit of a, a bit of library jargon.

Um We'll put it essentially the module focuses on the idea of using taxonomy terms as what we in the library world call controlled vocabularies.

So it creates a few different file types and or sorry field types.

[19:33] A couple of them are more focused on managing those taxonomies, but one of them is called EDT F date.

Um And EDT F extended date time format is a specification from the Library of Congress that was designed for people like me librarians to look at a book and code the date on it.

And so it could be like 1922 you know, in drupal to have a date field, you'd need to say 1922-01 dash 01.

But if all you have is that year, then we need a way to just enter that year.

So it's in a sense like partial date. I don't know if that module is still around in drupal, but it was in drupal seven.

And what EDT F date allows you to do is not only, you know, truncate your date at some point so you could have a year or a year and a month, but you can also kind of fuzz out some digits. So you could say like.

[20:24] 19 192 X. And that would be sometime within the 19 twenties.

You can also indicate with some symbols like a tilt or a I believe question mark, I might be wrong about that, that a date is uncertain or approximate.

And so the EDT F module is really great because it allows you to save all these all this complicated schema, it validates it.

So it makes sure that you're using all these indicators correctly.

And then it also has a plug in for solar so that you can have a date facet that's based on these kind of fuzzy dates.

And uh it will it will negotiate some of the uncertainty.

So if you've got like 192 X, it will have, you know, all dates within the 19 twenties or something.

So that's already understood by the indexer. So you can make yourself a date facet, which is of course something that libraries love to have displayed on our content. You know, which date range do you want to see from?

That is wild. We also talked about enhanced reference fields.

[21:25] So reference fields with some additional metadata on it.

Yeah, we call it a typed relation. I think this is what we were talking about where, you know, when you're talking about a book or something, you might want to record the author and the editor and the publisher and an illustrator and any number of people in any number of roles.

But to create enough fields in drupal to hold all of those would have been insane.

So instead of doing that and we put a little dropdown beside this beside the place where you record the name and the dropdown contains what we call the relationship or the relation.

And so what we, you know, we have a big list that includes all those ones that I mentioned as well as a whole bunch of others and some legal stuff and you know, creator of the index and stuff like that.

Is that a contributor module or is that part of, is that a custom, this field type is part of controlled access terms?

And when you're implementing the site, you can set up what you want the options to be for the relationships.

Yeah. And this is one of those things where, you know, at least I will would have to like play with it to like fully.

[22:32] Fully grasped. But yeah, I did want to ask about those two things.

So I thought those two things were super interesting.

So another contributed module or maybe it's a set of contributor modules that Island Dora takes advantage of is workbench.

Yeah. So you wanna talk about that and the workflow stuff a little bit?

All right, I guess I would be remiss if I didn't talk about one of the big workhorses that goes with Island Dora.

Um And that is a Python script called Island Dora Workbench.

And it's developed by a member of the Island or community actually happens to be the chair of the board.

And it is a tool for ingesting large amounts of data into drupal.

So we know everyone's familiar with drupal migrate and how fun it is to write eml files and pipelines.

[23:16] And this bypasses all of that. So you can just take a, a CS VA spreadsheet of your values.

It could be, you know, a CS V file or it can be a Google sheet and then you set up some config options and you run the workbench.

It first checks all of your data to make sure that it's OK, you know that it conforms to whatever field type you're trying to put it in and then it goes and runs and most we say ingests all your content.

And the the reason that we have this is that like I was mentioning a lot of the times in the glam community, we've got tons and tons of data to do.

So, adding content one piece at a time would be really ridiculous.

And we don't always have the kind of developers on hand who are ready to help guide you through migration.

So island or a workbench, you just, it's Python, you download it onto your computer or if you want to like put it on a server that's, that's a little bit closer there and it will bring in all of your files, it'll attach them to your nodes properly and it'll add all the metadata that you want to your nodes.

[24:19] So this has been an incredibly powerful tool and it's really, well, it's very responsive.

So if you have a bug or run into something, the creator of this module just like, seems like every night he'll like fix the things that came up that day with it.

So we are extremely lucky to have it. And it is, yeah, it's a pretty fantastic tool that's, you know, outside of drupal, but really works with drupal and integrates well with, with Drupal's field types and, and other things.

So, is there a like a sample.

[24:50] CS V file to like for folks to like kind of like a template for folks to use to get started with it or yeah, you can actually, you can build one out of your own site.

I mean, you can look at the the documentation and it might provide a few examples.

The main, I guess the main thing is that when you are creating your CS V, you need a column for ID and then columns um take the field name in drupal.

So field underscore body or field underscore date underscore issued and then there are there are also a few reserved fields that do special things within, within drupal and Island or so with workbench.

I mean, workbench is, you know, it's used often for like the editorial process, right?

So this is a different thing. The island or workbench is no relation to the drupal workbench.

Oh OK.

My confusion there. No, that's OK. It's hard. There aren't enough words.

[25:48] OK. Let's take a step back. I mean, I I actually have clients and I'm sure a lot of people listening have clients where this could be very potential, you know, could be potentially useful how would one get started?

Because it seems like there's a lot of pieces here.

I know that if you go to, you know, drupal dot org slash project slash island Dora, it kind of directs you over to the github.

Sure. Sure. But even if you know, you clone the project locally, um you mentioned a bunch of micro services like what, what is involved in like getting kind of the basic install up and running.

[26:22] You know, are the, are the content types already there for you kind of fleshed out and maybe just need, you know, do they?

Is it one of those things where the set that you get out of the box is a pretty useful set for most applications or kind of what that look like?

Well, we try, we try to make it that way. I guess if you're starting out, start at Island Dora dot C A our project page on github or on, on drupal dot org or on github are probably not the best places to start with because they're really deep into the weeds with things.

So from our Island Dora dot C A website, you can go to get a demo.

Get Island Dora is the big button.

There is a, if you want to just play with it, there's a live demo.

So we have an online sandbox or if you want a copy on your own machine, there is a desktop Docker demo that is the least code way to install Island Dora.

It uses Docker, it uses a Docker plug-in called Porter and then you get kind of an all in one.

There's your island or, but it's not very easy to code with.

So if you're planning on going in there and installing modules and stuff, I'd suggest using either our answerable playbook or our Docker, what we call Ale Isle, the Docker installation.

[27:42] And it's a little bit more thorough. You do need to open the command line in order to run aisle, but it launches up your island or your drupal, your fedora and all your micro services as containers that are all connected to each other, right?

That's what I was. That was the question I was about to ask. Ok.

Yeah. Yeah. So all of that stuff, all of those micro services are out of the box designed to run on the same server, or you don't have to, I mean, the idea of going for micro services in the first place was that they could be remote from a development standpoint, right?

Like if someone wanted to play with this, they could set this up.

Yeah, on their machine use have all the, you know, have all the micro services available and just have everything work. Yeah.

And then are there any like video walkthrough introductory level that you would recommend for folks if they, maybe they're not ready to install, but they do want to see like how it works and what it looks like.

And if you answer no, you've probably got a week or two before great question.

I know. I'm like, I got my podcast to get one done.

[28:52] I'll do that and I'll collect them. I mean, we've been making video content and recording screen casts and doing other things for years and years and so they're just all in various locations right now.

So I can probably find something for you. But I'll have to check how out of date it is.

Otherwise, of course, I could just make some, well, because I think that's what folks knew to digital asset management systems or even micro services or.

[29:19] You know, there's, there's like a, not a fear of the unknown but a how much time is this going to actually take me to get up and running on my local unknown.

Right? And I think it's very helpful for folks to be able to just see, see the process.

Like, oh, all I need need to do to get started is, you know, these six steps and I can watch the screen cast of someone doing them.

So I can kind of take the, the fear and the unknown out of it.

Yeah, that's a great idea. Um That's kind of what I'm going for is like, how can people, how can people who or listen to this podcast and intrigued by what Island Dora can do?

I mean, the live sandbox is fantastic, but, you know, if you're going to implement this for a client, you kind of need to be a little bit more comfortable with it than using a live sandbox obviously.

So how do we get over that, that, that second hurdle, so to speak.

Um Well, join the community I think would be my, my answer for that.

So we have had a very active Slack workspace and we have a mailing list on Google Groups.

Is Slack? Is that in the drupal community? Slack as well?

We, no, we don't. I mean, we cross pollinate sometimes, but we have our Island Slack.

Ok. And we can find the URL to that and stuff on Island Dora dot Ca.

I assume that's right. Yep. All right. Fantastic.

[30:39] Uh, did we miss anything? I mean, I, I feel, I feel pretty good about, you know, that we covered everything.

I was hoping we were going to cover and hopefully we've gotten folks the information they need to know if they're interested in something like this or not.

Sure. Well, I guess, yeah, I do want to mention that we've got a few interest groups that are pretty active within an island or so, one for documentation, one for metadata and one for institutional repositories or IRS.

Uh So when IR is often um when a university has a place to put all their digital thesis, so dissertations and other stuff, sometimes they might include publications by faculty members as well.

So these are often.

[31:23] It's really important that they can get harvested by something like a broader national organization and that they have all that citation information ready to use.

So we've got, you know, working groups who've been talking for a while about making a lot of progress with um like the kind of content type and metadata that we have out of the box for Island Dora because right now, we don't have some of the fields that they would be needing.

Another thing that is kind of been developed within the community are modules for embargo content.

So a lot of times, you know, you cannot release this until six months from now, one year from now whatever.

And so you will keep it private or hidden somehow embargoed and then release at a certain date.

Another way that embargoes can be active is if you only want to allow a certain IP range access to your content.

And so there's a kind of contributed, it's not part of Core Island Dora, but it's developed by the Island Dora community module that can do that.

So for fear of opening up a whole can of worm, something just popped into my head.

So I'm trying to contain this conversation. It sounds like with, with the harvesting of, of data that you mentioned a second ago, I couldn't help but think about schema dot org and the schema dot org blueprints module, right?

And it seems like there could be some some cross pollination here.

Where? Oh my gosh. Yeah. Where Island Dora?

[32:50] So I I guess my first question is when Island Dora kind of uh or when the, when the island community creates these, the, these content types and these media types with all of their various metadata fields.

[33:05] Was the the schema derived from or with the influence of schema dot org at all.

It was actually devised our, what we call Island repository item.

Um It's our content type with the full like big suite of metadata.

Um It was devised so that people could migrate stuff from Island or a seven, which used a different, like a library based metadata format called mods mod S.

[33:34] Because of that, there's a lot of it looks like, you know, repeated fields.

We've got about five different date fields, whether it's date created, date, issued, copyright date, et cetera.

And so they don't necessarily map to the schema dot org stuff.

But what we do have is an RDF mapping that takes them into Jason LD.

OK. So basically the same. So we are eventually putting out Jason LD.

It might not be in the same way. So that Jason LD then can populate a fedora repository, could populate a triple store if you have Blaze graph or something else set up and can be read and harvested.

And we also have a module again developed by somebody in the Islander community but not owned by Island Dora that's called Rest O A IP mh.

So O A IP MH is a protocol for harvesting that I know like the library and archives in Canada uses to get the thesis that we put on in our institutional repository.

And so we can use either a template ba and views based mapping or this RDF mapping uh in order to populate the material that can be seen through the O A IP MH protocol.

OK. Wow. OK.

[34:55] So bottom line is this data, can you know there are, there is a way to make this data easily harvest.

Oh Yeah, like we make it to share. That's, that's why most libraries, archives, galleries and museums are in this.

How many sites do you have an idea of how many organizations are using?

Oh, I want to say like 150 sites. I'm not sure if that counts like institutions having multiple sites.

I mean, I don't think there's, I don't know, yeah, there are many worldwide.

Yeah, we have a contingent in Europe as well and we used to do so we used to do in person camps about twice a year, one in North America and one in Europe, we kind of stopped doing that during the pandemic.

And we're trying to figure out how to start up again with what kinds of, what kinds of events we want, digital or in person.

And, well, this has been great, very enlightening for me personally.

I've learned a lot. Great. Well, thank you.

Yeah, I appreciate you coming on and, and sharing your knowledge and the story behind Island Dora.

And I hope that we get some, some folks more involved in the Island Dora community. Great.

Well, thanks for having me on here. This has been really fun.

Thanks, Rosie. Take care.

[36:11] Thank you for listening to the drupal podcast. Don't forget to check out all of our long form drupal training courses at drupal dot com and stay tuned for the next episode of the podcast where I will be talking with Matt Gloin, about PHP Stan and some other really cool stuff that he's been working on.

[36:29] Music.

July 03, 2023