Daniel Horowitz was one of the many genealogists I got to speak to while at RootsTech. He's a board member and the webmaster for , the International Association of Jewish Genealogy Societies, which is the umbrella organization for the various Jewish genealogy groups throughout the world. He's also an employee of , one of the major genealogy vendors. And I even ran into him on the final night of the conference, at genealogy blogger , where we ended up at the same table. Daniel interviewed me about RootsTech and LeafSeek on the floor of the Expo Hall. The interview was shared in a recent IAJGS webinar and was just posted on their website. In it, I explain what LeafSeek is, and how it came about. Here's the video! [jwplayer config="LeafSeek Video Player" mediaid="245"]   Added March 24th, 2012: And here's a transcription of the video! Daniel: Hello. I'm here with brooke ganz, the winner of the second prize on the rootstech technology contest. So Brooke, if you can tell us a little bit about the contest and what you did. Brooke: Okay. I entered the RootsTech Developer Challenge, which was announced a month or two before, sometime before the actual conference, and they were looking for new solutions to existing technological problems in the genealogical area. Anyone was able to submit an idea and some code. So I had been working on some code for the genealogy group Gesher Galicia, which concentrates on historical records from the former Austro-Hungarian province of Galicia, which is now split up between Ukraine and Poland. So the records ended up everywhere, and some of them are in Austria, some of them are in Poland, some of them are in Ukraine, some of them are in New York, some of them are in Israel. It just scattered. So the group exists. It was founded in 1993 to bring all the records together in an easier way for people to research their ancestors who were from Galicia or Galitzianer, which is the majority of my family which is why I was interested. So the old problem used to be that we had no way of getting these records because they were stored all over the place. Bit by bit now we're getting records. We have copies of old phone books that have been transcribed. We have copies of old tax lists from certain towns. We have some birth records for certain towns, Jewish birth records. We have landsmanshaften records, which were from YIVO Archives from New York City. People who came to New York and founded membership societies for people from certain Galician towns. We have all these different forms of data coming in, which is great. The problem was we had no way to share the data in a consistent way with our members, and that was a problem because we couldn't just really send around spreadsheets. So I went looking several months ago for a way to sort of create an online database for all these things, like an All Galicia Database, which we had talked about for years doing and it just sort of never came together for various technical reasons and other reasons. I couldn't find a technological solution that was working really well. Steve Morse has wonderful one-step tools to make your own one- step database, and they work really well if you have smaller datasets or you just have a couple databases. But we had something like 190, 000 records in 60 different datasets, and more and more are coming in every few months. We're adding like 10 more in the next few months. So it was going to be too much trouble for us to have many individual, separate, unconnected databases without an overarching schema. So I found something online after doing a lot of research called Apache Solr. Apache Solr is a free open source search platform which is available through the Apache Foundation. They're more famous for doing the Apache web server, but they also do Apache Solr. The more I learned about Solr, the more I thought this is great. This is a great way to form an integrated database that searches everything at once. It understands, say, a mother's maiden name and a child's name on a birth record and a name on a tax list -- they're all a surname and they're all a surname type even if there are multiple surname types in the records. You can define a given name type, and you can do different things based on what that type of data is. You can form one search interface for the entire record set. Daniel: So you create a search engine for all the records that you have. Brooke: Yes, a search engine for the entire set of records. Daniel: Okay. Brooke: Even if the individual record sets are very differently shaped. Tax lists might only have like four columns of data in the spreadsheet. But a marriage record set could have fifteen columns of data: the groom's first name, the groom's surname, the bride's first, the bride's surname, the congregational number, the house number, the date. You have a million things. Daniel: Okay. Brooke: So we needed this overarching thing and Solr was it. So I customized Apache Solr, with the addition of a front end built in PHP on a platform called Solarium, and I packaged it all together and I put it together so that it worked for genealogy records and historical records. It knew what these things were and sort of smartly handled all the different problems we had, like wildcard searching and phonetic matching and alternate names -- especially since Jews have tons of alternate names, from Hebrew versions and Yiddish versions to secular versions. I figured all these things out, and then I decided I should submit this to the RootsTech Developer Challenge. Daniel: And you win the second prize. Brooke: Yeah, there's that too. Daniel: Yeah. So what was the prize exactly? What did you win? Brooke: I won second prize, which was really exciting. There were a lot of great entrants. I won a very nice paperweight, which has my name on it, and I also won $3, 000. Daniel: Okay, great. That's very good for you. Brooke: Some of which I'm going to be giving back to genealogist societies and to groups like the Free Software Foundation and the Electronic Frontier Foundation, which promote open source and free open records access on the Internet, which is a cause near and dear to my heart. Daniel: Very nice from you. Brooke: And some of it I'm keeping for myself. Daniel: You should, you should, of course. Brooke: So I released it, and I needed to find a name, because I had to call it something other than 'that thing I built for that database and I think other people could use this too and I don't know what to call it'. So I came up with the name LeafSeek, because you would search for a leaf on a tree. This is not a method of publishing your family tree. There are a million ways to publish your family tree. In fact, I hear MyHeritage has a very nice way of publishing your family tree. Isn't that right? But this is a way to publish the data, the forest, all the data for an area that you're interested in or an ethnic group or a section or a historical time period or whatever.  And it's free! So I packaged it all up, and I made a website explaining how to install it and all the feature sets, and it's on LeafSeek. com. You can go and download it, and there's installation instructions. You can use it for free, open source. Daniel: So there is a note for all the webmasters that have a lot of data. . . Brooke: Yes. Daniel: . . . and lists of people that they have already a source and a code that they can use and do a really nice search engine for them. Brooke: Right. Yeah. I don't want other people to have to reinvent the wheel. So there are other groups, whether you're a formal JGS or whether you're a Special Interest Group or whether you're just a researcher with an interest in a certain little area of the world or a certain something, something that's important to you and you need search all the data no matter how big the data is, this is the way you can do it. Daniel: And that is one of the goals of RootsTech. Besides the prize, of course, how was the conference for you? Brooke: RootsTech has been really great. I really wanted to go last year when I heard about the first year of the conference, but I had just had a baby. I was at home, and so I followed as much as I could online and I read Twitter. I promised myself next year I'm going to go. I'm just going to get myself there, and then maybe I'll enter that Developer Challenge thing too. So I did enter and I did make it here. My husband is at home with the kids right now. It's been a really great experience here at RootsTech. I'm really glad I came. I've met some wonderful people and had a great time here in Salt Lake City. I would definitely come back next year, and I would encourage members from the Jewish genealogy community especially to come to RootsTech and be present and see the wide range of offerings not just from vendors and companies but the amazing talks that are here. There have been two tracks of talks. There is the user track, which is more for people who are just using technology, instruction on how to use Google for your genealogy, how to use tagging, how to use Twitter, and how to use all sorts of things to help you with your genealogy. But technology to help you with genealogy. But there is also, which is more interesting to me, the developer track, which is people who actually make these tools on a daily basis, who are interested in building API's. Talks on LeafSeek, or open source GEDCOM parsers, which we've never had an open source for them before. That was something that came out here. An open source common synonym name finder. All these great things that people are coming here and sharing and talking about. And it's been so great to meet people and have this experience. I'm really glad I came. Daniel: Okay. Thank you very much Brooke. So now we know that RootsTech has something for everybody related to genealogy. It doesn't matter if he's a techie or not, and he's involved or not with technology programming or companies. Brooke: I would encourage JGS webmasters and especially JGS presidents to come here, because I think there's been a lot of discussion among people who have similar challenges about how to deal with technology on their websites, how to make things useful to their members and provide value to the members but also keep records open. There have been a lot of people talking about that here, and so I think here is where you'd want be to get ideas and to get results from other people and to get the idea about best practices. Daniel: Okay. Thank you very much Brooke, and again congratulations. Brooke: Thank you.

