Audio, Image and video keywording. By people and machines.

Leave a comment

Tagging.tech interview with Jason Chicola

Listen and subscribe to Tagging.tech on Apple PodcastsAudioBoom, CastBox, Google Play, RadioPublic or TuneIn.



Henrik de Gyor: This is Tagging.tech. I’m Henrik de Gyor. Today I’m speaking with Jason Chicola.

Jason, how are you?

Jason Chicola: Doing great, Henrik. Thanks a lot for taking the time.

Henrik de Gyor: Jason, who are you and what do you do?

Jason Chicola: I’m the founder of Rev.com. Rev is building the world’s largest platform for work from home jobs. Our mission is to create millions of work from home jobs. Today, we have people working on five types of work. Jobs they could do in their pajamas. And the main ones are audio transcription and closed captioning. Several of my co-founders and I were early employees at Upwork which is the largest marketplace for work at home jobs. Rev takes a different approach than Upwork. With Rev, we guarantee quality which means that the task of managing a remote freelancer, hiring the right one is something that our platform excels at. And so what that means is our customers have a very easy to consume service. You can think of it… you can think of us as Uber for work at home jobs. So if you wanted to come to us to get say for example this call transcribed as you know as a customer all you have to do is upload an audio file to a website and then couple hours later, you’ll get back the transcript. Now behind the scenes, there is an army of freelancers that are doing that work and we have built our technology to make their lives easier and make them productive. If I zoom out from all of this, I look at the world and see a lot of people who are sitting in cubicles that probably shouldn’t have to, while he was in traffic and it shouldn’t have to and I look at what are all the kinds of jobs you will do today at a computer. How many of those jobs need to be done in cube farm. How many of them could be done from home? We think many of them can be done from home. And our mission is to give more people the opportunity and the freedom to work from home which allows them not only to choose the location but also gives people more control over their lives because they can decide whether they want to be the morning versus early afternoon. It means you’re not tied to a single boss or employer. It means that you can work on one skill on Monday and a different skill on go on Tuesday and go surfing or hiking on Wednesday, if you feel it. So that’s how we think about our business is really centered on giving people this freedom that comes when they can be their boss and work from home. And as a segue to some of your next questions that you and I discussed the past, as we got deeper and deeper into creating jobs for transcriptionists, we have invested in technology to make their jobs easier, to make them more productive. And that has led us to develop some competency and familiarity with what you’re calling here AI transcription which means using a computer to transcribe audio so that what I call a relatively new area for us, an important area, especially in light of people being familiar with Amazon Alexa and Apple’s Siri. So that’s a new small business, but the core is giving people work they can do with a computer. Most of that work today listening and typing.

Henrik de Gyor: Jason, what are the biggest challenges and successes you’ve seen with AI Transcription?

Jason Chicola: It’s really early to judge that. I can give you a specific example in a moment. But it’s a little bit like asking someone today what are the biggest challenges and successes of self-driving cars. The answer is I think business cases that they have been small but possible successes in the future could be massive. I really believe that we’re truly… you’re not even in the first inning. Maybe we’re the warm-up for the first inning of this game and I think is going to be a pretty exciting decade ahead of us as computers have gotten better, as more audio is captured in digital formats and companies like Rev are innovating in a bunch of areas. Our success today in this area has been… we had success, but it’s been at the fringes of our business so I’ll give you a specific example: when the Rev transcriptionist type out an audio file like somebody might hear about this phone call, some customers request time stamps and the humans part of their job is to go into note for example at the end of every minute, this event occurred in three minutes, this event occurred four minutes or so forth. That was an additional task they performed manually while they did their job. We automated that using what you could call AI transcription. So now not only time stamps are inserted automatically but every single word is marked by the AI as when it was sent. So literally for every single word, we know this word occurred at 4:38 and get that word occurred at 5:02. So that’s something that we’ve done that automated something previously done manually and it actually made it a much better experience for the customer because the timestamps are more accurate. That something we already have today. The challenge… the challenge list is longer. The biggest challenge to be aware of when it comes to automated transcription is that it’s garbage in, garbage out. Other people say you can’t make chicken salad out of chicken [****] that if you go to Starbucks and you sit outside by a noisy street and you record an interview with someone who you’re talking to for a book and you submit it to some automated engine you’re not going to get back anything that is very good. And that’s I mean it’s obvious why that is, but the quality of speech recognition depends I would say on three or four key factors other than the quality of the [speech recognition] engine itself. One is background noise. The less the better. Another is accents. The less the better. Another is how clearly the person is speaking. Are they annunciating? Are they slurring their words together? Are they speaking really quickly? Those tend to be the major factors. There is probably another one related to background noise which comes down to the quality of your microphone. How far you are from the microphone. You are a podcaster, so you probably know far more about how record clear audio than most people do. Most people throw an iPhone onto a table next to somebody else’s eating a bag of Doritos. [laugh] So you have great audio of someone eating a bag of Doritos which causes problems downstream and some of those people because they don’t think about it will say “Hey, I really annoyed. You didn’t get this word right.” And that’s because somebody was eating a bag of Doritos during the time that word was said. So part of our job… as we try to get better at helping people transcribing quickly and cheaply part of our job is to help customers understand that you need to record good audio if you want to get to get a good outcome.

Henrik de Gyor: Jason, of January 2018, how much of the transcription work is completed by people versus machines?

Jason Chicola: Are you talking to the work that Rev does?

Henrik de Gyor: Sure.

Jason Chicola: Depends on how you slice it, but I’ll say 99% percent people, 1% machine.

Henrik de Gyor: Fair.

Jason Chicola: We actually have…I’ll be a little more clear on that, we recently released a new service called Temi. Temi.com. That is an automated transcription service where people are not doing the work. Machines are and then are core service rev.com is done basically entirely by people. We believe that that’s required to deliver the right level of accuracy. This is I don’t answer your next questions but we clearly see these two blending and merging a little more over time, but today if you want to get good accuracy you need people to do it. If I give you kind of the external contacts in an earnings call used to be transcribed for Wall Street analysts and machine does it and they make a mistake on, you know, a key number or you know, the CFO said that something happened or something didn’t happen, that’s a big problem. Or if a movie is captioned for HBO. Game of Thrones is captioned by HBO. Those captions need to be right. So any use case where people want a transcript that is accurate today, they need to have people in the loop.

Henrik de Gyor: Make sense. Jason, as of January 2018, how do you see AI transcription changing?

Jason Chicola: I think the most obvious change that people expect and is probably been slower coming then people expect is that the machine is going to help the man in this proverbial battle of the man versus the machine… wasn’t there kid’s story about John Henry the guy that fought the train. And you know, there is this narrative in popular culture that robots are going to take our jobs. And there are sectors where that has been a big problem for people. I think that there’s a broader trend in more industries where technology seeps into our day to day lives in little ways that help us to eliminate the parts of our jobs that suck. The printing photocopies and running to Office Depot and you know changing format for documents. Those things go away what I expect to see is more transcription happening in a two-step process where first machine takes a cut, then the human tweaks it.

Jason Chicola: There are some companies that have tried this in the past that by in large didn’t do well because their quality sucked. There are some companies do this well in the medical transcription space. But the thing that I would….the trend that I would encourage your listeners to look for is a trend that is not my idea. There’s a book called Race Against the Machine its written by a couple academics out of the MIT Sloan School and in the final chapter they are talking about the rise of automation and AI and how it’s going to affect the economy and jobs broadly. In the last chapter, they concluded that they believe that rather than having a showdown against the machines, the best companies were going to be the ones that found a way to in their words “race with the machines,” that the machines should help people do their jobs better and so I would look for examples where software can make people better at their jobs. And I think that that is the trend to keep an eye on. I don’t think…there are people that say “AI is going to do everything.” Well, to those people, I would say are you saying that quality is going to stop mattering? Is HBO going to start caring about the quality of the captions? Is the Wall Street analyst firm that is… Is the Wall Street trader who is reading earnings calls as they come in so he can decide whether to buy or sell a given stock, is he going to want more mistakes in these documents? I don’t think so. I think these people want accuracy and I think humans are needed to deliver accuracy.

Henrik de Gyor: Jason, What advice would you like to share it with people looking into AI transcription?

Jason Chicola: Well, it depends on what their objective is. But if somebody has audios recordings or meetings or dictation that they want to use productively, I would certainly recommend that they try our service Temi.com. In fact, right now, if you download our mobile app which is available on both iOS and Android, all transcriptions submitted through our mobile app are free for limited time. I want to repeat that. You can get free unlimited transcription for a limited time through the Temi mobile app on iOS and Android. That’s a good place to start because that doesn’t cost you much. Beyond that, so that was this is a self-serving comment. There are transcription engines available today by a variety of companies, some of them well known and large for example Google has one under the Google Cloud products. You can play with that. Amazon has announced a couple products related to transcription. They have one called Amazon Transcribe which I don’t believe has formally launched at scale. It might be like a private beta, but that’s going to keep an eye on. They also have a product called Amazon Lex. If you want… If you were a software developer… wanted to build an Alexa-like app that you control the voice commands Amazon Lex was is designed to help you with that. There are some smaller companies in the space as well if you google, you’ll find them, but I would probably give those companies as good reference points for people trying to figure out the category.

Henrik de Gyor: Excellent. And Jason, where can we find out more information about AI transcription?

Jason Chicola: The Temi blog has some good information. So if you go to Temi.com and you click on a blog link in the footer, there’s a bunch of articles that address topics that we think are interesting. Beyond that, googling is great. You know, there are some more specialized publications in the speech world. Most of them are too technical for a general audience. There is a conference called Speech Tek that is pretty good. We’ve been a couple times for some of those really serious about it. But I think between those resources and googling somebody is probably in pretty good shape. If folks have large needs to transcribe a lot of audio, contacting Rev/Temi is a good idea because we can often point you in the right direction.

Henrik de Gyor: Well, Thanks, Jason.

Jason Chicola: It’s really a pleasure to chat today. I know I really believe that 2018 is going to be marked as the first year that transcriptionist start to use on the importance scale. Everybody that I know has probably a couple of these listening devices in their home. Everybody I know is really struggling with Siri and people are starting to think about how to use voice differently. Transcription is… We talked today about transcription and that’s how we framed the conversation. And I think that that is a fine framing, but its a bit backward looking. If I look into the future, I think that there is a whole new behavior that are likely to happen. So when I or a colleague of mine are driving to the office and I have I know how important meeting or presentation or board meeting later in the week. Shouldn’t I be effectively dictating notes to self that I can use later in that presentation? Shouldn’t I be trying to talk more than I type and use an app to build nodes knowledge and insights?

Jason Chicola: I think that transcription implies an existing recording off the shelves whereas using voice to be more productive I think is going to be a major behavior change that we’re likely to see in the next couple of years and we’re trying to accelerate that with our products. Clearly, there are other companies out there as well and we wish everyone luck. I think it’s a big space, but I’m glad that we able to have this conversation because hopefully, you know we listen to this conversation in a couple years from now, hopefully, we’ll have gotten a couple things right.

Henrik de Gyor: Awesome. Well thank you for leading this voice first revolution and thanks again.

Jason Chicola: Thank you, Henrik

Henrik de Gyor: For more information. visit tagging.tech.

Thanks again.


Leave a comment

Tagging.tech interview with Kirsti O’Sullivan

Tagging.tech presents an audio interview with Kirsti O’Sullivan about keywording services


Listen and subscribe to Tagging.tech on Apple PodcastsAudioBoom, CastBox, Google Play, RadioPublic or TuneIn.


Keywording Now: Practical Advice on using Image Recognition and Keywording Services

Now available





Henrik de Gyor:  This is Tagging.tech. I’m Henrik de Gyor. Today, I’m speaking with Kirsti O’Sullivan. Kirsti, how are you?

Kirsti O’Sullivan:  I’m good, thanks for having me.

Henrik:  Kirsti, who are you and what do you do?

Kirsti:  [laughs] I am the Managing Director of Keywording.com. Keywording.com offers outsourced keywording services to, in large part, the stock photography industry, but we do have people with internal collections that we do keywording for.

We do tagging for stills and for video. We’ve done illustrations and graphic designs, but basically we do the outsourced keywording for anybody who needs it. Depending on their target audience, we’ll get them what they need.

Henrik:  Kirsti, what are the biggest challenges and successes you’ve seen with keywording services?

Kirsti:  I think one of the biggest things that we see, that could be better from the client side perspective, is to have someone who owns the keywording piece within in your company. I know often times these companies aren’t very large, and it’s quite difficult to have one person who owns one job.

With keywording, the consistency and continuity is huge, and so your main goal is to always drive yourself towards consistency. How can you put things in place that make it the most consistent possible?

Oftentimes, you’re having someone come in temporarily to help you do the keywording, or that person doesn’t stay very long, and so you’ve got turnover in terms of who’s actually doing it.

You want to put processes or documents in place that give you as much quality control along the way as possible. That’s the biggest drawback that we see. We will start off with one person, who then leaves, rather abruptly sometimes, and we’re with a new person who probably doesn’t know much about keywording and we are starting over.

That’s one of our biggest challenges when taking on a new client. The successes are that having a keywording service do all of your keywording so one source, or one person for that matter, and we can just call it a source. Having one source do virtually all of your keywording makes your collection that much more valuable.

You’ve got a consistently tagged asset versus the biggest challenge we just talked about, which is having any number of people having done it any number of different ways. You’ve basically got a collection that’s hit or miss when you do a keyword search.

Thinking about what you want to do with your collection down the line ‑‑ and I think it’s even fair to even say 10–15 years down the line ‑‑ the more that you can point at this and say, “This is keyworded consistently 100 percent from beginning to end” is really a huge value when you are trying to sell your collection, frankly.

We do definitely ask clients, “What have you done in the past?” A lot of times they have not thought through how they want to categorize their collections. They don’t have an answer to give me, but we do want them to have thought about like, “How do you categorize these different types of pictures?” depending on the collection that’s coming to us.

Obviously, if you have a wildlife collection, you probably have thought much more thoroughly through how to categorize your collection. It’s a very straightforward thing to actually think about.

If you look at a lifestyle collection, trying to get categories out of something that’s so broad and could be any number of topics, is more difficult. If they have that, we absolutely incorporate it. If they don’t, we have our own that we institute.

We also try to ‑‑ and this is also quite difficult ‑‑ we would like to be involved in what were you thinking when you were designing this shoot? What is your intent in creating these pictures?

When the creative director is thinking about putting this together, what are the concepts and the things that you are trying to illustrate?

Trying to get that information to filter down to us and/or to whoever is going to be doing the keywording for you because that’s your intent. That’s super important, but is quite actually difficult in the workflow to make that happen.

Re‑keywording is probably our biggest challenge, and one we actually have chosen to say no to. Oftentimes people will come to us and say, “I’ve got 10,000 images I’ve keyworded and we need you to fix them.”

There’s some things that we can do to fix them. We can run through and analyze the words across the collection and say, “We can fix the misspellings easy enough,” and we can say, “This isn’t the right term,” or we can take out the plurals. It’s a much harder job to fix bad keywording than it is to start over.

Basically, we’ll have a look to see if there’s anything we can do, but 100 percent our answer is, “It’s going to be cheaper for me to do this again the right way.” Our advice is, do it once and do it right, whether it’s for yourself ‑‑ because it’s certainly possible ‑‑ or have someone do it for you right the first time.

The other piece of that is you need to watch the store. I think we are the best at what we do in the industry, but that doesn’t mean you turn a blind eye to what it is we are doing. If you don’t give us feedback, we have to proceed like we’re doing it exactly the way you want.

If you are not telling us this is good, or this is not so good, or I want this different, we have no way to know that we’re not doing what you want. We had one client who was really angry, who had said nothing to us for a whole year, and that we weren’t putting this particular keyword on these images. No one was looking at what we sent them. What do you do?

My advice to people who do take on keywording services is, don’t assume that they are doing good work for you. Make sure that someone’s checking that work. We strive really hard to check our work. Nothing goes straight from the keyworder to the client. There’s always someone with the second pair of eyes looking at it.

As someone who spends so much money creating that image, make sure someone’s looking at the keywords. That’s the only way you are going to sell them. If you take it back from me, or any other provider, and you don’t look at that, you are doing yourself a disservice.

Henrik:  Kirsti, as of late February 2016, how much of the keywording work is completed by people versus machines?

Kirsti:  I can only speak for myself. For us, its 100 percent human performed. We do have a few things that happen on the back end where we do a first pass of the core keywords. Then we have a system that will add in ancillary terms.

We just have to say, “woman,” but our system will add in “female”, and “lady”, and “human being”. We don’t have to do that, so that piece of it is machine‑driven. In terms of describing the picture, that’s 100 percent human for us.

The more you don’t watch what’s happening, the more trouble you can get into. I think what I’ve found when I hear something about, “Oh, we can analyze your photograph and tell you what ethnicity people are”, and “how many people are in the picture”, and “you can jump-start your keywording that way”,what I found is that it’s not 100 percent.

It’s back to that whole issue of re‑keywording is harder than keywording right the first time. When I’ve tried these services, because I’m absolutely not opposed to saving money in any way, shape, or form, but when using those kind of things like visual recognition, it causes me more problems than it solves.

I’ve yet to come across a service that I felt like I could put a picture in there, and hit a button, and be comfortable that I don’t have to go look at…I’m still going to check it, but I haven’t gotten to a point where I feel like I’m not super worried about what’s in there for every single picture.

If I have to worry about every single picture, and read through it, and remove what’s wrong, then I am going to have to go back and add what was right. It could be across any number of different categories, so I’m back to the, “It’s easier to do it right by hand than to use these technologies.”

Maybe someday it will get to the point where you just have to read through it and maybe there might the tiny tweak at the end, but I honestly don’t see that for a while.

Henrik:  What advice would you like to share with people looking into keywording services?

Kirsti:  I think the thing to remember when you’re looking for a keywording service is that it is so, so easy to do an amazing job on 10 images. Anybody can do it. You can get these samples back and look at them, and you can look at one company versus another company versus the next company and go, “Wow! They did an amazing job.”

Well, they sat down for half an hour and did 10 images, and they’re trying to get your work. The thing to remember is that the job that you see when you pick a company needs to be the job that you see six months down the line. Doing a great sample isn’t everything.

You need to check references, and I think that gets missed, that piece of the thought process is, “Just because I send you this.” Actually, when we do samples we try very hard not to go the extra mile. I don’t want you to be surprised when you pick me and then you send me your first job and it looks very different from the sample.

I could go to town and just put gazillions of words on an image and you go, “Wow,” but I can’t keep that up for the money you are going to pay me, so I try very hard to make sure that we give you the job that you will see six months down the line. That’s the big thing, to be very mindful and check references. It’s the longevity that’s important, not the sample.

Henrik:  Where can we find more information about keywording services?

Kirsti:  That’s a really good one. We rely heavily on being…and we have a good URL, knock wood, so we’re easy to find. I know that a lot of times places like Alamy will offer information on who’s out there doing keywording.*

I would do searches for keywording and then see what forums people are talking about what’s going on and other places to go to. If you’re lucky enough to go PACA or CEPIC, oftentimes there’ll be keywording services there. Word of mouth, I think, has been the best for us. If you have friends in the industry, asking around who do they use, who do they recommend?

I think those are probably…because it is. There’s maybe six companies in the world that do it on any kind of scale, so there’s just not a lot. I think, depending on who your audience is for this podcast, by and large our competition is the photo creator. We’re not really competing against another company, per se.

My job is to convince them I can do it better than you can, and I can do it cheaper than you can. I think everyone should choose a keywording service. I think if you are trying to build a collection on a shoestring, then you are going to have to do it for yourself.

I don’t know if you want to add an extra question, because I’ve got some advice to give for people who need to do it for themselves, but if you are going to go out looking for keywording service, I think that that’s the best money you can spend.

I have no idea why keywording is the last thing anyone thinks of, because keywording is the only thing that’s going to mean you get your money back. You can do everything right, but if you leave off the keywords then you’ve thrown your money down the toilet.

I don’t understand why it’s not until those pictures are all ready to go that they start to think about keywording then. Then they want to do it as cheaply as possible.

They wouldn’t in a million years put a crappy lens on their camera, but that’s kind of what they are doing when they leave keywording as an afterthought and as a “Let’s see how we can economize on this piece of this image creation or video creation.” It’s the thing that means you make money.

That piece I don’t understand, so I would really encourage everybody who has got a collection, or thinking about building a collection, to have that be your first thought. How are you going to handle this? Because this is the thing that means you maximize your investment.

If you’re going to do it for yourself, do have advice about that. If you’re going to look for a service, you want to have one that’s been around for a while, that people tell you is consistent, that has a good reputation, that is going to be responsive when you call them. If they make a mistake, they’re going to fix it. That would be my advice, [laughs] .

Henrik:  Great. I think to your point, I think it comes down to search, because people don’t realize that the keywords power the search. They think that the images…they just post them on there with their magic titles, and that will be enough. You and I both know that’s not the case.

Kirsti:  No. One big thing that we get a lot, that I think is important is, they say, “You have too many keywords on this image.” That’s the wrong question. The [right] question is, “Do I have any inaccurate keywords on this picture?”

Every single word that’s accurate is one more avenue to reaching your client. Do you want to leave off an accurate keyword? You can’t predict how someone will look for a picture. The more accurate keywords you have on your image, the more opportunities you have to get that picture in front of someone who you can’t predict how they’ll search.

The only question that matters is, “Are all these words accurate?” and if the answer is “No,” then they need to come off. If the answer is “Yes, they are all accurate,” the more you have of those; you maximize your opportunity to sell your pictures.

My advice for someone who can’t afford keywording service is, like we talked in the beginning a little bit, is that think about your collection and make a category, make a category list, and really think about how you want to tag these pictures, and write it down because you’ll have that piece of paper to pass out to the next person.

The next thing I would recommend is to invest in something like TextExpander. I don’t know if there’s other know software that do that, but I work on the Mac, so I know the TextExpander. Basically what it does is, you can give it a three letter code and it will add in whatever you want it to.

Like I can’t stand typing Latin American and Hispanic ethnicity over and over again, so I have LAE and if I type LAE with a comma, it just pops in that word for me, but you can also have it do any number of keywords.

If you have, take the example of a wildlife collection, you’re probably going to have core set of keywords for just about every picture ‑‑ outdoors, daytime, nature, wildlife, color image. Those you might be typing for every single image in your collection. You can just type a code and in pop all those words.

You don’t have typos, you don’t have to remember all of them, and you could have any number of words collections that let you be more consistent. You don’t have to remember every time.

If you’re doing business and the other guy is on a cell phone, you can have it add in words like “communications, cell phone, technology, wireless technology”, and you don’t have to type all those words.

For us, we use that. I highly recommend it to anyone doing their own keywording is that, and really put some thought into how you think about the way that you want your collection keyworded and documented. Then look for those kind of tools that can make the job a whole lot more consistent.

Henrik:  Thanks, Kirsti.

Kirsti:  Welcome.

Henrik:  For more on this, visit Tagging.tech.

Thanks again.


For a book about this, visit keywordingnow.com