tagging.tech

Audio, Image and video keywording. By people and machines.


Leave a comment

Tagging.tech interview with Jason Chicola

Listen and subscribe to Tagging.tech on Apple PodcastsAudioBoom, CastBox, Google Play, RadioPublic or TuneIn.

 

Transcript:

Henrik de Gyor: This is Tagging.tech. I’m Henrik de Gyor. Today I’m speaking with Jason Chicola.

Jason, how are you?

Jason Chicola: Doing great, Henrik. Thanks a lot for taking the time.

Henrik de Gyor: Jason, who are you and what do you do?

Jason Chicola: I’m the founder of Rev.com. Rev is building the world’s largest platform for work from home jobs. Our mission is to create millions of work from home jobs. Today, we have people working on five types of work. Jobs they could do in their pajamas. And the main ones are audio transcription and closed captioning. Several of my co-founders and I were early employees at Upwork which is the largest marketplace for work at home jobs. Rev takes a different approach than Upwork. With Rev, we guarantee quality which means that the task of managing a remote freelancer, hiring the right one is something that our platform excels at. And so what that means is our customers have a very easy to consume service. You can think of it… you can think of us as Uber for work at home jobs. So if you wanted to come to us to get say for example this call transcribed as you know as a customer all you have to do is upload an audio file to a website and then couple hours later, you’ll get back the transcript. Now behind the scenes, there is an army of freelancers that are doing that work and we have built our technology to make their lives easier and make them productive. If I zoom out from all of this, I look at the world and see a lot of people who are sitting in cubicles that probably shouldn’t have to, while he was in traffic and it shouldn’t have to and I look at what are all the kinds of jobs you will do today at a computer. How many of those jobs need to be done in cube farm. How many of them could be done from home? We think many of them can be done from home. And our mission is to give more people the opportunity and the freedom to work from home which allows them not only to choose the location but also gives people more control over their lives because they can decide whether they want to be the morning versus early afternoon. It means you’re not tied to a single boss or employer. It means that you can work on one skill on Monday and a different skill on go on Tuesday and go surfing or hiking on Wednesday, if you feel it. So that’s how we think about our business is really centered on giving people this freedom that comes when they can be their boss and work from home. And as a segue to some of your next questions that you and I discussed the past, as we got deeper and deeper into creating jobs for transcriptionists, we have invested in technology to make their jobs easier, to make them more productive. And that has led us to develop some competency and familiarity with what you’re calling here AI transcription which means using a computer to transcribe audio so that what I call a relatively new area for us, an important area, especially in light of people being familiar with Amazon Alexa and Apple’s Siri. So that’s a new small business, but the core is giving people work they can do with a computer. Most of that work today listening and typing.

Henrik de Gyor: Jason, what are the biggest challenges and successes you’ve seen with AI Transcription?

Jason Chicola: It’s really early to judge that. I can give you a specific example in a moment. But it’s a little bit like asking someone today what are the biggest challenges and successes of self-driving cars. The answer is I think business cases that they have been small but possible successes in the future could be massive. I really believe that we’re truly… you’re not even in the first inning. Maybe we’re the warm-up for the first inning of this game and I think is going to be a pretty exciting decade ahead of us as computers have gotten better, as more audio is captured in digital formats and companies like Rev are innovating in a bunch of areas. Our success today in this area has been… we had success, but it’s been at the fringes of our business so I’ll give you a specific example: when the Rev transcriptionist type out an audio file like somebody might hear about this phone call, some customers request time stamps and the humans part of their job is to go into note for example at the end of every minute, this event occurred in three minutes, this event occurred four minutes or so forth. That was an additional task they performed manually while they did their job. We automated that using what you could call AI transcription. So now not only time stamps are inserted automatically but every single word is marked by the AI as when it was sent. So literally for every single word, we know this word occurred at 4:38 and get that word occurred at 5:02. So that’s something that we’ve done that automated something previously done manually and it actually made it a much better experience for the customer because the timestamps are more accurate. That something we already have today. The challenge… the challenge list is longer. The biggest challenge to be aware of when it comes to automated transcription is that it’s garbage in, garbage out. Other people say you can’t make chicken salad out of chicken [****] that if you go to Starbucks and you sit outside by a noisy street and you record an interview with someone who you’re talking to for a book and you submit it to some automated engine you’re not going to get back anything that is very good. And that’s I mean it’s obvious why that is, but the quality of speech recognition depends I would say on three or four key factors other than the quality of the [speech recognition] engine itself. One is background noise. The less the better. Another is accents. The less the better. Another is how clearly the person is speaking. Are they annunciating? Are they slurring their words together? Are they speaking really quickly? Those tend to be the major factors. There is probably another one related to background noise which comes down to the quality of your microphone. How far you are from the microphone. You are a podcaster, so you probably know far more about how record clear audio than most people do. Most people throw an iPhone onto a table next to somebody else’s eating a bag of Doritos. [laugh] So you have great audio of someone eating a bag of Doritos which causes problems downstream and some of those people because they don’t think about it will say “Hey, I really annoyed. You didn’t get this word right.” And that’s because somebody was eating a bag of Doritos during the time that word was said. So part of our job… as we try to get better at helping people transcribing quickly and cheaply part of our job is to help customers understand that you need to record good audio if you want to get to get a good outcome.

Henrik de Gyor: Jason, of January 2018, how much of the transcription work is completed by people versus machines?

Jason Chicola: Are you talking to the work that Rev does?

Henrik de Gyor: Sure.

Jason Chicola: Depends on how you slice it, but I’ll say 99% percent people, 1% machine.

Henrik de Gyor: Fair.

Jason Chicola: We actually have…I’ll be a little more clear on that, we recently released a new service called Temi. Temi.com. That is an automated transcription service where people are not doing the work. Machines are and then are core service rev.com is done basically entirely by people. We believe that that’s required to deliver the right level of accuracy. This is I don’t answer your next questions but we clearly see these two blending and merging a little more over time, but today if you want to get good accuracy you need people to do it. If I give you kind of the external contacts in an earnings call used to be transcribed for Wall Street analysts and machine does it and they make a mistake on, you know, a key number or you know, the CFO said that something happened or something didn’t happen, that’s a big problem. Or if a movie is captioned for HBO. Game of Thrones is captioned by HBO. Those captions need to be right. So any use case where people want a transcript that is accurate today, they need to have people in the loop.

Henrik de Gyor: Make sense. Jason, as of January 2018, how do you see AI transcription changing?

Jason Chicola: I think the most obvious change that people expect and is probably been slower coming then people expect is that the machine is going to help the man in this proverbial battle of the man versus the machine… wasn’t there kid’s story about John Henry the guy that fought the train. And you know, there is this narrative in popular culture that robots are going to take our jobs. And there are sectors where that has been a big problem for people. I think that there’s a broader trend in more industries where technology seeps into our day to day lives in little ways that help us to eliminate the parts of our jobs that suck. The printing photocopies and running to Office Depot and you know changing format for documents. Those things go away what I expect to see is more transcription happening in a two-step process where first machine takes a cut, then the human tweaks it.

Jason Chicola: There are some companies that have tried this in the past that by in large didn’t do well because their quality sucked. There are some companies do this well in the medical transcription space. But the thing that I would….the trend that I would encourage your listeners to look for is a trend that is not my idea. There’s a book called Race Against the Machine its written by a couple academics out of the MIT Sloan School and in the final chapter they are talking about the rise of automation and AI and how it’s going to affect the economy and jobs broadly. In the last chapter, they concluded that they believe that rather than having a showdown against the machines, the best companies were going to be the ones that found a way to in their words “race with the machines,” that the machines should help people do their jobs better and so I would look for examples where software can make people better at their jobs. And I think that that is the trend to keep an eye on. I don’t think…there are people that say “AI is going to do everything.” Well, to those people, I would say are you saying that quality is going to stop mattering? Is HBO going to start caring about the quality of the captions? Is the Wall Street analyst firm that is… Is the Wall Street trader who is reading earnings calls as they come in so he can decide whether to buy or sell a given stock, is he going to want more mistakes in these documents? I don’t think so. I think these people want accuracy and I think humans are needed to deliver accuracy.

Henrik de Gyor: Jason, What advice would you like to share it with people looking into AI transcription?

Jason Chicola: Well, it depends on what their objective is. But if somebody has audios recordings or meetings or dictation that they want to use productively, I would certainly recommend that they try our service Temi.com. In fact, right now, if you download our mobile app which is available on both iOS and Android, all transcriptions submitted through our mobile app are free for limited time. I want to repeat that. You can get free unlimited transcription for a limited time through the Temi mobile app on iOS and Android. That’s a good place to start because that doesn’t cost you much. Beyond that, so that was this is a self-serving comment. There are transcription engines available today by a variety of companies, some of them well known and large for example Google has one under the Google Cloud products. You can play with that. Amazon has announced a couple products related to transcription. They have one called Amazon Transcribe which I don’t believe has formally launched at scale. It might be like a private beta, but that’s going to keep an eye on. They also have a product called Amazon Lex. If you want… If you were a software developer… wanted to build an Alexa-like app that you control the voice commands Amazon Lex was is designed to help you with that. There are some smaller companies in the space as well if you google, you’ll find them, but I would probably give those companies as good reference points for people trying to figure out the category.

Henrik de Gyor: Excellent. And Jason, where can we find out more information about AI transcription?

Jason Chicola: The Temi blog has some good information. So if you go to Temi.com and you click on a blog link in the footer, there’s a bunch of articles that address topics that we think are interesting. Beyond that, googling is great. You know, there are some more specialized publications in the speech world. Most of them are too technical for a general audience. There is a conference called Speech Tek that is pretty good. We’ve been a couple times for some of those really serious about it. But I think between those resources and googling somebody is probably in pretty good shape. If folks have large needs to transcribe a lot of audio, contacting Rev/Temi is a good idea because we can often point you in the right direction.

Henrik de Gyor: Well, Thanks, Jason.

Jason Chicola: It’s really a pleasure to chat today. I know I really believe that 2018 is going to be marked as the first year that transcriptionist start to use on the importance scale. Everybody that I know has probably a couple of these listening devices in their home. Everybody I know is really struggling with Siri and people are starting to think about how to use voice differently. Transcription is… We talked today about transcription and that’s how we framed the conversation. And I think that that is a fine framing, but its a bit backward looking. If I look into the future, I think that there is a whole new behavior that are likely to happen. So when I or a colleague of mine are driving to the office and I have I know how important meeting or presentation or board meeting later in the week. Shouldn’t I be effectively dictating notes to self that I can use later in that presentation? Shouldn’t I be trying to talk more than I type and use an app to build nodes knowledge and insights?

Jason Chicola: I think that transcription implies an existing recording off the shelves whereas using voice to be more productive I think is going to be a major behavior change that we’re likely to see in the next couple of years and we’re trying to accelerate that with our products. Clearly, there are other companies out there as well and we wish everyone luck. I think it’s a big space, but I’m glad that we able to have this conversation because hopefully, you know we listen to this conversation in a couple years from now, hopefully, we’ll have gotten a couple things right.

Henrik de Gyor: Awesome. Well thank you for leading this voice first revolution and thanks again.

Jason Chicola: Thank you, Henrik

Henrik de Gyor: For more information. visit tagging.tech.

Thanks again.

 


Leave a comment

Tagging.tech interview with Jonas Dahl

Tagging.tech presents an audio interview with Jonas Dahl about image recognition

Listen and subscribe to Tagging.tech on Apple PodcastsAudioBoom, CastBox, Google Play, RadioPublic or TuneIn.

Keywording_Now.jpg

Keywording Now: Practical Advice on using Image Recognition and Keywording Services

Now available

keywordingnow.com

Transcript:

Henrik de Gyor:  This is Tagging.tech. I’m Henrik de Gyor. Today I’m speaking with Jonas Dahl. Jonas, how are you?
Jonas Dahl:  Good. How are you?Henrik:  Good. Jonas, who are you and what do you do?Jonas:  Yeah, so I’m a product manager with Adobe Experience Manager. And I primarily look after our machine learning and big data features across all AEM products, so basically working with deep learning, graph-based methods, NLP, etc.

Henrik:  Jonas, what are the biggest challenges and successes you’ve seen with image recognition?

Jonas:  Yes. Well, deep learning is basically what happened, what defines before and after. So, basically in 2012, there’s a confluence of the data piece that is primarily enabled by the Internet, large amounts of well-labeled images that could drive these huge deep learning networks. There’s the deep learning technology and, obviously, the availability of raw computing power. So, that’s basically what happened. And with that we saw accuracy increase tremendously, and now it’s basically rivaling human performance, right? So we see both accuracy and also kind of the breadth of labeling you can do in classification you can do has just increased and improved tremendously in the last few years.

In terms of challenges, what I see is, I really see this as a path you’re going in or the first step is kind generic tagging of images, right? So what’s in an image? Are their people in it? What are the emotions? Stuff like that that’s pretty generic. And that’s kind of the era we’re in right now where we see a lot of success and where we can really automate these tedious tagging tasks at scale pretty convincingly.

I think the challenge right now is to move to kind of the next step, which is to personalize these tags. So, basically provide tags that are relevant not just to anyone but to your particular company. So, if you’re a car manufacturer and you want to be able to classify different car models. If you’re a retailer, you may want to be able to do fine grain classification of different products. So that’s the big challenge I see now and that’s definitely where we are headed and where we’re focusing on in all apps.

Henrik:  And, as of November 2016, how do you see image recognition changing?

Jonas:  Well, really where I see it changing is, as I said, it’s going to be more specific to the individual customer’s assets. It’s going to be able to learn from your guidance. So, basically, how it works now is that you have a large repository of already-tagged images, then you train networks to do classification. What’s going to happen is that we’re going to add a piece that makes this much more personalized, much more relevant to you, and where the system learns from your existing metadata and your guidance, basically, as you curate the proposed tags.

Another thing I see is video, it’s going to be more important. And video has that temporal component, which makes segmentation important, and that’s how that differs from images. So there’s that, and also the much larger scale that we’re looking at in terms of processing and storage when we’re talking about video. Basically, video is just a series of images, so when we develop technologies to handle images, those can be transferred to the video pieces, as well.

Henrik:  Jonas, what advice would you like to share with people looking at image recognition?

Jonas:  Well, I would say start using it. start doing small POCs [proof of concepts] to get a sense of how well it works for your use case and kind of define small challenges that, small successes you want to achieve and just get into it. This is something that is evolving really fast these days, so getting in and seeing how it performs now, then you’ll be able to provide valuable feedback to companies like Adobe. So you can basically impact the direction that this is going in. It’s something we value a lot. It’s really valuable to us that when we run beta programs, for instance, that people come to us and say, “You know, this is where this worked really well. These are the concrete examples where it didn’t work that well,” or, “These are specific use cases that we really wish that this technology could solve for us.”

So now is a really good time to get in there and see how well it works. And also, I’d say, just stay on top of it. Stay in touch because, as I said, this evolves so fast that you may try it today and then a year from now things can look completely different, and things can have improved tremendously.

So that’s my advice. Now is a good time. I think the technologies have matured enough that you can get real solid value out of them. So this is a good time to see what can these technologies do for you.

Henrik:  Jonas, where can we find more information?

Jonas:  Yeah, so we just at Adobe launched what we call Adobe Sensei, which is the collection of all the AI and machine learning efforts we have at Adobe. And going, just Googling that, and going to that website, that will be updated with all the exciting things that we are doing in that space. And I would recommend that you keep an eye on that because that’s something that’s going to really evolve the next few years.

Henrik:  Great. Well, thanks, Jonas.

Jonas:  Yeah, you’re welcome.

Henrik:  For more on this, visit Tagging.tech.

Thanks again.


 

For a book about this, visit keywordingnow.com


Leave a comment

Tagging.tech interview with Nikolai Buwalda

Tagging.tech presents an audio interview with Nikolai Buwalda about image recognition

 

Listen and subscribe to Tagging.tech on Apple PodcastsAudioBoom, CastBox, Google Play, RadioPublic or TuneIn.

Keywording_Now.jpg

Keywording Now: Practical Advice on using Image Recognition and Keywording Services

Now available

keywordingnow.com

 

Transcript:

 

Henrik de Gyor:  This is Tagging.tech. I’m Henrik de Gyor. Today, I’m speaking with Nikolai Buwalda. Nikolai, who are you, and what do you do?

Nikolai Buwalda:  I support organizations with product strategy, and I’ve being doing that for the last 15 years. My primary focus is products that have social networking components, and whenever you have social networking and user‑generated content, there is a lot of content moderation that’s a part of that workflow.

Recently, I’ve been working with a French company, who’s launched their large social network in Europe, and as a part of that, we’ve spun up a startup that I’m the Founder of called moderatecontent.com, uses artificial intelligence to handle some of the edge cases when moderating content.

Henrik:  Nikolai, what are the biggest challenges and successes you’ve seen with image recognition?

Nikolai:  2015 was really an amazing year with image recognition. A lot of forces really came to maturity and so you’ve seen a lot of organizations deploy products and feature sets in the cloud that used or depend heavily on image recognition. It probably started about 20 years ago with experiments using neural networks.

In 2012, a team from the University of Toronto came forward with a real radical development in how neural networks are used for image recognition. Based on that, there was quite a few open source projects, a lot of video card makers also developed hardware that supported it, and in 2014 you saw another big leap by Google in image recognition.

Those products really matured in 2015, and that’s really allowed for a lot of enterprises to have a very cost effective ability now to integrate image recognition into the work that they do. So 2015 really has seen, in the $1000 range, the ability to buy a video card, use an open source platform, and very quickly have image recognition technology available to your workflow.

In terms of challenges, I continue to see two of the very same challenges existing in the industry. One is the risk to a company’s brand, and that still continues.

Even though image recognition is widely accepted as a technology that can surpass humans in a lot of cases for detecting patterns and understanding content, when you go back to your legal and to your privacy departments, they still want to have an element of humans reviewing content in the process.

It really helps them with their audit, and their ability to represent the organization when an incident does occur. Despite companies like Google going with an image recognition first passing the Turing test, you still end up with these parts of the organization who want human review.

I think it’s still another five years before these groups are going to be swayed to have an artificial intelligence machine‑learning first approach.

The second major issue is context. Machine learning or image recognition is really great at matching patterns in content and understanding these are all the different elements that make up some content, but they are not great at understanding the context ‑‑ the metadata that goes along with a piece of content ‑‑ and making assumptions about how all the elements work together.

To illustrate this, it’s probably a very good use case that’s commonly talked about, which is having a person pouring a glass of wine. Now, in all kinds of different contexts, this content could be recognized as something that you don’t want associated with your brand versus not being an issue at all.

If you think about somebody pouring a glass of wine, say at a cafe in France versus somebody pouring a glass of wine in Saudi Arabia. Between the two, there’s very different context there, but very difficult for machine to draw conclusion about the appropriateness of that.

Another very common edge case that people like to use as example is the bicycle example where machines are great at detecting bicycles. They can do amazing things, far surpass the ability of people to detect this type of object, but if that bicycle was a few seconds away from being into some sort of accident, machines are very difficult at detecting this.

That’s where human review ‑‑ human escalations comes into play for these types of issues and still represent a large portion of the workflow and the cost in moderating content. So, mitigating risk within your organization to have some sort of person review of content.

Then to also really understand the context are two things that I think, in the next five years, will be solved by artificial intelligence and will really put these challenges for image recognition behind them.

Henrik:  As of March 2016, how much of image recognition is completed by people versus machines?

Nikolai:  This is a natural stat to ask about, but I think, with all the advancements in 2015, I really like to talk about a different stat. Right now, anybody developing a platform that has user‑generated content has gone with Computer Vision Machine learning approach first.

They’ll have a 100 percent of their content initially reviewed with this technology and then, depending on the use case and the risk profile, a certain percentage gets flagged and moved on to a human workflow. I really like to think about it in terms of, “What is the number of people globally working in the industry?”

We know today that about 100,000 to 200,000 people worldwide are working at terminals moderating content. That’s a pretty large cost and a pretty staggering human cost. We know these jobs are quite stressful. We know they have high turnover and have long‑term effects on the people doing these jobs.

The stat I like to think about is, “How do we reduce the number of people who have to do this and move that task over to computers?” We also know that it’s about a thousand times less expensive to use a computer to moderate this. It’s about a tenth of a cent per piece of content versus about 10 cents per content to have a piece of content reviewed with human escalation.

In terms of really understanding how far we’ve advanced, I think the best metric to keep is how we can reduce the number of people who are involved in manual reconciliation.

Henrik:  Nikolai, what advice would you like to share with people looking into image recognition?

Nikolai:  My advice is, and it’s something that people have probably heard quite a bit, which is it’s really important to understand your requirements and to gain consensus within your organization about the business function you want image recognition to do.

It’s great to get excited about the technology and to see where the business function can help, but it’s the edge cases that can really hurt your organization. You have to gather all the requirements around.

That means meeting with legal, privacy, security and understanding the use case that you want to use image recognition for and then the edge cases that may pose some risks to your organization. You really have to think about all the different feature sets that go into making a project really successful with image recognition.

Things that are important is how it integrates with your existing content management system. A lot of image recognition platforms use third parties, and they can be offshore in countries like the Philippines and India. Understanding your requirements for sending content over there, your infosec department is really important to know how that integrates.

Having escalation and approval workflows, this is really going to protect you in these edge cases where there is the need for human review. That needs to be quite seamless as there’s still a significant amount of content that gets moderated and approved this way.

Having language and cultural support, global companies really have to consider the impact culturally of content from one region versus another. Having features and an understanding built into your image recognition that it can adapt to that is very important.

Crisis management, this is something that all the big social platforms have playbooks ready to go for. It’s very important because, even if it’s, like I said, one image in a million that gets classified poorly, it can have a dramatic impact in media or even legally for you. You want to be able to get ahead of it very quickly.

A lot of third parties provide these types of playbooks, and it’s a feature set that they offer along with their resources. The usual feature set you have to think about ‑‑ language filters, image, video, chat protection. Edge case that has a lot of business rules associated with is the protection of children, social‑media filtering.

You might want to have a wider band of guardrails to protect you on response rate and throughput. A lot of services have different types of offerings. Some will moderate content over 72 hours, and others you need response rates within the minute.

Understanding your throughput and response rate that’s required is very important and really impacts the cost of the offering that you are looking to provide. Third‑party list support ‑‑ a lot of companies will provide business rule guidance and support on the different rule sets that apply to different regions around the world.

That’s important to understand which ones you need and how to support it within your business process. Important to demonstrate control of your content is having user flags. Being able to have the people who are consuming your content, the ability to flag content into workflow to work through that demonstrates one of the controls that you need to often have in place and the edge cases.

The edge cases are where media and legal really has a lot of traction and are looking for companies to provide really good controls for protecting themselves. Things like suicide prevention, bullying, and hate speech can really dramatically…just one case can have a significant impact on your brand.

The last item is a lot of organizations for a lot of different reasons have their content moderation done within their own organization. They have the human review within their own organization and so having training of that staff for some of the stressful portions of that job and training for HR is very important. It is something to consider when building out of these workflows.

Henrik:  Nikolai, where can we find more information about image recognition?

Nikolai:  The leading research for image recognition really starts at the ImageNet competition that’s hosted at Stanford. If you Google ImageNet in Stanford, you’ll find that the URL isn’t that great and officially it’s called the ImageNet Large Scale Visual Recognition Challenge. This is where all the top organizations, all the top research teams in image recognition compete to have the best algorithms, the best tools, and the best techniques.

This is where all the breakthroughs in 2012, 2014 happened. Right now, Google is the leader, but it’s very close and image recognition at that competition is certainly at a level where these teams are far exceeding the capability of humans. So from there, you get to see all the tools and techniques that the latest organizations are using, and what’s amazing is the same tools and techniques they use on their platforms that exist for integrating within your own organization.

On top of that, the competition between video card providers, between AMD and NVIDIA, has really made the hardware to support this to allow for real‑time image recognition at a very cost-effective manner. The tools that they talk about at this competition leverage that hardware and so it’s a great starting place to understand what the latest techniques are and how you might implement them within your own organization.

Another great site is opencv.org or open computer vision, and they have taken a built‑up framework around taking all the latest tools and techniques and algorithms and packaging them up in a really easy‑to‑deploy toolset. It’s has been around for a long time and so they really have a lot of examples, a lot of the background about how to implement these types of techniques.

If you are hoping to get an experiment going very quickly, using some of the open source platforms from ImageNet competitions and using OpenCV together you can really get something up very quickly.

On top of that, when you’re building out these types of workflows, you need to work closely with a lot of the nonprofits that have great guidance on what are the rule sets, what are the guardrails you need to have in place to protect your users and to protect your organization.

The Facebook has really been a leader in this area and they have spun up a bunch of different organizations they work with ‑‑ the National Cyber Security Alliance, Childnet International, connectsafely.org ‑‑ and there are a lot of region‑specific organizations that you can work with. I definitely recommend that using their guardrails will really be a great starting point for a framework when understanding how image recognition can moderate your content, how image recognition can be used in ethical and legal manner.

In terms of content moderation, it’s a very crowded space right now. Some of the big partners, they don’t talk a lot about their statistics, but they are doing a very large volume of moderation. Companies like WebPurify, Crisp Thinking, and crowdsource.com, they all have an element of machine learning and computer and human interaction.

The cloud platforms like AWS and Azure have offerings for the machine learning side. Adobe definitely is a content management platform. They have great integrated software package if you use that platform.

Another aspect, which is quite important, is a lot of companies do their content moderation internally, and so having training for that staff and training for your HR department is very important. But all in all, there are a lot of resources, a lot of open source platforms that make it really easy to get started.

TensorFlow, which is an open source project from Google, they use it across their platform. I think they have…The last I checked, it was about 40 different product offerings that use the TensorFlow platform, and it is a neural network based image recognition type technology. It’s very visual and it’s very easy to understand and can really help reduce the amount of time to go to production with some of this technology.

Other open source projects, if you don’t want to be attached to Google, include CaffeTorchTheano and NVIDIA. They have a great offering tied to their technology.

Henrik:  Well, thanks Nikolai.

Nikolai:  Thank you, Henrik. I’m excited about content moderation. It’s a topic that’s not really talked a lot about, but it’s really important and I think in the next five years we are really going to see the computer side of content moderation and image recognition take over, understand the context of these items, and really reduce the dependency on people to do this type of work.

Henrik: For more on this, visit Tagging.tech. Thanks again.


 

For a book about this, visit keywordingnow.com