When you are working with high profile people, from C-Level company execs to influential celebrities, access is a difficulty. Especially after a recording when you need a slight edit to fix a mistake or clarification.
That was the past! Today technology can take their voice and make it say anything you want. It can even take the recording of that executive or celebrity and translate it so they are saying the same words in a different language.
Audio and podcast expert James Cridland shares his first-hand experience including:
- The best-automated transcription services
- How to edit audio using a word processor
- Adding in extra words synthesised by the word processor
- How to completely automate the voice of a CEO or celebrity
- How to audio translate someone’s speech so they are speaking in another language
- Stunning demonstrations of the above
Our guest James Cridland is a man who has worked at the leading edge of radio and new media for many years. In the UK he was one of the first to launch podcasts with Virgin Radio around 2006, before serving as the Head of Future Media & Technology with the BBC. He is an internationally recognised though leader in this space and speaks regularly at the largest conferences worldwide. Today, he is acknowledged as a leader in the podcasting world, and as the founder of PodNews.net .. he pretty much knows everybody and is the first to hear what is going on … including the developments we’re talking about today.
Is there a question we can answer in the next podcast? Send an email to firstname.lastname@example.org and we’ll chase down your answer from the best in the business.
For your convenience, we have included a 90% accurate machine transcript.
Dusty Rhodes 0:37
Welcome to How To Build A podcast for your brand from DustPod adata. Ie My name is Dusty Rhodes. Today we’re looking at two possibilities. The first working with high profile people from a very senior executive in a company to an influential celebrity, where the difficult access to these people can make it hard is something needs a slight edit or a clarification. Can technology really take their voice and make it say what you want? Then we’re going to go one step further where you can take the recording of that executive or a celebrity and then translate it so that they’re saying the same words in a different language. If you think I’m kidding, we’re going to give you a stunning example of this in action during the podcast today. Joining us to chat about this is a man who has worked at the leading edge of radio and new media for many years in the UK. He was one of the first to launch podcasts with Virgin radio around 2006, before serving as the head of future media and technology with the BBC. He’s an internationally recognized thought leader in this space and speaks regularly at the largest conferences worldwide. Today, he is acknowledged as a leader in the podcasting world. And as the founder of pod news dotnet. He pretty much knows everybody and is the first to hear what’s going on, including the developments we’re talking about today. It’s a pleasure to welcome James Cridland, how are you?
James Cridland 2:00
It’s a great pleasure to be here. How are you dusty? You’re good?
Dusty Rhodes 2:03
Very, very good. So that we know each other a long, long, long, long time you I would have to say you’re probably the most English person I know. Accent wise.
James Cridland 2:12
Oh, right. Well, it’s good or bad.
Dusty Rhodes 2:18
It’s very good. And I’m only kind of pointing that out so that people will listen to your voice during this. And then when we do our demonstration later, they’ll go, Oh, my God. Listen, let’s start with a kind of, I suppose audio and text. And transcription is kind of what I’m thinking of, it’s quite easy these days to get something transcribed, relatively well actually, you upload an mp3 file or you speak directly into your phone and Bop, bop, bop bop ever comes out? What would you say are the best services that you know online for transcription?
James Cridland 2:55
Yeah, I mean, there are a fair amount of these rev.com is relatively well thought of Amazon has a version of it, as well as does Google. And actually there are various things that you can even get on your phone, which are free that you can basically sit there and and train on, you know, on something and make it and make it you know, give you give you a decent transcripts. In fact, I think even if you use Google Docs online, or you use Microsoft Word Online, then there’s a little button even in there to end up doing that. So one easy way of transcribing is to is to playback a file in a quiet room, leave your computer in there, and it will do about an hour. You know, prior to it getting a little bit upset. So yeah, so you can do all sorts of that things. And they all do a relatively good job. I mean, if you say that something is about 90% accurate, then that sounds good until you realize that that means that one word out of every 10 is going to be wrong. So actually, it’s not that good, really. So what typically you end up having to do is to do a bit of manual human curation to that afterwards. But you know, it’s still a pretty good start. And it certainly saves an awful lot of time. One of
Dusty Rhodes 4:17
the services that we’re going to talk about today is transcript because they are really marrying transcription and audio files. And one of the things they have is, is that if you upload an audio file, it will transcribers but then you can edit the audio like you would a Word document. You’ve used this What have you used it for?
James Cridland 4:41
Yeah, so the the service is called D script. And it works. So it works really well. So you basically use it as a recorder you record into it and as you say, it pretty well turns what you’ve said into a Word document. And if there’s an arm or an array then it will automatically get rid of those if you want to. So you can get rid of filler words such as that. But you can also go, Okay, I want that sentence to be over here. And so you can cut, you can copy and paste it, you can cut and paste it and move it around, and it will do all of the audio editing for you. And it does a really good job. It’s a really good, quick and simple, you can still hear a few of the edits if you’re an audio nerd, like you and me, but I’m guessing that you know, most of the edits you wouldn’t notice. And you know, and it works really well. And it’s an it’s a smart. It’s a smart tool that basically allows almost anybody to begin to edit audio without having to worry about waveforms or, you know, indeed, in, you know, in my day it was it was pieces of magnetic tape and trying to graph pencils and stuff like that. And razor
Dusty Rhodes 5:53
blades, which cook the rice, I think they were showing your age there, yes. But this service now has kind of gone one further, because editing audio and taking stuff out or rearranging stuff is one thing, then quite often you realize, oh, I’ve actually missed a word or have mispronounced a word that was the wrong word. And I need to change it. Yes, that’s possible as well.
James Cridland 6:22
Indeed. So it’s got two different things in there. So the first thing that it’s got is it’s got a bunch of actors who will say things. So you can if you want to use this tool, or others like it, you can type in whatever, whatever it is, you would like a American woman to say, and it will automatically have an American woman saying that, and that’s all good. And that’s all clever. Or if you do some training, you can train it with your own voice. And so that training is not a quick process, it has to be said it took me about half an hour, and you end up reading a story. And because I do a podcast where I sound as if I’m reading the news, I was there reading, reading the story in the style of a news report. And it sounded a bit strange. And it turns out, it was the story of The Wizard of Oz, actually, because it’s your turn of copyright. And so therefore they were just they were just using that. But that text is enough to then train the the artificial intelligence to work out what your voice sounds like. And so if you, if you use that to make an entire podcast, it’ll still sound a little bit robotic. But if for example, you said that we’re currently 10,000 kilometers apart, and then you realize it’s not kilometres in the edit you your eyes, it’s actually miles, then actually, you could very easily change that without having to ring up the talent who you’ve you’ve had and say, can you go into a studio and say the word miles for me. So you can end up doing quite a lot of those, you know, really nice little pieces of just sort of polish. And it’s very difficult to spot when you’ve ended up doing this.
Dusty Rhodes 8:12
Am I correct in thinking that you did a very extreme. Experiment with this where you do because you do pod news dotnet. And it’s a daily update by email for free, just go to the website sign up for our highly recommended, but James, as he said, also does a podcast version of this. Now, is it true that you actually used D script to read the news v one day?
James Cridland 8:39
Yeah, I did. So one day, I decided that I would use this particular tool because it was brand new. And after twisting the arm of the PR person who thought that I was off, after a freebie, what I kind of was I was able to use the tool to, to basically speak my mind to our show. And that was very interesting. It was you know, it wasn’t perfect. I gave it the minimum of half an hour’s worth of of Wizard of Oz, a Wizard of Oz. So we didn’t get very, very far in the story. But nevertheless, it was good enough to give something that sounded like me it was obviously recognizably me. It wasn’t necessarily as as emotive as you know, as me in real life is but it’s certainly a good start, you know, in terms of that, and that’s why I say you know, it can be really useful for small corrections it’s probably not the sort of thing that you would feed an entire newspaper article in and hope that that would you know, work, but it still works you know, really, really well. Yeah.
Dusty Rhodes 9:47
Well, that’s that’s what I’m thinking of is because if you’re working with a celebrity or you’re working with your your CEO, I mean these people are not easy to get to and you see a mistake like that. You could actually have the AI trained with CEOs voice and bid because nobody wants, you know, some high executive at that company to come not saying the wrong thing.
James Cridland 10:08
No no into easily. So it’s, and you also, and you also have an issue where actually it’s one thing, finding the time for that person to come back into the studio, but then you have to make sure that the microphone is is, is in exactly the same place, you need to make sure that, you know, if they had a sore throat from the previous night’s football, then they should, you know, shout a bit first, and all of that, because otherwise you can still hear the edit. Yeah, so it’s, it’s actually really helpful to be able to have a bank of their voice. Yeah, you know, they’re and by the way, I should point out that they’ve worked very hard on the ethics as well, both of the companies that we’re going to talk about, these scriptures work very hard on the on the ethics to make sure that the first thing that you have to say is, I agree that this company can take a Voiceprint and can use my voice and blah, blah, blah, if you don’t say that, and that is checked, if you don’t say that, then this particular voice will won’t be available for you. So there’s a lot of talk about deep fake, and there’s a lot of talk about index. And all of this, you know, to a point gets rid of some of the concerns and some of the worries there.
Dusty Rhodes 11:20
Now, everything we’ve spoken about, to this point is child’s play as to what we’re about to mention, and you know, kind of what you were saying there about Voiceprint and everything is really good because there’s a new service called Marvel AI. And they’re just taking it to levels. I mean, it’s gonna, we were just talking about if you have a CEO or or celebrity you’re working with, and you need to change it from kilometers to miles or whatever, you know, this service are taking tell me that because it just blows my mind.
James Cridland 11:53
So one of the things that they have pointed out is that there are of course, a lot of people listening to podcasts all over the world. And it may come as a surprise, but there are some parts of the world where they don’t speak English and who would much rather hear their podcast in a language that they understand. And so, what Marvel AI which is owned by Varitone have done is that they have married a synthetic voice with a translating tool as well. And they actually use both machine learning translation but also human beings because there are certain words and things that you just want to make sure that work correctly. So you end up with something which is which you know, you may have recorded in English it will transfer that into text in English, and then we’ll translate it into let’s say Spanish. So you already have a nice transcription in Spanish which is a great help. But then what it will do is with your voice with your voice print, it will take your voice and make you speak a different language which is amazing. Would you like to hear a little bit of me speaking Spanish dusty?
Dusty Rhodes 13:09
Absolutely Bring it on?
James Cridland 13:12
Basa qualcare tiempo con cualquier directorial the podcast see that Asuma gran cantidad the program as an English parole English ESL idioma. principality menosan cinco por ciento de la Poblacion mundial gaseous una nueva technology inteligente Yas, podcasts Patreon esta dispone bliss and barrio city on us since America Moon rowboat they’re also Chenda. Now, now I should say that I do not speak a word of Spanish. I have no idea what I just said. It’s not the weirdest thing to say. I have no idea what I just said. But that is me speaking Spanish with Spanish mouth sounds which I don’t actually know how to do. And and I’m told by people who speak Spanish that it is a bit recognizably synthetic, but it’s also absolutely fine. And it’s much better for them to listen to a Spanish voice than it is to try and listen to me speaking in my hoity toity English accent is far easier for them to understand. Isn’t that fantastic?
Dusty Rhodes 14:17
And this is why I was asking at the start of the podcast to listen to your accent because you know, it’s very identifiably English. But then when it translated listening to that Spanish clip, it’s like it doesn’t sound like you but then it’s kind of Oh no, hang on. Yes, there are just certain qualities about that where I know that it is your voice. What I think is quite clever. And this is a real thing in in Spain, friend of mine move there. And he said, one of the culture things in Spanish is that especially if you’re a man to establish your dominance and dadadada his men will typically speak in a lower tone. Right? Just I am the man that kind of that stupid carry on. Yeah. but it’s done that with your with with it. So it’s kind of like so we’re used to translation services with Google Translate where you can put in English and you get at Spanish a text, what you have just done is your autonomy. You recorded something in English, and then it gave you your voice speaking in Spanish,
James Cridland 15:19
speaking in Spanish, which is, is just fascinating. And you can see how useful this would be. So we ended up so I so what pod news does is I do a daily newsletter all about podcasting news. But Brian, Brian by letter, who works with me does a weekly podcast all about a weekly newsletter, all about ad tech in podcasting very, very deeply, you know, specialist. But, you know, clearly there are an awful lot, particularly in the US, there’s an awful lot of Spanish speakers who would be interested in that sort of thing. And so Brian ended up working with them. And this needs far more training. So they asked for three hours of my voice. Which is, which is quite a lot, it turns out. But what was interesting is they basically said, we, we just want three hours of maybe you’ve got some podcasts, maybe you’ve got some, you know, some other things that you have already recorded, can we have those, but just your end, please. And it just so happens that I’ve got quite a few bits of audio like that. And so I was able to give them a great big dump of my, of my voice. And, and that works, you know, really well, you know, for them. But there are real opportunities, I think in if you have a message, if you are particularly a company and you are podcasting, and you have a message about the products that you have, or the service that you offer, then your CEO could be giving that message in a different language to the language that your CEO actually speaks. And that is a tremendously exciting thing.
Dusty Rhodes 17:02
I think globally, it’s important because I mean, English is widely spoken. But Chinese is a massive language. Arabic is another massive language, Spanish is another massive language. And then when you look more locally, locally, like around Europe, you get German, Italians, French to all of those different and it is a little bit of a struggle to win when working on that international level. But for any kind of a video presentation or for a podcast or whatever to be able to have that person’s voice in another language is just stunning. But then it gets a little bit more nuanced as well, because it can translate your voice from English into American English.
James Cridland 17:48
Very strange thing, because it because it is an American company. Yes. What they’ve ended up doing, of course, is that they’ve ingested my voice as an English accent. And there are some things that I say in English, obviously in a British English accent. But there are you know, it’s a different. There are different vowels and different voice sounds if you’re talking in American English. And so what you can do is you can ask this, this system to end up speaking in American as well. And so this is what I sound like, apparently, speaking in America, and you’re ready for this go up. Here we go. spend any time with any Podcast Directory, and you’ll see a while the amount of English language shows, but English is the primary language for less than 5% of the global population. Thanks to some clever new tech podcast could be made available in multiple languages without sounding like an ad is a robot. Now that is exactly the same thing, as I just said in Spanish earlier on. But isn’t that the weirdest thing?
Dusty Rhodes 18:52
That is so
James Cridland 18:57
that’s me speaking in a Texan accent.
Dusty Rhodes 19:01
Is that one of the big pitches of this company? marvel.ai. And that’s the website marvel.ai Is that they’re kind of saying if you’re working with a high end celebrity, and you know, I think Tim Cook was one of the examples that they had on the video on their website. It’s kind of like, you know, you can do the three eras voice training with a celebrity, whoever it is, and then they never have to come to a studio again, to do any kind of recording for a radio commercial or for TV, commercial or video presentation or whatever it is. Do you think it’s capable of doing that?
James Cridland 19:37
Whether he’s capable of doing that at scale, so, you know, I don’t think you could narrate an audio book, you know, by by doing that, but I would certainly have thought that it’s possible once you’ve got the voice print in the system, I would have certainly thought that it’s possible to do you know, lines for commercials and that sort of thing. maybe even you could, if you were doing, I don’t know, a betting commercial, perhaps you could automatically update it with the latest odds. So you could, or you could, you know, give what the weather forecast is going to be, and all of that sort of thing. And I’m sure that that is very, very possible. Now, I do understand that there have been some, some movies, where they have used a little bit of this technology to actually, you know, if you’re doing a biopic about somebody, and you would really have liked them to have said something on tape when they were younger, then, you know, you can do a little bit of sort of, you know, working of that sort of, you know, that sort of work as well. So I think I think, you know, it’s certainly capable of a lot of really interesting things. And I think the concern, obviously, is that, you know, well, you could just get anybody to say anything, couldn’t you? And of course, you could, but, you know, you are, you know, there are a few things that, you know, there is also there is also an you should, you should have a look at this, there’s a YouTube video from a good comedian, who was trying this sort of technology out. And he was given a piece of text that he recognized as being a piece of text from David Attenborough’s life on Earth. And so he thought, right, instead of me reading this out, I’m going to just play the DVD in of David Attenborough, so that I’ve got a fake David Attenborough. So he’s, he’s ended up doing a deep fake of David Attenborough, where, you know, David Attenborough, saying, I don’t know why you bother, you know, we’re all going to hell in a handcart and all this kind of stuff, which is, which is very, very clever. But, you know, that’s sort of that’s sort of going one step further, I think
Dusty Rhodes 21:54
it is insane. With the deep fakes that they can do with the video is unbelievable. And then when you match it up with what they’re able to do with this Voiceprint I mean, how can you actually tell the real thing from from from somebody else? What kind of protection are Have you any idea what kind of protections are built into so I mean, you kind of starting off, or they’re saying, I agree to my voice being voice printed, or whatever, that’s all well, and good. But how does somebody like Tim Cook, protect himself from having some computer generated version of his voice? Telling people to buy Microsoft computers?
James Cridland 22:29
Yeah, well, I mean, I think I think partially, it’s the, the, the training really only works, if you have enough of the right type of quality voice, in in the right, you know, in the right thing. So I think, you know, partially it is up to that legal thing of getting a voice to agree on tape that this is okay for them to do. I ended up signing, it was two pages of quite densely written words, which was basically all of the things that this company could do with my voice, and wouldn’t do with my voice, and what and what I could do if I changed my mind, because, of course, it’s fine me signing this today. But if all of a sudden the company, you know, goes mad, and does all kinds of things that I don’t like, then I would quite like to be able to say, No, you can’t use my Voiceprint anymore. And so of course, you know, there’s, there’s that kind of side of it as well. But I think, you know, as with all new technology, there are ways of using it for good. And there are ways of using it, you probably don’t want, you know, but you know, the amount of deep fake stuff that you can do with video. And the amount of, you know, the amount of things that you can do with, you know, spoofed email, I’m sure that we’ve all had, you know, a fair amount of that a fair amount of, you know, texts that look as if they’re coming from the post office that really aren’t coming from the post office, and then all of a sudden, they’ve taken, you know, 20 Euro away from you. You know, all of those sorts of things are all very possible now. So, you know, I think it’s just something else for us to be a little bit a little bit cautious of,
Dusty Rhodes 24:10
let’s get back to podcasting. Good old fashioned human beings speak in front of microphones. Yes. And podcast. I mean, this is your area and you’re well on it. And you know, everything’s going on now. That part of how you described yourself and I don’t know if you came up with the word but it’s a brilliant word when you were talking about radio and we were very involved with dA b and digital radio once upon a time. And you described yourself as a futurologist love that word, Jules Verne would have been very proud of you. And if you were to look into your crystal ball just at podcasting, and where it is now and let’s say in give a decent piece of to four years, will it be 2025 or four years time? Where do you reckon podcast seems gonna be.
James Cridland 25:02
I think when you look at, when you look at audio consumption as a whole, you look at both radio and you look at podcasting, and you look at streaming music, you can very clearly see that younger people. And that’s people under 35. So a long time ago, for both of us, maybe you can very clearly see that for younger people, they are consuming a lot of on demand audio, a lot of on demand content. And you look at mobile phones, nobody’s really using the mobile phone to listen to live radio, they’re listening to the mobile phone to do on demand content, to go and find, you know, the latest episode or something that they want to have listened to, or maybe they’ll ask their Google assistant for the latest news update or whatever it might end up being. And that’s what they’re using it for. And so therefore, on demand audio in all of its shapes, and guises is most clear is very clearly going to be the future of audio. And live radio won’t go away, but very much, you know, focused on On Demand audio in the future. So I think that that gives us a couple of different changes, it gives us one change, which is that we need to be thinking about on demand audio first not live radio first. And in many broadcasters, then that’s quite a, that’s quite a step change. And I think secondly, it means that we just need to, you know, really focus on, on how to how people can find this great audio, and what they’re going to actually you know, how they’re going to search for it, how they’re going to call it up, how they’re going to play it back what devices, they’re going to play it back on, you know, all of that kind of stuff is really important too. But certainly, you know, when you have a look at Europe, for example, podcasting is tremendously popular in places like Sweden, Norway, Denmark, and I think some of the reasons for that is that all of the people in that country basically speak English very well, as well as their own local languages very well. And so therefore, they’re actually seeing that they’ve got the best of both worlds, they’ve got lots of local content that they can go and, and have a listen to in Swedish, or in Danish or in Norwegian, but then they’ve got all of the American stuff, all of the British stuff. So they’ve got a wide range of, of content. When you start looking at companies at countries like France, or Spain, or Germany, where English is spoken less, then actually, you see a proportional, you know, sort of catch up in terms of podcasting. And podcasting isn’t as popular yet in these countries, because there’s much less content available to a typical person. But I think all of that is changing too. And certainly when you have a look at France and Germany, in particular, they are doing fantastically in terms of growth of the podcasting space, in terms of, you know, what, what is available, and, you know, one of the top 10 paid for podcasting channels in Apple podcasting right now is a German one, which I think really goes to show how, you know, how quickly the German podcasting landscape is increasing and, and you know, how much podcast consumption is going on there as well. So I think, you know, it is certainly going to be something which will continue to grow. Around about a quarter of everybody in most parts of Europe, listen to a podcast every week, that figures about a third in places like the US or here in Australia, but I think that those figures are going to continue going up and to a degree replace a bit of radio, but also to supplant radio as well.
Dusty Rhodes 28:50
I think the more radio and television, kind of start offering on demand audio, the more popular it’s going to become with people in general. But I think the other side of on demand then is niche content, because it’s nice, it’s like Netflix, or whatever. It’s kind of like okay, well, I want to watch a very specific type of a program and it’s there as well as the big blockbusters. And I think that’s great for for companies who wants to get into podcasting because there will be an audience for widgets or whatever it is that your company does, or whatever your particular brand does, whatever.
James Cridland 29:25
Yeah, absolutely. And because there are always you know, I talk about this an awful lot that you know, the old fashioned world of radio as was used to look at communities but they were local communities because that’s how radio works. You know, your your, you have a local, a local transmitter in a field and it reaches a certain amount of people. And that’s basically how radio stations work. Whereas podcasting is global. And you’re not reaching local communities, although you can. You’re reaching communities of common interest. Yes, that’s what podcasting is all about. And you end up with a lot of niche communities who are enjoying what you are doing. I mean, you know, there’s the there are podcasts on all kinds of things. There are podcasts, if you like coffee, there are podcasts on if you write CSS code for a website, you know, there’s probably there’s there are, I think, three or four podcasts that I’ve been on, around expats from the UK who now live in Australia. You know, there’s all of this sort of thing. And I think it’s testament to the openness and the level playing field that podcasting has is that anybody can be in those platforms alongside the Joe Rogan’s of this world. And then Michael Barbaros, you know, there they are with their own show. And if people know what to search for, or if you’ve made it really obvious, then it’s a great place to to reach new communities that share the common interests that you do.
Dusty Rhodes 30:51
Well, James, it’s been an absolute delight and a very ear opening experience listening to and, and everything that can be done with artificial intelligence and voices and transcription and into different languages and stuff like that. So thank you very much for your time. I highly recommend I highly, highly recommend actually that you visit James’s website pod news dotnet, which gives you a free daily briefing email about what’s happening in podcasting. And it’s also available to podcast of course, and who knows soon, maybe in Spanish as well. Just search your podcast player for pod news. Of course, if you’d like to chat about any of the topics that we’ve discussed today, or you have any questions you’d like to answer either directly by myself or on the podcast, just email me email@example.com And of course, you’ll find tons about podcasting specifically for companies and brands on our website at DustPod. Dove ie until next time for myself, Dusty Rhodes, thank you very much for listening, take care. Open the pod bay doors. This conversation can serve no purpose anymore.