1 00:00:07,360 --> 00:00:10,560 EASIT - Easy access for social inclusion training 2 00:00:12,880 --> 00:00:16,023 Welcome to Unit 4 - The profession 3 00:00:16,023 --> 00:00:18,200 Element 1 - Personal skills 4 00:00:18,200 --> 00:00:24,000 Interviews with professionals: Audio description - part 2 5 00:00:24,400 --> 00:00:27,640 The video was produced by the University of Trieste 6 00:00:27,640 --> 00:00:31,218 in cooperation with Zavod Risa. 7 00:00:32,800 --> 00:00:34,977 This is the second part of the video 8 00:00:34,977 --> 00:00:38,401 featuring interviews with esteemed experts 9 00:00:38,401 --> 00:00:40,003 in the field of audio description: 10 00:00:40,003 --> 00:00:42,000 Bernd Benecke 11 00:00:42,000 --> 00:00:43,852 Louise Fryer 12 00:00:43,852 --> 00:00:45,653 Joel Snyder 13 00:00:45,653 --> 00:00:47,594 and Christopher Taylor. 14 00:00:47,680 --> 00:00:49,572 The interviews were conducted 15 00:00:49,572 --> 00:00:51,464 by the University of Trieste. 16 00:00:52,400 --> 00:00:56,960 The experts explain how they see  the future of their profession   17 00:00:56,960 --> 00:01:00,823 and whether they think that implementing easy to understand 18 00:01:00,823 --> 00:01:02,807 strategies in audio description 19 00:01:02,807 --> 00:01:05,880 would become a part of their profession. 20 00:01:10,789 --> 00:01:13,963 Audio description is going to increase. 21 00:01:13,963 --> 00:01:15,109 We see this already with 22 00:01:15,109 --> 00:01:17,744 all the streaming services coming up. 23 00:01:17,744 --> 00:01:19,698 In this time of pandemic of course 24 00:01:19,698 --> 00:01:22,087 everything gets streamed. 25 00:01:22,179 --> 00:01:24,818 Conferences are streamed 26 00:01:24,818 --> 00:01:26,567 and lectures at the universities 27 00:01:26,567 --> 00:01:27,919 are streamed. 28 00:01:28,178 --> 00:01:30,607 All this is material that should be 29 00:01:30,607 --> 00:01:31,619 audio described. 30 00:01:31,619 --> 00:01:35,331 So, I think audio description is really going up. 31 00:01:35,706 --> 00:01:37,201 How do I see the future? 32 00:01:37,330 --> 00:01:39,003 The need for trained audio describers 33 00:01:39,003 --> 00:01:40,468 is currently increasing, 34 00:01:40,468 --> 00:01:42,302 especially as more and more content 35 00:01:42,302 --> 00:01:43,366 is streamed. 36 00:01:43,550 --> 00:01:45,545 I expect this to continue. 37 00:01:45,637 --> 00:01:46,358 There are moves to introduce 38 00:01:46,488 --> 00:01:48,520 automatic visual recognition. 39 00:01:48,744 --> 00:01:50,185 Progress is slow because AI 40 00:01:50,185 --> 00:01:53,845 can identify static images quite well 41 00:01:53,845 --> 00:01:54,849 but it can't yet link 42 00:01:54,849 --> 00:01:57,131 continuing characters or locations. 43 00:01:57,242 --> 00:01:59,157 It might say there's a woman on a beach 44 00:01:59,157 --> 00:02:01,968 but it won't be able to identify 45 00:02:01,968 --> 00:02:05,240 the same woman on the same beach in the next shot. 46 00:02:05,326 --> 00:02:06,892 Humans are very good at picking up 47 00:02:06,892 --> 00:02:09,363 contextual clues to make a coherent whole. 48 00:02:09,492 --> 00:02:11,142 I expect workflows will be designed 49 00:02:11,142 --> 00:02:13,192 where AI can do some of the work 50 00:02:13,339 --> 00:02:15,075 and a human describes us the rest 51 00:02:15,075 --> 00:02:17,190 but it's still a long way off. 52 00:02:18,039 --> 00:02:23,619 I often talk about two aspects 53 00:02:23,729 --> 00:02:26,371 that are on the horizon for audio description 54 00:02:26,463 --> 00:02:28,956 that I think will be very important. 55 00:02:29,177 --> 00:02:32,292 First is the technical aspect. 56 00:02:32,292 --> 00:02:34,506 Soon enough the use of smartphones, 57 00:02:34,506 --> 00:02:36,210 if it isn't already, 58 00:02:36,302 --> 00:02:37,197 will be ubiquitous. 59 00:02:37,197 --> 00:02:39,210 Everyone will have a smartphone. 60 00:02:39,357 --> 00:02:42,037 It will be like a home telephone 61 00:02:42,176 --> 00:02:44,727 or having a home address. 62 00:02:44,820 --> 00:02:48,567 I think apps now, and there are half a dozen of them 63 00:02:48,567 --> 00:02:49,658 around the world, 64 00:02:49,767 --> 00:02:52,020 apps that you download 65 00:02:52,204 --> 00:02:54,958 and then that app allows you to 66 00:02:54,958 --> 00:02:58,777 pair a downloaded  audio description script 67 00:02:58,870 --> 00:03:01,326 with whatever is on your television right then... 68 00:03:01,326 --> 00:03:03,662 Or in the movie theater, it listens 69 00:03:03,662 --> 00:03:06,426 and it automatically syncs the description. 70 00:03:06,513 --> 00:03:08,813 I think this is this significant. 71 00:03:08,967 --> 00:03:11,562 Some of those apps have the capability 72 00:03:11,562 --> 00:03:13,732 of including captions, 73 00:03:13,732 --> 00:03:16,136 including sign interpretation, 74 00:03:16,136 --> 00:03:20,213 enhancing sound for people who are hard of hearing... 75 00:03:20,213 --> 00:03:23,499 not just turning up the volume 76 00:03:23,573 --> 00:03:27,140 but also download of an alternate language track 77 00:03:27,140 --> 00:03:29,765 so that grandma, who only speaks Spanish 78 00:03:29,839 --> 00:03:33,130 can go with the family to see a film 79 00:03:33,130 --> 00:03:34,782 in an American movie theater. 80 00:03:34,782 --> 00:03:37,146 The film is playing in English but she's using that app 81 00:03:37,220 --> 00:03:39,491 and hearing the dub in Spanish. 82 00:03:39,602 --> 00:03:43,569 Or perhaps reading subtitles. 83 00:03:43,661 --> 00:03:48,275 So that's the technical aspect  that has already begun to take hold 84 00:03:48,404 --> 00:03:50,333 but it's still in its infancy. 85 00:03:50,333 --> 00:03:52,814 I think it will become huge 86 00:03:52,888 --> 00:03:54,808 within the next 10 years. 87 00:03:54,900 --> 00:03:55,883 That's my prediction. 88 00:03:55,957 --> 00:03:58,730 The other aspect 89 00:03:58,860 --> 00:04:01,389 is more of a programmatic aspect. 90 00:04:01,518 --> 00:04:05,380 I think that sooner or later, to a certain extent... 91 00:04:05,565 --> 00:04:07,426 ...not totally by any means... 92 00:04:07,518 --> 00:04:10,160 audio describers are gonna go out of business. 93 00:04:10,160 --> 00:04:12,701 What I mean by that is that 94 00:04:12,701 --> 00:04:15,288 more filmmakers,  more videographers, 95 00:04:15,399 --> 00:04:19,959 more people dealing with the arts that are described 96 00:04:19,959 --> 00:04:23,740 will become aware of description and start using it 97 00:04:23,832 --> 00:04:25,761 as an aesthetic innovation, 98 00:04:25,761 --> 00:04:28,202 building it in from the get-go. 99 00:04:28,202 --> 00:04:31,064 Maybe there's a narrator character 100 00:04:31,175 --> 00:04:34,284 who speaks descriptively, 101 00:04:34,358 --> 00:04:37,988 actually describes as the program continues 102 00:04:38,172 --> 00:04:41,657 in a way that is not jarring. 103 00:04:41,657 --> 00:04:44,629 It seems as though it's part of the program. 104 00:04:44,722 --> 00:04:46,487 It is built in from the beginning 105 00:04:46,487 --> 00:04:48,814 so that it's no longer description... 106 00:04:48,814 --> 00:04:51,427 is no longer considered some sort of post-production activity 107 00:04:51,427 --> 00:04:55,421 that we don't need to worry about that we're making a film 108 00:04:55,421 --> 00:04:58,974 and the localization folks will deal with that. 109 00:04:58,974 --> 00:05:01,805 No, it can be used as an aesthetic innovation. 110 00:05:01,805 --> 00:05:04,233 It can be used in ways that 111 00:05:04,547 --> 00:05:10,708 will make for more significant  aesthetics in the film production, 112 00:05:10,800 --> 00:05:12,182 production of an idiom. 113 00:05:13,012 --> 00:05:15,229 Firstly, technological. 114 00:05:15,229 --> 00:05:19,806 The personalization revolution 115 00:05:19,899 --> 00:05:22,041 will continue to break new ground 116 00:05:22,576 --> 00:05:27,395 and depending on market demand and awareness of the involved stakeholders 117 00:05:27,395 --> 00:05:31,206 it will bring so far unknown possibilities 118 00:05:31,206 --> 00:05:34,702 for the blind and sight-impaired community. 119 00:05:34,702 --> 00:05:38,138 Secondly, the question of accessibility... 120 00:05:38,138 --> 00:05:40,942 and this movement towards universal design... 121 00:05:40,942 --> 00:05:43,465 This will see a change in attitudes 122 00:05:43,465 --> 00:05:46,347 about how audio description is considered 123 00:05:46,347 --> 00:05:48,302 among the public at large 124 00:05:48,302 --> 00:05:50,317 and among persons with sight loss. 125 00:05:50,317 --> 00:05:52,085 As regards screen AD 126 00:05:52,085 --> 00:05:56,908 the introduction of audio description and the assistance of end-users 127 00:05:56,908 --> 00:06:00,038 at the beginning of the filmmaking process 128 00:06:00,038 --> 00:06:01,330 will turn AD from being 129 00:06:01,330 --> 00:06:03,106 an addition to a film product to 130 00:06:03,106 --> 00:06:06,017 being an integral part. 131 00:06:06,017 --> 00:06:07,815 The same logic applies to the museum 132 00:06:07,815 --> 00:06:10,477 where layout and directionality 133 00:06:10,477 --> 00:06:14,271 will form part of the museum design process. 134 00:06:14,436 --> 00:06:17,063 So as museums change their role in society... 135 00:06:17,063 --> 00:06:18,733 which they have been doing for some time now... 136 00:06:18,880 --> 00:06:20,305 through hands-on exhibits... 137 00:06:20,305 --> 00:06:23,784 attractive layouts, technological enhancement... 138 00:06:24,000 --> 00:06:26,381 the provision of AD will be considered 139 00:06:26,455 --> 00:06:30,000 just a normal part of the plan. 140 00:06:30,111 --> 00:06:34,348 Thirdly, there are some ethical considerations. 141 00:06:34,478 --> 00:06:37,436 A recent publication on diversity 142 00:06:37,436 --> 00:06:40,790 points out how even the most well-meaning describers 143 00:06:40,790 --> 00:06:44,885 fall foul of political incorrectness misuse. 144 00:06:44,885 --> 00:06:47,273 So this manifests itself for example 145 00:06:47,273 --> 00:06:48,526 in the description of characters 146 00:06:48,526 --> 00:06:53,175 in terms of race, gender, age, disability... 147 00:06:53,292 --> 00:06:55,934 This publication refers specifically 148 00:06:55,934 --> 00:06:57,308 to theater AD, 149 00:06:57,308 --> 00:06:59,471 though its recommendations can equally apply 150 00:06:59,471 --> 00:07:03,153 to all kinds of screen products as well. 151 00:07:03,284 --> 00:07:06,035 For example, it was noted that 152 00:07:06,126 --> 00:07:08,858 white as a scheme skin color 153 00:07:08,858 --> 00:07:10,503 was never highlighted 154 00:07:10,503 --> 00:07:11,913 whereas black was. 155 00:07:11,913 --> 00:07:13,719 Women's appearance came 156 00:07:13,719 --> 00:07:15,759 into much more description than men's... 157 00:07:15,759 --> 00:07:19,238 So, we can thus expect 158 00:07:19,238 --> 00:07:20,937 that describers in the future 159 00:07:21,031 --> 00:07:24,325 will be coached on how to handle description 160 00:07:24,325 --> 00:07:26,325 in a non-discriminatory way. 161 00:07:35,687 --> 00:07:38,050 Easy to understand strategies 162 00:07:38,180 --> 00:07:39,955 could be useful for  some types of audience 163 00:07:39,955 --> 00:07:42,555 and some types of content. 164 00:07:42,663 --> 00:07:44,232 I'm not ruling it out 165 00:07:44,232 --> 00:07:45,944 but some blind or partially blind people 166 00:07:45,944 --> 00:07:47,623 revel in word pictures 167 00:07:47,623 --> 00:07:48,624 and I wouldn't want to 168 00:07:48,624 --> 00:07:50,119 deprive them of that. 169 00:07:50,250 --> 00:07:53,684 Finally, do you think that implementing 170 00:07:53,684 --> 00:07:55,529 easy to understand strategies in description 171 00:07:55,529 --> 00:07:57,389 will become a part of the profession? 172 00:07:57,389 --> 00:07:59,576 I think so. 173 00:07:59,576 --> 00:08:00,934 You know, listenability... 174 00:08:00,934 --> 00:08:02,716 easy to understand... 175 00:08:02,716 --> 00:08:05,180 that's going to be a plus 176 00:08:05,422 --> 00:08:06,918 for really anybody 177 00:08:06,918 --> 00:08:08,227 listening to description. 178 00:08:08,227 --> 00:08:09,328 And again... 179 00:08:09,328 --> 00:08:10,615 it can be a sighted person 180 00:08:10,615 --> 00:08:13,565 who is experiencing an audio movie. 181 00:08:13,565 --> 00:08:16,234 They love that film they saw last week 182 00:08:16,308 --> 00:08:17,743 and they're going on a long car trip... 183 00:08:17,743 --> 00:08:19,572 they want to experience the film again... 184 00:08:19,572 --> 00:08:21,600 They're not going to have a television playing 185 00:08:21,600 --> 00:08:22,746 or a screen, hopefully, 186 00:08:22,893 --> 00:08:24,113 while they're driving 187 00:08:24,113 --> 00:08:24,939 but if they hear 188 00:08:24,939 --> 00:08:27,465 the original audio track with description 189 00:08:27,465 --> 00:08:30,800 and it's done in a way that is easy to listen to... 190 00:08:30,800 --> 00:08:33,597 I think that makes all the difference in the world 191 00:08:33,597 --> 00:08:35,664 and allows them to vividly 192 00:08:35,664 --> 00:08:38,379 experience the movie once again. 193 00:08:38,379 --> 00:08:44,936 It increases the reach of audio description 194 00:08:44,936 --> 00:08:47,771 and that's going to be good for all of us. 195 00:08:47,863 --> 00:08:49,322 Well, they should... 196 00:08:49,396 --> 00:08:51,025 because the aim is always to provide 197 00:08:51,025 --> 00:08:53,906 the best possible description 198 00:08:53,906 --> 00:08:57,450 to be understandable by most people 199 00:08:57,450 --> 00:08:59,362 and the highest number of people. 200 00:08:59,436 --> 00:09:03,026 If concision and clarity 201 00:09:03,026 --> 00:09:05,000 are the name of the game... 202 00:09:05,000 --> 00:09:08,002 and what you're doing with EASIT... 203 00:09:08,002 --> 00:09:09,627 should definitely become 204 00:09:09,627 --> 00:09:11,718 a part of the logic 205 00:09:11,718 --> 00:09:14,184 behind any other description. 206 00:09:14,184 --> 00:09:15,656 Well i don't know. 207 00:09:15,656 --> 00:09:18,000 What I do not think is that we get 208 00:09:18,000 --> 00:09:20,362 an easy to understand description 209 00:09:20,362 --> 00:09:23,844 additionally to the normal audio description 210 00:09:23,844 --> 00:09:26,553 because that would mean you would need more money 211 00:09:26,553 --> 00:09:29,026 and there is no money in the field. 212 00:09:29,026 --> 00:09:32,454 But if this project can offer us some 213 00:09:32,454 --> 00:09:34,127 some strategies on how to make 214 00:09:34,127 --> 00:09:37,360 audio description easier to understand... 215 00:09:37,360 --> 00:09:40,754 I would be grateful to use this.