Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Software development and screen readers at 450 words per minute (vincit.fi)
416 points by mieky on Aug 28, 2017 | hide | past | favorite | 106 comments


I wonder how many times in a regular day a blind person nearly throws thew computer out the window because some application randomly does something inaccessible and they get stuck.

Some tray app from dell popping up a dialog about your trial backup subscription expiring, with no standard/accessible UI? Things like that.

This is what surprises me most about not having a screen. I get that a blind person doesn't NEED the screen but when you want to ask someone about some code, or need help by a seeing person, the screen might help... I admit it does look cool though.


I'm a blind developer and I've found Windows 10 to be pretty good about not throwing up popups if you configure it well. This is because we have a good IT group who builds machines from scratch so there's no trial software. I build my personal computers from scratch so install the OS avoiding junk software. When Windows does have notifications it's pretty good about putting them in the notification center so I can look at them when I want instead of having to deal with an notification dialog right away. I had the IT group find me an old external monitor. Although I don't need it it's much easier to have the screen at eye height when a co-worker is standing then requiring them to hunch over and look at a 13 inch laptop screen.


I think the D programming language site is accessible to the blind, but I don't know what inadvertent accessibility problems there might be with it. If you could please have a look and let me know about any difficulties, I'll get them taken care of.

https://dlang.org/


The Khan Academy people wrote a JS script called tota11y that will let you inspect your site: http://khan.github.io/tota11y/


The site appears to generally be accessible with the exception of the interactive tutorial. I'm not a web accessibility expert so don't know the details. There are two major issues. The biggest one is my inability to arrow through the code in an edit field. I can read the text outside of the edit field, but it's difficult since line numbers are interspersed with the text. My assumption is that your using an open source editor which is inaccessible, but I have not had the time to look at the tutorial on Github. The second issue is that there are buttons that my screen reader reads as "button". While run, edit, and format are read properly several other buttons are not.


Thank you! This is just what I was looking for. I filed a bug report:

  https://issues.dlang.org/show_bug.cgi?id=17794
and we'll get it fixed.

I'm a little unsure what you mean by arrowing through the edit field. The cursor moves around as I press the arrow keys, as they would in a text editor.


> have a look


> > have a look

Please don't do this, on the Internet or anywhere else. Blind people don't need you to speak in ghetto terms or start self-moderating the things you say to any greater degree than you (probably) normally would.


I agree, but don't you think you are contradicting yourself by jumping in to defend all blind people?


I also "watch movies".


> This is what surprises me most about not having a screen.

That thing on the right of the photo is a closed laptop. This person does have a screen, it's just not constantly in use. (Not "in sight", as the article puts it.) They can open up the laptop to show stuff to others when needed. This is how my blind colleague does it.


I've always wondered, is performance an issue with your colleague? Does he need to put extra hours or effort to finish some jobs? Or maybe he can be faster than the average developer with other tasks?


No, they get the things done that they are assigned. Watching them work, yes, some things are clearly less efficient than for someone who can see the whole screen at once. But overall it works well for our particular setting.


If your screenreader is any good, it'll be a piece of proprietary software that knows how to OCR the framebuffer and doesn't give a shit about "standard" UI.


It's still very difficult to navigate it if I make a custom drawn UI, even if you can read what the text on the form says.

Typically these crap apps have custom drawn widgets and buttons with no tab-order, no way to click without using mouse etc. The screen reading bit is pretty easy (fallback to OCR, as you mentioned). But once the screen reader says "please click this button" and there are no (known) buttons on the form, you are stuck.


Synthetic mouse clicks.

These are pretty much solved problems. I know because Nuance solved them. Dragon Naturally Speaking both OCRs the screen and sends synthetic mouse clicks to windows that use custom UIs, such as Microsoft Office, when you say something like "click OK".

Implementing this in an accessibility suite for the blind is definitely doable. It would be tedious and take a lot of testing to get right, but that's why the best solutions are proprietary.


I'm wondering if they use screen sharing. It'd be dead simple to share your screen to someone and have them look at it, you don't even need your own monitor.


As some sort of comparison, imagine having to call an IT support guy to come restart your computer every 15-30 minutes... I'd lose my sanity before a single day was out.

I imagine you very quickly identify software / features that cause such interruptions and ruthlessly prune them from your environment.


I wonder what the pair-programming experience is like when you have one vision impaired programmer and one sighted developer. Anyone here have experience with this?


450 wpm. I can't even descipher 2 words out of that. It is amazing if he can understand that fast. Did his brain just rewire the visual cortex to process audio?

Makes me wonder what a human would be with double the neurons in the brain.


I've been making a habit of listening to audiobooks, podcasts, and YouTube videos at increasingly faster speeds, and after listening to the clip two or three times can understand about 80% of it. My comprehension problem with it right now is the style of the voice, not the speed it's talking.

Like anything, I think understanding this computer voice at 450 wpm is a skill anyone could learn with practice.


Though I'm not blind I would imagine it is like speedreading. Occasionally skipping words has no effect to the main idea of a sentence because the mind fills in the blanks. Anyway disability related tech news is very cool stuff and I hope to see a lot more on HN.


I read the article without listening to the provided clips and WOW. I'm not sure how people manage that.

For anyone interested in UI/UX design for blind people Kevin Jones did a talk on it https://www.youtube.com/watch?v=lHBuQLNIs1c

https://twitter.com/kevinrj?ref_src=twsrc%5Egoogle%7Ctwcamp%...

He also has done some really cool stuff with the BrainPort: http://www.channel3000.com/news/local-news/madison-made-devi...


There's an episode of TNG with aliens called "Binars" and their speech sounds exactly like the screen reader. It always amazes me how prescient TNG was sometimes (although accidentally in this case)

http://memory-alpha.wikia.com/wiki/11001001_(episode)


The use of the visual cortex for audio is a theory: https://www.scientificamerican.com/article/why-can-some-blin...


I was able to understand it reasonably well. I could understand it about as well as normal Spanish speaking (I took a few years of Spanish in high school); I can pick out the verbs, some nouns and guess the rest. With practice, I think I could understand it perfectly.


Your comment provoked the intriguing thought that a visually impaired, hacker marine-biologist might be the demographics thats better prepared to understand orcas


After two or three listens, the only bit I couldn't make out was "(through a separate braille display) or synthetic speech". I imagine I could build up to that speed if I had to.


> A screen reader intercepts what's happening on the screen and presents that information via braille (through a separate braille display) or synthetic speech. And it's not the kind of synthetic speech you hear in today's smart assistants. I use a robotic-sounding voice which speaks at around 450 words per minute.

After listening to the English sample....maybe I'm being naive but is it typical to be able to understand computer speech at that rate? I could barely make out a handful of words and probably wouldn't have believed there was a text to understand in there if the words weren't above it.


You work your way up to it. Open up a podcast app and start listening at 1.5x. After a while (days to weeks), that becomes comfortable. Then work your way up to 2.0x. Once you're at that point, listening to familiar shows (familiar people/voices), at 1.0x makes them sound drunk. They seem. to. speak. so. slooow.

Some podcasts I listen at 3x which feels near my limit of comfort (comprehension drops off at speeds higher than that), but I suspect nearly any normal person can get used to 2x speed. When you are, you can absorb twice as much information in the same amount of time.

(The above applies to talk type podcasts. Music is best at 1x, and dramatizations and such are often best left at 1x to get correct pacing).


I've tried going faster but found that I had to concentrate too hard on listening to leave time for reflection during the listening, which is the whole point for me of listening podcasts. I wonder if a blind person struggles with similar issues, where they have to dose the speed reading to allow them to actually think about what they heard.


Instead of listening to a podcast/audiobook once on normal speed, you could try listening to it twice on double speed.

* Especially audiobooks generally have only one core idea so even if you miss a part, you're not missing much.

* By listing twice, you will 'revisit' the topic, increasing comprehension.

* Most podcasts/audiobooks actually don't have a lot of new ideas (especially after you've heard a lot of them), so in those cases you're only wasting half the time.


When you listen to a podcast at 3x do you focus solely on the podcast or are you able to do something else too? For example cooking? Or driving?


I pretty routinely listen to podcasts and audiobooks at 2-2.5x, and I could understand the English audio just fine. Not sure that I'd _want_ to listen to it that quickly, but I can understand why if that were your interface with your computer that you would.


I wish my Roku box would provide a speedup option, as well as my Comcast DVR.


You are comparing your first time trying to digest rapid speech, compared to someone who has done it for years. Yes, it's typical.

A new screen-reader user would start such an endeavour at normal speed, then gradually increase the speed. It's not much more different to people listening to podcasts and audiobooks at 1.5x speed. The robotic voice is more of a help than a hinderance because of it's consistency.


It may be typical if you are blind. It isn't if you are not (though obviously, you could train for it). Keep also in mind that the human brain has incredible capabilities when it's highly focused. Not having visual input removes a lot of distraction for our reptilian brain always scanning for threats.

In college, I had a friend who is blind and her screen reader speed setting was insane. I couldn't understand most of it despite being someone who watches videos and listens to audiobooks at 2x.

Her typing speed was also amazing. I have never seen anyone type that fast before or since.


The speech becomes surprisingly clear if you listen to it and read the paragraph at the same time.


Sure but that's obviously not how it's used. Also, for the first one which is the one he's actually using I couldn't make out a single word even when reading at the same time.


It's like learning a foreign language. At first, native speakers seem to speak so fast that all words flow into each other without distinction.

As you improve, you begin noticing words you already know, and when someone speaks deliberately slow, you can puzzle together sentences.

Eventually, you can understand speech at a normal conversational speed. Most people probably stop at that level, because they don't need more.

But e.g. when watching recorded lectures, I tend to crank up the speed to 1.5x or even 2x until I need to think more about something. The sped-up speech is mostly still quite understandable.

There is no reason to suppose you can't improve your hearing beyond the limits on humans' ability to form coherent thoughts from scratch and move their mouth in the right pattern. Just keep increasing the speed, attune yourself to the artificial voice and then increase it even more until you're at a level that makes untrained people feel like non-native speakers.


Yes, actually, having worked in speech synthesis for a while I was surprised to find that's how frequent users generally do use it. Although IIRC 450wpm is still on the high side - I think maybe 2-300wpm was more common, albeit that's still pretty unintelligible to most of us. I imagine maybe he can distinguish higher speech rates using quality headphones than coming from, say, a phone speaker though.


I wonder the same. I often listen to videos at 1.5-1.8x. I can listen to faster, but after about 2-2.5x it becomes enjoyable. As I need full concentration to parse the speech and miss out on the visual experience. Perhaps being blind the audio processing is heightened or just the visual distraction is absent allowing full concentration on the audio?


I would imagine that the consistent robotic voice helps too, allowing large chunks of text and whole sentences to be processed as a single pattern, like when speed reading.


You're not alone, I have overheard many hours of this type of speech and never figured out how to listen to it. I think it is something you have to work up to a bit at a time.


It's gibberish to me too, but surely it's something that can be learned.


I downloaded it and set the tempo 60 percent lower, and it's still difficult. It's not just the speed...the quality of the synthesis is pretty bad. I bet it takes quite a bit of time to get good at it.


Aren't both English but with different accents?


Finnish has highly regular pronunciation with almost 1-to-1 correspondance between letters and sounds. In the first clip English is read as if it was Finnish, using Finnish pronunciation rules. Works pretty well as Finnish would be pronounced correctly and for Finnish speaker it's quite easy to catch Finnish pronunciations of foreign words.


No. without reopening the article, I thought one was Finnish read with an English accent. This was due to what the author had grown accustomed. In his youth screen readers did not have as many options?


>I'm reading English with a Finnish speech synthesizer

I think it's English but the robot voice is geared for Finnish speech patterns.


I think it was the opposite - English read with a Finnish accent.


It's not just an accent. It's reading the words as if they were Finnish, so totally different.


I would totally put a screen on his desk hooked up to a separate computer that just displays nonsense so that people who don't know that he's blind think he's just hanging out in the office watching the entirety of strawberry shortcake or something.


Totally should have The Matrix screensaver playing at all times.


Yeah or the front page of HN.


I was curious about the braille display mentioned, so I had a look: https://www.google.com/search?q=braille+display&source=lnms&...

These things are priced from $1k up to $12k. I understand it's complicated since there are tons of actuated pins needed, but wow.


They're actuated by piezoelectric strips. The PZT ceramic is low power and robust, but heavy-ish (lead), large-ish, and shockingly expensive.

History: https://nfb.org/images/nfb/publications/bm/bm00/bm0001/bm000...


> Windows is the most accessible operating system there is

I was under the impression macOS and iOS are the preferred platforms for people with disabilities?


I’m totally blind and bought a Mac several years ago hoping to learn iOS programming. As far as I can tell it’s impossible to create a simple app without using drag and drop in XCode. While it’s technically possible to do drag and drop with Voiceover I can never get it to work correctly. Apple has not put out any documentation on how to use XCode when blind that I’ve been able to find. This is disappointing since I find iOS to be much easier to use then Android. I also find Voiceover on the mac does not have quite as many shortcut keys as my Windows screen reader when reading the web, for example I cannot jump to a heading at a specific level or move by region. While Mac OS did have easy availability of Unix tools as a plus that no longer is a point in its favor. I find WSL to be quite good and accessible. Luckily the 11 inch MacBook Air was the best ultra-book I could get at the time so I don’t feel like I wasted my money even though I boot into Windows 95% of the time. For general use Mac OSX is pretty good. I find the screen reader to be quite intuitive and there is a good tutorial to help you learn it. Voiceover just doesn’t have the breadth of shortcut keys or customizability of Jaws for Windows which is my preferred screen reader despite its cost.


If you're discussing storyboards in XCode (given you mentioned drag and drop), you actually don't need that feature and can lay out both iOS screens and the flows between them programmatically. I actually prefer writing views programmatically for iOS--it gives you readable diffs, which is especially helpful for code reviews.


I have not found an introduction tutorial that assumes you go from no iOS knowledge to a working app with just code instead of storyboards. DO you know of a tutorial that does this?


Here's one that might help for Objective C: https://medium.com/@danstepanov/your-first-ios-app-100-progr...

Alternatively this is the same tutorial using Swift: https://medium.com/@danstepanov/your-first-ios-app-100-progr...

Sadly the images in those articles don't have alt tags, but the images are well described by the surrounding text.

Googling "build iOS views programmatically no storyboard" should bring up some more resources. The rest of iOS development will be normal/consistent with other tutorials--the only thing that's different is learning to describe the view itself in code.


I haven't had a chance to actually run through the tutorial but looking at it I don't think I'll have problems following it. It's well written and the graphics are not required to understand what's going on.


There's a note in the article about the author's distaste for macOS and iOS because of VoiceOver's slow release schedule and how it doesn't work for his style of interacting.


I worked with a blind dev for years and he mostly used the command line.

He said, this is the only real accessible interface where you get 100% control.


This holds true for everyone


I'd love to read more specifically how coding style makes things easier or harder. It'd be really interesting to know if there are some general rules in this space, and if so how they may or may not help sighted programmers read code. Really interesting perspective on a topic which is usually just a mosh pit of opinions.


I'm not blind, but I found this comment by a blind programmer really interesting. Block comments are way more important when using a screen reader: https://github.com/rust-lang-nursery/fmt-rfcs/issues/17#issu...


That would be interesting. I wonder, for example, if underscores are better or worse than camelcase. Or if screen readers know when to say "dot" vs "period". Or if screen readers are reasonable for odd stuff like regex syntax and multi character constructs like ->.


On a related note, I wish Youtube had an option to play a video faster than 2X (other then downloading the video to play in a local viewer like VLC).

Even as a sighted individual, listening to conference talks at high speed is convenient (and you can still pause if you want to look deeper into something).


I have the same wish. Though not ideal, you can work around it in the console quite easily:

document.querySelector('video').playbackRate = 3


Just create a bookmark with that code prefixed by "javascript:", then it becomes 1-click to activate:

    javascript:(function(){document.querySelector('video').playbackRate = 3;})();


There's a chrome plugin called video speed controller that can speed up any html5 video (including YouTube, vimeo, etc) to whatever speed you want. It's great!



Yeah, I love the speed option on Youtube, but it often could go faster. Though I usually end up watching at 1.5x because when I'm watching coding tutorial videos, I can't type fast enough and also process what I'm hearing/seeing to be able to keep up at 2x.

I tried understanding the reader in the article and found I could only catch about every third or fourth word, even though I do often watch videos at 2x (and I did actually used to use VLC for speeding up videos in the pre-YouTube speed option days). I'd need a lot more practice to grok speech at that speed.


I was thinking what I need for programming and assumed that no hands (or hand) and no vision would be complete deal breaker. But by now I've seen people programming using voice and now this guy without seeing. It's amazing what they can do just by being dedicated enough.


This is amazing and a scenario I had never really considered before. I know it has been said that the best programmers carry around small programs in their heads and map things out mentally — this is kind of the same idea taken to the extreme.

I thought the portion on frontend development was really interesting. Something I had never considered, but to think of it from the angle in which it is presented just kind of blew my mind. This piece made my morning.

edit: I guess this post also just shows that if you want to learn to program and be an engineer, there are certainly ways to make that happen, regardless of specific elements of your situation (to a point, obviously). Really, really neat.


I worked with an engineer who was partially sighted and could only see a tiny fraction of the screen at a time, he had essentially memorized most of the code base.

What struck me was that when I started working, while not the same, that was how you got anything done, API docs were on paper, we had stacks of books on our desks, code navigation tools were laughable, no intellisence, and screen were small (14" 640x480). When I was regularly writing win16/32 code I would rarely need to look anything up, the cost of doing so naturally trains you to remember.

I don't miss it tbh


What about Notepad++ makes it work with screen readers while Sublime and Atom don't?


From https://notepad-plus-plus.org/

..., Notepad++ is written in C++ and uses pure Win32 API and STL which ensures a higher execution speed and smaller program size.

And which lets it piggy-back on Windows' accessibility features, which seem to be quite good.

I guess Sublime Text and Atom draw their own GUIs to get more control over the look-and-feel, but don't add the necessary hints to make screen readers work.


You're right about the advantage of using standard Win32 controls. But you omitted an important part of that sentence you quoted from the website: "Based on the powerful editing component Scintilla". Scintilla is indeed a custom control, and it doesn't implement either of the Windows accessibility APIs (UI Automation or Microsoft Active Accessibility, let alone MSAA's unofficial extension IAccessible2). So why does it work with a screen reader?

First of all, Scintilla doesn't work with all screen readers. I know for sure that it doesn't work with Narrator. So why does it work with NVDA, the OP's choice of screen reader?

The answer is basically an accident of Windows's history. Going back to the very beginning, the main way that an application queried or manipulated a Windows control was by sending it window messages. Because early versions of Windows had cooperative multitasking and no security, any application could send messages to any other application's windows. Presumably for backward compatibility, or perhaps because nobody thought to do it differently in the 90s, this openness was carried forward into Win32, including Windows NT. Some limitations have been added for security, particularly in Vista, but basically, if any two apps are running under the same user account, they can send window messages to each other.

It turns out that this capability is very useful for screen readers and other assistive technologies. Even after Microsoft introduced the Active Accessibility API sometime between Windows 95 and Windows 98, that API had some gaps, particularly when it came to working with editable text controls and multi-column list views. Screen readers worked around these deficiencies by using the window messages provided by these standard controls.

So, Scintilla exposes its own window messages, and NVDA uses those to make it accessible. I do wonder why Scintilla took this route, especially since it's multi-platform. It could have simply exposed a C API. In that case, it would be much less accessible (completely inaccessible if it uses something newer than GDI to render its text), unless it implemented UI Automation or IAccessible2. So, really, screen reader users just got lucky in this case.


This is correct. By sticking to pure Win32 controls it allows screen readers to do their job properly. While making your own controls or using third party UI libraries will give you a nicer looking UI it makes the screen readers job much harder if not impossible.


There must be an API for it though, right... That being my first reaction, after thinking about it again, well, a custom UI that is accessible and cross-platform... Maybe a semantic web site IS the holy-grail in that regard? ARIA[1] seems more detailed than Qt-quick[2], at least.

[1]: https://developer.mozilla.org/en-US/docs/Web/Accessibility/A... [2]: http://doc.qt.io/qt-5/qml-qtquick-accessible.html



Wouldn't a proper structured editor be more suitable for blind coding? The market isn't large but I can't imagine blindly navigating through a sea of characters. Sounds terrifying.


People once coded in ed and it is kinda like coding blind. You don't have a display, instead you can move to specific line numbers and print or replace them. It sounds impossible at first but you get used to it if you only use ed.


I've been thinking that doing a "30 day command-line challenge" might be cool, idea being that you'd be limited to command-line based tools only. No fancy visual editors or ncurses browsers. Just pure line-oriented bliss (or more likely, hell). ex would be likely candidate for text editor in such experiment.


What options even exist outside of ex in that case?


If you want something even more bare-bones than ex or ed, you could use cat as a write-only editor. In combination with head and tail, replacing parts of a file would be possible as well.


I suppose you could also use TECO: http://almy.us/teco.html


Or sed, which for many editing purposes is probably easier to use.


Sure, but sed is basically just ed in non-interactive mode, so not very different from ex. And ex on my system is basically just vim without instant redraw after each change. And then you're back in terminal-GUI land.

It's interesting how the ancestors of modern editors are still around as living fossils ...


I'm not blind but I also hate side by side diffs. When resolving conflicts, I prefer opening the raw text file with the conflict segments and manually delete and copy paste stuff; it's so much faster and more flexible.


There is a fairly popular thought that software which is accessible tends to be better for "normal" users as well.


FYI for anyone interested: There is a google group called Blind Dev Works. It is run by a blind developer. (I am co-admin, but it is his group.)

https://groups.google.com/forum/#!forum/blind-dev-works


I'd be interested in learning more about how a blind person does software development. I already do most of my coding work in the terminal with keyboard only where possible, and it'd be an interesting exercise to try and get rid of the monitor altogether.



its about coding for a blind person

but under what circumstance would typing speed really matter? most of my coding is a creative process with tons of autocomplete via the IDE, as well as copy and paste for boilerplate that can't be inherited


How to spot the guy that didn't read the article.


It's a disingenuous title. The article is cool, but not what I was lead to believe.


It really is. There is nothing "software development" about the speed of his text-to-speech other than he uses it for reading code in addition to reading normal sentences.

The more accurate title would be "Software developer reads at 450WPM". Still interesting enough to get people to click to find out how, but "development at 450wpm" really really does sound like he's typing at 450wpm.

Reading code is not "software development", and reading is all he does at 450wpm.


A very skilled reader can read text at around 400wpm with good comprehension, so I can believe he can listen to code at those speeds.

A professional typist is somewhere around 80wpm, and a stenographer can reach up to 250wpm. 450 would have to represent an enormous innovation in typing methods.

Frankly, even if you could, 450wpm would be an irresponsible speed to code at. As mediocre as my own typing speed is, it's never the limiting factor when I code. There's just too much to consider.


I skimmed

Skimmed really fast looking for something related to the title


It's interesting; none of the 'speed reading' wisdom has worked for me. When I 'skim' I seem to be way slower than most people who 'skim' - even though I'm quite competitive when 'reading' - but I also seem to usually have some idea of what the text was about, while I find it is quite common for people skimming to not have absorbed much at all from the text. That's interesting too, because I find that in 'reading' my reading comprehension is not better if you compare me to others who have the level of education implied by my profession.

When I skim, my focus is on reading faster, pushing myself through the document, and I let my unconscious deal with what information is dropped. It does move things along much faster than 'reading' at regular speed, and it does lower comprehension, but it does these things in a sort of incremental way.

Most of the time when I've read about 'speed reading' it talks a lot about consciously managing that information drop, and managing where your eyes go, which for me, slows me down quite a lot, and usually ends up making my retention all but useless.


The 450 words is not about the typing speed, but about the speed of his screen readers (reading to him 450 words a minute)

Check out the samples he gave, it's pretty impressive that he is able to understand anything


Check out the audio he presents in the article. His screen reader reads him text at 450words per minute.

I wonder how quickly he can finish reading non-technical books.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: