The Great Microsoft Scrollbar-Dragging Behavior Debate Continues

Well, it finally happened. Someone sent a mail outright defending Microsoft's scrollbar behavior — and did a depressingly fine job of it, too.

I'm quite sure I'll never be so certain again.

From: Daniel Beardsmore <email.address@obscured.by.request>
To: Karl Fogel
Subject: Microsoft Scrollbar-Dragging Behavior
Date: Sun, 21 Jan 2007 23:36:51 +0000

Hm, I think there are some cases you've not considered. For me
personally at least, I find it frequently necessary to review part of a
window's contents that are distant from where I am presently looking. It
might be further down a Web page, further down a document, further up a
page of source code. By your belief, once I have scrolled down to it,
there is no guaranteed way to return to where I was.

Thanks to Microsoft (or whoever's idea this was, since Macs tend to
behave the same way), I only need pull the cursor away from the scroll
bar and I am instantly returned to my former position.

Pure text based applications for which the caret was exactly located at
my previous view position, when following your desired behaviour, would
let me press left or right arrow to refocus the window content. This
does not work in other cases, such as conversation history in a chat
client or, of course, a Web browser.

In fact, the Macintosh trend during any drag operation -- be it dragging
a window or a file -- was to drag the cursor over the menu bar to
cancel, since that's an invalid destination. They've slowly been moving
to the Microsoft approach of pressing escape to cancel. However, none of
XUL, Windows, GTK+ and Mac OS 9 (I don't have X booted right now)
allow escape to cancel scrolling, since scrolling is in fact
asynchronous in most window APIs and toolkits (not the case of OS 9
where the whole system is frozen, but escape is still ignored).

Escape is the best answer, I suppose, except it violates the concept of
sticking to a single input device -- it's never good to have to reach
for a different input device to complete an operation. In this case, a
wrist flick will cancel a scroll with convenience far beyond having to
go looking for the escape key.

Various applications and UIs do not implement snap-back when scrolling,
including iCab, GTK+ and EIKON. On a tiny little palmtop screen you most
want this feature since with longer documents it's very hard to find
yout way back. I lament the lack of snapback with any program that
doesn't provide it as it makes my experience all the more difficult.

Yes, it took more effort for Microsoft (and Apple) to write, but
generally intelligence is complicated. Take a counter-example: submenus.

When you move the mouse over a parent menu item, a submenu pops open. If
you aim the cursor diagonally at the destination item, your path first
crosses the next item down on the higher-level menu, causing the submenu
to snap shut. This was the Amiga behaviour, as well as Windows 3 I
imagine, and is still how DHTML tends to work.

Mac users never have this problem: submenus are smooth and predictable?
Why? Because before the Mac was even released, the developers realised
this problem and solved it. If the cursor is sensed to be moving
diagonally towards the submenu, the Mac assumes you want it left open
and it does indeed leave the submenu up for you even if the cursor has
to cross over the menu item below the parent item to get to it.

Not the simple approach, but it's the logical one. Microsoft instead put
in a very annoying delay in both opening and closing submenus that makes
using them slow and clunky, and solving one problem by creating another.
Thanks to TweakUI I can remove that delay so that I don't have to wait
around, but I still have to guide the mouse carefully so as to not cause
the submenu to snap shut.

I have, I will add, witnessed scroll bar snapback causing a problem: it
confuses my father. Now, he's left handed and due to an "unfortunate"
set-up at home, has the mouse to his right. He also fails to learn how
his actions affect the system, and either panics or gets angry. But I
don't recall anyone else ever being bothered by the behaviour. (What I
do despise, and this victimises him too, is that clicking too fast --
the time between mouse down and mouse up being too short -- causes
Windows to focus the control on which you clicked as if you tabbed to
it, but not deliver the click itself. MS Paint is worst affected
somehow, as I tend to think fast and rapidly click the tool palette
buttons and have it ignore me entirely over and over. I have never heard
anyone offering, or found, a single explanation for this piece of
ridiculous behaviour. It does even affect Mac OS on rare occasions,
suggesting a flaw in the underlying event model where perhaps mouseup is
eclipsing the preceding mousedown?)

It's tricky to decide which design decisions to have as options, and how
hard to hide them from people, and balance smart people suffering at the
expense of those who can't and won't learn. But Mac OS and Windows both
have this as the standard behavior.

Of course, you could argue -- and many I imagine would -- that the
navigation model itself is completely flawed and needs reinventing. I
was surprised to see that the iPhone uses a zooming GUI -- possibly the
first ever commercial example of it. It would be ironic if this were
true, as Apple were the people who brought the WIMP GUI to the public
and may have now brought the public what some consider the WIMP GUI's
replacement.

Me, I've tried Jef Raskin's Archy and I find it an abomination. I can't
touch-type, or even close, but I use enough fingers that I find trying
to type commands with my pinky anchored to caps lock, very hard -- the
rest of my hand is now misplaced. (And that's only the tip of the
problem for me.) Since the iPhone is not Archy, and presumably not
modeless either, it remains to be seen how nice that is to use, not that
I'll be buying one :P (I don't have a mobile phone to start with).

But no, I bash Microsoft yet I still agree with many decisions about the
Windows GUI, enough that ultimately it's faster, clearer and more
productive than Mac OS X by far for me (and I'm still a Mac OS 9
die-hard). And as far as scrolling in Mac OS and Windows goes, it's
fiddly but wins out by far in its aid to navigation.

From: Karl Fogel
To: Daniel Beardsmore
Subject: Re: Microsoft Scrollbar-Dragging Behavior
Date: Sun, 21 Jan 2007 19:04:06 -0600

On 1/21/07, Daniel Beardsmore wrote:
> [...]

Thanks for this very articulate defense of the scrollbar behavior!
You are quite correct, I hadn't considered this case.

I still think it's a bad UI decision -- I have seen others get confused
by it, and in general feel that any "invisible wall" in an interface is
human-unfriendly -- but you've made as good an argument for it
as can be made, and I can see how once you get used to it, it is
a help in return navigation.

I often discuss with friends the inherent tension in UI design between
"friendly for newcomers" and "friendly for experts": what works well for
a light user of the system may turn out to be very frustrating for those
who use it hour after hour every day.  Usually I prefer systems designed
for users who are going to become experts -- systems with a steep and
tall learning curve, but a rewarding pot of gold once you get to the top.
The canonical example of such a system is my favorite text editor,
GNU Emacs.  There's no point using Emacs unless you're going to
use it for the next six years.  On the other hand, if you're going to use
a text editor for six years, there's no point using anything but Emacs.
(Or something similarly extensible, like modern versions of 'vi'.  I don't
mean to start an editor war here.  The point is whether the interface is
tuned toward people who will have the time to learn it, or toward people
who will not -- and who therefore need things to work along simple and
familiar metaphors drawn from the physical world, however inappropriate
those may be for the actual task of editing.)

But ironically, I have to admit the most consistent way to analyze the
scrollbar situation would be to say that I'm advocating an interface designed
for newcomers, and you're advocating one designed for experts!  :-)

I have to admit I've never had a problem returning to where I was
in a web page, but that may be because I've grown used to what
you would consider a clumsy workaround: I just scroll back roughly
to where the scrollbar thumb was before, and use the page's overall
visual appearance to find the exact location once I get close enough.
It's not as fast as flicking the wrist though.  If I thought I could ever
get comfortable with that "invisible wall" phenomenon, I'd set my
applications to use your preferred scrollbar behavior... But I know I
couldn't: my skin would always be crawling with tension at whether
I was about to cross that line or not :-).

(On the other hand, in GNU Emacs, I use the push-mark and pop-mark
commands all the time, and they are exactly the kind of location stack you
are describing -- in fact, a more fully general one, in that you can push an
arbitrary number of locations and pop back to them when you're one.)

Your point that good UI design doesn't necessarily follow simple rules is
one I completely agree with, by the way.  I wasn't arguing against the
scrollbar behavior on the grounds that it was either more complex to
describe or more complex to implement; my implementation comments
were grounded on the assumption that the behavior itself is bad, and that
therefore writing extra code to achieve it would be worse than having a bad
behavior resulting from laziness (i.e., from *not* writing extra code to get
something right).  If we take away the assumption that the behavior is bad,
then of course the extra code is no longer objectionable.

So as you probably guessed, I'd like to add our correspondence to the
web page, as I did with previous correspondence on this topic.  May I
publish your email there?

Thank you,
-Karl

From: Daniel Beardsmore
To: Karl Fogel
Subject: Re: Microsoft Scrollbar-Dragging Behavior
Date: Mon, 22 Jan 2007 03:25:19 +0000

Karl Fogel wrote:
> I still think it's a bad UI decision -- I have seen others get confused
> by it, and in general feel that any "invisible wall" in an interface is
> human-unfriendly -- but you've made as good an argument for it
> as can be made, and I can see how once you get used to it, it is
> a help in return navigation.

Hm, I find the idea of an "invisible wall" curious. I just measured it
-- in Windows, on my 17" CRT at 1152x864, it's 1.23 inches (measured as
pixels, ~103). On rare occasions I fall foul of it -- maybe I grabbed
the mouse funny or some such -- but otherwise it's hard to get it wrong
and the margin is adequately wide.

In my dad's case, it's something I blame more on using the mouse with
the wrong hand, as that leads to all sorts of problems, although it's
fun to try. #include "nerdy anecdote about school crush on left-handed
girl..."

> I often discuss with friends the inherent tension in UI design between
> "friendly for newcomers" and "friendly for experts": what works well for
> a light user of the system may turn out to be very frustrating for those
> who use it hour after hour every day.  Usually I prefer systems designed
> for users who are going to become experts -- systems with a steep and
> tall learning curve, but a rewarding pot of gold once you get to the top.

I think I'm middle ground in that sense: I want enough power that it's
not going to leave me unable to be efficient but not so steeped in
arcane ideas that I need to be an acolyte to some Free Software guru to
understand.

Choosing such a middle ground alone is hard and it does seem like there
is no such thing. Mac OS X looks like an attempt to take this route but
it seems that people are quite polar about Mac OS X too. It gets a lot
right and a lot wrong, and typically, I'm sat on the fence not sure
whether I like it or hate it. Generally, both.

The idea with X11 that all the choices are yours is a good route, but it
needs to be underlaid with standards for applications to rely on, which
it failed to provide, with rivalries such as Qt and GTK causing trouble.
And now I am terrified of a simple decision as "use Linux" involving a
million options to choose.

I remember when all I had to do to replace Program Manager with the
alternative Wayfarer shell was put the EXE somewhere nice and tick a
box, or set SHELL=C:\PATH\TO\WAYFARER.EXE, and you know that at the
flick of a switch, you can go back. I don't have the same assurances
that I can toy with X11 desktops the same way.

> The canonical example of such a system is my favorite text editor,
> GNU Emacs.  There's no point using Emacs unless you're going to
> use it for the next six years.  On the other hand, if you're going to use
> a text editor for six years, there's no point using anything but Emacs.
> (Or something similarly extensible, like modern versions of 'vi'.  I don't
> mean to start an editor war here...

I guess it depends a lot. You have to be fairly competent for that. I've
just installed JujuEdit for Windows -- very nice, but the syntax
highlighting regular expressions are both brilliant and
incomprehensible. Being someone's hobby app, the defaults need attention
and of course, no editor ever ships with highlighting for every
language, but oh boy, trying to write highlighting using his RegEx
extensions ...

> The point is whether the interface is
> tuned toward people who will have the time to learn it, or toward people
> who will not -- and who therefore need things to work along simple and
> familiar metaphors drawn from the physical world, however inappropriate
> those may be for the actual tasks the users will be performing.)

I'm lazy and afraid, I want something that just works. But I also have
very strong ideas about how things should work (beyond demands like it
actually working at all, which is all too commonly a failing). For one,
it must not rely on me remembering anything, since I cannot. So it'd be
Gvim or XEmacs over the purely textual variants. But there are also nice
touches that make me smile, such as how JujuEdit has an Undo button in
the Replace dialog -- wonderful, since it may take me repeated efforts
to get a regex to work and this makes it so easy to keep undoing my
mistakes.

I've certainly tried both Emacs and XEmacs, but not vi. I settled on
Pico/Nano since it has that nice bar at the bottom reminding me what I'm
supposed to press, especially when I trip over my fingers and don't know
what I just pressed and how to get out of it :P

> But ironically, I have to admit the most consistent way to analyze the
> scrollbar situation would be to say that I'm advocating an interface designed
> for newcomers, and you're advocating one designed for experts!  :-)

Not experts, but people who don't have a problem learning effect from
cause with shrieking and panicking. If you panic you'll drop the mouse
and never see why it did what it did. If you can react without
panicking, you'll push the mouse back across and realise what happened.
I have no recollection of learning any of these things, it was probably
so insignificant that I never noticed.

(As far as the submenu thing goes: I never realised why Mac menus
worked, but Windows menus always drove me up the wall. I am so glad to
have learned that TweakUI offers improvements there ;)

> I have to admit I've never had a problem returning to where I was
> in a web page, but that may be because I've grown used to what
> you would consider a clumsy workaround: I just scroll back roughly
> to where the scrollbar thumb was before, and use the page's overall
> visual appearance to find the exact location once I get close enough.
> It's not as fast as flicking the wrist though.

I am used to this, because for years the only half-decent browser I had
was iCab and this was all I could do. But long passages of text have no
visual appearance to go by, so I tended to first line up the scrollbar
thumb with something and then return it to that position.

 > If I thought I could ever
> get comfortable with that "invisible wall" phenomenon, I'd set my
> applications to use your preferred scrollbar behavior...

I've never been aware of any such setting. If iCab in 9 can behave the
"wrong" way -- not implement a wall -- it seems that the decision and
implementation is left to the application.

Testing this theory in Windows, for example: XUL applications (Firefox
and Thunderbird) have a narrower threshold than my earlier test of
Windows Explorer. FlashFXP matches the exact threshold of Explorer, but
SoulSeek and Paint are about 6 pixels narrower.

 > But I know I
> couldn't: my skin would always be crawling with tension at whether
> I was about to cross that line or not :-).

I'm artistic, so I figure that I have to some degree greater wrist
control than others. For example, I am agnostic about a single,
Fitt-compliant menu bar (as in Mac OS) and menu bars in individual
windows. My dad for example can't grasp one menu bar per window, and
will quite readily click the menu bar in the window behind and start
getting angry that Print printed something totally different to what he
wanted. A single menu bar is a better point of focus -- you know where
the commands are -- but in some ways I find it visually disjointed (it's
nowhere near my window) and I don't feel any great problem reaching for
one in a window that's not "infinite" height.

So maybe there is another connection here with how readily you can
control the mouse -- not just whether you can hit a local menu but how
well you can keep the cursor within a strip at least one inch wide.

But it's yet deeper. For example, if the window is maximised, as most of
mine are, then the width of that strip is infinite. All I need do is
push the cursor a little to the right and I can't ever fall off. I think
I do do this too, without thinking. It's a little bit analogous to that
infinitely-high menu bar ...

> (On the other hand, in GNU Emacs, I use the push-mark and pop-mark
> commands all the time, and they are exactly the kind of location stack you
> are describing -- in fact, a more fully general one, in that you can push an
> arbitrary number of locations and pop back to them when you're one.)

They're fine if you're navigating via the keyboard or you are OK
switching input (keyboard to mark, mouse to scroll), or fidding about
with the menu to mark a position and recall it.

I've never taken to navigating text I'm editing with the keyboard. I did
like ctrl-up/down to scroll a line without moving the caret, but that's
died out in anything I ever see now ... Good old DOS :P

> So as you probably guessed, I'd like to add our correspondence to the
> web page, as I did with previous correspondence on this topic.  May I
> publish your email there?

As long as you don't publish my e-mail address, you're welcome to.
Besides, I have a rare name and anyone who wants me can find me with
great ease ;)

2016-06-17 update: I just discovered that Google Chromium is implementing this feature on purpose, and it's present not just in the Windows versions but even on the version of Chromium packaged for my Debian GNU/Linux system. Starting here you can see discussion from irate GNU/Linux users who find this Windows-standard behavior confusing. Fun for the whole family.

(Back to Karl Fogel's home page.)