LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* UTF-8 and Alt key in the console
@ 2008-03-23 15:15 John T.
2008-03-23 15:29 ` Jan Engelhardt
0 siblings, 1 reply; 22+ messages in thread
From: John T. @ 2008-03-23 15:15 UTC (permalink / raw)
To: linux-kernel
Hello,
It is understood that although the Meta-key sequences
work in an xterm with vim on UTF-8, they don't on the
linux console.
That's because vim and xterm have an understanding about
how to function in UTF-8 regarding the Meta key. Xterm
translates the would-be ISO-8859 high-bit-char to its
UTF-8 representation, and vim catches that. This is the
way to move the traditional 8th-bit Meta convention from
single-byte encodings to UTF-8.
The linux console could function that way too, so that the
Meta-key would be recognized for those not willing to make
Meta send an ESC prefix; this behavior can be toggled with
the setmetamode command. Seems it's quite a simple code
snippet.
I'd like to know whether it would be an accepted change.
Regards,
--
John
____________________________________________________________________________________
Never miss a thing. Make Yahoo your home page.
http://www.yahoo.com/r/hs
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: UTF-8 and Alt key in the console
2008-03-23 15:15 UTF-8 and Alt key in the console John T.
@ 2008-03-23 15:29 ` Jan Engelhardt
2008-03-23 15:46 ` John T.
0 siblings, 1 reply; 22+ messages in thread
From: Jan Engelhardt @ 2008-03-23 15:29 UTC (permalink / raw)
To: John T.; +Cc: linux-kernel
On Sunday 2008-03-23 16:15, John T. wrote:
>
> It is understood that although the Meta-key sequences
> work in an xterm with vim on UTF-8, they don't on the
> linux console.
They also seem work on the console; I can use Alt-L in
mcedit to jump to a line.
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: UTF-8 and Alt key in the console
2008-03-23 15:29 ` Jan Engelhardt
@ 2008-03-23 15:46 ` John T.
2008-03-23 16:54 ` H. Peter Anvin
0 siblings, 1 reply; 22+ messages in thread
From: John T. @ 2008-03-23 15:46 UTC (permalink / raw)
To: Jan Engelhardt; +Cc: linux-kernel
--- Jan Engelhardt <jengelh@computergmbh.de> wrote:
>
> On Sunday 2008-03-23 16:15, John T. wrote:
> >
> > It is understood that although the Meta-key sequences
> > work in an xterm with vim on UTF-8, they don't on the
> > linux console.
>
> They also seem work on the console; I can use Alt-L in
> mcedit to jump to a line.
>
That's because you are working in "meta sends ESC" mode.
Although this is OK for most applications, for some it isn't.
Thus there have always been two modes, "meta sends ESC"
and "meta sets 8th bit". (toggled with setmetamode on the
console)
Vim relies on "meta sets 8th bit". Unfortunatelly the code
for this options does not work in UTF-8 in the console. What
I'd like to do is make this a viable option in UTF-8.
____________________________________________________________________________________
Looking for last minute shopping deals?
Find them fast with Yahoo! Search. http://tools.search.yahoo.com/newsearch/category.php?category=shopping
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: UTF-8 and Alt key in the console
2008-03-23 15:46 ` John T.
@ 2008-03-23 16:54 ` H. Peter Anvin
2008-03-23 17:47 ` John T.
0 siblings, 1 reply; 22+ messages in thread
From: H. Peter Anvin @ 2008-03-23 16:54 UTC (permalink / raw)
To: John T.; +Cc: Jan Engelhardt, linux-kernel
John T. wrote:
>
> That's because you are working in "meta sends ESC" mode.
> Although this is OK for most applications, for some it isn't.
> Thus there have always been two modes, "meta sends ESC"
> and "meta sets 8th bit". (toggled with setmetamode on the
> console)
>
> Vim relies on "meta sets 8th bit". Unfortunatelly the code
> for this options does not work in UTF-8 in the console. What
> I'd like to do is make this a viable option in UTF-8.
>
No, fix vim instead.
"Meta sets 8th bit" is so obviously and totally broken, since it maps
onto real characters, and has been doing so for at least 20 years.
Meta-L maps onto LATIN CAPITAL LETTER I WITH GRAVE, both in 8-bit mode
and in your proposed UTF-8 mode. It just becomes even more obvious how
unbelievably broken it is when you try to map it onto UTF-8.
Seriously, fix the crap.
-hpa
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: UTF-8 and Alt key in the console
2008-03-23 16:54 ` H. Peter Anvin
@ 2008-03-23 17:47 ` John T.
2008-03-23 17:55 ` H. Peter Anvin
0 siblings, 1 reply; 22+ messages in thread
From: John T. @ 2008-03-23 17:47 UTC (permalink / raw)
To: H. Peter Anvin; +Cc: Jan Engelhardt, linux-kernel
--- "H. Peter Anvin" <hpa@zytor.com> wrote:
> John T. wrote:
> >
> > That's because you are working in "meta sends ESC" mode.
> > Although this is OK for most applications, for some it isn't.
> > Thus there have always been two modes, "meta sends ESC"
> > and "meta sets 8th bit". (toggled with setmetamode on the
> > console)
> >
> > Vim relies on "meta sets 8th bit". Unfortunatelly the code
> > for this options does not work in UTF-8 in the console. What
> > I'd like to do is make this a viable option in UTF-8.
> >
>
> No, fix vim instead.
>
> "Meta sets 8th bit" is so obviously and totally broken, since it maps
> onto real characters, and has been doing so for at least 20 years.
> Meta-L maps onto LATIN CAPITAL LETTER I WITH GRAVE, both in 8-bit mode
> and in your proposed UTF-8 mode. It just becomes even more obvious how
> unbelievably broken it is when you try to map it onto UTF-8.
>
> Seriously, fix the crap.
>
> -hpa
>
OK, let's see if I can answer this.
Vi has 32 years of ESC key use tradition which doesn't play
well with "meta sends ESC".
Even though "meta sets 8th bit" is "broken" in your point-of-view,
that didn't stop it from being used all these years. The fact
that it maps into real characters is not a problem if you can just
use a CTRL-V equivalent in bash or vim.
Furthermore, it is an _option_. No one is obliged to use it.
So it's a question of:
.. _forcing_ the end of "meta sets 8th bit"
.. leaving things the way they are, and have them keep working,
as xterm did.
So guess we should fix xterm too?
I think you're exagerating.
____________________________________________________________________________________
Be a better friend, newshound, and
know-it-all with Yahoo! Mobile. Try it now. http://mobile.yahoo.com/;_ylt=Ahu06i62sR8HDtDypao8Wcj9tAcJ
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: UTF-8 and Alt key in the console
2008-03-23 17:47 ` John T.
@ 2008-03-23 17:55 ` H. Peter Anvin
2008-03-23 18:13 ` John T.
0 siblings, 1 reply; 22+ messages in thread
From: H. Peter Anvin @ 2008-03-23 17:55 UTC (permalink / raw)
To: John T.; +Cc: Jan Engelhardt, linux-kernel
John T. wrote:
>
> OK, let's see if I can answer this.
>
> Vi has 32 years of ESC key use tradition which doesn't play
> well with "meta sends ESC".
>
> Even though "meta sets 8th bit" is "broken" in your point-of-view,
> that didn't stop it from being used all these years. The fact
> that it maps into real characters is not a problem if you can just
> use a CTRL-V equivalent in bash or vim.
>
> Furthermore, it is an _option_. No one is obliged to use it.
> So it's a question of:
>
> .. _forcing_ the end of "meta sets 8th bit"
> .. leaving things the way they are, and have them keep working,
> as xterm did.
>
> So guess we should fix xterm too?
>
> I think you're exagerating.
>
Hardly. vim clearly can deal with the ESC-is-prefix issue anyway, since
otherwise it wouldn't be able to use arrow keys.
That being said, quite frankly, *both* Meta key conventions are
incredibly broken.
What I would much prefer is to see would be a brand new convention where
different keys (Ctrl, Meta, Super, Hyper, Alt or even in some cases
Shift) issues a unique prefix which doesn't conflict with anything else.
Emacs has tried to promote such a convention of the format
<CAN> @ <bucky> <keystroke> which is a lot better, although it's a bit
Emacs-centric (using <CAN> / ^X as the initial character is not really a
very good choice.)
The best probably would be to introduce an escape code, along the lines
of other escape codes in the terminal interfae.
-hpa
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: UTF-8 and Alt key in the console
2008-03-23 17:55 ` H. Peter Anvin
@ 2008-03-23 18:13 ` John T.
2008-03-23 18:46 ` Jan Engelhardt
0 siblings, 1 reply; 22+ messages in thread
From: John T. @ 2008-03-23 18:13 UTC (permalink / raw)
To: H. Peter Anvin; +Cc: Jan Engelhardt, linux-kernel
--- "H. Peter Anvin" <hpa@zytor.com> wrote:
> John T. wrote:
> >
> > OK, let's see if I can answer this.
> >
> > Vi has 32 years of ESC key use tradition which doesn't play
> > well with "meta sends ESC".
> >
> > Even though "meta sets 8th bit" is "broken" in your point-of-view,
> > that didn't stop it from being used all these years. The fact
> > that it maps into real characters is not a problem if you can just
> > use a CTRL-V equivalent in bash or vim.
> >
> > Furthermore, it is an _option_. No one is obliged to use it.
> > So it's a question of:
> >
> > .. _forcing_ the end of "meta sets 8th bit"
> > .. leaving things the way they are, and have them keep working,
> > as xterm did.
> >
> > So guess we should fix xterm too?
> >
> > I think you're exagerating.
> >
>
> Hardly. vim clearly can deal with the ESC-is-prefix issue anyway, since
> otherwise it wouldn't be able to use arrow keys.
There's always the "timeout" hack. It is allright with the
arrow and function keys because the second character in these
cases (`[' usually) is not a commonly typed vim command.
> That being said, quite frankly, *both* Meta key conventions are
> incredibly broken.
Indeed, I agree with you here.
> What I would much prefer is to see would be a brand new convention where
> different keys (Ctrl, Meta, Super, Hyper, Alt or even in some cases
> Shift) issues a unique prefix which doesn't conflict with anything else.
> Emacs has tried to promote such a convention of the format
> <CAN> @ <bucky> <keystroke> which is a lot better, although it's a bit
> Emacs-centric (using <CAN> / ^X as the initial character is not really a
> very good choice.)
>
> The best probably would be to introduce an escape code, along the lines
> of other escape codes in the terminal interfae.
You're right.
Many say Unix is also broken compared to Plan 9.. sometimes it's
too late. The real fix for this issue seems like it'd be very
hard to accomplish. In the meantime, maybe we could do this easy
fix. Or not. But we have a situation.
> -hpa
>
____________________________________________________________________________________
Be a better friend, newshound, and
know-it-all with Yahoo! Mobile. Try it now. http://mobile.yahoo.com/;_ylt=Ahu06i62sR8HDtDypao8Wcj9tAcJ
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: UTF-8 and Alt key in the console
2008-03-23 18:13 ` John T.
@ 2008-03-23 18:46 ` Jan Engelhardt
2008-03-28 23:26 ` H. Peter Anvin
0 siblings, 1 reply; 22+ messages in thread
From: Jan Engelhardt @ 2008-03-23 18:46 UTC (permalink / raw)
To: John T.; +Cc: H. Peter Anvin, linux-kernel
On Sunday 2008-03-23 19:13, John T. wrote:
>> Hardly. vim clearly can deal with the ESC-is-prefix issue anyway, since
>> otherwise it wouldn't be able to use arrow keys.
>
> There's always the "timeout" hack. It is allright with the
> arrow and function keys because the second character in these
> cases (`[' usually) is not a commonly typed vim command.
>[...]
>> The best probably would be to introduce an escape code, along the lines
>> of other escape codes in the terminal interfae.
>
> You're right.
>
> Many say Unix is also broken compared to Plan 9.. sometimes it's
> too late. The real fix for this issue seems like it'd be very
> hard to accomplish.
The idea of revamping the escape codes is not all that bad.
Thanks to terminfo, this should be easy. Change vt.c,
add corresponding terminfo entry and set TERM to something
that has not previously existed.
About the ESC key, I thought, would it suffice to replace its
current output of ^[ with ^[^[?
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: UTF-8 and Alt key in the console
2008-03-23 18:46 ` Jan Engelhardt
@ 2008-03-28 23:26 ` H. Peter Anvin
2008-03-29 0:07 ` Jan Engelhardt
2008-04-06 8:46 ` Marko Macek
0 siblings, 2 replies; 22+ messages in thread
From: H. Peter Anvin @ 2008-03-28 23:26 UTC (permalink / raw)
To: Jan Engelhardt; +Cc: John T., linux-kernel
Jan Engelhardt wrote:
>>> The best probably would be to introduce an escape code, along the lines
>>> of other escape codes in the terminal interfae.
>>
>> You're right.
>>
>> Many say Unix is also broken compared to Plan 9.. sometimes it's
>> too late. The real fix for this issue seems like it'd be very
>> hard to accomplish.
>
> The idea of revamping the escape codes is not all that bad.
>
> Thanks to terminfo, this should be easy. Change vt.c,
> add corresponding terminfo entry and set TERM to something
> that has not previously existed.
>
> About the ESC key, I thought, would it suffice to replace its
> current output of ^[ with ^[^[?
It would be better to assign a CSI (ESC [) code to it, like other
function keys. Unfortunately, the terminal everyone tries to emulate
(Linux does so quite poorly due to its broken implementation of ISO
2022, but that's less of an issue with UTF-8), VT 220, had ESC on the
F11 key, so the CSI 2 3 ~ sequence it uses we use for the F11 key.
Doesn't mean we can't assign another one.
One would also like to distinguish, say, Backspace from Ctrl-H. This is
trickier, because the termios settings don't permit compound keys. The
most obvious way to deal with that is an escape code for Ctrl-H, but
that has the risk of breaking a lot of other things.
-hpa
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: UTF-8 and Alt key in the console
2008-03-28 23:26 ` H. Peter Anvin
@ 2008-03-29 0:07 ` Jan Engelhardt
2008-03-29 0:23 ` H. Peter Anvin
2008-04-06 8:46 ` Marko Macek
1 sibling, 1 reply; 22+ messages in thread
From: Jan Engelhardt @ 2008-03-29 0:07 UTC (permalink / raw)
To: H. Peter Anvin; +Cc: John T., linux-kernel
On Saturday 2008-03-29 00:26, H. Peter Anvin wrote:
>>
>> About the ESC key, I thought, would it suffice to replace its
>> current output of ^[ with ^[^[?
>
> It would be better to assign a CSI (ESC [) code to it, like other function
> keys. Unfortunately, the terminal everyone tries to emulate (Linux does so
> quite poorly due to its broken implementation of ISO 2022, but that's less of
> an issue with UTF-8), VT 220, had ESC on the F11 key, so the CSI 2 3 ~
> sequence it uses we use for the F11 key. Doesn't mean we can't assign another
> one.
Even so, the linux term is the least broken one of all. I often had
issues with remote login programs (largely Windows ones) that had a
different idea of VTxxx whenever you wished not to have it. Despite
TERM being vt100 and the local encoding being vt100 too, actual
escape sequences were different from what programs in the shell
expected. On one occassion, F keys worked, but the Ins/Home does not,
in another it was reversed, etc. As soon as I learnt of putty a
few years ago I was happy to have all the mess that windows ssh
programs cause solved because it implemented the "linux" term type
and that just seemed to work out-of-the-box. So it does not seem
as broken to me as VTxxx.
> One would also like to distinguish, say, Backspace from Ctrl-H. This is
> trickier, because the termios settings don't permit compound keys. The most
> obvious way to deal with that is an escape code for Ctrl-H, but that has the
> risk of breaking a lot of other things.
Like what? I know that ^H is abused for screen effects.. not much
you can do about it, but it is not that important anyway.
As for ^H, all that I think is needed is the generation of an
appropriate escape code for Ctrl-H and Backspace at the terminal
emulator level (read: a pure xterm thing what key gets translated
into what escape code), while the read side then interprets
"ESC CTRLH", "ESC BKSP" and the traditional "^H".
And while we are at it, I'd suggest a whole new set of escape
codes, the current sequences are particularly... bad for
stream synchronization. Right now one has to parse strings for
end-of-escape.. which is awkward. I'd just be able to
strchr(s, '^]') for example and know when the escape code
ends. (Compat should of course be honored where necessary.)
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: UTF-8 and Alt key in the console
2008-03-29 0:07 ` Jan Engelhardt
@ 2008-03-29 0:23 ` H. Peter Anvin
2008-03-29 0:44 ` Jan Engelhardt
0 siblings, 1 reply; 22+ messages in thread
From: H. Peter Anvin @ 2008-03-29 0:23 UTC (permalink / raw)
To: Jan Engelhardt; +Cc: John T., linux-kernel
Jan Engelhardt wrote:
> And while we are at it, I'd suggest a whole new set of escape
> codes, the current sequences are particularly... bad for
> stream synchronization. Right now one has to parse strings for
> end-of-escape.. which is awkward. I'd just be able to
> strchr(s, '^]') for example and know when the escape code
> ends. (Compat should of course be honored where necessary.)
I think it would be a major lose to move away from ISO 6429 format; the
format is self-terminating and really isn't all that complex.
-hpa
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: UTF-8 and Alt key in the console
2008-03-29 0:23 ` H. Peter Anvin
@ 2008-03-29 0:44 ` Jan Engelhardt
2008-03-29 1:07 ` H. Peter Anvin
2008-03-29 6:33 ` David Newall
0 siblings, 2 replies; 22+ messages in thread
From: Jan Engelhardt @ 2008-03-29 0:44 UTC (permalink / raw)
To: H. Peter Anvin; +Cc: John T., linux-kernel
On Saturday 2008-03-29 01:23, H. Peter Anvin wrote:
>> And while we are at it, I'd suggest a whole new set of escape
>> codes, the current sequences are particularly... bad for
>> stream synchronization. Right now one has to parse strings for
>> end-of-escape.. which is awkward. I'd just be able to
>> strchr(s, '^]') for example and know when the escape code
>> ends. (Compat should of course be honored where necessary.)
>
> I think it would be a major lose to move away from ISO 6429 format; the format
> is self-terminating and really isn't all that complex.
What do you mean by self-terminating? There is no easy
synchronization like in UTF-8, given you are anywhere inside
a text stream, how do you know (a) you are already in an
escape sequence and (b) how to figure out the rebegin of
normal text.
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: UTF-8 and Alt key in the console
2008-03-29 0:44 ` Jan Engelhardt
@ 2008-03-29 1:07 ` H. Peter Anvin
2008-03-29 6:33 ` David Newall
1 sibling, 0 replies; 22+ messages in thread
From: H. Peter Anvin @ 2008-03-29 1:07 UTC (permalink / raw)
To: Jan Engelhardt; +Cc: John T., linux-kernel
Jan Engelhardt wrote:
>>
>> I think it would be a major lose to move away from ISO 6429 format;
>> the format is self-terminating and really isn't all that complex.
>
> What do you mean by self-terminating? There is no easy
> synchronization like in UTF-8, given you are anywhere inside
> a text stream, how do you know (a) you are already in an
> escape sequence and (b) how to figure out the rebegin of
> normal text.
(a) isn't readily supported (other than scanning backwards), but (b) is
pretty easy, see ISO 6429.
-hpa
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: UTF-8 and Alt key in the console
2008-03-29 0:44 ` Jan Engelhardt
2008-03-29 1:07 ` H. Peter Anvin
@ 2008-03-29 6:33 ` David Newall
2008-03-29 17:05 ` H. Peter Anvin
1 sibling, 1 reply; 22+ messages in thread
From: David Newall @ 2008-03-29 6:33 UTC (permalink / raw)
To: Jan Engelhardt; +Cc: H. Peter Anvin, John T., linux-kernel
Jan Engelhardt wrote:
> What do you mean by self-terminating? There is no easy
> synchronization like in UTF-8, given you are anywhere inside
> a text stream, how do you know (a) you are already in an
> escape sequence and (b) how to figure out the rebegin of
> normal text.
It's not very useful being able to tell you are inside a escape sequence
unless you see that sequence from the start. You do need the complete
sequence to make sense of it.
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: UTF-8 and Alt key in the console
2008-03-29 6:33 ` David Newall
@ 2008-03-29 17:05 ` H. Peter Anvin
2008-04-01 20:13 ` Jan Engelhardt
0 siblings, 1 reply; 22+ messages in thread
From: H. Peter Anvin @ 2008-03-29 17:05 UTC (permalink / raw)
To: David Newall; +Cc: Jan Engelhardt, John T., linux-kernel
David Newall wrote:
> Jan Engelhardt wrote:
>> What do you mean by self-terminating? There is no easy
>> synchronization like in UTF-8, given you are anywhere inside
>> a text stream, how do you know (a) you are already in an
>> escape sequence and (b) how to figure out the rebegin of
>> normal text.
>
> It's not very useful being able to tell you are inside a escape sequence
> unless you see that sequence from the start. You do need the complete
> sequence to make sense of it.
I think what Jan is alluding to is the property of UTF-8 text that you
can start in the middle of a string and either skip an incomplete
character or find the beginning of it. If you can search backwards, you
can find the beginning of an escape sequence, too; the "skip incomplete"
functionality is missing, though, but as you say, isn't actually all
that useful in real life *for the applications which use these kinds of
escape sequences.*
-hpa
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: UTF-8 and Alt key in the console
2008-03-29 17:05 ` H. Peter Anvin
@ 2008-04-01 20:13 ` Jan Engelhardt
2008-04-01 20:22 ` H. Peter Anvin
2008-04-02 0:02 ` David Newall
0 siblings, 2 replies; 22+ messages in thread
From: Jan Engelhardt @ 2008-04-01 20:13 UTC (permalink / raw)
To: H. Peter Anvin; +Cc: David Newall, John T., linux-kernel
On Saturday 2008-03-29 18:05, H. Peter Anvin wrote:
> David Newall wrote:
>> Jan Engelhardt wrote:
>> > What do you mean by self-terminating? There is no easy
>> > synchronization like in UTF-8, given you are anywhere inside
>> > a text stream, how do you know (a) you are already in an
>> > escape sequence and (b) how to figure out the rebegin of
>> > normal text.
>>
>> It's not very useful being able to tell you are inside a escape sequence
>> unless you see that sequence from the start. You do need the complete
>> sequence to make sense of it.
>
> I think what Jan is alluding to is the property of UTF-8 text that you can
> start in the middle of a string and either skip an incomplete character or
> find the beginning of it. If you can search backwards, you can find the
> beginning of an escape sequence, too; the "skip incomplete" functionality is
> missing, though, but as you say, isn't actually all that useful in real life
> *for the applications which use these kinds of escape sequences.*
No backwards searching, just forwards.
In UTF-8 this is simple. You know you are in a character when the highest
two bits are 10, and you can skip bytes until the start of the next
character, whose highest bits are either 00 or 11.
With the VTxxx escape codes, this is hardly possible. Given a broken
code of ^[43m,
echo -e '\x1B[43m wonderful \x1B[0m' | cosmicrays | cat
3m wonderful ^[[0m
There is no way to check whether you are in the escape code. And there
is no way to find its end. If a heuristic were to be used (which is
certainly a possibility), you would end up killing text up until the
next ^[.
Hence the proposal of using definite start and end markers:
echo -e '\x1B43m\x1D wonderful \x1B0m\x1D' | cosmicrays | cat
3m^] wonderful ^[0m^]
Ok, finding out whether we are in an escape code is not as easy as with
UTF-8 (the latter of which looks at the current character only), but
still very viable.
Prerequisite to this simple model is that the user does not use an
overly long dumb escape sequence like ^[[43;43;43;43;43;43m, i.e.
that the end marker is in the buffer if we really are in an escape
sequence:
static bool in_an_escape_seq(const char *buf)
{
const char *e = strchr(buf, 0x1D);
return e != NULL && e < strchr(buf, 0x1B);
}
If so, skipping parts of a faulty write() is easy:
static const char *get_out_of_esc(const char *buf)
{
if (in_an_escape_seq(buf))
return strchr(buf, 0x1D) + 1;
else
return buf;
}
--
make boldconfig -- to boldly select what no one has selected before
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: UTF-8 and Alt key in the console
2008-04-01 20:13 ` Jan Engelhardt
@ 2008-04-01 20:22 ` H. Peter Anvin
2008-04-02 0:02 ` David Newall
1 sibling, 0 replies; 22+ messages in thread
From: H. Peter Anvin @ 2008-04-01 20:22 UTC (permalink / raw)
To: Jan Engelhardt; +Cc: David Newall, John T., linux-kernel
Jan Engelhardt wrote:
>
> There is no way to check whether you are in the escape code. And there
> is no way to find its end.
Right, and wrong, respectively. Read the standard.
-hpa
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: UTF-8 and Alt key in the console
2008-04-01 20:13 ` Jan Engelhardt
2008-04-01 20:22 ` H. Peter Anvin
@ 2008-04-02 0:02 ` David Newall
2008-04-02 0:38 ` H. Peter Anvin
1 sibling, 1 reply; 22+ messages in thread
From: David Newall @ 2008-04-02 0:02 UTC (permalink / raw)
To: Jan Engelhardt; +Cc: H. Peter Anvin, John T., linux-kernel
Jan Engelhardt wrote:
> Hence the proposal of using definite start and end markers:
>
> echo -e '\x1B43m\x1D wonderful \x1B0m\x1D' | cosmicrays | cat
I see no merit in the idea. Most seriously, there isn't any real-world
problem being solved. In addition, it proposes creating yet another
type of terminal emulation. If there's something you don't like about
VT escape codes, use a different emulation. For example, Televideo
terminals used almost exclusively single-character control codes,
reducing the scope of being mid-sequence to, well much closer to zero.
You need to make quite clear that your proposal is to discontinue use of
VT terminal emulation.
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: UTF-8 and Alt key in the console
2008-04-02 0:02 ` David Newall
@ 2008-04-02 0:38 ` H. Peter Anvin
0 siblings, 0 replies; 22+ messages in thread
From: H. Peter Anvin @ 2008-04-02 0:38 UTC (permalink / raw)
To: David Newall; +Cc: Jan Engelhardt, John T., linux-kernel
David Newall wrote:
> Jan Engelhardt wrote:
>> Hence the proposal of using definite start and end markers:
>>
>> echo -e '\x1B43m\x1D wonderful \x1B0m\x1D' | cosmicrays | cat
>
> I see no merit in the idea. Most seriously, there isn't any real-world
> problem being solved. In addition, it proposes creating yet another
> type of terminal emulation. If there's something you don't like about
> VT escape codes, use a different emulation. For example, Televideo
> terminals used almost exclusively single-character control codes,
> reducing the scope of being mid-sequence to, well much closer to zero.
>
> You need to make quite clear that your proposal is to discontinue use of
> VT terminal emulation.
Okay, let's put this to rest once and for all:
*** ISO 6429 sequences are self-terminating. ***
No, you can't tell you're inside one if you miss the leading CSI, but as
has been pointed out, there really isn't a huge case for it.
The standard is available for free under the name ECMA-48:
http://www.ecma-international.org/publications/files/ECMA-ST/Ecma-048.pdf
It references ISO 2022, a.k.a. ECMA-35:
http://www.ecma-international.org/publications/files/ECMA-ST/Ecma-035.pdf
These standards use a decimalized hexadecimal notation, so if you see
"05/10" it means 0x5a. A "column" refers to a 16-character set, so
"column 4" refers to bytes 0x40 to 0x4f.
The structure defined in section 5.4 of ISO 6429/ECMA-48:
-----------
5.4 Control sequences
A control sequence is a string of bit combinations starting with the
control function CONTROL SEQUENCE INTRODUCER (CSI) followed by one or
more bit combinations representing parameters, if any, and by one or
more bit combinations identifying the control function. The control
function CSI itself is an element of the C1 set.
The format of a control sequence is
CSI P ... P I ... I F
where
a) CSI is represented by bit combinations 01/11 (representing ESC) and
05/11 in a 7-bit code or by bit combination 09/11 in an 8-bit code, see 5.3;
b) P ... P are Parameter Bytes, which, if present, consist of bit
combinations from 03/00 to 03/15;
c) I ... I are Intermediate Bytes, which, if present, consist of bit
combinations from 02/00 to 02/15. Together with the Final Byte F, they
identify the control function;
NOTE The number of Intermediate Bytes is not limited by this Standard;
in practice, one Intermediate Byte will be sufficient since with sixteen
different bit combinations available for the Intermediate Byte over one
thousand control functions may be identified.
d) F is the Final Byte; it consists of a bit combination from 04/00 to
07/14; it terminates the control sequence and together with the
Intermediate Bytes, if present, identifies the control function. Bit
combinations 07/00 to 07/14 are available as Final Bytes of control
sequences for private (or experimental) use.
-----------
Note: DEC added nonstandard control sequences initiated with SS3 (ESC O)
as well as CSI (ESC [); otherwise they use the same format.
The Final Byte is easy enough to spot, as writing a generic parser which
can pick this apart, including parameter handling.
-hpa
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: UTF-8 and Alt key in the console
2008-03-28 23:26 ` H. Peter Anvin
2008-03-29 0:07 ` Jan Engelhardt
@ 2008-04-06 8:46 ` Marko Macek
2008-04-06 10:14 ` David Newall
2008-04-06 16:37 ` H. Peter Anvin
1 sibling, 2 replies; 22+ messages in thread
From: Marko Macek @ 2008-04-06 8:46 UTC (permalink / raw)
To: H. Peter Anvin; +Cc: Jan Engelhardt, John T., linux-kernel
H. Peter Anvin wrote:
> One would also like to distinguish, say, Backspace from Ctrl-H. This is
> trickier, because the termios settings don't permit compound keys. The
> most obvious way to deal with that is an escape code for Ctrl-H, but
> that has the risk of breaking a lot of other things.
Backspace is not a problem, since it generates ^? (DEL/127) on Linux
since the early days.
It would be really nice to be able get arbitrary modifier combinations for all keys
and a separate combination for the escape key.
Mark
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: UTF-8 and Alt key in the console
2008-04-06 8:46 ` Marko Macek
@ 2008-04-06 10:14 ` David Newall
2008-04-06 16:37 ` H. Peter Anvin
1 sibling, 0 replies; 22+ messages in thread
From: David Newall @ 2008-04-06 10:14 UTC (permalink / raw)
To: Marko Macek; +Cc: H. Peter Anvin, Jan Engelhardt, John T., linux-kernel
Marko Macek wrote:
> H. Peter Anvin wrote:
>
>> One would also like to distinguish, say, Backspace from Ctrl-H. This
>> is trickier, because the termios settings don't permit compound
>> keys. The most obvious way to deal with that is an escape code for
>> Ctrl-H, but that has the risk of breaking a lot of other things.
>
> Backspace is not a problem, since it generates ^? (DEL/127) on Linux
> since the early days.
And yet, Ctrl/H *is* backspace. Look it up in any ASCII chart. Let's
not make a virtue out of ignoring or breaking standards.
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: UTF-8 and Alt key in the console
2008-04-06 8:46 ` Marko Macek
2008-04-06 10:14 ` David Newall
@ 2008-04-06 16:37 ` H. Peter Anvin
1 sibling, 0 replies; 22+ messages in thread
From: H. Peter Anvin @ 2008-04-06 16:37 UTC (permalink / raw)
To: Marko Macek; +Cc: Jan Engelhardt, John T., linux-kernel
Marko Macek wrote:
> H. Peter Anvin wrote:
>
>> One would also like to distinguish, say, Backspace from Ctrl-H. This
>> is trickier, because the termios settings don't permit compound keys.
>> The most obvious way to deal with that is an escape code for Ctrl-H,
>> but that has the risk of breaking a lot of other things.
>
> Backspace is not a problem, since it generates ^? (DEL/127) on Linux
> since the early days.
>
> It would be really nice to be able get arbitrary modifier combinations
> for all keys and a separate combination for the escape key.
>
Yes; this probably needs to be modal, but we can probably live with that.
-hpa
^ permalink raw reply [flat|nested] 22+ messages in thread
end of thread, other threads:[~2008-04-06 16:38 UTC | newest]
Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2008-03-23 15:15 UTF-8 and Alt key in the console John T.
2008-03-23 15:29 ` Jan Engelhardt
2008-03-23 15:46 ` John T.
2008-03-23 16:54 ` H. Peter Anvin
2008-03-23 17:47 ` John T.
2008-03-23 17:55 ` H. Peter Anvin
2008-03-23 18:13 ` John T.
2008-03-23 18:46 ` Jan Engelhardt
2008-03-28 23:26 ` H. Peter Anvin
2008-03-29 0:07 ` Jan Engelhardt
2008-03-29 0:23 ` H. Peter Anvin
2008-03-29 0:44 ` Jan Engelhardt
2008-03-29 1:07 ` H. Peter Anvin
2008-03-29 6:33 ` David Newall
2008-03-29 17:05 ` H. Peter Anvin
2008-04-01 20:13 ` Jan Engelhardt
2008-04-01 20:22 ` H. Peter Anvin
2008-04-02 0:02 ` David Newall
2008-04-02 0:38 ` H. Peter Anvin
2008-04-06 8:46 ` Marko Macek
2008-04-06 10:14 ` David Newall
2008-04-06 16:37 ` H. Peter Anvin
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).