LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
From: Rolf Eike Beer <eb@emlix.com>
To: Linus Torvalds <torvalds@linux-foundation.org>,
	Junio C Hamano <gitster@pobox.com>
Cc: Git List Mailing <git@vger.kernel.org>,
	Tobias Ulmer <tu@emlix.com>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>
Subject: Re: data loss when doing ls-remote and piped to command
Date: Fri, 17 Sep 2021 08:59:07 +0200	[thread overview]
Message-ID: <2722184.bRktqFsmb4@devpool47> (raw)
In-Reply-To: <xmqq7dfgtfpt.fsf@gitster.g>

[-- Attachment #1: Type: text/plain, Size: 2746 bytes --]

Am Donnerstag, 16. September 2021, 22:42:22 CEST schrieb Junio C Hamano:
> Linus Torvalds <torvalds@linux-foundation.org> writes:
> > On Thu, Sep 16, 2021 at 5:17 AM Rolf Eike Beer <eb@emlix.com> wrote:
> >> Am Donnerstag, 16. September 2021, 12:12:48 CEST schrieb Tobias Ulmer:
> >> > > The redirection seems to be an important part of it. I now did:
> >> > > 
> >> > > git ... 2>&1 | sha256sum
> >> > 
> >> > I've tried to reproduce this since yesterday, but couldn't until now:
> >> > 
> >> > 2>&1 made all the difference, took less than a minute.
> > 
> > So if that redirection is what matters, and what causes problems, I
> > can almost guarantee that the reason is very simple:
> > ...
> > Anyway. That was a long email just to tell people it's almost
> > certainly user error, not the kernel.
> 
> Yes, 2>&1 will mix messages from the standard error stream at random
> places in the output, which explains the checksum quite well.

If there would be any errors. The point is: if I run the command with ">/dev/
null" just to the terminals a hundred times there is never any output on 
stderr at all. If I pipe stderr into a file it's empty after all of this (yes, 
I did append, not overwrite).

That the particular construct in this case is sort of nonsense is granted, I 
just hit it because some tool here used some very similar construct and 
suddenly started failing. "less" isn't the original reproducer, it was just 
something I started testing with to be able to easily visually inspect the 
output.

What you need is a _fast_ git server. kernel.org or github.com seem to be too 
slow for this if you don't sit somewhere in their datacenter. Use something in 
your local network, a Xeon E5 with lot's of RAM and connected with 1GBit/s 
Ethernet in my case.

And the reader must be "somewhat" slow. Using sha256sum works reliably for me. 
Using "wc -l" does not, also md5sum and sha1sum are too fast as it seems.

When I run the whole thing with strace I can't see the effect, which isn't 
really surprising. But there is a difference between the cases where I run 
with redirection "2>&1":

ioctl(2, TCGETS, 0x7ffd6f119b10)        = -1 ENOTTY (Inappropriate ioctl for 
device)

and without:

ioctl(2, TCGETS, {B38400 opost isig icanon echo ...}) = 0

AFAICT this is the only place where fd 2 is used at all during the whole time.

Regards,

Eike
-- 
Rolf Eike Beer, emlix GmbH, http://www.emlix.com
Fon +49 551 30664-0, Fax +49 551 30664-11
Gothaer Platz 3, 37083 Göttingen, Germany
Sitz der Gesellschaft: Göttingen, Amtsgericht Göttingen HR B 3160
Geschäftsführung: Heike Jordan, Dr. Uwe Kracke – Ust-IdNr.: DE 205 198 055

emlix - smart embedded open source

[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 313 bytes --]

  reply	other threads:[~2021-09-17  6:59 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <6786526.72e2EbofS7@devpool47>
     [not found] ` <2279155.Qy0YqsFniq@devpool47>
     [not found]   ` <85a103f6-8b3c-2f21-cc0f-04f517c0c9a1@emlix.com>
2021-09-16 12:17     ` data loss when doing ls-remote and piped to command Rolf Eike Beer
2021-09-16 15:49       ` Mike Galbraith
2021-09-17  6:38         ` Mike Galbraith
2021-09-16 17:11       ` Linus Torvalds
2021-09-16 20:42         ` Junio C Hamano
2021-09-17  6:59           ` Rolf Eike Beer [this message]
2021-09-17 19:13             ` Jeff King
2021-09-17 19:28             ` Linus Torvalds
2021-09-18  6:33             ` Mike Galbraith

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=2722184.bRktqFsmb4@devpool47 \
    --to=eb@emlix.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=torvalds@linux-foundation.org \
    --cc=tu@emlix.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).