Request for volunteers to vet sed error case

Forum rules
Before you post please read how to get help
Post Reply
markfilipak
Level 5
Level 5
Posts: 979
Joined: Sun Mar 10, 2013 8:08 pm

Request for volunteers to vet sed error case

Post by markfilipak »

Below is an email I'm preparing to bug-sed@gnu.org.
Before I send it, I need someone to vet the error. Preferred would be someone who is not running Mint in a VM.
Any volunteers?

Of course, your ideas are welcome.

Thanks,
Mark.
=====
Kindly expedite this and contact me for any reason.

The file "1,073,709,056 bytes" provokes an error (& zero output), but only if piped from 'tr' and only for a particular pattern: /00000100/.
the file "1,073,739,776 bytes" succeeds with identical parameters.
The pipe through 'tr' appears not to be the problem.

$ sed --version
sed (GNU sed) 4.2.2
...
$ xxd -p -u "1,073,709,056 bytes" | tr -d '\n' | sed -r 's/00000100/\x0D\x0A&/g' > foo.txt
sed: couldn't re-allocate memory
$ xxd -p -u "1,073,709,056 bytes" | tr -d '\n' | sed -r 's/000001/\x0D\x0A&/g' > foo.txt
$ xxd -p -u "1,073,709,056 bytes" | sed -r 's/00000100/\x0D\x0A&/g' > foo.txt
$ xxd -p -u "1,073,739,776 bytes" | tr -d '\n' | sed -r 's/00000100/\x0D\x0A&/g' > foo.txt
$

You probably want the two source files, or perhaps only the source file that provokes the error. Kindly let me know how I can send it to you.
User avatar
xenopeek
Level 24
Level 24
Posts: 24983
Joined: Wed Jul 06, 2011 3:58 am
Location: The Netherlands

Re: Request for volunteers to vet sed error case

Post by xenopeek »

I'm curious, what does this answer:

Code: Select all

wc -l "1,073,709,056 bytes"
wc -l "1,073,739,776 bytes"
From your description, and sed being line based, I suppose the smaller file has more lines?
Image
markfilipak
Level 5
Level 5
Posts: 979
Joined: Sun Mar 10, 2013 8:08 pm

Re: Request for volunteers to vet sed error case

Post by markfilipak »

xenopeek wrote:
Wed Oct 14, 2020 5:02 pm
I'm curious, what does this answer:

Code: Select all

wc -l "1,073,709,056 bytes"
wc -l "1,073,739,776 bytes"
From your description, and sed being line based, I suppose the smaller file has more lines?

Code: Select all

$ wc -l "1,073,709,056 bytes"
4758494 1,073,709,056 bytes
$ wc -l "1,073,739,776 bytes"
4647236 1,073,739,776 bytes
They are both binary files (i.e. DVD 'VOB's).
User avatar
xenopeek
Level 24
Level 24
Posts: 24983
Joined: Wed Jul 06, 2011 3:58 am
Location: The Netherlands

Re: Request for volunteers to vet sed error case

Post by xenopeek »

I can't say whether that is the cause but the smaller file does as I suspected have more lines. 111,258 more lines is just 2.4% more on the total number of lines but perhaps it is what bottlenecks sed memory allocation somehow.

Thinking some more on it, what happens if you split this into two steps instead of using a pipe?
1. xxd -p -u "1,073,709,056 bytes" | tr -d '\n' > temp
2. sed -r 's/00000100/\x0D\x0A&/g' temp > foo.txt
Does that give the same error on step 2?

I suppose there is one more thing to test. If you make a copy of the larger file and replace 111,258 characters in it with newline 0x0A and then retest, does this larger file now give the same error?

BTW I'm lost what the output in foo.txt would be for?
Image
Post Reply

Return to “Scripts & Bash”