I have a couple log files that are each around 100MB in size. Dealing with such large files is inconvenient for me. I’m aware that the log lines that interest me are only 200 to 400 lines long.
What would be a suitable technique to extract relevant log lines from these files, i.e. simply pipe the line number range to another file?
The following are some examples of inputs:
filename: MyHugeLogFile.log Starting line number: 38438 Ending line number: 39276
Is there a cygwin command that will cat out only that range in that file? I know that if I can display that range in stdout, I can pipe to an output file as well.
Note: I’ve added the Linux tag to increase visibility, but I’m looking for a solution that will work in Cygwin. (In most cases, linux commands work in cygwin.)
Asked by bits
This appears to be a job for sed:
sed -n '8,12p' yourfile
Lines 8 through 12 of your file will be sent to standard out.
You might want to run cat -n first if you want to prepend the line number:
cat -n yourfile | sed -n '8,12p'
Answered by Johnsyweb
To determine the total number of lines, use wc -l.
Then you can combine head and tail to acquire the desired range. Assume the log has 40,000 lines, and you want the last 1562 lines, followed by the first 838. So:
tail -1562 MyHugeLogFile.log | head -838 | ....
There’s surely a simpler method to do it with sed or awk.
Answered by David
When I was trying to break a file into 100 000 line files, I came across this topic. For that, there’s a better option than sed:
split -l 100000 database.sql database-
It will produce files such as:
database-aaa database-aab database-aac ...
Answered by Dorian
If you just want to clip a section of a file, say from line 26 to 142, and paste it into a new file: sed -n ‘26,142p’ | cat file-to-cut.txt >> new-file.txt
Answered by Marc Perrin-Pelletier
How about this:
$ seq 1 100000 | tail -n +10000 | head -n 10 10000 10001 10002 10003 10004 10005 10006 10007 10008 10009
It uses tail to output from the 10,000th line and onwards and then head to only keep 10 lines.
With sed, you get the same (nearly) result:
$ seq 1 100000 | sed -n '10000,10010p' 10000 10001 10002 10003 10004 10005 10006 10007 10008 10009 10010
This one has the benefit of allowing you to directly input the line range.
Answered by thkala
Post is based on https://stackoverflow.com/questions/5683367/how-to-cropcut-text-files-based-on-starting-and-ending-line-numbers-in-cygwin