Sorting, joining, shuffling, skipping and numbering lines on Linux


Whenever you need to work with lists that are stored as text files on Linux – especially long ones – you can take advantage of some easy commands to make manipulating them a lot easier. Any text file can be easily sorted, but you can also randomly arrange the lines, number them or join files when two share an initial common field. In fact, if you only want to see every other line or every fifth line in a file, you can do that too. This post runs through the commands to do all of these things.

Sorting files

The sort command makes sorting text files very easy. To view the contents of a text file in sorted order, all you need to do is type a command like this:

$ sort myfile

If you want to save the output or add it to the bottom of another file, one of the commands below will do that for you.

$ sort myfile > sortedfile
$ sort myfile >> otherfile

Once you add lines to an existing file as shown in the second command above, you may need to sort it again. The commands below would do that for you and will ensure that the file has the original name.

$ sort otherfile > otherfile.new
$ mv otherfile.new otherfile

The sort command also has quite a few options. For example, if you have a file with dates in alphabetic order, you could switch to displaying it in annual date order with the -M option in the command on the right below:

$ cat birthdays             $ sort -M birthdays
Jan 4, 1972                 Jan 4, 1972
Mar 18, 1949                Jan 8, 1954
May 1, 1976                 Mar 18, 1949
Jan 8, 1954                 May 1, 1976
Sep 23, 1979                Aug 6, 1956
Aug 6, 1956                 Sep 23, 1979

To sort a long list of colors and display them in columns, use a command like this one:

$ sort colors | column
Aqua            Brown           Gold            Navy blue       Purple          Tomato          Yellow
Azure           Chocolate       Green           Navy blue       Red             Turquoise
Black           Cyan            Grey            Olive           Salmon          Violet
Blue            Cyan            Lime            Orange          Sİlver          Wheat
Bronze          Dark blue       Maroon          Pink            Teal            White

Shuffling lines

To randomly arrange the lines in a text file, use the shuf (shuffle) command. For example, if you want to shuffle a list of friends each month to randomly select who to take out to lunch, you could use a command like this:

$ shuf friends | head -2
Sam
Patty

Run the command a few times in a row and you should get a different listing each time.

Sorting by number or text

If you want to sort the lines of a file numerically (assuming they are not listed numerically to begin with), use the sort -n option. Remember, however, that any lines that don’t start with a number will appear first

$ sort -n story | head -5
1       Once upon a time
2       There was a Linux elf
3       who liked to surprise
4       users by introducing
5       new commands.

Displaying every Nth line from a text file

The awk command provides a way to view every other, third, fourth or Nth line in a file by using an NR (record number) argument as shown in the commands below. The first command ensures that only the 2nd, 4th, 6th, etc. lines are displayed. The second would display every 7th line. Think of the first as saying “if the line number divided by 2 leaves a remainder of 0, then display it.

$ awk 'NR % 2 == 0' filename
$ awk ‘NR % 7 == 0’ filename

Here are two examples –- one displaying every second line, the other every third line of a file. The file being used has numbered lines to make what’s happening more clear.

$ awk 'NR % 2 == 0' myfile | head -6
2       There was a Linux elf
4       users by introducing
6
8       didn't know much about
10      line. As a result, none
12      tried actually worked
$ awk 'NR % 3 == 0' myfile | head -6
3       who liked to surprise
6
9       working on the command
12      tried actually worked
15      That's all we know about
18      command "cheat sheet" and

To do the same thing with a list of colors, the output lines will not be numbered. This command displays the 13th and 26th line in the colors file.

$ awk 'NR % 13 == 0' colors
Turquoise
Chocolate

Numbering lines

To number lines in a text file, use the nl (number lines) command. In the example below, the command adds line numbers to the colors file.

$ nl colors
     1  Black
     2  Grey
     3  Red
     4  Blue
     5  Orange
     6  White
     7  Brown
     8  Pink
     9  Yellow
    10  Green
    11  Purple
    12  Maroon
    13  Turquoise
    14  Cyan
    15  Navy blue
    16  Gold
    17  Tomato
    18  Teal
    19  Lime
    20  Cyan
    21  Wheat
    22  Salmon
    23  Olive
    24  Aqua
    25  Violet
    26  Chocolate
    27  Azure
    28  Sİlver
    29  Bronze
    30  Dark blue
    31  Navy blue

Wrap-up

As you can see, Linux provides a lot of handy commands for manipulating the content of text files. The man pages for the commands explained in this post will offer additional insights into how these commands work.

Copyright © 2023 IDG Communications, Inc.



Source link