Random identity generation in Linux
If you need to generate a list of names and addresses to test an application or a script that you’re working on, Linux can make that surprisingly easy. There’s a command called “rig” that will create name, address and phone number listings. As far as I can tell, out of the box, it only works with U.S. addresses and area codes. However, if this is indeed the case, you might be able to work around this problem.
To use the rig command, you can just type “rig” on the command line, and a single name and address will be generated. You will see something like this:
$ rig Mavis English 1015 Tulip St Anderson, IN 46018 (317) xxx-xxxx
To generate a list with many addresses, use the -c option and specify the number of addresses that you want to see.
$ rig -c 3 Curt Rhodes 750 Orrand Dr Kinston, NC 28501 (919) xxx-xxxx Glenna Sheppard 531 Buncaneer Dr Seattle, WA 98109 (206) xxx-xxxx Georgina Burke 840 Plinfate St Orlando, FL 32802 (407) xxx-xxxx
You’ve probably noticed that the phone numbers in these identity records have an area code, but only a series of x’s for the phone numbers. Later in this post, I’ll demonstrate one way that you can get beyond this.
If, for some reason, you need only male or female names in your generated list, you can use the -m (male) or -f (female) option.
$ rig -c 3 -m $ rig -f -c 3 Eduardo Mathis Alicia Lara 183 Kennel Ln 853 Willow Rd Appleton, WI 54911 Roanoke, VA 24022 (414) xxx-xxxx (703) xxx-xxxx Tristan Mckee Mindy Romero 608 Lake Dr 846 Burnet Dr Miami, FL 33152 Emporia, KS 66801 (305) xxx-xxxx (316) xxx-xxxx Randy Chavez Ina Morris 654 Bourg St 556 Cedarwood Ln Spokane, WA 99210 Passadena, CA 91109 <== oops! (509) xxx-xxxx (818) xxx-xxxx
It’s easy to redirect the output to a file to save it for your intended use.
$ rig -c 100 > IDs
Putting your rig command into a script might make it a little easier to use, though it doesn’t add much to the command. In this gen_random_IDs script, we prompt the user for the number of identity records to be generated and redirect the output into a file. It uses the bash PID to randomize the file name (e.g., IDs.3255) to lessen the likelihood that a file with the same name already exists.
#!/bin/bash if [ $# == 0 ]; then echo -n "number of records to generate> " read num else num=$1 fi rig -c $num > IDs.$$ echo "$num identity records are in the IDs.$$ file"
You could also turn your rig commands into an easy bash alias:
alias genIDs="rig -c 1000 > IDs"
Adding phone numbers
If you would prefer seeing phone numbers in place of all those xxx-xxxx strings, you can do a little more work to make that happen. You can create random fictitious phone numbers to go along with your fictitious identities. In this next script, I use an internal bash function called RANDOM to create the needed digits to replace the xxx-xxxx strings that rig provides. The syntax shown is meant to ensure that we get numeric strings with exactly 3 and 4 digits.
The script generates the list of identities using the rig command and then runs back through the list to replace the xxx-xxxx strings with the generated phone numbers.
#!/bin/bash if [ $# == 0 ]; then echo -n "number of IDs to generate> " read num else num=$1 fi if [ -f IDs ]; then rm IDs fi rig -c $num > IDs.$$ while IFS= read -r line do if [[ $line == *"xxx-xxxx" ]]; then areacode=`echo $line | cut -c1-5` echo -n "$areacode " >> IDs echo $((100 + RANDOM % 899))-$((1000 + RANDOM % 8999)) >> IDs else echo "$line" >> IDs fi done < IDs.$$ # remove temp file rm IDs.$$ echo "Your generated identities are in the IDs file"
In this second version of the gen_random_IDs script, the rig output is written to the IDs.$$ file, and the revised (final) identity records are written to the IDs file. Any file by that name that exists when the script is started is simply removed. You are, of course, welcome to change any of this behavior to adjust the script to your preferences.
Output from that last script will look like this. Keep in mind that the phone numbers are completely random and do not likely resemble phone numbers in the cities shown, though the area codes are likely OK.
$ cat IDs Silvia Frederick 163 Shalton Dr Beloit, WI 53511 (608) 776-7085 Mildred Joyner 116 Spring County Blvd Albany, NY 12212 (518) 491-5250
Going international
The rig command gets the information that it provides from files in /usr/share/rig. If you want it to generate names and addresses that resemble those in another country, you might get away with replacing the content of these files. On the other hand, your success will probably depend on the extent to which the addresses match the format of the current content. The rig command doesn’t seem to deal well with city names that have more than one word in them like “San Francisco” or “New York”. It won’t likely deal well with area codes that have more than one component either.
Adding data
The data files that rig uses have as many as 1,000 entries for some of the fields. The counts on my system show:
$ cd /usr/share/rig $ wc -l * 1000 fnames.idx <== 1,000 first names for women 1000 lnames.idx <== 1,000 last names 61 locdata.idx <== 61 cities and states 1000 mnames.idx <== 1,000 first names for men 60 street.idx <== 1,000 street names 3121 total
That means it can generate as many as 2 million different names. There’s no reason you can’t add more if you’re so inclined. Just follow the format.
Copyright © 2021 IDG Communications, Inc.