Unicage Commands

The following is a list of all Unicage commands and a description of their function. You can jump straight to a function type by clicking on the links below.

Database Date/Time Formatting Input/Output
Mathematical String Functions System Functions Statistical
Database
ajoin1 Match on the longest key field
and Outputs the field values that two files share in common in the first field Outputs to stdout the values that the two files passed as arguments have in common in the first field. Both files passed as arguments must be sorted on the first field.
block_getlast Within each block of records that share the same value in their key field, outputs all of the records at the end of the block that share the same value in the reference field. Within a block of records whose <key> field shares the same value, outputs all of the records at the end of the block that share the same value in the <ref> field. <key> and <ref> can be specified as ranges (Example: 1/3 2@5) or using "NF". If <file> is not specified or specified as "-" then the command reads from standard input.
ccnt Counts the columns Counts the number of columns in the text file passed as an argument or from standard input. Specify -d then the number of columns in text string <string> is counted.
cjoin0 Sequential key matching select Selects only the rows in <tran> where the <n>th field in <tran> matches the first field in <master> (sorted list of unique values) and outputs the matching rows to standard output. If the "+ng" option is used then matching rows are sent to standard output and non-matching rows are sent to the file descriptor specified in <fd>. If <fd> is omitted, non-matching rows are output to standard error. The <n>th field of <tran> does not need to be sorted. If "-" is specified for <master> then the command expects input on standard input. If "-" is specified for <tran> or if <tran> is omitted then the command expects input on standard input. <key> can be specified as: key=3 The third field in the transaction file key=3/5 The third, fourth and fifth field in the transaction file (the first, second and third field in the master file) key=4@3 The fourth and third field in the transaction file the second and first field in the master file) key=NF Last field in the transaction file (first field in the master file) key=NF-3 Third to last field in the transaction file (first field in the master file) In all cases, the master must be sorted for the specified key. (The transaction file need not be sorted). cjoin0 reads the entire <master> file into memory, so if <master> is extremely large a memory allocation error can occur. On the contrary, when <master> is relatively small and <tran> is extremely large, using cjoin0 is most efficient.
cjoin1 Sequential key matching join Selects the rows in <tran> where the <n>th field in <tran> matches the first field in <master> (sorted list of unique values) and then inserts all the fields in <master> starting with the second field into the matching rows in <tran> beginning after the <n>th field. If the "+ng" option is used then matching rows are sent to standard output and non-matching rows are sent to the file descriptor specified in <fd> If <fd> is omitted, non-matching rows are output to standard error. The <n>th field of <tran> does not need to be sorted. If "-" is specified for <master> then the command expects input on standard input. If "-" is specified for <tran> or if <tran> is omitted then the expects input on standard input. <key> can be specified as: key=3 The third field in the transaction file key=3/5 The third, fourth and fifth field in the transaction file (the first, second and third field in the master file) key=4@3 The fourth and third field in the transaction file the second and first field in the master file) key=NF Last field in the transaction file (first field in the master file) key=NF-3 Third to last field in the transaction file (first field in the master file) In all cases, the master must be sorted for the specified key. (The transaction file need not be sorted). cjoin1 reads the entire <master> file into memory, so if <master> is extremely large a memory allocation error can occur.
cjoin2 Sequential key matching join Selects the rows in <tran> where the <n>th field in <tran> matches the first field in <master> (sorted list of unique values) and then inserts all the fields in <master> starting with the second field into the matching rows in <tran> beginning after the <n>th field. For non-matching rows in <tran>, dummy data is inserted for the number of fields specified in "key". The dummy data is taken from the +<string> option. If this option is omitted, then the underscore character ("_") is used. If "-" is specified for <master> then the command expects input on standard input. If <master> is an empty file (0 bytes) an error is returned. If "-" is specified for <tran> or if <tran> is omitted then the command expects input on standard input. <key> can be specified as: key=3 The third field in the transaction file key=3/5 The third, fourth and fifth field in the transaction file (the first, second and third field in the master file) key=4@3 The fourth and third field in the transaction file the second and first field in the master file) key=NF Last field in the transaction file (first field in the master file) key=NF-3 Third to last field in the transaction file (first field in the master file) In all cases, the master must be sorted for the specified key. (The transaction file need not be sorted). cjoin2 reads the entire <master> file into memory, so if <master> is extremely large a memory allocation error can occur.
count Count the number of rows that contain the same key This tool outputs the number of rows (records) where the key field is the same value as the value specified in the file passed as an argument or via standard input. The key fields are specified using <k1> as the first field and <k2> as the last field.
csum Returns a cumulative sum Adds the specified field (val=) from each row of the speficied file (or standard input) and inserts the cumulative sum of all previous rows as a new field immediately to the right of the summed field. This is used to return a cumulative sum.
delf Remove the specified field delf=Delete Field Removes the specified field from <file>. (works oppositely to "self") If the file name is omitted or "-" then the command expects input on standard input.
delkey Deletes the key field from consecutive records. Using field 1 to field <n> as the key, convert the key field in consecutive records of <file> that share the same key to spaces, starting with the second record sharing the same key. You can specify a different padding character with the -d option. The -i option specifies a character to replace the key field.
dmerge Merges files based on a key Merges the specified files <file1> <file2>… based on the key field specified as key=<key>. Each file must already be sorted on the specified field. Examples of Specifying Keys: key=3 : Third field is the key key=1/4 : Fields 1 through 4 are the key key=3@1@2 : Fields 3, 1 and 2 are the key key=NF : The last field is the key key=1n : The first field is the numeric key key=3n : The third field is the numeric key
fldcmp Compares files field-by-field Compares <file1> to <file2> field-by-field.
fsed Replaces characters in a field Converts string <org> to string <dst> within field <n> in specified file <file>. The -e option allows you to use regular expressions. You can specify multiple search/replace pairs.
getfirst Outputs the first row with a matching key. If there are multiple rows within the file passed as an argument or standard input where the specified key field contains the same value, only the first row is output.
getlast Outputs the last row with a matching key. If there are multiple rows within the file passed as an argument or standard input where the value matches the specified key field, only the last row is output.
getno Return a sequential number Outputs to standard output <num> sequential numbers starting with the number stored in <file>. The final number output the last time the command was run is stored in <file>, and this is overwritten with the last number generated each time the command is run. (<file> is locked while the command is running.) The numbers that are output will be of fixed length based on the number of digits in the value recorded in <file> (zeroes will be prepended as necessary). When the largest number possible for the given number of digits has been output, no more numbers are output and the command exits with an error. Whenever the command exits with an error, the value of the last number generated as recorded in <file> is reset to 1. * If option [-s] is specified, when the largest possible number is reached, all of the digits in the number in <file> are reset to zero (loop) and sequential numbers continue to be output. * If the option [-i] is specified (requires option [-s]), when the largest possible number is reached, the number in <file> is set to <ini> and sequential numbers continue to be output (Sets starting value when looping). The number of digits when looping remains the same even if the value of <ini> has less digits than the number in <file>. If the number of digits in <ini> is larger than that in <file> then the command exits with an error.
hsum Horizontal sum. This tool sums up all of the fields in each record of <file> and adds the sum as a field to the end of each record. It sums all fields starting with field "num=<n>" to the last field in the record, and adds the sum as a field at the end of the record. If "num=<n>" is omitted, "num=0" is assumed.
interlace Merges records alternating rows Merges records from <file1> <file2> … one record at a time in round robin style. If the end of any file has been reached, that file is skipped on subsequent rounds and the merge continues until all rows in all files have been merged.
join0 Selects the lines from Master with a matching key field Selects all lines from <tran> where the <n>th field matches the first field (key field) of <master>. The first field of <master> and the <n>th field of <tran> MUST be sorted in ascending order. Also, the key field (first field) in <master> must only contain unique values (the same value cannot be repeated in the first field). <tran> does not have this requirement in <tran> can have the same value in the key field (<n>th field). You can specify multiple key fields.
join1 Joins a master file to a transaction file. (Only rows with matching key fields are selected) Only those rows in the text file "tran" where the key field "key=<n>" of "tran" matches the first field (key field) of "master" are selected, then joined with the fields in "master" and output. The join occurs by adding the fields from "master" immediately after the key field in "tran". It is a requirement for the first field of "master" and the <n>th field of tran be sorted in ascending order. Also, the key field (first field) in "master" must consist entirely of unique values. (You cannot have two rows in "master" with the same value in the first field). This restriction does not apply to "tran" which can have unlimited rows who share the same value in the key field (<n>th field). You can choose multiple key fields.
join1x Joins to files where more than one record has matching key values. This tool joins records from the transaction file <tran> and the master file <master> where the key field of <tran> has the same value as the first field of <master>. However, the difference between this command and join1 is that there must be more than one record in Master that share the same key value. (–>join1 requires the key fields in Master to be unique.) The files are joined for every combination of Master and Transaction records where the key fields have the same value. Both files must be sorted ("master" on the first field and "tran" on the key field.)
join2 Joins a master file to a transaction file. (For rows that match, joins data from master for rows that do not match, joins dummy data.) Only those rows in the text file "tran" where the key field "key=<n>" of "tran" matches the first field (key field) of "master" are selected, then joined with the fields in "master" and output. For rows that do not match, dummy data "_" is joined for the amount of fields in master. It is also possible to specify different dummy data. It is a requirement for the first field of "master" and the <n>th field of tran be sorted in ascending order. Also, the key field (first field) in "master" must consist entirely of unique values. (You cannot have two rows in "master" with the same value in the first field). This restriction does not apply to "tran" which can have unlimited rows who share the same value in the key field (<n>th field). If <master> is an empty file (0 bytes) an error is generated. It is possible to specify multiple key fields.
join2x Joins to files where more than one record has matching key values. This tool selects records from the text file <tran> where the key field of <tran> ("key=<n>") matches the first field (key field) of <master>, then joins those records with the fields in <master> and outputs those records. However, the difference between this command and join2 is that there must be more than one record in Master that share the same key value. (–>join2 requires the key fields in Master to be unique.) The files are joined for every combination of Master and Transaction records where the key fields have the same value. Both files must be sorted ("master" on the first field and "tran" on the key field.) If <master> is an empty file (0 bytes) an error occurs.
joinx Perform a complete join with all possible combinations. Joins the records of two files in all possible combinations. The join is performed in the order the arguments are specified. In other words, each record in <file1> is joined with every record in <file2> in order.
keycut Split a file based on a key field (Key must be sorted) Reads in <file> and splits it into multiple files where the key field specified in <filename> has the same values. For example, if you want to split the file into multiple files where the 2nd field contains the same value, specify the filename as "data.%2". The names of the output files will be data.(2nd field value). The key field must be sorted to use keycut. (Files are output when the value of the field changes.) The key field specified in <filename> should be written as "%(Field #), but you can speficy substrings as "%5.2, %5.1.3," etc.
lineup Returns the unique data in the specified field in ascending order Reads the data in the specified field of <file> and outputs it as a "lineup" in ascending order. If <file> is omitted or specified as "-" the command reads from standard input.
loopj Join all records from multiple files. loopj = loop of join3 The specified text files are joined together using as the key field the first to the <n>th field. If there is no matching field in each file, then its output will be padded with "0". The files to be joined must be greater than zero bytes in size, and the key field must contain only unique values sorted in ascending order. If <file> is a zero byte file, an error is generated.
loopx Join by all possible combinations. Joins each record in multiple files by creating rows containing all possible combinations of all records. The output is in the order that the files were specified. In other words, all records in <file1> are combined with all records in <file2> and the resulting records are combined with <file3> and so on.
msort In-Memory Sort Sorts a file on key=<pos>. <pos> designates the field position. msort key=2 file msort key=2/5 file msort key=3@1@2 file There is no limit on the length of the key field or on the number of key fields. The key field can also contain multi-byte characters such as Japanese. If you specify "n" after the field position, that field’s values will be sorted as numbers. If you specify "N" after the field, the values will be sorted in descending order as numbers. If you specify "n" or "N" before or after the "/", you must use the same specification for all fields. msort key=2n/5n file OK msort key=2n/5N file Error msort key=2n/5 file Error If the file name is omitted or if it is specified as "-" then the command will read from standard input. If you specify the -p<n> option, the command will run in <n> parallel processes. There is no limit to the number of parallel processes you can specify, but if you specify more processes than there are cores available on your CPU then performance will be degraded. If you are sorting less than 10,000 lines, even if you specify the -p<n> option it will only run in one process.
order Returns the order of the record within a block of records that share the same key value. Inserts a number at the beginning of each record representing the record’s position within a block of records that share the same key field. The key field is defined as fields <k1> through <k2>. If you omit <k1> and <k2>, all records are considered to share the same key value. In this case the result is to add a line number to the beginning of all records. The -h option causes each key field’s order to be displayed heirarchically. If <file> is not specified or specified as "-" then the command reads from standard input.
perlsql Executes an SQL command Executes <sql_command> and if it is a SELECT command output the results to standard input. <data_source> : Data Source Ex) DBI:mysql:test:localhost:3306 DBI <– If using DBI, specify "DBI" mysql <– DBD driver name Parameters following the DBD driver name depend on the particular driver For MySQL: test <– Database Name localhost <– Server Name (optional) 3306 <– DB Port Number (optional) <username> : Specify a user who has access priviliges to the database <password> : User’s password <sql_command> : SQL command to execute To use DBI you must have the Perl DBI module installed. Also, you must have the DBD driver corresponding to your database server.
psort Partial Sort Sorts the <key> field for all records that have the same value in the <ref> field. It sorts in small chunks so it is much faster than sorting the entire file. If <file> is not specified or specified as "-" then the command reads from standard input.
rank Add a rank Add line numbers to "file". If a reference key field is specified with "ref=<ref>", then the line numbers are restarted at 1 whenever the value in the reference key field changes. If a value field is specified with "key=<key>" then the same line number is assigned when the value is the same.
self Outputs the data from the specified field self=select field Outputs the data from the specified field in <file>. If <file> is not specified or is "-" then the command reads from standard input.
selr Selects rows that match exactly Displays rows from <file> (or stdin) where the specified field <field> matches the specified string <str> exactly.
sm2 Sum values by key Sums each field of records that have the same key within <file>. The range of key fields begins with "k1" and ends with "k2", and every field in the range from "s1" to "s2" is summed. Records with the same key are summed in a single line and output. Fields not specified as key or sum fields are not output.
sm3 Sum values by key (No Sort) Sums each field of records that have the same key within <file>. There are two types of sum keys. The fields starting with "k1" up to and including "k2" are the sorted fields, and the fields starting with "n1" up to and including "n2" do not need to be sorted. However, the fields must obey this relationship: k1<=k2<n1<=n2 The fields beginning with "f1" up to and including "f2" will be summed. Records with the same key are summed in a single line and output. Fields not specified as key or sum fields are not output.
sm4 Inserts subtotal lines. Inserts a "total line" for records in <file> that have the same key. Fields from <k1> to <k2> and are key fields while fields from <d1> to <d2> are dummy fields (they are not keys nor are they summed). Fields from <s1> to <s2> are the summed fields. These fields are summed for rows with the same key field and a total line is inserted after the last line with the same key. A "@" character is inserted in the dummy fields of the total row. If you use sm4 repeatedly to insert subtotals and sub-subtotals then any record where there is at least one field equal to "@" will be excluded. (Records that include "@" such as "usp@usp-lab.com" are not excluded.) The "+h" option treats the first row as a header row and excludes it from the sum. Fields must be specified following these rules: k1=1 k1<=k2 d1=k2+1 d1<=d2 s1=d2+1 s1<=s2 s2=NF(last field) If you specify "x" for d1 or d2, then these rules must be followed: k1=1 k1<=k2 s1=k2+1 s1<=s2 s2=NF(last field)
sm5 Output the grand total This tool adds a grand total line (total of all lines) to <file>. The range of key fields begins with "k1" and ends with "k2", and every field in the range from "f1" to "f2" are summed fields. A record is added at the end of the list containing the grand totals of all summed fields that are not key fields. The fields in the grand total row that are key fields are padded with "@". When a file that has subtotal lines from the sm4 command is processed, the subtotal lines that were added with sm4 are ignored when calculating the grand total.
sorter Split a file based on a key (The file doesn’t need to be sorted on the key) Read in <file> and then write records who share the same value in the key field to separate files using <filename>. For example, to write records with the same value in field 2 to separate files, specify "data.%2" as <filename>. The output files will be named data.<value of field 2>. Unlike keycut, sorter does not require the input file to be sorted on the key field. Also, records are written to each output file in the order they appear in the input file, so the output files are not sorted. The key field is specified in <filename> using %<field number>, but you can also speficy substrings such as %5.2 or %5.1.3.
tagcjoin0 Outputs tag data that exists in the tag master. Records that don’t exist are discarded. Matches the tag field specified as key=<tag> in tag-formatted file <master> with the tag-formatted file <tran> and outputs the matching rows in tag format to standard output. Non-matching rows are discarded, however the +ng option allows you to output them in tag format to standard error. If <master> is specified as "-" then it is read from standard input. If <tran> is omitted or specified as "-" then it is read from standard input. If either <master> or <tran> is a zero-byte file, an error occurs. If <master> or <tran> is a null file, the command processes normally. The <tag> file can be a combination of any of the four formats below: 1. TAG1 (Normal Format) 2. TAG1/TAG4 (Range Format: Tags must be contiguous) 3. TAG3@TAG1 (Individual Format) 4. NF-4 (NF refers to the last field in <tran>)
tagcjoin1 Joins together tag data that exists in the tag master. Records that do not exist are discarded. Matches the tag field specified by key=<tag> in tag-formatted file <master> with tag-formatted file <tran> and joins together the matching rows then outputs them to standard output. Non-matching rows are discarded, however the +ng option allows you to output them in tag format to standard error. If <master> is specified as "-" then it is read from standard input. If <tran> is omitted or specified as "-" then it is read from standard input. If either <master> or <tran> is a zero-byte file, an error occurs. If <master> or <tran> is a null file, the command processes normally. The <tag> file can be a combination of any of the four formats below: 1. TAG1 (Normal Format) 2. TAG1/TAG4 (Range Format: Tags must be contiguous) 3. TAG3@TAG1 (Individual Format) 4. NF-4 (NF refers to the last field in <tran>)
tagcjoin2 Joins together tag data that exists in the tag master. Records that do not exist are padded. Matches the tag field specified by key=<tag> in tag-formatted file <master> with tag-formatted file <tran> and joins together the matching rows then outputs them to standard output. Non-matching rows are padded with dummy data ("*"), however the +<string> option allows you to specify <string> as the dummy data which will replace each non-matching field. If <master> is specified as "-" then it is read from standard input. If <tran> is omitted or specified as "-" then it is read from standard input. If either <master> or <tran> is a zero-byte file, an error occurs. If <master> or <tran> is a null file, the command processes normally. The <tag> file can be a combination of any of the four formats below: 1. TAG1 (Normal Format) 2. TAG1/TAG4 (Range Format: Tags must be contiguous) 3. TAG3@TAG1 (Individual Format) 4. NF-4 (NF refers to the last field in <tran>)
tagcount Count the number of rows that contain the same key This tool outputs the number of rows (records) where the key field is the same value as the value specified in the file passed as an argument or via standard input. The key fields are specified using <k1> as the first field and <k2> as the last field.
tagdelf Remove the specified field from a tag file and output the result Outputs <tagfile> with the fields specified as <tag1> <tag2> .. removed. The output is in tag file format. If <tagfile> is not specified or specified as "-" then the command reads from standard input.
taggetfirst Outputs the first row with a matching key. If there are multiple rows within the file passed as an argument or standard input where the key fields specified as <tag1> to <tag2> contain the same value, only the first row is output.
taggetlast Outputs the last row with a matching key. If there are multiple rows within the file passed as an argument or standard input where the value matches the key fields specified by <tag1> to <tag2>, only the last row is output.
tagjoin0 Outputs tag data whose tags exist in the Tag Master Records that don’t exist are discarded. Matches the tag fields specified by key=<tag> in the tag-formatted file <master> with the tag-formatted file <tran> and outputs the matching rows in tag format to standard output. Non-matching rows are discarded, however if you specify the +ng option then these rows are output in tag format to standard error. If you specify "-" as <master> then it will be read from standard input. If you omit <tran> or specify "-" then it will be read from standard input. If either <master> or <tran> are zero byte files, an error occurs. If either <master> or <tran> are null files (only contain the tag row) the command processes normally. The <tag> file can be in any of the following three formats: 1. TAG1 (Normal Format) 2. TAG1/TAG4 (Range Format: Tags must be contiguous) 3. TAG3@TAG1 (Individual Format)
tagjoin1 Outputs tag data whose tags exist in the Tag Master Records that don’t exist are discarded. Matches the tag fields specified by key=<tag> in the tag-formatted file <master> with the tag-formatted file <tran>, joins the matching rows in tag format and outputs them to standard output. Non-matching rows are discarded, however if you specify the +ng option then these rows are output in tag format to standard error. If you specify "-" as <master> then it will be read from standard input. If you omit <tran> or specify "-" then it will be read from standard input. If either <master> or <tran> are zero byte files, an error occurs. If either <master> or <tran> are null files (only contain the tag row) the command processes normally. The <tag> file can be in any of the following three formats: 1. TAG1 (Normal Format) 2. TAG1/TAG4 (Range Format: Tags must be contiguous) 3. TAG3@TAG1 (Individual Format)
tagjoin1x Joins together tag data that exists in the tag master in all possible combinations. Records that do not exist are discarded. Matches the key field as specified by key=<tag> in tag-formatted file <master> with tag-formateed file <tran>, joins matching lines together in tag format and outputs them to standard output. The difference between this command and tagjoin1 is that more than one record in <master> can have a key field with the same value. If more than one record with the same key field value exists, then the records are joined in all possible combinations. Non-matching rows are discarded, however if you specify the +ng option then these rows are output in tag format to standard error. If you specify "-" as <master> then it will be read from standard input. If you omit <tran> or specify "-" then it will be read from standard input. If either <master> or <tran> are zero byte files, an error occurs. If either <master> or <tran> are null files (only contain the tag row) the command processes normally. The <tag> file can be in any of the following three formats: 1. TAG1 (Normal Format) 2. TAG1/TAG4 (Range Format: Tags must be contiguous) 3. TAG3@TAG1 (Individual Format)
tagjoin2 Joins and outputs tag data whose tags exist in the Tag Master. Records that don’t exist are joined with dummy data. Matches the tag fields specified by key=<tag> in the tag-formatted file <master> with the tag-formatted file <tran>, joins the matching rows in tag format and outputs them to standard output. Non-matching rows are padded with "*". The padding character can be changed with the -d option. If you specify "-" as <master> then it will be read from standard input. If you omit <tran> or specify "-" then it will be read from standard input. If either <master> or <tran> are zero byte files, an error occurs. If either <master> or <tran> are null files (only contain the tag row) the command processes normally. The <tag> file can be in any of the following three formats: 1. TAG1 (Normal Format) 2. TAG1/TAG4 (Range Format: Tags must be contiguous) 3. TAG3@TAG1 (Individual Format)
tagjoin2x Joins together tag data that exists in the tag master in all possible combinations. Records that do not exist are padded with dummy data. Matches the key field as specified by key=<tag> in tag-formatted file <master> with tag-formateed file <tran>, joins matching lines together in tag format and outputs them to standard output. The difference between this command and tagjoin2 is that more than one record in <master> can have a key field with the same value. If more than one record with the same key field value exists, then the records are joined in all possible combinations. Non-matching rows are padded with "*". The padding character can be changed with the -d option. If you specify "-" as <master> then it will be read from standard input. If you omit <tran> or specify "-" then it will be read from standard input. If either <master> or <tran> are zero byte files, an error occurs. If either <master> or <tran> are null files (only contain the tag row) the command processes normally. The <tag> file can be in any of the following three formats: 1. TAG1 (Normal Format) 2. TAG1/TAG4 (Range Format: Tags must be contiguous) 3. TAG3@TAG1 (Individual Format)
tagjoinx Joins tag data in all possible combinations Joins all records in tag-formatted files <file1> and <file2> in all possible combinations.
taglineup Returns the unique data in the specified field in ascending order Reads the data in the specified field of <file> and outputs unique values as a "lineup" in ascending order. If <file> is omitted or specified as "-" the command reads from standard input.
tagloopj loopj command for tag files The specified tag files <tagfile1> <tagfile2> … are processed with loopj. The output is also tag formatted. If <tagfile> is not specified or specified as "-" then it is read from standard input. The -d option allows you to change the default dummy character ("0"). If <file> is a zero byte file, an error is generated. If <file> is a null file (contains only a tag line) the command processes normally. The <tag> file can be in any of the following three formats: 1. TAG1 (Normal Format) 2. TAG1/TAG4 (Range Format: Tags must be contiguous) 3. TAG3@TAG1 (Individual Format)
tagloopx Join by all possible combinations. Joins each record in multiple files by creating rows containing all possible combinations of all records. The output is in the order that the files were specified. In other words, all records in <file1> are combined with all records in <file2> and the resulting records are combined with <file3> and so on.
tagmsort In-Memory Sort for tag files Sorts a file on key=<pos>. <pos> designates the field position as a tag name. tagmsort key=TAG2 file tagmsort key=TAG2/TAG5 file tagmsort key=TAG3@TAG1@TAG2 file There is no limit on the length of the key field or on the number of key fields. The key field can also contain multi-byte characters such as Japanese. If you specify ":n" after the field position, that field’s values will be sorted as numbers. If you specify ":N" after the field, the values will be sorted in descending order as numbers. If you specify ":n" or ":N" before or after the "/", you must use the same specification for all fields. tagmsort key=TAG2:n/TAG5:n file OK tagmsort key=TAG2:n/TAG5:N file Error tagmsort key=TAG2:n/TAG5 file Error If the file name is omitted or if it is specified as "-" then the command will read from standard input. If you specify the -p<n> option, the command will run in <n> parallel processes. There is no limit to the number of parallel processes you can specify, but if you specify more processes than there are cores available on your CPU then performance will be degraded. If you are sorting less than 10,000 lines, even if you specify the -p<n> option it will only run in one process.
tagpsort Partial Sort for tag files Sorts the <key> field for all records that have the same value in the <ref> field. It sorts in small chunks so it is much faster than sorting the entire file. If <file> is not specified or specified as "-" then the command reads from standard input.
tagself Outputs data from the specified field of a tag file. Extracts data from the specified field of tag-formatted file <tagfile> and outputs it. If <tagfile> is not specified or specified as "-" then the command reads from standard input. If you specify a tag name that doesn’t exist, an error occurs. However, if you specify the –ngthrough option, then a field is created for that tag and the data is padded with "_". Use the -d option to change the padding character to another character. The output is a tag-formatted file.
tagsm2 Sum values in a tag file by key Sums each field of records that have the same key value within <file>. The range of key fields begins with "k1" and ends with "k2", and every field in the range from "s1" to "s2" is summed. Records with the same key are summed in a single line and output. Fields not specified as key or sum fields are not output.
tagsm3 Sum values by key (No Sort) Sums each field of records that have the same key within <file>. There are two types of sum keys. The fields starting with "k1" up to and including "k2" are the sorted fields, and the fields starting with "n1" up to and including "n2" do not need to be sorted. However, the fields must obey this relationship: k1<=k2<n1<=n2 The fields beginning with "s1" up to and including "s2" will be summed. Records with the same key are summed in a single line and output. Fields not specified as key or sum fields are not output.
tagsm4 Inserts subtotal lines. Inserts a "total line" for records in <file> that have the same key. Tags from <k1> to <k2> and are key fields while tags from <d1> to <d2> are dummy fields (they are not keys nor are they summed). Tags from <s1> to <s2> are the summed fields. These fields are summed for rows with the same key field and a "total line" is inserted after the last line with the same key. If you do not specify <s1> and <s2> then all fields starting with the field following field <d2> are summed. A "@" character is inserted in the dummy fields of the total row. If you use tagsm4 repeatedly to insert subtotals and sub-subtotals then any record where there is at least one field equal to dummy value "@" will be excluded.
tagsm5 Output the grand total This tool adds a grand total line (total of all lines) to <file>. The range of fields from tag <d1> to tag <d2> are dummy fields, and every field in the range from tag <s1> to <f2> are summed fields. A record is added at the end of the list containing the grand totals of all summed fields. The dummy fields in the grand total record are padded with "@". When a file that has subtotal lines from the tagsm4 command is processed, the subtotal lines that were added with tagsm4 are ignored when calculating the grand total.
tagupl upl for tag-formatted data Compares the specified tag field <key> in tag-formatted file <master_file> with the tag field in tag-formatted file <tran_file>, and if the key values are the same the record in <master_file> is replaced by the record in <tran_file>. If the key tag field only exists in one of the files <master_file> or <tran_file>, then records from both files are output. If <master_file> is specified as "-" then the command reads from standard input. If <tran_file> is not specified or specified as "-" then the command reads from standard input. If either <master_file> or <tran_file> are zero byte files, an error occurs. If either <master_file> or <tran_file> are null files (only have a tag row) then the command processes normally. <tag> can be a combination of any of the four formats below: 1. TAG1 (Normal Format) 2. TAG1/TAG4 (Range Format: Tags must be contiguous) 3. TAG3@TAG1 (Individual Format) 4. NF-4 (NF refers to the last field in <tran_file>)
underlay Overlay fields one record at a time. Using the first <n> fields of <file> as key fields, overlay in order each field in records that contain the same key. The "@" character is the NULL character by default contain this value are not overlaid. If <file> is not specified or specified as "-" then the command reads from standard input. Use option -d to change the NULL character.
up3 Merges two files on the same key field The key field specified as key=<key> of each record of the transaction file passed as argument <tran> (or standard input) is compared to the same field of the master file <master>, and where the keys match the records from the transaction file are inserted below the matching record in the master file. The key fields in <master> and <tran> do not need to be sorted.
upl Merges two files on the same key field and then extracts the final line of a group of lines that share the same key value. Selects all rows of file <tran_file> (or standard input) where the key field specified as key=<key> matches the same key field in <master_file>, then merges those rows below the row in <master_file> and finally extracts the last record in the group of rows that share the same key value. Both <master_file> and <tran_file> must be sorted. * Performs the same function as up3 + getlast.
ychange Creates a record showing changes from the previous record Creates "change records" in the form "before" "after" based on the contents of the history file <file> using fields 1-<n> as the key. If <file> is not specified or specified as "-" then the command reads from standard input.
Date/Time
calclock Calculates date and time (calendar clock) This tool takes a date string passed as an argument or a specified field in a file read in from standard input, and replaces it or returns it as the number of seconds since 01/01/1900. Used to convert dates or times to decimal in order to perform calculations on them. The original date string must be in one of the following three formats: YYYYMMDD (8 digits), YYYYMMDDhhmm (12 digits) or YYYYMMDDhhmmss (14 digits). The converted value can be further converted to minutes by dividing by 60 or to days by dividing by 86400.
dayslash Conversion filter for date/time formatting
isdate Checks an 8-digit date Checks if the 8-digit date passed as <date> is a valid date. If it is, the command exits normally, otherwise the command exits with an error (status 1).
mdate Performs date calculations
time-excel Convert a Date and/or time to Excel date/time format Converts a date and/or time (YYYYMMDD, YYYYMMDDHHMMSS, HHMMSS) to Excel date/time format "Integer.Decimal". Use the -r option to convert in the reverse direction. The fields must be specified in ascending order. You can also specify consecutive fields such as "1/3".
Formatting
1x Remove leading zeroes Removes the leading zeroes from the specified field or specified string within the specified file. (Also removes trailing zeroes after the decimal point.)
bb BashBeautifier A filter that beautifies Bash code. Performs the following: Adds indents Lines up all of the pipes Use as a filter in vi. Specifically, in vi: :%!bb will execute the filter.
bedit Binary Editor bedit dumps the specified file in hexadecimal code, then after you have edited it in vi, it converts it back to text.
calc Insert data or formula immediately after the specified field. wrapper for "awk" command Inserts data or formula specified as an <awk_exp> (following the grammar of the awk command) immediately after the <f>th field in <file>, then executes awk. Makes it easy to insert values in between fields that would be difficult to write with awk.
calsed sed "light" calsed is a "light" version of the "sed" string replacement function. Replaces the specified string with a specified string. Unlike "sed" you cannot use Regular Expressions. If a file name is not specified or if "-" is used, then calsed expects a file on standard input. If the replacement string is "@" then the string is converted to NULL. To actually convert to "@" use the -n option. The "-s" specifies the character which is to be converted to a space character within the replacement string.
cap Converts roman alphabet letters to upper case. Converts all roman alphabet letters in the specified fields to upper case. Characters that cannot be converted (non-roman, numbers, symbols, multi-byte) are not converted and output as-is.
charsplit Cuts a field into two fields of fixed length Splits the specified <field> into two fields of length <length>. The width is display width (single-byte character = 1). If a multi-byte character will be split in the middle then the field is split at the previous character. If you specify a length so that the field cannot be split, then one field is output as "_". The -b option uses byte width instead of display width.
comma Formats the specified field with commas every 3rd (or 4th) digit. Inserts commas after every third (or fourth) digit in the numeric text data specified in the argument or input on standard input. The field to be formatted is specified as an argument.
ctail Delete the last n rows of a file Outputs the file specified as the argument or standard input removing the last <n> rows ( n >= 0 ). If specified as -<n>c then the last <n> bytes are removed. The switch "-n" can be omitted. ctail -n 3 ctail -3 ctail 3 all behave identically.
distribute Writes standard output into separate files one line at a time. Writes the contents of standard input into the files specified one line at a time.
drawline Draws a line Inserts a horizontal line between rows (records) in the specified file. The line is drawn with "-" characters, the length of the line is the same as the length of the first row. If the file is omitted or specified as "-" this command will read from standard input. $ cat data 01 Massachusetts 01 Boston 91 59 20 76 54 01 Massachusetts 02 Worcester 46 39 8 5 21 01 Massachusetts 03 Springfield 82 0 23 84 10 02 New_York 04 Manhattan 30 50 71 36 30 02 New_York 05 Brooklyn 78 13 44 28 51 02 New_York 06 Queens 58 71 20 10 6 02 New_York 07 Bronx 39 22 13 76 08 02 New_York 08 Albany 82 79 16 21 80 02 New_York 09 Buffalo 50 2 33 15 62 03 New_Jersey 10 Newark 52 91 44 9 0 03 New_Jersey 11 Trenton 60 89 33 18 6 03 New_Jersey 12 Moorestown 95 60 35 93 76 14 Texas 13 Philadelphia 92 56 83 96 75 14 Texas 14 Austin 30 12 32 44 19 14 Texas 15 Lancaster 48 66 23 71 24 14 Texas 16 Hershey 45 21 24 39 03
fcols Fix (align) the columns in a text file This tool aligns the column widths of all of the fields in the specified file or standard input. The tool can automatically calculate the column width of each field or you can specify the column widths manually.
filemrg Merge a file into a template Merge <file> into the lines within <template> that contain <label>. If there are multiple lines with <label> then multiple <files> will be merged.
formfix Fixes (makes read only) all forms in an HTML file Converts all input tags (text radio checkbox reset button submit), textarea tags and select tags in file <HTML> to read only. The attributes added to each tag are as follows: The class name class="readonly" can be changed using the –class option. input type="text" -> readonly="readonly" class="readonly" tabindex="-1" input type="radio" -> class="readonly" disabled="disabled" input type="checkbox" -> class="readonly" disabled="disabled" input type="image" -> disabled="disabled" input type="reset" -> disabled="disabled" input type="button" -> disabled="disabled" input type="submit" -> disabled="disabled" textarea -> readonly="readonly" class="readonly" tabindex="-1" select -> disabled="disabled" For <select> tags, the option with attribute selected="selected" will have a tag of the format "input type="hidden" name="name" value="value" /> inserted immediately after it. The same is done for <input type="radio" /> and <input type="checkbox" /> tags with attribute checked="checked". If multiple options are selected, multiple hidden tags are generated.
formmerge Merges characters into an HTML template Inserts the values from the <data> file (First field: tag name, Subsequent fields: values) into the input tags (text radio checkbox hidden), textarea tags and select tags of the <HTML_template> file.
fromcsv Filter Converts a CSV file to a space-delimited file. If the file name is not specified or is "-" then this command expects input on standard input.
hcat Concatenates files horizontally Horizontally concatenates the multiple files passed as arguments. The file format is not changed and the resulting file looks as if the files have been lined up horizontally.
head2 Replacement for head command When reading a file from standard input, the head command exits immediately after reading the first <n> lines, however the head2 command does not exit but instead reads in the entire data (and ignores it). Because of this, when using pipes such as in the examples below: $ cat bigfile | head $ cat bigfile | head2 The first example causes the cat command to exit with an error (the head command exits before cat is finished, so the standard input becomes "jammed"). However the second example exits normally. This is the same whether or not you use the -n option. head2 -n 5 head2 -5 head2 5 The above all produce the same result.
lcat Repeated output of file (looping cat) The lcat command "cat"s the specified file <n> times.
linecut Splits a file into the specified number of lines Normal: Splits <file> into multiple files of <n> lines. The resulting files use the name <filename>. You can specify %05d in order to append a sequential index to the name of the file. The -m <max> option limits the number of resulting files to <max> files. All remaining rows are added to the final file. -f <infofile> option: <infofile> is a text file consisting of two fields as follows: 1:max number of files 2:file name When <file> is split into files of <n> number of rows, the command follows the settings in <infofile> and creates only max number of files. If there are extra rows remaining, these are added to the end of the last file. The linecut command outputs the names of the resulting files. If <file> is not specified or specified as "-" then the command reads from standard input.
linefeed Adds a linefeed to the end of a file. If <file> has no linefeed code at the end of the file, this command adds a linefeed to the end of the file. If <file> already has a linefeed at the end, this command does nothing. If <file> is not specified or specified as "-" then the command reads from standard input.
makec Converts row data into columns. Reorgaizes data by creating separate records from a single record using as a key the first field up until the field specified as "num=<n>".
maker Converts columnar data into rows. Reorgaizes data by combining separate records from multiple rows using as a key the first field up until the field specified as "num=<n>".
map Converts a list to a matrix with row and column headings Re-formats the specified file or standard input into a matrix consisting of <n> row key fields, <m> column key fields (m=1 when omitted) and the rest of the fields as data fields. When formatting as a matrix, any missing data is padded with zeroes. You can change the padding character with the -m option.
name-tag Converts a file in name format to tag format name-tag converts a file in name format to tag format.
overlay Overlay one file on top of another. Overlay a text file "overfile" on top of a text file "basefile" and output as a single file. This command is often used to create reports by first moving the text of one file using the "txtmv" tool to open up space and then overlaying another file to make the report.
pad0 Add zeros to the beginning Pads the specified fields in the specified file or standard input with zeros to the specified number of digits.
sb_pipealign Aligns the pipes in a bash script sb_pipealign=shellscript beautifier for pipe alignment Reads all or part of a shell script from standard input and aligns all pipes. The pipe position is set automatically.
sb_rtrim Removes whitespace from the right of lines in a bash script sb_rtrim = shell beautifier for right side triming Reads all or part of a shell script from standard input and removes all spaces (single-byte and multi-byte) and tabs from the right of lines.
strmerge Merge characters into a template Reads data from "data" or an environment variable and merges it into the file specified as "template". Has the following three usages: 1. Merge all fields in "data" in order (normal) 2. Merges entire records at a time from "data" into the labels specified in "template" (by row) 3. If "data" contains heirarchical data, the data is merged into the labels in "template" according to the structure. (heirarchical data)
tagawk A wrapper for awk that allows tag names to be used The tagawk command allows you to use tag names (%TagName) instead of field variables ($1, $2) in the awk script "<PATTERN>{<ACTION>}". When the tagawk command detects a tag-name variable (%TagName), it looks up the field position of the tag in <file> and converts the tag name variable to a field variable ($1, $2, etc.) before executing awk. Arguments starting with "-" immediately after tagawk are considered to be options. You can only specify one <file>. If <file> is not specified or specified as "-" then the command reads from standard input. If <file> is a zero-byte file, an error occurs. If <file> is a null file (tag line only) the command executes properly.
tagcat cat multiple tag files Runs cat on tag files <file1> <file2> … Output is a tag file. If the tags are different in each file, the sum of the tags becomes a new tag. In this case, the fields for tags that don’t exist in a file are padded with the underscore ("_"). You can change the padding character with the -d option. If <file> is not specified or specified as "-" then the command reads from standard input.
tagcomma Add commas to the specified field every three (or four) digits Inserts commas after every third (or fourth) digit in the numeric text data specified in the argument or input on standard input. The field to be formatted is specified as an argument.
tagcond A wrapper for awk that allows tag names to be used The tagcond command allows you to use tag names (%TagName) instead of field variables ($1, $2) in the awk script "<PATTERN>{<ACTION>}". tagcond ‘<PATTERN>’ <file> is equivalent to: awk ‘NR==1{print}NR>1&&(<PATTERN>){print}’ <file> When the tagcond command detects a tag-name variable (%TagName), it looks up the field position of the tag in <file> and converts the tag name variable to a field variable ($1, $2, etc.) before executing awk. Arguments starting with "-" immediately after tagcond are considered to be options. You can only specify one <file>. If <file> is not specified or specified as "-" then the command reads from standard input. If <file> is a zero-byte file, and error occurs. If <file> is a null file (tag line only) the command executes properly.
tagmap Converts a tag file list to a matrix with row and column headings Re-formats the specified file or standard input into a matrix consisting of <n> row key fields, <m> column key fields (m=1 when omitted) and the rest of the fields as data fields. When formatting as a matrix, any missing data is padded with zeroes. You can change the padding character with the -m option.
tagmerge Merge characters into a template Merges fields from <tagdata> into the specified labels within <template> one row at a time. When the pattern "###TagName###" appears in the label, the value from <tagdata> with the same tag name is inserted. If no label names are specified, all data is merged with <template>.
tagunmap Restore a file that was tagmapped to its original format 1. Reverts a file mapped with "tagmap num=<n>x<m>" back to its original form. 2. Reverts a file mapped with "tagmap +maker num=<n>x<m>" back to its original form.
tdconnect merges <td></td> blocks Horizontally merges <td> elements within a <tr> block surrounded by a <label>. The number of <td></td> elements merged is <n>. The values within the <td></td> blocks are checked from left to right in order, and if consecutive values are the same then the blocks are merged. The attribute colspan="<number merged>" is added to the <td> element and the <td> blocks that were removed are commented out. <th> elements are also merged in this way.
tocsv Converts a space delimited file to a CSV file Converts a space delimited file to a CSV file. The fields specified as f1, f2, f3…(f1<f2<f3) are considered "String 1" and are enclosed in quotes when converted. Fields not specified are considered "String 0" and are not enclosed in quotes. If the file name is omitted or specified as "-" then the command reads from standard input.
transpose Transposes rows and columns Transposes rows in <file> into columns. If <file> is omitted or specified as "-" then the command reads from standard input.
trconnect merges <tr> elements Vertically merges <td> elements within <tr> elements surrounded by <label>. For <tr> elements generated repeatedly with strmerge -l, the first <n> keys specified by num=<n> are checked, and if they have the same value then the table cell is heirarchicaly merged vertically. For example, if there are five key cells in a row with the same value, then the five cells are vertically merged. The attribute rowspan="5" is added to the first <td> element, and the remaining four <td> elements are commented out.
txtmv Moves the position of text. Moves an entire text file to a specified position. Arguments <x> and <y> are the horizontal and vertical start positions for the text to be moved. Position is specified as a number of characters. This command is used mostly together with the "overlay" command to generate reports by combining multiple formatted text files.
unmap Converts a mapped file back to its original form. 1. Reverts a file mapped with "map num=<n>x<m>" back to its original form. 2. Reverts a file mapped with "map +maker num=<n>x<m>" back to its original form.
unsco Remove leading and trailing underscores from fields Removes leading and trailing underscores from all fields in the file passed as the argument (or standard input).
vcat Concatenates files vertically Vertically concatenates multiple files specified as arguments.
wexcel Merge data into an Excel report template Merges data in field format into the specified position of the specified worksheet of an Excel file report template. The merged data can be string, date or numeric type. Cell formatting and borders, etc. must be set ahead of time in the template Excel file.
wexcelx Merge data into an Excel 2007-2010 report template (xlsx,xlsm) Pastes field-formatted data into the Excel worksheet template and creates an Excel file.
Input/Output
cgi-name Converts data received using the CGI POST method to "name" format. cgi-name converts data received using the web server CGI POST method to "name" format.
cgi-read Reads data that was passed using the CGI POST method cgi-read reads data that was passed from a web server using the CGI POST method.
cgi-tag Converts data passed using the CGI POST method to tag format. cgi-tag converts data passed using the CGI POST method to tag format.
check_attr_name Checks the attributes of data in name format Checks the data in <name_file> against the tag names, string length and attributes defined in <check_file>. If the attribute is specified in uppercase then the string length must mach exactly, if the attribute is specified in lowercase then the string length must be less than or equal to the specified length. If the tag in the <name_file> is specified as TagName_Number, then the "_Number" part is removed before checking. If there is an error, the command ends with an error and the tag name, string length and attribute is output.
check_date_name Validates the date/time data in name format Checks to make sure the day/week/month data in <name_file> is correct according to the tag names and date patterns (D/W/M) in <check_file>. If there is an error, the command exits with an error and outputs the tag name and check pattern to standard output.
check_dble_name Checks for duplicate data in name format Conducts a duplicate check of the values in <name_file> based on the tag names specified in <check_file>. If there is an error, the command exits with an error and outputs the tag name to standard output.
check_inlist_name Checks data in name format to see if values are on a specified list of values. Checks if values in <name_file> appear in the list specified in <check_file>. If there is an error, the command exits with an error and outputs the tag name and list name to standard output.
check_need_name Checks if a particular value exists in data in name format Checks <name_file> to see if the tags in <check_file> have values. If the tag doesn’t exist or has no value set, the command exits with an error and outputs the tag name to standard output. The <check_file> can be formatted as TagName + Any string other than ("_").
email SMTP client (mail sending client)
htable Extracts strings from a table in an HTML file. Outputs the text strings from within the table tag inside the file specified as <html> or standard input (if file name is omitted or "-"). Text is output as rows and columns. Any other tags within <td></td> (for example, <a></a>) are not output. </br> is converted to "\n". Whitespace and tabs are not output. Line breaks are converted to "_". If you use the -a option, then the href target xxxx within an <a> tag (for example, <a href="xxxx"> is also output. If you use the -i option, empty cells (<td></td>) will be replaced with <string>. By default empty cells are converted to "_".
mime-name Converts MIME format data to name format mime-name converts MIME format data to name format. Files that are not images are output normally (-i initializes null data, -d converts empty data), but for data with "Content-Type: image/XXX" <name> is output as "/tmp/<name>.XXX" and the actual image file is output to "/tmp/<name>.XXX". For data with "Content-Type: application/vnd.ms-excel" <name> is output as "/tmp/<name>.xls" and the actual image file is output to "/tmp/<name>.xls". If you want to change the output path of the image file, use the "–path" option to change the "/tmp/" path. If <file> is not specified or specified as "-" then the command reads from standard input.
mime-read Reads a MIME formatted file Searches for name="<name>" within each section of MIME-formatted file <MIME-file> and outputs the data for that part. If you specify -n <str> then when the data is null, <str> will be output. If you specify -s <char> then all spaces in the data will be converted to <char>. If you specify -v then the command searches for all name="…" and outputs a list of Names.
mnameread Outputs multiple name formatted files (in tag format) When you specify a file containing a list of tag names as the <tagname_file> and a file containing a list of filenames of files containing data in name format, this command extracts the values of the tags specified in <tagname_file> from the data contained in <filename_file> and outputs each line of data in tag format.
nameread Read a file that is in "name" format Reads a value from a file in "name" formate by specifying the name. If <namefile> is omitted or specified as "-" then the command will read from standard input. The -l option will also output the name as well as the value The -e option allows you to use a regular expression The -s option allows you to specify the character to replace whitespace (If this is not specified, whitespace is deleted.) The -n option will initialize the null data.
rexcel Convert text in an Excel file. Outputs the specified range of the specified sheet in an Excel file as a table. Empty cells are padded with "@", spaces are converted to "_" and linefeeds are converted to "\n".
rexcelx Converts an Excel file to text Outputs the specified range of the specified sheet of an Excel file as a text-based table. Empty cells are padded with "@" and spaces converted to "_".
wpdf Create a PDF Translates a script file containing character, line and jpeg scripts and outputs a pdf file to standard output.
xdump Displays a hexadecimal dump of a file The xdump command displays a hexadecimal dump of the specified file.
Mathematical
check_cmp_name Compares data in name format Compares tag vs. tag or tag vs. value in <name_file> according to <argument>. If there is an error, the command exits with an error and outputs the tag name to standard output. <argument> is specified as "Val1 operator Val2". Val1 and Val2 can be tag names or values. The operator can be any of the following 6 operators. -EQ -eq <– = (equal) -NE -ne <– != (not equal) -GE -ge <– >= (greater or equal) -GT -gt <– > (greater) -LE -le <– <= (less or equal) -LT -lt <– < (less)
divk Divide the specified field by 1000 Divides the numeric text data passed as the argument or via standard input by 1000 and outputs the answer. Used when creating a report in units of 1000s. Each field to be divided by 1000 can be specified sequentially as the argument. The resulting value is rounded to the nearest whole number. If you use the "-s" option the resulting value is not rounded and the result is output including any decimals
lcalc Perform precision floating point calculation (18 integer digits + 18 decimal places) Performs a mathematical calculation as described in <script> on each field of the specified file or standard input. In <script>, you can specify equations, fields and field ranges by themselves or several separated by commas. The mathematical calculations are carried out with 36-digit precision (18 integer digits and 18 decimal places). Within the equations you can use constants, NF (total Number of Fields), NR(total Number of Records), binary operators (+ – * / %), unary operators (- $), rounding functions (round() roundup() rounddown()) and parentheses. The "$" operator represents the field that was input. Operands are truncated to integers. The truncated result must be within the range 1 to NF. Rounding functions include round() roundup() and rounddown(). They round the first argument to the number of decimal places specified in the second argument. The second argument must be an integer constant. If the second argument starts with zero, rounding is performed on the integer part, if it begins with anything other than zero, rounding is performed on the decimal part. Example: rounddown(123.456, 01) = 120.000, rounddown(123.456, 1) = 123.400. The output of the equation is in the format "Integer_Portion.Decimal_Portion". The integer portion is displayed to the necessary number of digits (1πC£18) but the decimal portion is always displayed to 18 digits. However, when the result of the calculation is an integer, then no decimal point or decimal digits are displayed. If overflow occurs in an equation, lcalc is aborted. If you specify the –overflow option, the calculation of the equation is aborted, the text string specified in the option is output, and execution of lcalc continues. If you omit the text string in the –overflow option, the string "ovflw" is output by default. For division and modulo, if the right operand is zero then the result of the calculation is zero. If you specify the –divzero option, the equation containing that operation is discarded and the text string specified in the option is output. If you omit the text string in the –overflow option, the string "div/0" is output by default. Fields are formatted $<formula> and indicate the input field. The difference with the $ operator is whether or not it is resolved to a number. For example, if the script is "$1, ($2)" and the input is: 123.456 123.456 Then the output is: 123.456 123.456000000000000000 A range of fields can be specified using the format $[<formula>:<formula>] which indicates multiple contiguous input fields. You can have as much whitespace (including linefeeds) in between the elements of the script. Everything starting with "#" up until the end of the line is a comment. Comments can be placed anywhere whitespace can be placed. If you specify the -d option, the file is not read and the script is executed directly. In this case, you cannot specify the $ operator, fields or ranges of fields.
plus Adds the arguments Outputs the value v1 + v2 + v3 …
proportion Returns a proportion Within a group of records (rows) in <file> (or stdin) that have the same value in the <key> field , calculates the proportion that each row makes up out of the total of all rows in the group, and inserts the proportion as a new field immediately following the key field.
rand Generates random numbers Generates <num> random numbers. Outputs uniform random numbers from 1 to 65536. If you specify option -m <max> the command generates random numbers from 1 to <max>. If you specify the -a option then <num> alphabetic lowercase characters are generated. The -A option generates uppercase characters.
ratio Returns a ratio Calculates the ratio of the field specified as "val=" to the sum of that field in all the records (lines) of the specified file or standard input, then inserts the result in the next field.
round Round, round-up or truncate Rounds, rounds up or truncates the specified field of the file (or standard input) to the specified number of digits and outputs the result. If you specify the decimal places as "0" (n.0) then the field is output as an integer. If you specify the decimal places as "0+num" then the integer value is rounded, rounded up or rounded down to that number of decimal places.
taground Round, round-up or truncate Rounds, rounds up or truncates the specified field of the file (or standard input) to the specified number of digits and outputs the result. If you specify the decimal places as "0" (n.0) then the field is output as an integer. If you specify the decimal places as "0+num" then the integer value is rounded, rounded up or rounded down to that number of decimal places.
String
iscode Checks the number of digits in a numeric code. If <string> is a number with <n> digits, the command exits normally, otherwise it exits with an error (status 1).
isnum Checks if the argument is a number This command exits normally if <num> is a number, otherwise it exits with an error (status 1).
kana Converts Japanese "Katakana" to/from "Hiragana" -htok : Converts hiragana to katakana -ktoh : Converts katakana to hiragana
full Convert to multi-byte (zenkaku) characters. Converts all single-byte (hankaku) katakana and alphanumeric characters in <file> or standard input to multi-byte (zenkaku) characters. (<-> half)
mcrypt MD5 encrypt MD5 encrypts <string> using <salt>. The default for <salt> is "UspLab". Please use a string of length 8 or less for <salt>.
half Convert to single-byte (hankaku) character Converts all multi-byte (zenkaku) katakana and alphanumeric characters in <file> or standard input to single-byte (hankaku) characters. (<-> full)
scrlen Outputs the display length of the specified string Outputs the display length of the UTF-8 string <string>.
sml Converts roman alphabet letters to lower case. Converts all roman alphabet letters in the specified fields to lower case. Characters that cannot be converted (non-roman, numbers, symbols, multi-byte) are not converted and output as-is.
uconv UTF-8 <=> Shift JIS / EUC-JP Code Conversion Converts character codes to and from UTF-8 and Shift JIS / EUC-JP Options and compatibility are shown below: -utos UTF-8 to Shif JIS -stou Shift JIS to UTF-8 -utos UTF-8 to EUC-JP -stou EUC-JP to UTF-8 If characters that cannot be converted or characters containing invalid character codes are detected: They are converted to "πCo" If the -d<string> option is specified, they are converted to <string> If the -e option is specified, conversion is aborted. If the -Lu option is specified, Unix line ending is used (\n). If the -Lw option is specified, Windows line ending is used (\r\n). If "Gaiji" conversion is required, you can specify the gaiji conversion table using option –gaiji GAIJI_TABLE. The Gaiji conversion table contains the character code before conversion, character code after conversion and comment separated by whitespace. Comments are optional. Lines beginning with # or empty lines are ignored.
ugrep Wrapper for GNU grep ugrep is a wrapper for GNU grep. Even if the specified pattern is not found in the file, the error status is 0 and not 1. This prevents disruption of shell scripts.
utf8nude Removes non-displayable ASCII characters and improper UTF8 bytes After converting POSTed data with cgi-name, this command deletes improper bytes. "Improper Bytes" refers to: * Bytes within ASCII code that are non-displayable, CR (0x0d) * Bytes that do not fall within ASCII code or UTF8 code * 5 and 6 byte UTF8 codes Whitespace, tabs and linefeeds are allowed. Question -> what happens to "0x0d" -> 0x0d is allowed
width Returns the display width of a file Returns the display width of <file> using a single-byte character as width 1. multi-byte characters are width 2. You can specify multiple files.
System
corenum Outputs the number of physical cores in the server Returns the number of physical cores in the server. On Linux, this command reads core id and physical id from /proc/cpuinfo and then calculates the number of cores. On FreeBSD, this command executes sysctl -n kern.smp.cpus The rules for calculating cores on Linux are as follows: * If core id and physical id are the same, then one core appears to be more than one using hyperthreading, so it is counted as only one physical core. * If there are no entries for core id then the number of processors is counted.
exist Checks if one or more files exist Exits normally if every file <file1> <file2> … exists, otherwise it exits with an error.
extname Returns the file extension from a pathname extname returns the file extension of the specified <pathname>.
getip Returns the IP address of the main machine when you specify the MSCTRLTABLE job group. Returns the IP address of the main machine when you specify the MSCTRLTABLE job group. If the msflg of the machine group of the specified job group is "B" (load balance), then the IP address of one random machine is returned. If there is no load balancing in the machine group, then the IP address of the machine with msflg set to "M" (main) is returned. If no IP address is detected, or multiple IP addresses are detected, the command exits with an error.
gtouch Creates a null file in gzip format. If the specified files <file>… have content (are larger than zero bytes), this command does nothing. If they are zero bytes or do not exist, then a null file (20 bytes) in gzip format is created. When this null file is unzipped, the result is zero bytes.
ismime Checks if a file is in MIME formant This command exits normally if <file> is in MIME format, otherwise it exits with an error (status 1).
itouch Initializes a file If the specified file is not found or is zero bytes, then its contents are initialized with <string>. If -<n> is specified as a number, then the file is initialized with <n> lines of <string>. You can specify multiple files. If the file exists and is larger than zero bytes, the command does nothing. If the file specified is "-" then the command expects input on standard input and will output to standard output.
lcnt Counts lines (rows) Counts the number of rows (records) in the text data input from the file specified in the argument or standard input.
man2 Display the manual page for a USP command Displays the manual page for a USP command.
msctrl Extracts information from the MSCTRLTABLE Extracts rows from MSCTRLTABLE according to the Select_Conditions and outputs the information specified in the Output_Conditions. Information is output in the order in which Output_Conditions are specified.
msec Displays microseconds msec displays the number of microseconds since 1970/01/01.
rootname Removes the extension from a path name The rootname command outputs <pathname> with the file extension removed.
semwait Wait until the specified file exists If all files <file1> <file2> … exist, the command exits normally. If at least one file doesn’t exist, the command waits. It checks to see if the file exists once per second. The -t option allows you to specify a time limit in any of the following four syntaxes: HHMM: Hours and Minutes HHMMSS: Hours, Minutes and Seconds YYYYMMDDHHMM: Year, Month, Day, Hours and Minutes YYYYMMDDHHMMSS: Year, Month, Day, Hours, Minutes and Seconds An error is generated if the specified date/time is passed.
ulock Exclusive access control command Syntax 1 creates an absolutely exclusive zone. In this case, ulock creates <lock-file> exclusively. Syntax 2 creates a read/write lock. When using -w (write lock), ulock creates <lock-file> exclusively and waits until the link count on <counter-file> becomes 1. When using -r (read lock), ulock attempts to increment the link count of <counter-file> by one until it is successful, at which time <command> is executed and the link count of <counter-file> is decremented by one. The –timeout option specifies the maximum time to wait for <command> to run. If you specify -1 it will wait indefinitely. Default is -1 (wait indefinitely). The –invalid option deletes old lock files. Default is 60 seconds.
Statistical
abc ABC (Pareto) Analysis
ceil Ceiling Function
floor Floor Function
coefvariation Coefficient of Variation
correl Correlation Coefficient (Pearson's product-moment correlation coefficient)
covariance Covariance
entropy Entropy
fft Fourier Transform
ifft Inverse Fourier Transform
freqdist Frequency Distribution
gauss-density Density Function of the Gauss Distribution
lsfit Least-Squares Fit
markov-chain Markov Chain Simulation
mat-calc Matrix Calculation
mat-minus Matrix Subtraction
mat-multi Matrix Multiplication
mat-plus Matrix Addition
median Arithmetic Median
mode Mode
multireg Multiple Regression
quariledev Quartile Deviation
range Range Distribution
sampling Weighted Sampling of Data
stddev Standard Deviation
stdscore Deviation Score
variance Variance