SEDIT
Jump to navigation
Jump to search
Edits a string and writes it to a results parameter
<ISTR> Input string to be edited (String results label or literal string) <OSTR> Output results label <FUNC> Edit operation to perform (see below for LIST) <P1> Edit parameter as defined for each operation <P2> Edit parameter as defined for each operation NOTE: SEDIT can now handle MULTIPLE operations per line (entering just ONE output result name): nM> SEDIT <input> <output result> FUNC1 [P1] [P2] ... FUNCn [P1n] [P2n] This command performs a few basic string editing functions that may be required by some macros with a sophisticated user interface or command line alteration functions. The supported operations (CAPS indicate minimum required name) are: APPEND, BETWeen, BSEArch, CLEAN, ELEMent, EXT, GSUBstitute, HEAD, JOIN, KEY, LENgth, LOCAse, NELem, NFORM, PADL, PADR, PADB, PARSE, PARSEALl, PARSEARgs, PARSEDInd, PATH, PREPEND, RANGE, ROOT, SEARch, SELect, SPLIT, STRIM, SUBstitute, TAIL, TRIM, UPCAse, WORD NOTE: As of NeXtMidas 3.1.2, the behaviour of The BETWeen, RANGE, and SELect operations is deprecated and will change in a future release. These changes are to bring SEDIT in line with X-Midas SEDIT behaviour. These changes are: BETWeen: Negative numbers will be treated as offsets from the last character in the input string. LENgth: The index passed to the LENgth will be one-based, since it is counting elements. RANGE: Like BETWeen, negative numbers will be treated as offsets from the last character in the input string. The net effect this is that (in zero index mode) a range of -5 str.length-1 will return the last 5 characters of the string, where previously it would return the last 6 characters. Also, leaving off the second index will equate to a value of 0, NOT the end of the input string. SELect: This operation will be 1-index based only, since it is returning an element number, not an index. These changes can be utilized as of NeXtMidas 3.1.2 using the /LEGACY=FALSE switch. In a future release, this will be the default setting. Until that time, users of the above operations not using /LEGACY=FALSE will get deprecation warnings. There is also an equivalent switch /seditLegacy added as a convenience so that users can use the switch at the top of a macro without interfering with other primitives that use /legacy (i.e. LIST2). By default SEDIT is "offset" or "zero-based". The /OB or /ONEBASE switch may be used to FORCE one-based processing. When using index return value functions (SEARCH, BSEARCH and PARSEDINDEX,..) one should check for < 0 for invalid or not found. A return value of zero (0) is NO LONGER SUPPORTED! Zero is a valid index in JAVA and should not be used as an error indicator. NOTE 1: In NeXtMidas a string result has access to all of the methods of the string class. Use the QUERY command to show the available methods. For instance, we can use raw JAVA toUpperCase method as follows: nM> res str "This is a string" nM> res str2 str.touppercase() nm> res str2 16S: STR2 = THIS IS A STRING NOTE 2: In NeXtMidas null strings, "" are differentiated from blank spaces, " ". Thus, there is no "special string" "<SPACE>" as in the X-Midas version of SEDIT. For example, nM> res orig "Underscores_to_spaces" nM> sedit orig str gsubs "_" " " 21S: STR = "Underscores to spaces" FUNCTIONS: APPend - Append <P1> to <string> BETWeen - ZERO-BASED: extracts any possible substring from <string> between <P1> and <P2>. It doesn't matter whether <P1> is "less than" <P2> or not; BETWEEN returns what is *between* the two indices, inclusive. A negative number means that many characters from the other index. In general, BETWEEN will give as much of the string as possible if at least one index is within the string. If both indices are out of range, a blank string will be returned. ONE-BASED (/OB) ONLY: A zero means the end of the string. The zero is a special case; if the other index is within range, it returns the indicated substring; if not, it returns blank. NOTE: As of NeXtMidas 3.1.2, the BETWeen function will have the following change to behaviour when using /LEGACY=FALSE. This will be the default behaviour in a later release. Negative numbers will represent an index from the end of the string. So, in zero-based, "-1" is the last character in the string, and in one-based, "0" is the last character. Like RANGE, BETWeen will support leaving off the last index to mean the end of String. See the examples below. Examples (Zero-Based): * Get the first 7 characters of a string: nM> SEDIT "This is a string" str "BETWEEN" 0 6 str = "This is" nM> SEDIT "This is a string" str "BETWEEN" 6 0 str = "This is" * Get the last 6 characters of a string nM> res instr "This is a string" nM> SEDIT instr str "BETWEEN" ^instr.length -5 str = "string" nM> SEDIT/LEGACY=FALSE instr str "BETWEEN" -1 -6 str = "string" * Get the characters OUT-OF-RANGE of a string nM> SEDIT "This is a string" str "BETWEEN" 100 200 str = "" * Get the entire string nM> SEDIT "This is a string" str "BETWEEN" 0 200 str = "This is a string" * Get five (5) characters from character 4 to the beginning of the string nM> SEDIT "This is a string" str "BETWEEN" 4 -100 str = "This " * The INCORRECT way to get the last character of the string nM> res instr "This is a string" nM> SEDIT instr str "BETWEEN" ^instr.length ^instr.length str = "" * The CORRECT way to get the last character of the string nM> res instr "This is a string" nM> SEDIT instr str "BETWEEN" ^instr.length-1 ^instr.length-1 str = "g" nM> SEDIT/LEGACY=FALSE instr str "BETWEEN" -1 -1 str = "g" * Get the first character of the string nM> SEDIT instr str "BETWEEN" 0 0 str = "T" * Get the last five characters in a string nM> SEDIT/LEGACY=FALSE instr str "BETWEEN" -6 str = "string" Examples (one-based): SEDIT/OB "This is a string" str "BETWEEN" 1 7 str = "This is" SEDIT/OB "This is a string" str "BETWEEN" 7 1 str = "This is" res instr "This is a string" SEDIT/OB instr str "BETWEEN" ^instr.length -5 str = "string" SEDIT/OB "This is a string" str "BETWEEN" 100 200 str = "" SEDIT/OB "This is a string" str "BETWEEN" 4 -100 str = "This" res instr "This is a string" SEDIT/OB instr str "BETWEEN" ^instr.length ^instr.length str = "g" SEDIT/OB instr str "BETWEEN" 0 0 str = "" BSEArch - search for the substring <P1> in <string> starting from the back and return index of the start of the string in <label>. Returns -1 if not found. Examples: SEDIT "This is a string" idx "BSEARCH" "string" idx = 11 SEDIT "This is a string" idx "BSEARCH" "STRING" idx = -1 CLEAN - returns the CLEANed version of <string> in <label> (see nxm.sys.lib.Parser) Example: SEDIT "This is a string" str CLEAN str = THIS,IS,A,STRING ELEMent - extracts the nth delimited word where n = <P1> and the delimiter = <P2>. The string is not CLEANED as in PARSE. <P2> can be more than 1 character in length. Examples: SEDIT "This/is/a//string" str "ELEM" 2 "/" str = is SEDIT "Bill and Bob and Jebediah" str "ELEM" 3 " and " str = Jebediah Examples: nM>SEDIT "This is a string" x "ENDS" "string" z: X = true nM>SEDIT "This is a string" x "ENDS" "This" z: X = false EXT - returns the filename extension (see ROOT, TAIL, PATH) GSUBstitute - substitutes string <P2> for every instance of <P1> Examples: SEDIT "This is a string" str "GSUBS" "is" "at" str = That at a string SEDIT "This is a string" str "GSUBS" " " "xx" str = Thisxxisxxaxxstring SEDIT "This is a string" str "GSUBS" "is" "" 12S: str = Th a string SEDIT "This is a string" str "GSUBS" "is" " " 14S: str = Th a string HEAD - alias for PATH (see ROOT, TAIL, EXT) JOIN - Join a string array into a String. Examples: nM> SEDIT "A string array to join" word SPLIT " " nM> SEDIT word str JOIN " " 17S: STR = A string array to join KEY - Gets the value from a string containing TAG/VALUE pairs, where <P1> is the tag name and <P2> is the delimiter (= is DEFAULT). Examples: nM> SEDIT "NAME=homer CITY=SPRINGFIELD" myname KEY "NAME" LENgth - calculates the length of string through the Nth naturally delimited entry where <P1> = N. Use N=-1 (default) to calculate length of the entire string. Note: The default behaviour or LENgth will change from zero-based to one-based in a future release, since length is counting elements. Examples (zero-based): SEDIT "This is a string" lstr "LEN" 0 lstr = 4 SEDIT "This is a string" lstr "LEN" 2 lstr = 9 SEDIT "This is a string" lstr "LEN" lstr = 16 Examples (one-based): SEDIT/OB "This is a string" lstr "LEN" 0 lstr = 16 SEDIT/OB "This is a string" lstr "LEN" 3 lstr = 9 Note that you can; for example, extract the first three words of some string with the string length followed by RANGE: RESULT string "This is one powerful utility!" SEDIT/OB string lstr length 3 SEDIT/OB string substr RANGE 1 lstr substr = This is one LOCAse - converts alphabetic characters to lower case Examples: SEDIT "This is a string" str "LOCA" str = this is a string MASK - Calls Parser.mask to build an integer bit mask of enabled items from a list. Examples: SEDIT "AA,BB,CC,DD" mask "MASK" "A|DD" L: mask = 0x9 NELem - Count the number of elements in string delimited by <P1>. The default delimiter is a comma (,). Only an empty string or the the RESERVED word "NULL" has zero (0) elements. Examples: SEDIT "" nels "NELEM" L: NELS = 0 SEDIT "NULL" nels "NELEM" L: NELS = 0 SEDIT " " nels "NELEM" L: NELS = 1 SEDIT "," nels "NELEM" L: NELS = 2 SEDIT "x,y" nels "NELEM" L: NELS = 2 SEDIT "blank separated string" nels "NELEM" " " L: NELS = 3 SEDIT "blanks and commas,separated string" nels "NELEM" " " L: NELS = 4 SEDIT " " nels "NELEM" " " ! Four blanks with blank as delimiter L: NELS = 5 SEDIT "Bill and Bob and Jebediah" nels "NELEM" " and " ! String delimiter L: NELS = 3 NFORM - Format numbers according to given mask. Besides all of the standard Java format strings, Fortran format strings can be applied by surrounding the Fortran format string with parentheses like (F12.2). The following format keywords are also supported: GEN - X-Midas GENeral format (no exponent if between 1E-3, 1E15). VIS - X-Midas VISual format (no exponent if between 1E-3, 1E7). SCI - SCIentific notation. ENG - ENGineering notation (exponent is a multiple of 3). MAN - MANtissa notation (no exponents). DMS - Deg-Min-Sec angular format. ddd'mm'ss LAT - Deg-Min-Sec format for latitude. ddd'mm'ssN LON - Deg-Min-Sec format for longitude. ddd'mm'ssE STD - STanDard time code format. [yy]yy:mm:dd::hh:mm:ss ACQ - ACQuisition time code format. [yy.ddd]:[hh:mm:ss] EPOCH - EPOCH quadwords for time code. [yy]yy:sec_in_year NORAD - NORAD timecode format. yyddd.frac_of_day TCR - TimeCode Reader format. ddd:hh:mm:ss VAX - VAX time format. dd-MMM-yy[yy]:hh:mm:ss HMS - Hour-Min-Sec time format. hh:mm:ss.ffff YMD - Year-Month-Day format. yyyy:mm:dd NET - 32-bit integer formatted as URL. 127.0.0.1 Examples: SEDIT 1 str "NFORM" "#" str = 1 SEDIT 1.12345678 str "NFORM" "#.###" str = 1.123 SEDIT 1.12345678 str "NFORM" "00.000" str = 01.123 SEDIT 1 str NFORM "0.00" str = 1.00 SEDIT 1 str NFORM "(F3.2)" str = 1.00 SEDIT 123.456 str NFORM "DMS" STR = 123'27'22 the start of <P2> occurrence of <P1>. If <P1> not found or if specified occurence of <P1> not found, returns -1. (Since 3.5.0) Examples: nM> sedit "1 and 2 and 3 and" index NSEA "and" 3 L: INDEX = 14 nM> sedit "1 and 2 and 3 and" index NSEA "and" 10 L: INDEX = -1 nM> sedit "1 and 2 and 3 and" index NSEA "missing" 3 L: INDEX = -1 PADx - PADRight, PADLeft and PADBoth. Default pad character is a space. You can specify a length or a relative length by prepending a + before the number. When necessary when padding BOTH, the extra character goes on the right. Examples: nM> sedit "a string" str PADL 20 20S: STR = " a string" nM> sedit "a string" str PADR 20 "." 20S: STR = "a string............" nM> sedit "a string" str PADB 20 20S: STR = " a string " PARSE - extracts the nth naturally delimited word where n = <P1>. <P2> is an optional delimiter, if not set commas or spaces are used. <label> will be in all uppercase letters unless /CLEAN=FALSE. Examples: nM> SEDIT "This is a string" str "PARSE" 2 str = IS nM> SEDIT "This/is/a/string" str "^func" 4 "/" /clean=f nM> res str 6S: STR = string PARSEALl - parses <string> into naturally delimited words and puts the parsed words into a results array named <label>. <P1> specifies the number of elements in the array. If <P1> is not specified or made 0, the number of elements within the array will be defaulted to the number of words in the string. If <P1> is less than the number of words, only the first <P1> words will be placed into the array. If <P1> is greater than the number of words, the remaining elements of the array are null. <P2> is an optional delimiter, if not set commas or spaces are used. All results will be in all uppercase letters unless /CLEAN=FALSE. Examples: * nM> SEDIT "This is a string" word "PARSEALL" 4 nM> res word(0) 4S: WORD(0) = THIS nM> res word(2) 1S: WORD(2) = A nM> res word(4) ERROR: java.lang.IllegalArgumentException: KeyObject.getIndexed(): Error in accessing element 4 from [Ljava.lang.String;@33b121: java.lang.ArrayIndexOutOfBoundsException * nM> SEDIT "This is a string" word "PARSEALL" 4 /clean=f nM> res word(0) 4S: WORD(0) = This PARSEARgs - parses the arguments of a command. The returned object is of type nxm.sys.lib.Args, and as such has access to the Args class methods. Examples: nM> sedit "PATH,FUNC=SET,SYS" args PARSEARGS nM> res L:size args.size nM> res size L: SIZE = 2 PARSEDIndex - parses <string> into naturally delimited words and returns the index of <P1>. Returns -1 if not found. Examples (ZERO-BASED): nM> SEDIT "This is a string" idx "PARSEDI" is idx = 1 nM> SEDIT "This is a string" idx "PARSEDI" "not" idx = -1 nM> SEDIT "This is a string" idx "PARSEDI" "not" idx = -1 Examples (ONE-BASED): nM> SEDIT/OB "This is a string" idx "PARSEDI" is idx = 2 nM> SEDIT/OB "This is a string" idx "PARSEDI" "not" idx = -1 nM> SEDIT/OB "This is a string" idx "PARSEDI" "not" idx = 0 PATH - returns the path of a filename (see ROOT, TAIL, EXT) PREPend - Prepend <P1> to <string> RANGE - extracts the substring from <string> between <P1> and <P2>. For either <P1> or <P2>, a negative number means that many characters from the end. The indices <P1> and <P2> are order-DEPENDENT; that is, <P1> must refer to a position in the string that is equal to or to the left of the position specified by <P2>. If either index is out of range, a blank string is returned. RANGE is therefore useful for catching errors: if it cannot give the entire range of characters, it won't give any. The BETWEEN operator is error-tolerant. Note: As of NeXtMidas 3.1.2, the RANGE function will have the following change to behaviour when using /LEGACY=FALSE. This will be the default behaviour in a later release. Negative numbers will count as an index from the end of the string. See negative number examples below. Leaving off the second index value will equate to a value of zero, NOT the end of the string. Examples(Zero-Based): nM> SEDIT "This is a string" str "RANGE" 0 6 str = "This is" nM> SEDIT "This is a string" str "RANGE" 6 0 str = "" nM> SEDIT "This is a string" str RANGE -5 ^inStr.length-1 str = "string" nM> SEDIT/LEGACY=FALSE "This is a string" str RANGE -6 -1 str = "string" nM> SEDIT "This is a string" str "RANGE" 100 200 str = "" nM> SEDIT "This is a string" str "RANGE" 4 -100 str = "" nM> SEDIT "This is a string" str "RANGE" -1 -2 str = "" nM> SEDIT "This is a string" str "RANGE" -2 -1 str = "in" nM> SEDIT/LEGACY=FALSE "This is a string" str "RANGE" -2 -1 str = "ng" nM> res inStr "This is a string nM> SEDIT inStr str RANGE ^inStr.length ^inStr.length str = "" nM> SEDIT inStr str RANGE -1 -1 str = "n" nM> SEDIT/LEGACY=FALSE inStr str RANGE -1 -1 str = "g" Examples(One-Based): nM> SEDIT/OB "This is a string" str "RANGE" 1 7 str = "This is" nM> SEDIT/OB "This is a string" str "RANGE" 7 1 str = "" nM> res inStr "This is a string" nM> SEDIT/OB inStr str RANGE -5 ^inStr.length str = "string" nM> SEDIT/OB "This is a string" str "RANGE" 100 200 str = "" nM> SEDIT/OB "This is a string" str "RANGE" 4 -100 str = "" nM> SEDIT/OB "This is a string" str "RANGE" -1 -2 str = "" nM> SEDIT/OB "This is a string" str "RANGE" -2 -1 str = "in" nM> res inStr "This is a string" nM> SEDIT/OB inStr str RANGE ^inStr.length ^inStr.length str = "g" ROOT - returns the filename without extension (see TAIL, PATH, EXT) SEARch - Case sensitive search for the substring <P1> in <string> and return index in <label>. Returns -1 if not found . Examples (Zero-Based): nM> SEDIT "This is a string" idx SEARCH "string" idx = 10 nM> SEDIT "This is a string" idx SEARCH "STRING" idx = -1 Examples (One-Based): nM> SEDIT/OB "This is a string" idx SEARCH "string" idx = 11 nM> SEDIT/OB "This is a string" idx SEARCH "STRING" idx = -1 SELect - Find the index of the first token that starts with <P1>. See nxm.sys.lib.Parser for documentation on find method. Note: As of NeXtMidas 3.1.2, the SELect function will have the following change to behaviour when using /LEGACY=FALSE. This will be the default behaviour in a later release. The index returned will always be one-based, and using the /OB switch will have no effect Example (/LEGACY=FALSE): nM> SEDIT/LEGACY=FALSE "This,is,a,string" idx SELECT "str" idx = 4 Example (Zero-Based): nM> SEDIT "This,is,a,string" idx SELECT "str" idx = 3 Example (One-Based): nM> SEDIT/OB "This,is,a,string" idx SELECT "str" idx = 4 SPLIT - Split a string into an array (0 to N-1) of N tokens. Unlike PARSEALL, the delimiter is specified. Examples: nM> SEDIT "A string to split" word SPLIT " " word(0) = A word(1) = string word(2) = to word(3) = split nM> SEDIT "A,string,to,split" word SPLIT "," word(0) = A word(1) = string word(2) = to word(3) = split Examples: nM>SEDIT "This is a string" x "STARTS" "This" z: X = true nM>SEDIT "This is a string" x "STARTS" "string" z: X = false STRIM - trims all leading and trailing spaces off of <string>. Examples: nM> SEDIT " stranded " str STRIM str = stranded SUBstitute - substitutes string <P2> for the first instance of <P1> only Examples: nM> SEDIT "This is a string" str SUBS "is" "at" str = That is a string nM> SEDIT "This is a string" str SUBS "is" " " 15S: STR = Th is a string ! Note length nM> SEDIT "This is a string" str SUBS "is" "" 14S: STR = Th is a string ! Note length TAIL - returns the filename and extension (see ROOT, PATH, EXT) TRIM - trims off string before <P1> and after <P2>. <label> will not contain <P1> or <P2>. Examples: nM> SEDIT "ARRAY(FRAME;INDEX)+5" str TRIM "(" ")" str = FRAME;INDEX nM> SEDIT "FILENAME.EXT" str trim "." str = EXT nM> SEDIT "FILENAME.EXT" str trim ,, "." str = FILENAME UPCAse - converts alphabetic characters to upper case. One may also use the Java method directly, for example; nM> res str "my string" nM> res str2 str.toUpperCase() 9S: STR2 = MY STRING WORD - extracts the naturally delimited word containing <P1> Examples: nM> SEDIT "This is a string" str WORD "ri" str = string Performing Multiple Functions: ============================= To perform multiple operations per line simply chain the operations together (entering just ONE output result name): nM> sedit "this is a string" out GSUBS "is" "at" GSUBS "at" "is" nM> res out 16S: OUT = this is a string When performing multiple operations per line the operations must "make sense". For example, the following will generate an exception: nM> sedit "this is a string" out LEN GSUBS "is" "at" ERROR: Unable to convert GSUBS to type L because SEDIT expects the parameter after LEN to be an integer. SWITCHES: /CLEAN - [DEFAULT=TRUE] Cleans the string before performing the functions: PARSE, PARSEALL, PARSEDINDEX /DEBUG - Turn on debug output /FORCE - Sets <OSTR> in a readonly table. (Since 2.9.3) /LEGACY - Use legacy behaviour of BETWEEN, LENGTH, RANGE and SELECT functions. [DEF=TRUE], but will change to [DEF=FALSE] in a future version. (Since 3.1.2) /OB - Same as /ONEBASE. /ONEBASE - Force 1-based indexing. Affects the BETWEEN, BSEARCH, PARSEDINDEX, LENGTH, RANGE, SEARCH and SELECT functions. /SEDITLEGACY - Same as /LEGACY. (Since 3.1.2) /STRIP - Remove the quotes from the string /ZEROBASE - DEPRECATED -- This is now the default. Use /OB to override. SEE ALSO: Query, nxm.sys.test.test_sedit.mm, nxm.sys.lib.Format, SEE ALSO: nxm.sys.lib.Parser