Class Sort

java.lang.Object
com.softwaremining.jcl.Sort

public class Sort extends Object
This Class provides a function similar to the SORT function accessible from JCL scripts. It can be called from command line / JVM by:
   java -Xmx400m com.softwaremining.jcl.Sort C:/IN.txt C:/OUT.txt sortFields=(00424,5,A,00178,5,A,00183,30,A,114,32,A)  true 12345 true
     where first 3 options represent: input-file, output-file, sort-Fields
           the final 3 optional parameters represent [duplicatesAreInOrder] [recordsize] [removeduplicates] 
          
    or 
     java com.softwaremining.jcl.Sort [parameters] (note - this option requires the SoftwareMining extra libraries
   where parameters is a combination of:
      propertyfile=new-corect.property file
      in=input-file
      out=output-file
      xsumFilename= XSUM File 
      
      sortFields= standard mainframe format, e.g original JCL sort-fields - e.g sortFields=(66,1,A,22,3,D) (from column 66 by 1 character Ascending. From Col 22 by 3 characters Descending)
      sumFields=  standard mainframe format, e.g  sumFields=(40,15,ZD,60,15,ZD,75,8,ZD,99,15,ZD,114,15,ZD)
      recordSize=number, e.g. recordSize=10 - Sometimes a record-line may contain CR/LF in middle of the data. Using this parameter the system will read the entire line, whether it had CR in middle or not.
      dupinorder=true (or duplicatesareinorder=true) . (default: false) - represents : DuplicatesAreInOrder. This is same as "equals" parameter.
      equals==true or false (default: false) Sets the "DuplicatesAreInOrder" parameter. on Mainframe, EQUALS tells the process to preserve the original order of the data within the "sort keys". I java The system automatically preserves the order. 
      removeDuplicates=true or false (default: false)
      saveAsLineSequential=true or false (default: true). Saves as RECORD-Sequential when false)
      convert=true or false (default: true). Converts the output to fixed length file (same as saveAsLineSequential=true)
      
      inRec=Reformats the input records before they are sorted, merged, or copied. Standard mainframe format, e.g. inRec=(9,6,11:6,3,16:15,9,ZD,EDIT=(IIIIIIIITS),SIGNS=(,,,-),131:C' ')
      outRec= reformat the input records after they are sorted, merged or copied. Standard mainframe format, e.g. outRec=(9,6,11:6,3,16:15,9,ZD,EDIT=(IIIIIIIITS),SIGNS=(,,,-),131:C' ')
      outRecSize= number - size of output record e.g. outRecSize=120
      overlay= OUTREC Overlay defined in standard mainframe format, e.g. overlay=(1:1,230,231:C'VV')
      include==standard mainframe format, e.g. include=(1,10,CH,NE,11,10,CH,AND,21,10,CH,NE,C'          ')
      omit= standard mainframe format, e.g. omit=(256,10,EQ,C'0000000000',OR,256,10,EQ,C'          ')
      noDetail=true or false (default: false).  Tells DFSORT not to write the data lines, so if the TRAILER1 was defined - then all we get is TRAILER1, no other records.
      symnames=file1;file2 file1 and file2 will be picked up from DATA_DIR, it is expected that the first line of the file would contain a comma separated key/Value such as "MAXYEAR,C'2018'".
         The system will replace the Key with Value in the include parameter.
      
      sections= standard mainframe format, e.g. sections='(33,3,SKIP=3L,TRAILER3=(1:'BRAND TOTAL ',40:TOTAL=(22,8,ZD,EDIT=(III,III,IITS),SIGNS=(,,,-))))'
      header1=  standard mainframe format, e.g. header1=('MONTHLY USAGE REPORT')
      header2=standard mainframe format,   e.g. header2=('MONTHLY USAGE REPORT')
      header3=standard mainframe format,   e.g. header3=('MONTHLY USAGE REPORT')
      trailer1=standard mainframe format,  e.g. trailer1=('END MONTHLY USAGE REPORT')
      trailer2=standard mainframe format,  e.g. trailer2=('END MONTHLY USAGE REPORT')
      trailer3==standard mainframe format, e.g. trailer3=('END MONTHLY USAGE REPORT')

      removecc=true or false (default: false) - tells DFSORT to remove the ANSI carriage control characters. 
         Without REMOVECC, the record in SORTOUT a record may look like this for an input file 
            1bbbbbbbb27
         when REMOVECC is set to true, the initial "1" is removed to produce 
            bbbbbbbb27
            
            
       IF/When Support 
      Translation of INREC IFTHEN : e.g. IFTHEN=(WHEN=(01,01,CH,EQ,C'H'),OVERLAY=(17001:C'1'))
         IN_IF=(01,01,CH,EQ,C'H')
         IN_THEN=\"OVERLAY=(17001:C'1'))\"  // must contain surrounding quotation marks
         
      please use a numeric suffix starting from 1 when multiple IFTHEN are involved - e.g.
         IN_IF_1=(01,01,CH,EQ,C'H')
         IN_THEN_1=\"OVERLAY=(17001:C'1'))\"  // must contain surrounding quotation marks
         IN_IF_2=INIT
         IN_THEN_2=...
         
       IF/When Support 
      Translation of OUTREC IFTHEN : e.g. IFTHEN=(WHEN=(01,01,CH,EQ,C'H'),OVERLAY=(17001:C'1'))
         OUT_IF=(01,01,CH,EQ,C'H')
         OUT_THEN=\"OVERLAY=(17001:C'1'))\"  // must contain surrounding quotation marks
         
      please use a numeric suffix starting from 1 when multiple IFTHEN are involved - e.g.
         OUT_IF_1=(01,01,CH,EQ,C'H')
         OUT_THEN_1=\"OVERLAY=(17001:C'1'))\"  // must contain surrounding quotation marks
         OUT_IF_2=NONE
         OUT_THEN_2=...

       Join Support 
      Translation of JOIN e.g. 
      
        JOINKEYS FILE=F1,FIELDS=(1,10,A),SORTED,NOSEQCK
        JOINKEYS FILE=F2,FIELDS=(7,10,A),SORTED,NOSEQCK
        JOIN UNPAIRED,F1,F2
        REFORMAT FIELDS=(F1:1,14,F2:1,20,?)
         
     the above JCL functionality can be reproduced by:
        
           SORTOFA=F1ONLY
           buildA="(1,14)"
           includeA="(35,1,CH,EQ,C'1')"
           SORTOFB=F2ONLY
           buildB="(15,20)"
           includeB="(35,1,CH,EQ,C'2')"
           SORTOFC=BOTH
           buildC="(1,14,/,15,20)"
           includeC="(35,1,CH,EQ,C'B')"
           JOINKEYS_F1="F1=SORTJNF1,FIELDS=(1,10,A),SORTED,NOSEQCK"
           JOINKEYS_F2="F2=SORTJNF2,FIELDS=(7,10,A),SORTED,NOSEQCK"
           JOIN=UNPAIRED,F1,F2
           JOIN_OPTION=COPY
           JOIN_REFORMAT_FIELDS="(F1:1,14,F2:1,20,?)"
  ==========================================================================================
  
 Alternatively, similar to the original JCL,  all the parameters may be passed in using Environment-Variables - e.g.
    export SORTIN=in-file1.txt 
    export SORTOUT=out-file1.txt 
    export sortFields=(66,1,A,22,3,D) 
 java -Xmx400m com.softwaremining.jcl.Sort
   
   
  • Field Summary

    Fields
    Modifier and Type
    Field
    Description
    protected static Log
     
  • Constructor Summary

    Constructors
    Constructor
    Description
     
  • Method Summary

    Modifier and Type
    Method
    Description
    static void
    execute(String[] inputFileNames, String[] outputFileNames, String xsumFileName, String sortFields, String sumField, String mergrFields, int recordSize, InOutRec inRec, boolean duplicatesAreInOrder, boolean removeDuplicates, boolean useEquals_NOT_USED, boolean saveAsLineSequential, boolean noDetail, boolean removeCC, String omitOut, String[] omitConditions, String[] includeConditions, InOutRec[] outrecs, boolean[] outrecOverlays, String[] outrecBuilds, String[] outrecFindReps, int[] outRecSizes, int skipRec, String[] header1s, String[] header2s, String[] header3s, String[] trailer1s, String[] trailer2s, String[] trailer3s)
    main entry point
    static void
    execute(String[] inputFileNames, String outputFileName, String xsumFileName, String sortFields, String sumField, String outrec, int recordSize, boolean outrecOverlay, boolean duplicatesAreInOrder, boolean removeDuplicates, boolean useEquals_NOT_USED, boolean saveAsLineSequential)
    main entry point
    static int
    execute(String inputFileName, String outputFileName, String sortFields)
    entry point
    static int
    execute(String inputFileName, String outputFileName, String sortFields, boolean duplicatesAreInOrder)
    main entry point
    static void
    main(String[] args)
    main entry point from command line It can be called from command line / JVM by:
    static void
    processOutput(String[] outputFileNames, String[] inputFileNames, int recordSize, int numberOfCrLFChars, OutFileSortController outFileSortController, String xsumFileName, boolean saveAsLineSeq, boolean noDetail, boolean removeCC, boolean useEquals_NOT_USED, ArrayList<List<String>> inputFilesAsList, String[] includeConditions, String omitOut, List<String> xsumData_LinesExcludedFromOutFile, InOutRec[] outrecs, boolean[] outrecOverlays, String[] outrecBuilds, String[] outrecFindReps, int[] outRecSizes, boolean useMemoryStrategy, String[] header1s, String[] header2s, String[] header3s, String[] trailer1s, String[] trailer2s, String[] trailer3s)
    use process(String[] inputFile ...);
    static List<String>
    processOutputFile(OutFileSortController outFileSortController, String xsumFileName, boolean saveAsLineSeq, boolean noDetail, boolean removeCC, boolean useEquals_NOT_USED, List<String> theData, String includeCondition, String omitOut, List<String> xsumData_LinesExcludedFromOutFile, InOutRec outrec, boolean outrecOverlay, String outrecBuild, String outrecFindRep, int outRecSize, String header1, String header2, String header3, String trailer1, String trailer2, String trailer3)
     
    static String
    reset all the environment variables used in Sort/Copy commands
    (e.g SORTOFA, SORTOFA, , includeA , header1A, trailer2B ...)
    static void
    reset all the environment variables used in Sort/Copy commands
    (e.g SORTOFA, SORTOFA, , includeA , header1A, trailer2B ...)

    Methods inherited from class java.lang.Object

    clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
  • Field Details

    • logger

      protected static transient Log logger
  • Constructor Details

    • Sort

      public Sort()
  • Method Details

    • execute

      public static int execute(String inputFileName, String outputFileName, String sortFields)
      entry point
      Parameters:
      inputFileName - : file to be processed. The system will read "record-size" characters for each record.
      outputFileName - : name of sorted file
      sortFields - : original JCL sort-fields - e.g (66,1,A,22,3,D) (from column 66 by 1 character Ascending. From Col 22 by 3 characters Descending)... Additionally the sortFields can be defined as: (66,1,CH,A,22,3,ZD,D,40,5,BI)
      - supported format qualifiers are: CH (Character), ZD (Zone Decimal - Numeric Display), BI (Binary COMPUTATIONAL) and PD (packed decimal - COMP-3)
    • execute

      public static int execute(String inputFileName, String outputFileName, String sortFields, boolean duplicatesAreInOrder)
      main entry point
      Parameters:
      inputFileName - : file to be processed. The system will read one line (CR or CR-LF TErminated) for each record.
      outputFileName - : name of sorted file
      sortFields - : original JCL sort-fields - e.g (66,1,A,22,3,D) (from column 66 by 1 character Ascending. From Col 22 by 3 characters Descending)... Additionally the sortFields can be defined as: (66,1,CH,A,22,3,ZD,D,40,5,BI)
      - supported format qualifiers are: CH (Character), ZD (Zone Decimal - Numeric Display), BI (Binary COMPUTATIONAL) and PD (packed decimal - COMP-3)
      duplicatesAreInOrder - - when two sorted fields are equivalent, then which one should go first? If set to true it keeps existing order
    • execute

      public static void execute(String[] inputFileNames, String outputFileName, String xsumFileName, String sortFields, String sumField, String outrec, int recordSize, boolean outrecOverlay, boolean duplicatesAreInOrder, boolean removeDuplicates, boolean useEquals_NOT_USED, boolean saveAsLineSequential)
      main entry point
      Parameters:
      inputFileNames - : file to be processed. The system will read "record-size" characters for each record.
      outputFileName - : name of sorted file
      sortFields - : original JCL sort-fields - e.g (66,1,A,22,3,D) (from column 66 by 1 character Ascending. From Col 22 by 3 characters Descending)... Additionally the sortFields can be defined as: (66,1,CH,A,22,3,ZD,D,40,5,BI)
      - supported format qualifiers are: CH (Character), ZD (Zone Decimal - Numeric Display), BI (Binary COMPUTATIONAL) and PD (packed decimal - COMP-3)
      sumField - : standard mainframe format, e.g sumFields=(40,15,ZD,60,15,ZD,75,8,ZD,99,15,ZD,114,15,ZD)
      outrec - : standard mainframe format, e.g. outRec=(9,6,11:6,3,16:15,9,ZD,EDIT=(IIIIIIIITS),SIGNS=(,,,-),131:C' ')
      recordSize - : Sometimes a record-line may contain CR/LF in middle of the data. Using this parameter the system will read the entire line, whether it had CR in middle or not.
      outrecOverlay -
      duplicatesAreInOrder - - when two sorted fields are equivalent, then which one should go first? If set to true it keeps existing order
      removeDuplicates - : true or false (default: false)
      saveAsLineSequential - : true or false (default: false)
    • execute

      public static void execute(String[] inputFileNames, String[] outputFileNames, String xsumFileName, String sortFields, String sumField, String mergrFields, int recordSize, InOutRec inRec, boolean duplicatesAreInOrder, boolean removeDuplicates, boolean useEquals_NOT_USED, boolean saveAsLineSequential, boolean noDetail, boolean removeCC, String omitOut, String[] omitConditions, String[] includeConditions, InOutRec[] outrecs, boolean[] outrecOverlays, String[] outrecBuilds, String[] outrecFindReps, int[] outRecSizes, int skipRec, String[] header1s, String[] header2s, String[] header3s, String[] trailer1s, String[] trailer2s, String[] trailer3s) throws Exception
      main entry point
      Parameters:
      inputFileNames - : file to be processed. The system will read "record-size" characters for each record.
      outputFileNames - : name of sorted file
      sortFields - : original JCL sort-fields - e.g (66,1,A,22,3,D) (from column 66 by 1 character Ascending. From Col 22 by 3 characters Descending)... Additionally the sortFields can be defined as: (66,1,CH,A,22,3,ZD,D,40,5,BI)
      - supported format qualifiers are: CH (Character), ZD (Zone Decimal - Numeric Display), BI (Binary COMPUTATIONAL) and PD (packed decimal - COMP-3)
      recordSize - : Sometimes a record-line may contain CR/LF in middle of the data. Using this parameter the system will read the entire line, whether it had CR in middle or not.
      duplicatesAreInOrder - - when two sorted fields are equivalent, then which one should go first? If set to true it keeps existing order
      Throws:
      Exception
    • main

      public static void main(String[] args)
      main entry point from command line It can be called from command line / JVM by:
         java -Xmx400m com.softwaremining.jcl.Sort C:/IN.txt C:/OUT.txt sortFields=(00424,5,A,00178,5,A,00183,30,A,114,32,A)  true 12345 true
           where first 3 options represent: input-file, output-file, sort-Fields
                 the final 3 optional parameters represent [duplicatesAreInOrder] [recordsize] [removeduplicates] 
                
          or 
           java com.softwaremining.jcl.Sort [parameters] (note - this option requires the SoftwareMining extra libraries
         where parameters is a combination of:
            -corectprop=new-corect.property file
            in=input-file
            out=output-file
            xsumFilename= XSUM File 
            
            sortFields= standard mainframe format, e.g original JCL sort-fields - e.g sortFields=(66,1,A,22,3,D) (from column 66 by 1 character Ascending. From Col 22 by 3 characters Descending)
            sumFields=  standard mainframe format, e.g  sumFields=(40,15,ZD,60,15,ZD,75,8,ZD,99,15,ZD,114,15,ZD)
            recordSize=number, e.g. recordSize=10 - Sometimes a record-line may contain CR/LF in middle of the data. Using this parameter the system will read the entire line, whether it had CR in middle or not.
            dupinorder=true or false (default: true)
            removeDuplicates=true or false (default: false)
            equals==true or false (default: false)  This option is currently always set to "false" - a true value can be implemented on request).
            saveAsLineSequential=true or false (default: true). Saves as RECORD-Sequential when false)
            convert=true or false (default: true). Converts the output to fixed length file (same as saveAsLineSequential=true)
            
            inRec=standard mainframe format, e.g. inRec=(9,6,11:6,3,16:15,9,ZD,EDIT=(IIIIIIIITS),SIGNS=(,,,-),131:C' ')
            outRec= standard mainframe format, e.g. outRec=(9,6,11:6,3,16:15,9,ZD,EDIT=(IIIIIIIITS),SIGNS=(,,,-),131:C' ')
            outRecSize= number - size of output record e.g. outRecSize=120
            overlay= OUTREC Overlay defined in standard mainframe format, e.g. overlay=(1:1,230,231:C'VV')
            include==standard mainframe format, e.g. include=(1,10,CH,NE,11,10,CH,AND,21,10,CH,NE,C'          ')
            omit= standard mainframe format, e.g. omit=(256,10,EQ,C'0000000000',OR,256,10,EQ,C'          ')
            noDetail=true or false (default: false).  Tells DFSORT not to write the data lines, so if the TRAILER1 was defined - then all we get is TRAILER1, no other records.
            
            sections= standard mainframe format, e.g. sections='(33,3,SKIP=3L,TRAILER3=(1:'BRAND TOTAL ',40:TOTAL=(22,8,ZD,EDIT=(III,III,IITS),SIGNS=(,,,-))))'
            header1=  standard mainframe format, e.g. header1=('MONTHLY USAGE REPORT')
            header2=standard mainframe format,   e.g. header2=('MONTHLY USAGE REPORT')
            header3=standard mainframe format,   e.g. header3=('MONTHLY USAGE REPORT')
            trailer1=standard mainframe format,  e.g. trailer1=('END MONTHLY USAGE REPORT')
            trailer2=standard mainframe format,  e.g. trailer2=('END MONTHLY USAGE REPORT')
            trailer3==standard mainframe format, e.g. trailer3=('END MONTHLY USAGE REPORT')
      
            removecc=true or false (default: false) - tells DFSORT to remove the ANSI carriage control characters. 
               Without REMOVECC, the record in SORTOUT a record may look like this for an input file 
                  1bbbbbbbb27
               when REMOVECC is set to true, the initial "1" is removed to produce 
                  bbbbbbbb27
                  
                  
             IF/When Support 
            Translation of INREC IFTHEN : e.g. IFTHEN=(WHEN=(01,01,CH,EQ,C'H'),OVERLAY=(17001:C'1'))
               IN_IF=(01,01,CH,EQ,C'H')
               IN_THEN=\"OVERLAY=(17001:C'1'))\"  // must contain surrounding quotation marks
               
            please use a numberic suffix starting from 1 when multiple IFTHEN are involved - e.g.
               IN_IF_1=(01,01,CH,EQ,C'H')
               IN_THEN_1=\"OVERLAY=(17001:C'1'))\"  // must contain surrounding quotation marks
               IN_IF_2=INIT
               IN_THEN_2=...
               
             IF/When Support 
            Translation of OUTREC IFTHEN : e.g. IFTHEN=(WHEN=(01,01,CH,EQ,C'H'),OVERLAY=(17001:C'1'))
               OUT_IF=(01,01,CH,EQ,C'H')
               OUT_THEN=\"OVERLAY=(17001:C'1'))\"  // must contain surrounding quotation marks
               
            please use a numeric suffix starting from 1 when multiple IFTHEN are involved - e.g.
               OUT_IF_1=(01,01,CH,EQ,C'H')
               OUT_THEN_1=\"OVERLAY=(17001:C'1'))\"  // must contain surrounding quotation marks
               OUT_IF_2=NONE
               OUT_THEN_2=...
      
             Join Support 
            Translation of JOIN e.g. 
            
              JOINKEYS FILE=F1,FIELDS=(1,10,A),SORTED,NOSEQCK
              JOINKEYS FILE=F2,FIELDS=(7,10,A),SORTED,NOSEQCK
              JOIN UNPAIRED,F1,F2
              REFORMAT FIELDS=(F1:1,14,F2:1,20,?)
               
           the above JCL functionality can be reproduced by:
              
                 SORTOFA=F1ONLY
                 buildA="(1,14)"
                 includeA="(35,1,CH,EQ,C'1')"
                 SORTOFB=F2ONLY
                 buildB="(15,20)"
                 includeB="(35,1,CH,EQ,C'2')"
                 SORTOFC=BOTH
                 buildC="(1,14,/,15,20)"
                 includeC="(35,1,CH,EQ,C'B')"
                 JOINKEYS_F1="F1=SORTJNF1,FIELDS=(1,10,A),SORTED,NOSEQCK"
                 JOINKEYS_F2="F2=SORTJNF2,FIELDS=(7,10,A),SORTED,NOSEQCK"
                 JOIN=UNPAIRED,F1,F2
                 JOIN_OPTION=COPY
                 JOIN_REFORMAT_FIELDS="(F1:1,14,F2:1,20,?)"
                         
         
    • processOutput

      public static void processOutput(String[] outputFileNames, String[] inputFileNames, int recordSize, int numberOfCrLFChars, OutFileSortController outFileSortController, String xsumFileName, boolean saveAsLineSeq, boolean noDetail, boolean removeCC, boolean useEquals_NOT_USED, ArrayList<List<String>> inputFilesAsList, String[] includeConditions, String omitOut, List<String> xsumData_LinesExcludedFromOutFile, InOutRec[] outrecs, boolean[] outrecOverlays, String[] outrecBuilds, String[] outrecFindReps, int[] outRecSizes, boolean useMemoryStrategy, String[] header1s, String[] header2s, String[] header3s, String[] trailer1s, String[] trailer2s, String[] trailer3s) throws Exception
      use process(String[] inputFile ...);
      Throws:
      Exception
    • processOutputFile

      public static List<String> processOutputFile(OutFileSortController outFileSortController, String xsumFileName, boolean saveAsLineSeq, boolean noDetail, boolean removeCC, boolean useEquals_NOT_USED, List<String> theData, String includeCondition, String omitOut, List<String> xsumData_LinesExcludedFromOutFile, InOutRec outrec, boolean outrecOverlay, String outrecBuild, String outrecFindRep, int outRecSize, String header1, String header2, String header3, String trailer1, String trailer2, String trailer3) throws Exception
      Throws:
      Exception
    • resetEnvironmentVars

      public static void resetEnvironmentVars()
      reset all the environment variables used in Sort/Copy commands
      (e.g SORTOFA, SORTOFA, , includeA , header1A, trailer2B ...)
    • reportRelevantEnvironmentVars

      public static String reportRelevantEnvironmentVars()
      reset all the environment variables used in Sort/Copy commands
      (e.g SORTOFA, SORTOFA, , includeA , header1A, trailer2B ...)