Error while extracting data from a text file using Java

  Kiến thức lập trình

Using Java I’m trying to extract data from a text file. The design of the program is that it captures the current Iteration number when “Iteration X Starts” exists in the text file (similarly for “Phase X Starts”) and sets that as the cursor. With the iteration and phase cursors in mind, the program then only looks if “–” or “-” are mentioned, scans ahead until “Ends” is reached, then throwing that as the algorithm name. e.g “–Test Cake Algorithm X Ends” then the algorithm name is “Test Cake Algorithm X” and using parallel arrays, it continues to scan and capture if “Elapsed Time(ms)” or “Evaluations” or “Improvements” variables exist on the same line the test ends, if they do not exist, they are simply replaced with 0 (to maintain referential integrity as I’m using parallel arrays such as index 1 in all arrays refer to the same algorithm). Note that in the code, if the same Algorithm name is captured in the same phase and iteration, then the three variables are accumulated and that algorithm name is output once for that iteration and phase.

import java.io.BufferedReader;
import java.io.FileReader;
import java.io.IOException;
import java.util.ArrayList;
import java.io.FileWriter;
import java.io.IOException;
import java.io.PrintWriter;

public class Main {
    public static void main(String[] args) {
        String filePath = "/home/user/file/src/Input.txt";

        int iterationcursor = 0;
        int phasecursor = 0; //iterationcursor, phasecursor are set to when "Iteration x Starts" and "Phase x Starts"

        ArrayList<Integer> IterationParalleltoData = new ArrayList<>(); // e.g {0, 0, 0, 1}
        ArrayList<Integer> PhaseParalleltoData = new ArrayList<>(); // e.g    {0, 1, 1, 2}
        ArrayList<String> algorithmName = new ArrayList<>(); //           {EjA, TstA, TstB}
        ArrayList<Integer> algoElapsedTime = new ArrayList<>(); //        {2|2|4, 5|3|1, 5|3|2} data will be stored in parallel arrays
        ArrayList<Integer> algoEvaluations = new ArrayList<>();
        ArrayList<Integer> algoImprovments = new ArrayList<>(); //
        ArrayList<String> algorithmNameTemp = new ArrayList<>();  // this will be dumped into the correct testsubject data


        try (BufferedReader br = new BufferedReader(new FileReader(filePath))) {
            ArrayList<String> bufferLine = new ArrayList<>(); // will store the line words / content in a array so we can iterate through them
            String line;
            while ((line = br.readLine()) != null) {
                // Split the line into individual words
                String[] words = line.split("\s+"); // Split by whitespace

                // Add each word to the buffer array
                for (int wordIndex = 0 ; wordIndex < words.length; wordIndex++) {
                    if (words[wordIndex].equals("Iteration")) { // Case we encounter "Iteration x Starts"
                        try {
                            if (words[wordIndex+2].equals("Starts")) {
                                if (wordIndex + 1 < words.length) {
                                    iterationcursor = Integer.parseInt(words[wordIndex + 1]);
                                }
                            }
                        } catch (IndexOutOfBoundsException e) {
                            //SKip if Iteration was the last thng in line / no number after iteration and nothing is past "Iteration"
                        } catch (NumberFormatException e) {
                            //Skip if after "Iteration" is not a number, e.g "Iteration is something" 'is' is not a number
                        }
                    }

                    if (words[wordIndex].equals("Phase")){ // Case we encounter "Phase x: X Y Z Starts" since we can have X Y Z A B C ... Starts, we scan until Starts on the same line
                        try {
                            for (int restofLine = wordIndex ; restofLine < words.length; restofLine++){
                                if (words[restofLine].equals("Starts")){
                                    if (wordIndex + 1 < words.length) {
                                        String[] phaseRaw = words[wordIndex + 1].split(":"); // "Phase 1: X Y Z Starts" '1:' needs to be stripped to only 1
                                        phasecursor = Integer.parseInt(phaseRaw[0]);
                                    }
                                }
                            }
                            System.out.println(phasecursor);
                        } catch (IndexOutOfBoundsException e) {
                            //SKip if Iteration was the last thng in line / no number after iteration and nothing is past "Iteration"
                        } catch (NumberFormatException e) {
                            //Skip if after "Iteration" is not a number, e.g "Iteration is something" 'is' is not a number
                        }

                    }

                    if (words[wordIndex].equals("-") || words[wordIndex].equals("--")){ // Case we encounter - or -- to capture Algorithm data

                        for (int restofLine = wordIndex + 1; restofLine < words.length; restofLine++){
                            algorithmNameTemp.add(words[restofLine]);
                            if (words[restofLine].equals("Ends")){
                                algorithmNameTemp.remove(algorithmNameTemp.size()-1); // remove Ends from algorithm name
                                String nameBufferBlock = ""; // combining the data in temp name array to one single string to store in the
                                for (String namecaptured : algorithmNameTemp){
                                    if (nameBufferBlock.equals("")) {
                                        nameBufferBlock = namecaptured;
                                    } else {
                                        nameBufferBlock += " " + namecaptured;
                                    }
                                }
                                boolean redundantAlgorithm = false;
                                for (String iterationHistory : algorithmName) {
                                    if (nameBufferBlock.equals(iterationHistory) &&
                                            iterationcursor == IterationParalleltoData.get(algorithmName.indexOf(nameBufferBlock)) &&
                                            phasecursor == PhaseParalleltoData.get(algorithmName.indexOf(nameBufferBlock))) {
                                        redundantAlgorithm = true;
                                        break;
                                    }
                                }
                                if (!redundantAlgorithm) {
                                    algorithmName.add(nameBufferBlock);
                                    IterationParalleltoData.add(iterationcursor);
                                    PhaseParalleltoData.add(phasecursor);
                                }
                                boolean elapsedExists = false;
                                boolean evaluationsExists = false;
                                boolean improvementsExists = false;
                                for (int lineAfterEnd = restofLine + 1; lineAfterEnd < words.length; lineAfterEnd++) {
                                    if (words[lineAfterEnd].equals("Elapsed")) {
                                        if (!redundantAlgorithm) {
                                            if (lineAfterEnd + 3 < words.length) {
                                                algoElapsedTime.add(Integer.parseInt(words[lineAfterEnd + 3]));
                                                elapsedExists = true;
                                            }

                                        } else {
                                            int index = 0;
                                            for (String iterationHistory : algorithmName) {
                                                //int index = algorithmName.indexOf(nameBufferBlock);
                                                if (nameBufferBlock.equals(iterationHistory) &&
                                                        iterationcursor == IterationParalleltoData.get(index) &&
                                                        phasecursor == PhaseParalleltoData.get(index)) {
                                                    if (lineAfterEnd + 3 < words.length) {
                                                        int newValue = Integer.parseInt(words[lineAfterEnd + 3]) + algoElapsedTime.get(index);
                                                        algoElapsedTime.set(index, newValue);
                                                    }
                                                    break;
                                                }
                                                index += 1;
                                            }
                                        }
                                    } else if (words[lineAfterEnd].equals("Evaluations:")) {
                                        if (!redundantAlgorithm) {
                                            if (lineAfterEnd + 1 < words.length) {
                                                algoEvaluations.add(Integer.parseInt(words[lineAfterEnd + 1]));
                                                evaluationsExists = true;
                                            }
                                        } else {
                                            int index = 0;
                                            for (String iterationHistory : algorithmName) {
                                                //int index = algorithmName.indexOf(nameBufferBlock);
                                                if (nameBufferBlock.equals(iterationHistory) &&
                                                        iterationcursor == IterationParalleltoData.get(index) &&
                                                        phasecursor == PhaseParalleltoData.get(index)) {
                                                    if (lineAfterEnd + 1 < words.length) {
                                                        int newValue = Integer.parseInt(words[lineAfterEnd + 1]) + algoEvaluations.get(index);
                                                        algoEvaluations.set(index, newValue);
                                                    }
                                                    break;
                                                }
                                                index += 1;
                                            }
                                        }
                                    } else if (words[lineAfterEnd].equals("Improvements:")) {
                                        if (!redundantAlgorithm) {
                                            if (lineAfterEnd + 1 < words.length) {
                                                algoImprovments.add(Integer.parseInt(words[lineAfterEnd + 1]));
                                                improvementsExists = true;
                                            }
                                        } else {
                                            int index = 0;
                                            for (String iterationHistory : algorithmName) {
                                                //int index = algorithmName.indexOf(nameBufferBlock);
                                                if (nameBufferBlock.equals(iterationHistory) &&
                                                        iterationcursor == IterationParalleltoData.get(index) &&
                                                        phasecursor == PhaseParalleltoData.get(index)) {
                                                    if (lineAfterEnd + 1 < words.length) {
                                                        int newValue = Integer.parseInt(words[lineAfterEnd + 1]) + algoImprovments.get(index);
                                                        algoImprovments.set(index, newValue);
                                                    }
                                                    break;
                                                }
                                                index += 1;
                                            }
                                        }
                                    }

                                }
                                if (!elapsedExists && !redundantAlgorithm){
                                    algoElapsedTime.add(0);
                                } else if (!evaluationsExists && !redundantAlgorithm){
                                    algoEvaluations.add(0);
                                } else if (!improvementsExists && !redundantAlgorithm) {
                                    algoImprovments.add(0);
                                }



                            }

                        }
                        algorithmNameTemp.clear();


                    }

                }

            }
        } catch (IOException e) {
            e.printStackTrace();
        }
        System.out.println(IterationParalleltoData.toString());
        System.out.println(PhaseParalleltoData.toString());
        System.out.println(algorithmName.toString());
        System.out.println(algoElapsedTime.toString());
        System.out.println(algoEvaluations.toString());
        System.out.println(algoImprovments.toString());

        String outputPath = "/file/path/to/Output.txt";
        try (PrintWriter writer = new PrintWriter(new FileWriter(outputPath))) {
            writer.println("Iteration,Phase,Algorithm,Elapsed Time (ms),Evaluations,Improvements");
            for (int i = 0; i < algorithmName.size(); i++) {
                writer.println(IterationParalleltoData.get(i) + "," +
                        PhaseParalleltoData.get(i) + "," +
                        algorithmName.get(i) + "," +
                        algoElapsedTime.get(i) + "," +
                        algoEvaluations.get(i) + "," +
                        algoImprovments.get(i));
            }
            writer.println("-1,-1,All," +
                    algoElapsedTime.stream().mapToInt(Integer::intValue).sum() + "," +
                    algoEvaluations.stream().mapToInt(Integer::intValue).sum() + "," +
                    algoImprovments.stream().mapToInt(Integer::intValue).sum());
        } catch (IOException e) {
            e.printStackTrace();
        }
    }
} 

However in my code I’m facing an error that I cannot pinpoint whereas the Improvements variable when not mentioned in the record.
E.g this would work:

- Ejection Algorithm Ends       Time 12: 30:53          Elapsed Time (ms): 4                Evaluations: 23                 Improvements: 0

This wont work:

- Greedy Algorithm Ends         Time 12: 30:43          Elapsed Time (ms): 21

I get errors at line 209 at the part that writes them in the Output file. When I’m using a large test Input file, I also get errors at this line

if (lineAfterEnd + 1 < words.length) {
                                                        int newValue = Integer.parseInt(words[lineAfterEnd + 1]) + algoImprovments.get(index);
                                                        algoImprovments.set(index, newValue);

The error I get when using massive test Input files:

Exception in thread "main" java.lang.IndexOutOfBoundsException: Index 4 out of bounds for length 4
    at java.base/jdk.internal.util.Preconditions.outOfBounds(Preconditions.java:100)
    at java.base/jdk.internal.util.Preconditions.outOfBoundsCheckIndex(Preconditions.java:106)
    at java.base/jdk.internal.util.Preconditions.checkIndex(Preconditions.java:302)
    at java.base/java.util.Objects.checkIndex(Objects.java:365)
    at java.base/java.util.ArrayList.get(ArrayList.java:428)
    at Main2.main(Main2.java:158)

and the error I get with the brief example I provided at first:

Exception in thread "main" java.lang.IndexOutOfBoundsException: Index 0 out of bounds for length 0
    at java.base/jdk.internal.util.Preconditions.outOfBounds(Preconditions.java:100)
    at java.base/jdk.internal.util.Preconditions.outOfBoundsCheckIndex(Preconditions.java:106)
    at java.base/jdk.internal.util.Preconditions.checkIndex(Preconditions.java:302)
    at java.base/java.util.Objects.checkIndex(Objects.java:365)
    at java.base/java.util.ArrayList.get(ArrayList.java:428)
    at Main2.main(Main2.java:209)

I don’t understand why the first two variables work fine if they are mentioned or not, while the last one is being problematic when not mentioned. I literally designed the algorithm for the first variable “Elapsed”, and copied it for the last two variables and tailored it based on the format of the txt file.

Here is an example of a massive test input text file:
https://pastebin.com/AxarzxL5

LEAVE A COMMENT