java.lang.OutOfMemoryError using PdfBox library in java when having large pdf with a big number of annotations

  Kiến thức lập trình

I’using apache pdfBox to create a pdf that contains a huge number of links. The doc is supposed to contain a big table (1M rows), some of the table’s column may be of type link.
I noticed that when i don’t use links, i have a reasonable memory usage (couple of hundred of MB).

When using links even 3GB of heap memory isn’t enough

Is there a way use less memory?

Please find below a unit test to reproduce

`import java.io.IOException;
import org.apache.pdfbox.io.MemoryUsageSetting;
import org.apache.pdfbox.pdmodel.PDDocument;
import org.apache.pdfbox.pdmodel.PDPage;
import org.apache.pdfbox.pdmodel.PDPageContentStream;
import org.apache.pdfbox.pdmodel.common.PDRectangle;
import org.apache.pdfbox.pdmodel.font.PDType1Font;
import org.apache.pdfbox.pdmodel.interactive.action.PDActionURI;
import org.apache.pdfbox.pdmodel.interactive.annotation.PDAnnotationLink;
import org.junit.Test;

public class TestTable {


   @Test public void main1() throws IOException {
      try (PDDocument document = new PDDocument(MemoryUsageSetting.setupTempFileOnly())) {
         PDPage page = new PDPage(PDRectangle.A4);
         document.addPage(page);
         PDPageContentStream contentStream = new PDPageContentStream(document, page);
         float margin = 40;
         float yStart = page.getMediaBox().getHeight() - margin;
         float tableWidth = page.getMediaBox().getWidth() - 2 * margin;
         float yPosition = yStart;
         int rows = 1000000;
         float rowHeight = 20f;
         float cellMargin = 5f;
         float[] columnWidths = {50f, 50f, 50f, 50f, 50f, 50f, 50f, 50f, 50f, 50f, 50f, 50f, 50f, 50f, 50f};
         int cols = columnWidths.length;

         drawTableHeader(contentStream, margin, yPosition, tableWidth, rowHeight, cellMargin, columnWidths);

         for (int i = 0; i < rows; i++) {
            if (isPageFull(page, yPosition)) {
               contentStream.close();
               yPosition = yStart;
               page = new PDPage(PDRectangle.A4);
               document.addPage(page);
               contentStream = new PDPageContentStream(document, page);
               drawTableHeader(contentStream, margin, yPosition, tableWidth, rowHeight, cellMargin, columnWidths);
            }

            yPosition -= rowHeight;
            drawTableRow(contentStream, margin, yPosition, tableWidth, rowHeight, cellMargin, columnWidths, page);

         }
         contentStream.close();
         document.save("tableBOX.pdf");
      } catch (IOException e) {
         e.printStackTrace();
      }
   }

   private static void drawTableHeader(PDPageContentStream contentStream, float xStart, float yStart, float tableWidth, float rowHeight, float cellMargin, float[] columnWidths) throws IOException {
      contentStream.setNonStrokingColor(150, 150, 150);
      contentStream.addRect(xStart, yStart, tableWidth, rowHeight);
      contentStream.setFont(PDType1Font.HELVETICA_BOLD, 12);

      // Set the text color to black
      contentStream.setNonStrokingColor(0, 0, 0);

      float yPosition = yStart + (rowHeight / 2);
      for (int i = 0; i < columnWidths.length; i++) {
         String headerText = "Column " + (i + 1);
         float width = columnWidths[i];
         float xPosition = xStart + (width / 2) - (headerText.length() / 2 * 4);
         contentStream.beginText();
         contentStream.newLineAtOffset(xPosition, yPosition);
         contentStream.showText(headerText);
         contentStream.endText();

         xStart += width;
      }
   }

   private static void drawTableRow(PDPageContentStream pDPageContentStream, float xStart, float yStart, float tableWidth, float rowHeight, float cellMargin, float[] columnWidths, PDPage page) throws IOException {
      pDPageContentStream.setNonStrokingColor(255, 255, 255);
      pDPageContentStream.addRect(xStart, yStart, tableWidth, rowHeight);
      pDPageContentStream.setFont(PDType1Font.TIMES_ROMAN, cellMargin);
      pDPageContentStream.setNonStrokingColor(0, 0, 0);

      float yPosition = yStart + (rowHeight / 2);
      for (int i = 0; i < columnWidths.length; i++) {
         String cellText = "Row " + (i + 1);
         float width = columnWidths[i];
         float xPosition = xStart + cellMargin;

         if (i % 2 == 0) {
            PDAnnotationLink txtLink = new PDAnnotationLink();
            PDRectangle position = new PDRectangle(xPosition, yPosition, 50, 20);
            PDActionURI action = new PDActionURI();
            action.setURI("www.google.com");
            txtLink.setAction(action);
            txtLink.setRectangle(position);
            page.getAnnotations().add(txtLink);
            page.setAnnotations(null);

         } else {
            pDPageContentStream.beginText();
            pDPageContentStream.newLineAtOffset(xPosition, yPosition);
            pDPageContentStream.showText(cellText);
            pDPageContentStream.endText();
         }
         xStart += width;
      }
   }

   private static boolean isPageFull(PDPage page, float yPosition) throws IOException {
      float threshold = 700;
      float remainingSpace = page.getMediaBox().getHeight() - yPosition;
      return remainingSpace > threshold;
   }
}`



i expect to use less memory while keep adding links, is there a way to achieve this?

New contributor

anna dev is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.

LEAVE A COMMENT