I’using apache pdfBox to create a pdf that contains a huge number of links. The doc is supposed to contain a big table (1M rows), some of the table’s column may be of type link.
I noticed that when i don’t use links, i have a reasonable memory usage (couple of hundred of MB).
When using links even 3GB of heap memory isn’t enough
Is there a way use less memory?
Please find below a unit test to reproduce
`import java.io.IOException;
import org.apache.pdfbox.io.MemoryUsageSetting;
import org.apache.pdfbox.pdmodel.PDDocument;
import org.apache.pdfbox.pdmodel.PDPage;
import org.apache.pdfbox.pdmodel.PDPageContentStream;
import org.apache.pdfbox.pdmodel.common.PDRectangle;
import org.apache.pdfbox.pdmodel.font.PDType1Font;
import org.apache.pdfbox.pdmodel.interactive.action.PDActionURI;
import org.apache.pdfbox.pdmodel.interactive.annotation.PDAnnotationLink;
import org.junit.Test;
public class TestTable {
@Test public void main1() throws IOException {
try (PDDocument document = new PDDocument(MemoryUsageSetting.setupTempFileOnly())) {
PDPage page = new PDPage(PDRectangle.A4);
document.addPage(page);
PDPageContentStream contentStream = new PDPageContentStream(document, page);
float margin = 40;
float yStart = page.getMediaBox().getHeight() - margin;
float tableWidth = page.getMediaBox().getWidth() - 2 * margin;
float yPosition = yStart;
int rows = 1000000;
float rowHeight = 20f;
float cellMargin = 5f;
float[] columnWidths = {50f, 50f, 50f, 50f, 50f, 50f, 50f, 50f, 50f, 50f, 50f, 50f, 50f, 50f, 50f};
int cols = columnWidths.length;
drawTableHeader(contentStream, margin, yPosition, tableWidth, rowHeight, cellMargin, columnWidths);
for (int i = 0; i < rows; i++) {
if (isPageFull(page, yPosition)) {
contentStream.close();
yPosition = yStart;
page = new PDPage(PDRectangle.A4);
document.addPage(page);
contentStream = new PDPageContentStream(document, page);
drawTableHeader(contentStream, margin, yPosition, tableWidth, rowHeight, cellMargin, columnWidths);
}
yPosition -= rowHeight;
drawTableRow(contentStream, margin, yPosition, tableWidth, rowHeight, cellMargin, columnWidths, page);
}
contentStream.close();
document.save("tableBOX.pdf");
} catch (IOException e) {
e.printStackTrace();
}
}
private static void drawTableHeader(PDPageContentStream contentStream, float xStart, float yStart, float tableWidth, float rowHeight, float cellMargin, float[] columnWidths) throws IOException {
contentStream.setNonStrokingColor(150, 150, 150);
contentStream.addRect(xStart, yStart, tableWidth, rowHeight);
contentStream.setFont(PDType1Font.HELVETICA_BOLD, 12);
// Set the text color to black
contentStream.setNonStrokingColor(0, 0, 0);
float yPosition = yStart + (rowHeight / 2);
for (int i = 0; i < columnWidths.length; i++) {
String headerText = "Column " + (i + 1);
float width = columnWidths[i];
float xPosition = xStart + (width / 2) - (headerText.length() / 2 * 4);
contentStream.beginText();
contentStream.newLineAtOffset(xPosition, yPosition);
contentStream.showText(headerText);
contentStream.endText();
xStart += width;
}
}
private static void drawTableRow(PDPageContentStream pDPageContentStream, float xStart, float yStart, float tableWidth, float rowHeight, float cellMargin, float[] columnWidths, PDPage page) throws IOException {
pDPageContentStream.setNonStrokingColor(255, 255, 255);
pDPageContentStream.addRect(xStart, yStart, tableWidth, rowHeight);
pDPageContentStream.setFont(PDType1Font.TIMES_ROMAN, cellMargin);
pDPageContentStream.setNonStrokingColor(0, 0, 0);
float yPosition = yStart + (rowHeight / 2);
for (int i = 0; i < columnWidths.length; i++) {
String cellText = "Row " + (i + 1);
float width = columnWidths[i];
float xPosition = xStart + cellMargin;
if (i % 2 == 0) {
PDAnnotationLink txtLink = new PDAnnotationLink();
PDRectangle position = new PDRectangle(xPosition, yPosition, 50, 20);
PDActionURI action = new PDActionURI();
action.setURI("www.google.com");
txtLink.setAction(action);
txtLink.setRectangle(position);
page.getAnnotations().add(txtLink);
page.setAnnotations(null);
} else {
pDPageContentStream.beginText();
pDPageContentStream.newLineAtOffset(xPosition, yPosition);
pDPageContentStream.showText(cellText);
pDPageContentStream.endText();
}
xStart += width;
}
}
private static boolean isPageFull(PDPage page, float yPosition) throws IOException {
float threshold = 700;
float remainingSpace = page.getMediaBox().getHeight() - yPosition;
return remainingSpace > threshold;
}
}`
i expect to use less memory while keep adding links, is there a way to achieve this?
New contributor