Code works locally but gets positional indexers are out-of-bounds on AWS Lambda

  Kiến thức lập trình

I’m running a code to extract data from a Gsheets Sheet and loading it into a Postgres database. The code works perfectly with other sheets of the same workbook as AWS Docker and also when I run it locally on my computer. However, when I push the code via a Docker Container to AWS Lambda I always get the error “positional indexers are out-of-bounds”.
Does someone has an idea where that comes from?

Here’s my code (locally, on the AWS the same just as lamda_handler function)

from pathlib import Path
import pandas as pd
import gspread as gs
from datetime import datetime
from gspread_dataframe import get_as_dataframe
from upload_functions import push_df
import numpy as np

gsheets_name = "xxx"
gsheets_token = Path("xxx")

# Initialisierung des Google Sheets Clients mit der Service-Account-Credentials-Datei
gc = gs.service_account(filename=gsheets_token)

# Öffnen des spezifischen Google Sheets
sheet = gc.open(gsheets_name)

# Zugriff auf das Arbeitsblatt "xxx"
staffing_worksheet = sheet.worksheet("xxx")
staffing_df = get_as_dataframe(staffing_worksheet, evaluate_formulas=True, header=None, skiprows=95)
staffing_df = staffing_df[staffing_df[1].notna()]

# Extraktion der Spaltennamen und Datumsinformationen
columns_A_to_E = staffing_worksheet.row_values(95)[:5]
date_row = staffing_worksheet.row_values(2)
columns_dates = pd.to_datetime(date_row[6:], format='%d.%m.%Y', errors='coerce')  # Angepasstes Format
current_week, current_year = pd.Timestamp.now().isocalendar()[1], pd.Timestamp.now().isocalendar()[0]

# Filtern der Spalten basierend auf aktuellen oder zukünftigen Daten
valid_columns = [(date.isocalendar()[1] >= current_week and date.isocalendar()[0] == current_year) or date.isocalendar()[0] > current_year for date in columns_dates]
columns = columns_A_to_E + [date_row[i+6] for i, valid in enumerate(valid_columns) if valid]
staffing_df = staffing_df.iloc[:, np.r_[0:5, 6 + np.flatnonzero(valid_columns)]]
staffing_df.columns = columns

# Hinzufügen eines Timestamps
staffing_df['pulled_timestamp'] = datetime.now()

# Bereinigung und Transponierung des DataFrames für "xxx"
pivot_list_staffing = []
for index, row in staffing_df.iterrows():
    for i, valid in enumerate(valid_columns):
        if valid:
            # Sicherstellen, dass das Datum im korrekten String-Format für den Zugriff verwendet wird
            column_date = columns_dates[i].strftime('%d.%m.%Y')
            # Erstellen einer Serie für jeden gültigen Eintrag und Hinzufügen zur Pivot-Liste
            pivot_list_staffing.append(pd.Series(
                [row[dim] for dim in columns_A_to_E] + [column_date, row[column_date], row['pulled_timestamp']],
                index=list(columns_A_to_E) + ['day', 'days', 'pulled_timestamp']
            ))

pivot_df_staffing = pd.DataFrame(pivot_list_staffing)
pivot_df_staffing = pivot_df_staffing.dropna(subset=['days'])

pivot_df_staffing.columns = ['ID', 'user', 'project_name', 'project_type', 'sum', 'day', 'days', 'pulled_timestamp']
pivot_df_staffing = pivot_df_staffing.drop(columns=['sum'])

pivot_df_staffing['day'] = pd.to_datetime(pivot_df_staffing['day'], errors='coerce', dayfirst=True)

# Daten in Datenbank pushen
push_df(pivot_df_staffing, 'staffing_mc')

I’ve tested that “valid_columns” is not empty and “staffing_df” is not empty.

This is the error I see on CloudWatch

LAMBDA_WARNING: Unhandled exception. The most likely cause is an issue
in the function code. However, in rare cases, a Lambda runtime update 
can cause unexpected function behavior. For functions using managed 
runtimes, runtime updates can be triggered by a function change, or 
can be applied automatically. To determine if the runtime has been 
updated, check the runtime version in the INIT_START log entry. If 
this error correlates with a change in the runtime version, you may be 
able to mitigate this error by temporarily rolling back to the previous 
runtime version. For more information, see 
https://docs.aws.amazon.com/lambda/latest/dg/runtimes-update.html
[ERROR] IndexError: positional indexers are out-of-bounds
Traceback (most recent call last):
  File "/var/task/main.py", line 37, in lambda_handler
    staffing_df = staffing_df.iloc[:, np.r_[0:5, 6 + np.flatnonzero(valid_columns)]]
  File "/var/lang/lib/python3.11/site-packages/pandas/core/indexing.py", line 1184, in __getitem__
    return self._getitem_tuple(key)
  File "/var/lang/lib/python3.11/site-packages/pandas/core/indexing.py", line 1690, in _getitem_tuple
    tup = self._validate_tuple_indexer(tup)
  File "/var/lang/lib/python3.11/site-packages/pandas/core/indexing.py", line 966, in _validate_tuple_indexer
    self._validate_key(k, i)
  File "/var/lang/lib/python3.11/site-packages/pandas/core/indexing.py", line 1612, in _validate_key
    raise IndexError("positional indexers are out-of-bounds")

3

Theme wordpress giá rẻ Theme wordpress giá rẻ Thiết kế website Kho Theme wordpress Kho Theme WP Theme WP

LEAVE A COMMENT