I’m running a code to extract data from a Gsheets Sheet and loading it into a Postgres database. The code works perfectly with other sheets of the same workbook as AWS Docker and also when I run it locally on my computer. However, when I push the code via a Docker Container to AWS Lambda I always get the error “positional indexers are out-of-bounds”.
Does someone has an idea where that comes from?
Here’s my code (locally, on the AWS the same just as lamda_handler function)
from pathlib import Path
import pandas as pd
import gspread as gs
from datetime import datetime
from gspread_dataframe import get_as_dataframe
from upload_functions import push_df
import numpy as np
gsheets_name = "xxx"
gsheets_token = Path("xxx")
# Initialisierung des Google Sheets Clients mit der Service-Account-Credentials-Datei
gc = gs.service_account(filename=gsheets_token)
# Öffnen des spezifischen Google Sheets
sheet = gc.open(gsheets_name)
# Zugriff auf das Arbeitsblatt "xxx"
staffing_worksheet = sheet.worksheet("xxx")
staffing_df = get_as_dataframe(staffing_worksheet, evaluate_formulas=True, header=None, skiprows=95)
staffing_df = staffing_df[staffing_df[1].notna()]
# Extraktion der Spaltennamen und Datumsinformationen
columns_A_to_E = staffing_worksheet.row_values(95)[:5]
date_row = staffing_worksheet.row_values(2)
columns_dates = pd.to_datetime(date_row[6:], format='%d.%m.%Y', errors='coerce') # Angepasstes Format
current_week, current_year = pd.Timestamp.now().isocalendar()[1], pd.Timestamp.now().isocalendar()[0]
# Filtern der Spalten basierend auf aktuellen oder zukünftigen Daten
valid_columns = [(date.isocalendar()[1] >= current_week and date.isocalendar()[0] == current_year) or date.isocalendar()[0] > current_year for date in columns_dates]
columns = columns_A_to_E + [date_row[i+6] for i, valid in enumerate(valid_columns) if valid]
staffing_df = staffing_df.iloc[:, np.r_[0:5, 6 + np.flatnonzero(valid_columns)]]
staffing_df.columns = columns
# Hinzufügen eines Timestamps
staffing_df['pulled_timestamp'] = datetime.now()
# Bereinigung und Transponierung des DataFrames für "xxx"
pivot_list_staffing = []
for index, row in staffing_df.iterrows():
for i, valid in enumerate(valid_columns):
if valid:
# Sicherstellen, dass das Datum im korrekten String-Format für den Zugriff verwendet wird
column_date = columns_dates[i].strftime('%d.%m.%Y')
# Erstellen einer Serie für jeden gültigen Eintrag und Hinzufügen zur Pivot-Liste
pivot_list_staffing.append(pd.Series(
[row[dim] for dim in columns_A_to_E] + [column_date, row[column_date], row['pulled_timestamp']],
index=list(columns_A_to_E) + ['day', 'days', 'pulled_timestamp']
))
pivot_df_staffing = pd.DataFrame(pivot_list_staffing)
pivot_df_staffing = pivot_df_staffing.dropna(subset=['days'])
pivot_df_staffing.columns = ['ID', 'user', 'project_name', 'project_type', 'sum', 'day', 'days', 'pulled_timestamp']
pivot_df_staffing = pivot_df_staffing.drop(columns=['sum'])
pivot_df_staffing['day'] = pd.to_datetime(pivot_df_staffing['day'], errors='coerce', dayfirst=True)
# Daten in Datenbank pushen
push_df(pivot_df_staffing, 'staffing_mc')
I’ve tested that “valid_columns” is not empty and “staffing_df” is not empty.
This is the error I see on CloudWatch
LAMBDA_WARNING: Unhandled exception. The most likely cause is an issue
in the function code. However, in rare cases, a Lambda runtime update
can cause unexpected function behavior. For functions using managed
runtimes, runtime updates can be triggered by a function change, or
can be applied automatically. To determine if the runtime has been
updated, check the runtime version in the INIT_START log entry. If
this error correlates with a change in the runtime version, you may be
able to mitigate this error by temporarily rolling back to the previous
runtime version. For more information, see
https://docs.aws.amazon.com/lambda/latest/dg/runtimes-update.html
[ERROR] IndexError: positional indexers are out-of-bounds
Traceback (most recent call last):
File "/var/task/main.py", line 37, in lambda_handler
staffing_df = staffing_df.iloc[:, np.r_[0:5, 6 + np.flatnonzero(valid_columns)]]
File "/var/lang/lib/python3.11/site-packages/pandas/core/indexing.py", line 1184, in __getitem__
return self._getitem_tuple(key)
File "/var/lang/lib/python3.11/site-packages/pandas/core/indexing.py", line 1690, in _getitem_tuple
tup = self._validate_tuple_indexer(tup)
File "/var/lang/lib/python3.11/site-packages/pandas/core/indexing.py", line 966, in _validate_tuple_indexer
self._validate_key(k, i)
File "/var/lang/lib/python3.11/site-packages/pandas/core/indexing.py", line 1612, in _validate_key
raise IndexError("positional indexers are out-of-bounds")
3