Dictionary input in dataframe column

  Kiến thức lập trình

I’m trying to read a file with multiple columns. One such column is named ‘answer’, containing values that pretty much are Python dictionaries.

Values include:

{'number': '2', 'date': {'day': '', 'month': '', 'year': ''}, 'spans': []}

and

{'number': '', 'date': {'day': '', 'month': '', 'year': ''}, 'spans': ['4-yard', '31-yard']}

A particular row’s value in the ‘answer’ column is when printed onto the console displayed as

'{'number': '', 'date': {'day': '', 'month': '', 'year': ''}, 'spans': ["Barlow's 1-yard touchdown run", '2-yard touchdown run', 'by rookie RB Joseph Addai.']}'

In the csv, it looks like

{'number': '', 'date': {'day': '', 'month': '', 'year': ''}, 'spans': ["Barlow's 1-yard touchdown run", '2-yard touchdown run', 'by rookie RB Joseph Addai.']}

After conversion to a valid JSON string and printed onto the console, it looks like

'{"number": "", "date": {"day": "", "month": "", "year": ""}, "spans": ["Barlow"s 1-yard touchdown run", "2-yard touchdown run", "by rookie RB Joseph Addai."]}'

To convert the strings to dictionaries, I tried using the json.loads(string) method.

Here is what I did:

for i in range(df.shape[0]):
    dict = df.iloc[i]['answer']
    dict = dict.replace("'", '"')
    # To convert it to a valid JSON string
    dict = json.loads(dict)
    ans[i] = dict['number']

The following error appears for the third example given above, but not the other two:

JSONDecodeError: Expecting ',' delimiter: line 1 column 80 (char 79)

It fails to convert the string into a dictionary for reasons unknown to me.

What can I do to rectify this error?

Is there any method to read the ‘answer’ column as a dictionary, instead of having to read it as a string and then convert said string to a dictionary?

LEAVE A COMMENT