Is it correct to remove an escape key when decoding bytes?

  Kiến thức lập trình

I have the following small dataframe that I am passing through a decoder that decodes some byte values in the dataframe:

import pandas as pd
import numpy as np
dict = {'trade_qualifier' : [b'x86', b' ', b' ', b'x02']} 

df = pd.DataFrame(dict) 
def decode_bytes(df):
    col = 'trade_qualifier'
    df[col] = df[col].values.astype(np.str_).astype('O')

    return df

decode_bytes(df)

The above breaks with:

UnicodeDecodeError: 'ascii' codec can't decode byte 0x86 in position 0: ordinal not in range(128)

I can simply remove the backslashes in the byes so that b'x86' -> b'x86' for example.

But I am not sure if this is the correct thing to do?

I suppose a better question may be, what does this line do?:

df[col] = df[col].values.astype(np.str_).astype('O')

My understanding is that this line is changing the type of each value in the col column to be an np.str type? Is that correct?

I apologise if this is a very basic question, I am still new to working with bytes.

Theme wordpress giá rẻ Theme wordpress giá rẻ Thiết kế website

LEAVE A COMMENT