Not able to format data so that 0.5 shows as 0.50

  Kiến thức lập trình

I am trying to format the data in a column. It is monetary but I need to match it up with another dataset so I need to format it before it goes into PowerBi.

I am trying to ‘absolute value’ the number then convert it into a string to 2 decimal places

Any idea where I am going wrong with this?

def extract_data(filename, sheet_name):
    # DataFrame containing data extracted from Excel sheet
    df = pd.read_excel(filename, sheet_name=sheet_name, skiprows=[0])

    # Drop rows and columns with all NaN values
    df.dropna(how='all', axis=0, inplace=True)
    df.dropna(how='all', axis=1, inplace=True)

    # Format columns based on their data types
    for col in df.columns:
        if df[col].dtype == 'object': #Handle text or categorical data
            df[col] = df[col].astype(str).str.strip() #Remove leading/trailing whitespace
        elif df[col].dtype == 'float64': #Format numeric data
           
            # Format specific columns based on required use case
            if col == "Total Cost": #Selecting column based on header value
                df[col] = df[col].apply(lambda x: '{:.2f}'.format(abs(x)) if pd.notnull(x) else "")         

The last section: “# Format specific columns based on required use case” is the section that is currently returning 0.5 as 0.5 (rather than “0.50” as desired)

3

Thanks for the responses. Looks like there was a formatting issue with PANDAS processing numbers that had too many decimal places (or something similar). I managed to fix this by applying a ’round’ function before any of the other column formatting however this caused rounding issues due to Pythons default “ROUND TO EVEN”, so I have since amended to “ROUND AWAY FROM ZERO”.

You have to import:

from decimal import Decimal, ROUND_HALF_UP

Then the decimal fix looks like this:

for col in df.columns:
  if pd.api.types.is_numeric_dtype(df[col]):
      df[col] = df[col].apply(lambda x: Decimal(abs(x)).quantize(Decimal('0.01'), rounding=ROUND_HALF_UP))

The whole code for anybody who needs it in the future:

import pandas as pd
from decimal import Decimal, ROUND_HALF_UP
import os

#DEF: Extract Data
def extract_data(filename, sheet_name):

    df = pd.read_excel(filename, sheet_name=sheet_name, skiprows=[0])

  #Drop rows and columns with all NaN values
    df.dropna(how='all', axis=0, inplace=True)
    df.dropna(how='all', axis=1, inplace=True)
    
  #Set all columns to 2 decimals (amend 0.01 to change decimal places)  
    for col in df.columns:
      if pd.api.types.is_numeric_dtype(df[col]): #this is an absolute value, concerned this may round negative values incorrectly
          df[col] = df[col].apply(lambda x: Decimal(abs(x)).quantize(Decimal('0.01'), rounding=ROUND_HALF_UP))

  #To apply it to a specific column 
    df['Column1'] = df['Column1'].apply(lambda x: Decimal(abs(x)).quantize(Decimal('0.01'), rounding=ROUND_HALF_UP))

    return df


#DEF: File location
username = os.path.basename(os.path.expanduser("~"))
filename = os.path.join('C:\Users\', username, Documents, 'Book1.xlsx')
sheet_name = "Sheet1"

#PROCESS: Extract and Print/Save File
data = extract_data(filename, sheet_name)

print(data)

#TESTING: Export to desktop - uncomment the below and comment 'print' statement above
#output_file_path = os.path.join('C:\Users\', username, 'Desktop', 'test1.csv')
#data.to_csv(output_file_path, index=False)

Theme wordpress giá rẻ Theme wordpress giá rẻ Thiết kế website Kho Theme wordpress Kho Theme WP Theme WP

LEAVE A COMMENT