How to change column type in Python pandas?

Estimated read time 2 min read

To change the column type in Python pandas, you can use the astype() method or the pd.to_numeric(), pd.to_datetime(), or pd.to_timedelta() functions depending on the desired type conversion. Here’s an overview of how to change the column type using these methods:

  1. Using astype() method:
import pandas as pd

# Create a DataFrame
df = pd.DataFrame({'col1': ['1', '2', '3'],
                   'col2': [4.5, 5.6, 6.7]})

# Convert 'col1' to integer type
df['col1'] = df['col1'].astype(int)

# Convert 'col2' to string type
df['col2'] = df['col2'].astype(str)

In this example, we convert the 'col1' column to an integer type using astype(int). Similarly, we convert the 'col2' column to a string type using astype(str).

  1. Using pd.to_numeric() function:
import pandas as pd

# Create a DataFrame
df = pd.DataFrame({'col1': ['1', '2', '3'],
                   'col2': [4.5, 5.6, 6.7]})

# Convert 'col1' to integer type
df['col1'] = pd.to_numeric(df['col1'])

# Convert 'col2' to float type
df['col2'] = pd.to_numeric(df['col2'], downcast='float')

In this example, we use pd.to_numeric() to convert the 'col1' column to an integer type. By default, it converts non-numeric values to NaN. You can specify the downcast parameter to optimize memory usage, if applicable.

  1. Using pd.to_datetime() function:
import pandas as pd

# Create a DataFrame
df = pd.DataFrame({'date': ['2022-01-01', '2022-02-01', '2022-03-01'],
                   'value': [10, 20, 30]})

# Convert 'date' to datetime type
df['date'] = pd.to_datetime(df['date'])

In this example, we use pd.to_datetime() to convert the 'date' column to a datetime type.

  1. Using pd.to_timedelta() function:
import pandas as pd

# Create a DataFrame
df = pd.DataFrame({'duration': ['2 days', '5 days', '1 day'],
                   'value': [10, 20, 30]})

# Convert 'duration' to timedelta type
df['duration'] = pd.to_timedelta(df['duration'])

In this example, we use pd.to_timedelta() to convert the 'duration' column to a timedelta type.

Make sure to assign the converted column back to the DataFrame to update its type. Additionally, consider handling any potential errors or missing values that may arise during the type conversion process.

You May Also Like

More From Author

+ There are no comments

Add yours

Leave a Reply