How to SHA256 a Column in Python?

Estimated read time 2 min read

In Python, you can use the hashlib library to compute the SHA-256 hash of a column in a dataframe. Here’s an example code that demonstrates this:

import pandas as pd
import hashlib

# Create sample dataframe
df = pd.DataFrame({
    'Name': ['John', 'Jane', 'Mike', 'Sara'],
    'Age': [30, 25, 35, 20],
    'Email': ['john@example.com', 'jane@example.com', 'mike@example.com', 'sara@example.com']
})

# Compute SHA-256 hash of email column
hashes = []
for email in df['Email']:
    email_bytes = email.encode('utf-8')
    hash_object = hashlib.sha256(email_bytes)
    hex_dig = hash_object.hexdigest()
    hashes.append(hex_dig)

# Add hashed column to dataframe
df['Email Hash'] = hashes

# Print the result
print(df)

In this example, we create a sample dataframe df with three columns: Name, Age, and Email. We then compute the SHA-256 hash of the Email column using the hashlib library.

To compute the hash, we first encode each email string as bytes using the encode method with the UTF-8 encoding. We then create a sha256 hash object using the hashlib.sha256 function and pass in the email bytes. We then call the hexdigest method on the hash object to get a string representation of the hash in hexadecimal format.

We store each hash in a list called hashes and then add the list as a new column to the dataframe using the df['Email Hash'] = hashes syntax.

Note that this example computes the SHA-256 hash for each email string separately. If you want to compute the hash for the entire column as a single string, you can join the email strings together before computing the hash, like this:

email_str = ''.join(df['Email'])
email_str_bytes = email_str.encode('utf-8')
hash_object = hashlib.sha256(email_str_bytes)
hex_dig = hash_object.hexdigest()

This will give you a single SHA-256 hash for the entire column.

You May Also Like

More From Author

+ There are no comments

Add yours

Leave a Reply