Displaying Full Dataframe Information in HTML with Pandas

When working with large datasets in Pandas, it’s often necessary to display full dataframe information in a human-readable format. One common way to do this is by converting the dataframe to an HTML table using the to_html function. However, by default, Pandas truncates long strings and displays only a limited number of columns, which can make it difficult to view all the data.

To display full dataframe information in HTML, you need to adjust the display options in Pandas. The most important option is display.max_colwidth, which controls the maximum width of each column. By default, this option is set to 50 characters, which means that any string longer than 50 characters will be truncated.

To display full dataframe information, you can set display.max_colwidth to None. This will allow Pandas to display strings of any length without truncating them. Here’s an example:

import pandas as pd

# Create a sample dataframe
df = pd.DataFrame({
    'TEXT': ['This is a very long string that should not be truncated. It contains many words and characters.']
})

# Set the display option to show full column width
pd.set_option('display.max_colwidth', None)

# Convert the dataframe to HTML
html = df.to_html()

print(html)

In this example, the to_html function will generate an HTML table with a single column that contains the long string. Because display.max_colwidth is set to None, the entire string will be displayed without truncation.

Another important option is display.max_columns, which controls the number of columns displayed in the dataframe. If you have a wide dataframe with many columns, you may need to adjust this option to display all the columns. You can set display.max_columns to None to display all columns:

pd.set_option('display.max_columns', None)

It’s worth noting that adjusting these options can affect the performance of your code, especially if you’re working with very large datasets. Therefore, it’s a good idea to reset the options to their default values after displaying the dataframe.

Here’s an example of how you can create a helper function to display full dataframe information without affecting the rest of your code:

def print_full(df):
    pd.set_option('display.max_rows', None)
    pd.set_option('display.max_columns', None)
    pd.set_option('display.width', 2000)
    pd.set_option('display.float_format', '{:20,.2f}'.format)
    pd.set_option('display.max_colwidth', None)
    print(df.to_html())
    pd.reset_option('display.max_rows')
    pd.reset_option('display.max_columns')
    pd.reset_option('display.width')
    pd.reset_option('display.float_format')
    pd.reset_option('display.max_colwidth')

# Usage
df = pd.DataFrame({
    'TEXT': ['This is a very long string that should not be truncated. It contains many words and characters.']
})
print_full(df)

In Jupyter notebooks, you can use the option_context function to temporarily adjust the display options for a single cell:

with pd.option_context('display.max_colwidth', None):
    display(df)

By following these tips and examples, you should be able to display full dataframe information in HTML with Pandas.

Leave a Reply

Your email address will not be published. Required fields are marked *