Document Loaders
Context Loaders
- class ContextExtractor[source]
Bases:
object
A class to extract specific columns from a CSV file as pandas Series or DataFrame.
This class provides methods to read a specified column from a CSV file and return it as either a pandas Series or DataFrame. It’s useful for data processing tasks where only specific column data is required from a larger dataset.
- Method from_csv_as_series:
Reads a CSV file and returns the specified column as a pandas Series.
- Method from_csv_as_df:
Reads a CSV file and returns the specified column as a pandas DataFrame.
- from_csv_as_df(csv_file_path, column_name='context', encoding='utf-8')[source]
Reads a CSV file and returns the specified column as a pandas DataFrame.
This method extracts a single column from a CSV file and presents it as a pandas DataFrame. It’s particularly useful when only one column of data is needed for analysis or processing.
- Parameters:
csv_file_path (str) – The path to the CSV file.
column_name (str, optional) – The name of the column to extract. Default is “context”.
encoding (str, optional) – The encoding of the CSV file. Default is “utf-8”.
- Returns:
The specified column as a pandas DataFrame.
- Return type:
pandas.DataFrame
- Raises:
FileNotFoundError – If the CSV file does not exist.
ValueError – If the specified column does not exist in the CSV.
Exception – For any other exceptions that may occur.
- Example:
>>> extractor = ContextExtractor() >>> dataframe = extractor.from_csv_as_df("data.csv", "context")
- from_csv_as_series(csv_file_path, column_name='context', encoding='utf-8')[source]
Reads a CSV file and returns the specified column as a pandas Series.
This method is designed to extract a single column from a CSV file and present it as a pandas Series, which can be useful for further data analysis or processing.
- Parameters:
csv_file_path (str) – The path to the CSV file.
column_name (str, optional) – The name of the column to extract. Default is “context”.
encoding (str, optional) – The encoding of the CSV file. Default is “utf-8”.
- Returns:
The specified column as a pandas Series.
- Return type:
pandas.Series
- Raises:
FileNotFoundError – If the CSV file does not exist.
ValueError – If the specified column does not exist in the CSV.
Exception – For any other exceptions that may occur.
- Example:
>>> extractor = ContextExtractor() >>> series = extractor.from_csv_as_series("data.csv", "context")