Loader
A unified class for loading data from both S3 and local storage.
This class provides functionality to load data from either Amazon S3 or local storage based on whether a bucket name is specified. It inherits from S3Loader to handle S3-specific operations.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
|
str
|
The package to use for S3 connections ('s3fs' or 'boto3'). Defaults to "boto3". |
'boto3'
|
Attributes:
| Name | Type | Description |
|---|---|---|
s3 |
The S3 connection object, initialized when needed. |
Examples:
Load data from S3:
>>> loader = Loader(s3_package='boto3')
>>> s3_data = loader.load(
... filepath='data/sales.csv',
... bucket='my-bucket',
... aws_access_key_id='YOUR_KEY',
... aws_secret_access_key='YOUR_SECRET'
... )
Load data from local storage:
Notes
- When loading from S3, AWS credentials can be provided either through environment variables or as parameters.
- For local loading, all standard formats are supported: CSV, Excel, JSON, Pickle, GeoJSON, and Parquet.
Methods:
| Name | Description |
|---|---|
connect |
Establish a connection to the S3 bucket. |
load |
Load data from either S3 or local storage. |
Source code in dashboard_template_database/storage/loader.py
11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 | |
connect
connect(**kwargs) -> None
Establish a connection to the S3 bucket.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
|
Additional keyword arguments for establishing the connection. |
{}
|
Returns:
| Name | Type | Description |
|---|---|---|
object |
None
|
The established S3 connection. |
Example :
s3_loader = S3Loader(package='boto3') s3_connection = s3_loader.connect(aws_access_key_id='your_access_key', aws_secret_access_key='your_secret_key')
Source code in dashboard_template_database/storage/s3/loader.py
load
Load data from either S3 or local storage.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
|
str
|
Path to the file. For S3, this is the key within the bucket. For local storage, this is the path on the filesystem. |
required |
|
str
|
S3 bucket name. If None, loads from local storage. |
None
|
|
Additional arguments passed to the underlying loader: - For S3: aws_access_key_id, aws_secret_access_key, aws_session_token, endpoint_url, verify - For both: file format specific options (encoding, separator, etc.) |
{}
|
Returns:
| Name | Type | Description |
|---|---|---|
Any |
Any
|
The loaded data in appropriate format based on file extension: - .csv, .xlsx, .xls -> pandas DataFrame - .json -> dict or pandas DataFrame - .pkl -> pickled object - .geojson -> GeoDataFrame - .parquet -> pandas DataFrame |
Raises:
| Type | Description |
|---|---|
ValueError
|
If the file extension is not supported |
FileNotFoundError
|
If the local file doesn't exist |
ClientError
|
If there are S3 access issues |
Examples:
Load CSV from S3:
>>> data = loader.load(
... filepath='data.csv',
... bucket='my-bucket',
... aws_access_key_id='KEY',
... aws_secret_access_key='SECRET'
... )
Load local Excel file with specific options: