AISee Package#
Submodules#
aisee.custom_datasets module#
- class aisee.custom_datasets.DatasetFromDataFrame(data: Series | DataFrame, task: str = 'single_label', transform: Compose = None, class_to_idx: dict[str, int] = None)[source]#
Bases:
Dataset
Image Dataset for Pandas Dataframe data.
- Parameters:
data (pandas.DataFrame) –
If it is a multiclass problem: the dataframe must contain a path
column with the full path of the image and a label column (optional) with the label assigned to the image. This label can be a number or a string.
Example:
path
label
”sample/cat1.png”
”cat”
”sample/cat2.png”
”cat”
”sample/dog1.png”
”dog”
”sample/cat3.png”
”cat”
or
path
”sample/cat1.png”
”sample/cat2.png”
”sample/dog1.png”
”sample/cat3.png”
If it is a multilabel problem: the dataframe must contain a “path”
column with the full path of the image and one column for each class in the problem. The classes that belong to that image will be indicated with a “1” and those that do not with a “0”.
Example:
path
car
motorbike
bus
”sample/vehicles1.png”
1
1
0
”sample/vehicles2.png”
0
0
1
”sample/vehicles3.png”
1
0
1
”sample/vehicles4.png”
1
0
0
task (str) –
Task to be resolved. Possible values:
”single_label”
”multi_label”
transform (torchvision.transforms.Compose) –
class_to_idx (dict[str, int], default=None) –
Equivalence between the label and the index of the neural net output. This parameter is equivalent to label2id of the transformers library. Example:
class_to_idx = { "cat": 0, "dog": 1 } nerual_net_output = [0.2, 0.8]
Cat has probability 0.2
Dog has probability 0.8
- class aisee.custom_datasets.DatasetFromFolder(data: str, transform: Compose = None, class_to_idx: dict[str, int] = None)[source]#
Bases:
ImageFolder
Image Dataset for return de path of the image.
- Parameters:
data (str) –
A directory where each folder is a class, and inside that class are the images.
Example:
├── animals ├── cat │ ├── cat1.jpg │ └── cat2.jpg └── dog ├── dog1.jpg └── dog2.jpg
transform (torchvision.transforms.Compose) –
class_to_idx (dict[str, int], default=None) –
Equivalence between the label and the index of the neural net output. This parameter is equivalent to label2id of the transformers library. Example:
class_to_idx = { "cat": 0, "dog": 1 } nerual_net_output = [0.2, 0.8]
Cat has probability 0.2
Dog has probability 0.8
- class aisee.custom_datasets.DatasetFromNumpy(data: ndarray, transform: Compose = None)[source]#
Bases:
Dataset
Image Dataset for Numpy data.
This class only works for making predictions. Return np.nan for label and ‘N/A’ for path.
- Parameters:
data (np.ndarray) – Image as numpy object.
transform (torchvision.transforms.Compose) –
aisee.trainer module#
- class aisee.trainer.Trainer(base_model: VisionClassifier, data: DataFrame | str, output_dir: str = None, lr: float = 0.001, batch_size: int = 8, num_epochs: int = 5, checkpointing_metric: str = 'loss', verbose: int = 3, shuffle: bool = True, num_workers: int = 2, dict_data_transforms: dict = None, criterion: type[Loss] = None, optimizer: type[Optimizer] = None, optimer_kwargs: dict = None)[source]#
Bases:
object
Train a model from VisionClassifier.
The Trainer class allows you to take a VisionClassifier class and train the model with the data passed to the Trainer and save the best model weights based on the selected checkpoint_metric.
- Parameters:
base_model (VisionClassifier) – An instance of VisionClassifier.
data (pandas.DataFrame or str (only for single_label task)) –
A DataFrame or a string which contains the training data:
- If it is a dataframe:
If it is a multiclass problem: the dataframe must contain a path
column with the full path of the image, a label column with the label assigned to the image and a fold column that indicates the ‘train’ and ‘val’ samples.
Example:
path
label
fold
”sample/cat1.png”
”cat”
”train”
”sample/cat2.png”
”cat”
”val”
”sample/dog1.png”
”dog”
”train”
”sample/cat3.png”
”cat”
”val”
If it is a multilabel problem: the dataframe must contain a “path”
column with the full path of the image, one column for each class in the problem and a fold column . The classes that belong to that image will be indicated with a “1” and those that do not with a “0”.
Example:
path
car
motorbike
bus
fold
”sample/vehicles1.png”
1
1
0
”train”
”sample/vehicles2.png”
0
0
1
”train”
”sample/vehicles3.png”
1
0
1
”val”
”sample/vehicles4.png”
1
0
0
”val”
If it is a string, it must be a directory which should contain subfolders
with training (‘train’) and validation (‘val’) samples and second subfolders with labels.
Example:
└── animals ├── train | ├── cat | │ ├── cat1.jpg | │ └── cat2.jpg | └── dog | ├── dog1.jpg | └── dog2.jpg └── val ├── cat │ ├── cat3.jpg │ └── cat4.jpg └── dog ├── dog3.jpg └── dog4.jpg
output_dir (str, default=None) – File where the weights of the neural network will be saved. If None output_dir = ‘weights_model_name_time.pt’
lr (float, default=0.001) – Learning rate used by the torch optimizer.
batch_size (int, default=16) – Number of training samples.
num_epochs (int, default=5) – Number of training epochs.
checkpointing_metric (str, default="loss") –
Metric with which the best model will be saved. Possible values:
”loss”
”acc”
”f1”
F1 is calculated as ‘macro-averaged F1 score’
verbose (int, default=3) – Controls the verbosity: the higher, the more messages.
shuffle (bool, default=True) – Whether or not to shuffle the data before splitting.
num_workers (int, default=2) – How many subprocesses to use for data loading.
0
means that the data will be loaded in the main process.criterion (torch.nn, default= CrossEntropyLoss for single_label) – default= BCELoss for multi_label A loss function from pytorch. This criterion computes loss between input logits and target.
transformations (dic_data_transforms = dict with 'train' and 'val' image) – A function/transform that takes in an PIL image and returns a transformed version. If None for train: resize, horizontal flip and normalize val: resize and normalize
default=None – A function/transform that takes in an PIL image and returns a transformed version. If None for train: resize, horizontal flip and normalize val: resize and normalize
optimizer (torch.optim, default=None) – Add an optimizer from pytorch. If None Adam will be used.
optimer_kwargs (dict, default=None) – Optimizer parameters.
aisee.utils module#
- aisee.utils.check_multilabel_df(df)[source]#
Check if the data has the structure needed for training on multi_label task.
- Parameters:
data (pandas.DataFrame) – Multilabel image dataframe. It must contain the path and fold columns where the fold column indicates the ‘train’ and ‘val’ samples. Also, it must include a column for each possible label.
- aisee.utils.check_single_label_data(data)[source]#
Check if the data has the structure needed for training on single_label task.
- Parameters:
data (str or pandas.DataFrame) – Directory path or dataframe. For directory, it must contain subfolders with training (‘train’) and validation (‘val’) samples. For data frame, it must contain the path, label, and fold columns where the fold column indicates the ‘train’ and ‘val’ samples.
- aisee.utils.get_data_split(data, fold)[source]#
Split data for train or `val.
- Parameters:
data (str or pandas.DataFrame) – Directory path or dataframe. For directory, it must contain subfolders with training (‘train’) and validation (‘val’) samples. For data frame, it must contain the path, label, and fold columns where the fold column indicates the ‘train’ and ‘val’ samples.
fold (str, "train" or "val") – Directory or data split to be selected.
- Returns:
d – Dataframe split (‘train’ or ‘val’) if data=pandas.DataFrame path to the directory split (‘train’ or ‘val’) if data=str
- Return type:
pandas.DataFrame or str
- aisee.utils.get_n_classes(data)[source]#
Get the number of classes.
- Parameters:
data (str or pandas.DataFrame) – Directory path or dataframe. For directory, it must contain subfolders with training (‘train’) and validation (‘val’) samples. For data frame, it must contain the path, label, and fold columns where the fold column indicates the ‘train’ and ‘val’ samples.
- Returns:
n – Number of classes.
- Return type:
int
- aisee.utils.get_n_classes_multilabel(df)[source]#
Get the number of classes in multilabel problems.
- Parameters:
df (pandas.DataFrame) – Multilabel image dataframe. It must contain the path and fold columns where the fold column indicates the ‘train’ and ‘val’ samples. Also, it must include a column for each possible label.
- Returns:
n – Number of classes.
- Return type:
int
- aisee.utils.numpy_image_from_jpg(path: str, rgb: bool = True, resize: tuple[int, int] = None) ndarray [source]#
Get numpy array from image path.
- Parameters:
path (str) – Image path
rgb (bool, default=True) – Convert the image to RGB, if it is not.
resize (tuple[int, int], default=None) – Resize image.
- Returns:
n – Numerical representation of an image
- Return type:
np.ndarray
aisee.vision_classifier module#
- class aisee.vision_classifier.VisionClassifier(model_name: str, num_classes: int, class_to_idx: dict[str, int] = None, weights_path: str = None, learning_method: str = 'freezed', extra_layer: int = None, dropout: float = None, task: str = 'single_label', device: str = 'cpu')[source]#
Bases:
object
Instantiating and predicting with a computer vision model from timm library.
The VisionClassifier class allows loading and utilizing neural networks from timm library. The class provides methods for loading models with pre-trained or non pre-trained weights, or instantiating models with custom weights. It also allows training models with pre-trained weights (freezing some layers or training the entire network) and from scratch. Predictions can be made through a path to an image, a directory with images, or a dataframe. Additionally, it allows working with multiclass and multilabel problems.
- Parameters:
model_name (str) – Name of the model that will be obtained from the timm library.
num_classes (int) – Number of classes in the problem. The number of classes will be the number of outputs of the neural net.
class_to_idx (dict[str, int], default=None) –
Equivalence between the label and the index of the neural net output. This parameter is equivalent to label2id of the transformers library. Example:
class_to_idx = { "cat": 0, "dog": 1 } nerual_net_output = [0.2, 0.8]
Cat has probability 0.2
Dog has probability 0.8
weights_path (str, default=None) – Directory where network weights are located. If value is different from None, pretrained weigths from the timm library will be ignored.
learning_method (str, default="freezed") –
Possible values: from_scratch, freezed, and unfreezed:
from_scratch: The model will be trained from scratch, without
using any pre-trained weights contained in the timm library.
freezed: The model will be trained using pre-trained weights
from the timm library. For this training, all layers of the network will be frozen (weights will not be updated) except for the last layer, and the extra layer if it is added with the extra_layer parameter.
unfreezed: The model will be trained using pre-trained weights
from the timm library. In this case, all layers of the network will be updated without exception.
Note that if custom weights are passed in the custom_weights parameter, the network weights will be those, and the pre-trained weights from the timm library will be ignored.
extra_layer (int, default=None) – If value is different from None, a linear layer is added before the last layer with extra_layer number of neurons. If None, this does nothing.
dropout (float, default=None) – If dropout has a value different from None, dropout layer is added before the last layer. Otherwise this does nothing.
task (str, default="single_label") –
Task to be resolved. Possible values:
”single_label”
”multi_label”
device (str, default="cpu") –
Device where the neural network will be running.
Example: “cuda:1”, “cpu”
- create_dataloader(data: Series | DataFrame | str | ndarray, num_workers: int = 2, data_transform: Compose = None, batch_size: int = 8, shuffle: bool = False) DataLoader [source]#
Create dataloaders from data.
- Parameters:
data (pandas.DataFrame, str or numpy.ndarray) –
Numpy array images only for predict.
A DataFrame, a string, or array which contains the data:
dataframe (- If it is a) –
If it is a multiclass problem: the dataframe must contain a path
column with the full path of the image and a label column (optional) with the label assigned to the image. This label can be a number or a string.
Example:
path
label
”sample/cat1.png”
”cat”
”sample/cat2.png”
”cat”
”sample/dog1.png”
”dog”
”sample/cat3.png”
”cat”
or
path
”sample/cat1.png”
”sample/cat2.png”
”sample/dog1.png”
”sample/cat3.png”
If it is a multilabel problem: the dataframe must contain a “path”
column with the full path of the image and one column for each class in the problem. The classes that belong to that image will be indicated with a “1” and those that do not with a “0”.
Example:
path
car
motorbike
bus
”sample/vehicles1.png”
1
1
0
”sample/vehicles2.png”
0
0
1
”sample/vehicles3.png”
1
0
1
”sample/vehicles4.png”
1
0
0
string (- If it is a) –
A path to an image.
Example
/data/cat.png
A directory where each folder is a class, and inside that class are the
images.
Example:
├── animals ├── cat │ ├── cat1.jpg │ └── cat2.jpg └── dog ├── dog1.jpg └── dog2.jpg
be (it must) –
A path to an image.
Example
/data/cat.png
A directory where each folder is a class, and inside that class are the
images.
Example:
├── animals ├── cat │ ├── cat1.jpg │ └── cat2.jpg └── dog ├── dog1.jpg └── dog2.jpg
images (np.array(nº) –
images –
height –
width –
channels) –
num_workers (int, default=2) – Subprocesses to use for data loading.
data_transform (torchvision.transforms.Compose, default=None) –
batch_size (int, default=8) –
shuffle (bool, default=False) – Shuffle the data.
- Returns:
image_dataloader
- Return type:
torch.utils.data.DataLoader
- create_default_transform() dict[str, torchvision.transforms.transforms.Compose] [source]#
Create default transform based on timm config model.
- Returns:
dict_transform – A torchvision.transforms.Compose for train and another one for val/test. The dictionary keys are “train” and “val”. Both keys contain a torchvision.transforms.Compose that contains the following layers:
Resize
ToTensor
Normalize
- Return type:
dict[str, torchvision.transforms.Compose]
- evaluate(data: Series | DataFrame | str, metrics: list[Callable], metrics_kwargs: dict[str, dict[str, Any]], num_workers: int = 2, data_transform: Compose = None, batch_size: int = 8) dict[str, Any] [source]#
Evaluate data over the metrics indicated in the metrics parameter.
- Parameters:
data (pandas.DataFrame or str) – A DataFrame or a string which contains the training data:
dataframe (- If it is a) –
If it is a multiclass problem: the dataframe must contain a path
column with the full path of the image and a label column (optional) with the label assigned to the image. This label can be a number or a string.
Example:
path
label
”sample/cat1.png”
”cat”
”sample/cat2.png”
”cat”
”sample/dog1.png”
”dog”
”sample/cat3.png”
”cat”
or
path
”sample/cat1.png”
”sample/cat2.png”
”sample/dog1.png”
”sample/cat3.png”
If it is a multilabel problem: the dataframe must contain a “path”
column with the full path of the image and one column for each class in the problem. The classes that belong to that image will be indicated with a “1” and those that do not with a “0”.
Example:
path
car
motorbike
bus
”sample/vehicles1.png”
1
1
0
”sample/vehicles2.png”
0
0
1
”sample/vehicles3.png”
1
0
1
”sample/vehicles4.png”
1
0
0
string (- If it is a) –
A path to an image.
Example
/data/cat.png
A directory where each folder is a class, and inside that class are the
images.
Example:
├── animals ├── cat │ ├── cat1.jpg │ └── cat2.jpg └── dog ├── dog1.jpg └── dog2.jpg
be (it must) –
A path to an image.
Example
/data/cat.png
A directory where each folder is a class, and inside that class are the
images.
Example:
├── animals ├── cat │ ├── cat1.jpg │ └── cat2.jpg └── dog ├── dog1.jpg └── dog2.jpg
metrics (List[Callable]) –
Each element of the list is a function that at least has the parameters y_pred and y_true, and each parameter accepts pure predictions as 1D array-like.
For example, you can import accuracy_score from sklearn.metrics.
metrics_kwargs (Dict[str, Dict[str, Any]]) –
Each key of the dictionary represents the name of one of the functions indicated in the metrics parameter. The value is a dictionary with the arguments of thath function.
For example, if your metrics parameter has the function f1_score from sklearn.metrics and you want to use the parameter average with micro value, your kwargs will be:
metrics_kwargs = {“f1_score”: {“average”: “micro”}}
num_workers (int, default=2) – Subprocesses to use for data loading.
data_transform (torchvision.transforms.Compose, default=None) –
batch_size (int, default=8) –
- Returns:
evaluation_results – The resulting dictionary has a key for each function in the metrics parameter. The values are the results of each function.
- Return type:
Dict[str, Any]
- predict(data: DataFrame | str, num_workers: int = 2, data_transform: Compose = None, batch_size: int = 8) list[collections.abc.Mapping[str, T]] [source]#
Predict images in data.
- Parameters:
data (pandas.DataFrame, str or numpy.ndarray) –
It must be a dataframe a string or numpy.ndarray:
- If it is a dataframe:
If it is a multiclass problem: the dataframe must contain a path
column with the full path of the image and a label column with the label assigned to the image. This label can be a number or a string.
Example:
path
label
”sample/cat1.png”
”cat”
”sample/cat2.png”
”cat”
”sample/dog1.png”
”dog”
”sample/cat3.png”
”cat”
If it is a multilabel problem: the dataframe must contain a “path”
column with the full path of the image and one column for each class in the problem. The classes that belong to that image will be indicated with a “1” and those that do not with a “0”.
Example:
path
car
motorbike
bus
”sample/vehicles1.png”
1
1
0
”sample/vehicles2.png”
0
0
1
”sample/vehicles3.png”
1
0
1
”sample/vehicles4.png”
1
0
0
If it is a string, it must be:
A path to an image.
Example
/data/cat.png
A directory where each folder is a class, and inside that class are the
images.
Example:
├── animals ├── cat │ ├── cat1.jpg │ └── cat2.jpg └── dog ├── dog1.jpg └── dog2.jpg
If it is a numpy array it must be a numpy representation of images:
np.array(nº images, height, width, channels)
num_workers (int, default=2) – Subprocesses to use for data loading.
data_transform (torchvision.transforms.Compose, default=None) –
batch_size (int, default=8) –
- Returns:
result – Each position of the lists contains a dictionaty like this:
{ "image_path": <path of the image>, "probabilities": <all classes probabilities>, "prediction": <class(es) predicted>, "real_label": <real label(s)> }
- Return type:
list[Mapping[str, T]]
- predict_loop(dataloader: DataLoader) tuple[list, torch.Tensor, torch.Tensor, torch.Tensor] [source]#
Make the loop for predict.
- Parameters:
dataloader (torch.utils.data.DataLoader) –
- Returns:
results – The tuple contains: - all_paths: Paths of the images predicted - all_probabilities: Probabilities of the images predicted - all_predictions: Classes predicted of the images. - all_real_labels: Real labels of the images predicted
- Return type:
tuple[list, torch.Tensor, torch.Tensor, torch.Tensor]