AISee Package#

Submodules#

aisee.custom_datasets module#

class aisee.custom_datasets.DatasetFromDataFrame(data: Series | DataFrame, task: str = 'single_label', transform: Compose = None, class_to_idx: dict[str, int] = None)[source]#

Bases: Dataset

Image Dataset for Pandas Dataframe data.

Parameters:
  • data (pandas.DataFrame) –

    • If it is a multiclass problem: the dataframe must contain a path

    column with the full path of the image and a label column (optional) with the label assigned to the image. This label can be a number or a string.

    Example:

    path

    label

    ”sample/cat1.png”

    ”cat”

    ”sample/cat2.png”

    ”cat”

    ”sample/dog1.png”

    ”dog”

    ”sample/cat3.png”

    ”cat”

    or

    path

    ”sample/cat1.png”

    ”sample/cat2.png”

    ”sample/dog1.png”

    ”sample/cat3.png”

    • If it is a multilabel problem: the dataframe must contain a “path”

    column with the full path of the image and one column for each class in the problem. The classes that belong to that image will be indicated with a “1” and those that do not with a “0”.

    Example:

    path

    car

    motorbike

    bus

    ”sample/vehicles1.png”

    1

    1

    0

    ”sample/vehicles2.png”

    0

    0

    1

    ”sample/vehicles3.png”

    1

    0

    1

    ”sample/vehicles4.png”

    1

    0

    0

  • task (str) –

    Task to be resolved. Possible values:

    • ”single_label”

    • ”multi_label”

  • transform (torchvision.transforms.Compose) –

  • class_to_idx (dict[str, int], default=None) –

    Equivalence between the label and the index of the neural net output. This parameter is equivalent to label2id of the transformers library. Example:

    class_to_idx = {
        "cat": 0,
        "dog": 1
    }
    
    nerual_net_output = [0.2, 0.8]
    
    • Cat has probability 0.2

    • Dog has probability 0.8

check_dataframe_columns() None[source]#

Check if Pandas DataFrame is well constructed.

generate_class_to_idx() dict[str, int][source]#

Generate a dictionary with classes equivalences.

Example:

classes = ["cat", "dog]

class_to_idx = {
    "cat": 0,
    "dog": 1
}
Returns:

class_to_idx

Return type:

dict[str, int]

class aisee.custom_datasets.DatasetFromFolder(data: str, transform: Compose = None, class_to_idx: dict[str, int] = None)[source]#

Bases: ImageFolder

Image Dataset for return de path of the image.

Parameters:
  • data (str) –

    A directory where each folder is a class, and inside that class are the images.

    Example:

    ├── animals
    ├── cat
    │  ├── cat1.jpg
    │  └── cat2.jpg
    └── dog
        ├── dog1.jpg
        └── dog2.jpg
    

  • transform (torchvision.transforms.Compose) –

  • class_to_idx (dict[str, int], default=None) –

    Equivalence between the label and the index of the neural net output. This parameter is equivalent to label2id of the transformers library. Example:

    class_to_idx = {
        "cat": 0,
        "dog": 1
    }
    
    nerual_net_output = [0.2, 0.8]
    
    • Cat has probability 0.2

    • Dog has probability 0.8

class aisee.custom_datasets.DatasetFromNumpy(data: ndarray, transform: Compose = None)[source]#

Bases: Dataset

Image Dataset for Numpy data.

This class only works for making predictions. Return np.nan for label and ‘N/A’ for path.

Parameters:
  • data (np.ndarray) – Image as numpy object.

  • transform (torchvision.transforms.Compose) –

class aisee.custom_datasets.DatasetFromSingleImage(data: str, transform: Compose = None)[source]#

Bases: Dataset

Image Dataset for one label image.

Parameters:
  • data (str) –

    Image path

    Example:

    /data/cat.png
    

  • transform (torchvision.transforms.Compose) –

aisee.trainer module#

class aisee.trainer.Trainer(base_model: VisionClassifier, data: DataFrame | str, output_dir: str = None, lr: float = 0.001, batch_size: int = 8, num_epochs: int = 5, checkpointing_metric: str = 'loss', verbose: int = 3, shuffle: bool = True, num_workers: int = 2, dict_data_transforms: dict = None, criterion: type[Loss] = None, optimizer: type[Optimizer] = None, optimer_kwargs: dict = None)[source]#

Bases: object

Train a model from VisionClassifier.

The Trainer class allows you to take a VisionClassifier class and train the model with the data passed to the Trainer and save the best model weights based on the selected checkpoint_metric.

Parameters:
  • base_model (VisionClassifier) – An instance of VisionClassifier.

  • data (pandas.DataFrame or str (only for single_label task)) –

    A DataFrame or a string which contains the training data:

    • If it is a dataframe:
      • If it is a multiclass problem: the dataframe must contain a path

      column with the full path of the image, a label column with the label assigned to the image and a fold column that indicates the ‘train’ and ‘val’ samples.

      Example:

      path

      label

      fold

      ”sample/cat1.png”

      ”cat”

      ”train”

      ”sample/cat2.png”

      ”cat”

      ”val”

      ”sample/dog1.png”

      ”dog”

      ”train”

      ”sample/cat3.png”

      ”cat”

      ”val”

      • If it is a multilabel problem: the dataframe must contain a “path”

      column with the full path of the image, one column for each class in the problem and a fold column . The classes that belong to that image will be indicated with a “1” and those that do not with a “0”.

      Example:

      path

      car

      motorbike

      bus

      fold

      ”sample/vehicles1.png”

      1

      1

      0

      ”train”

      ”sample/vehicles2.png”

      0

      0

      1

      ”train”

      ”sample/vehicles3.png”

      1

      0

      1

      ”val”

      ”sample/vehicles4.png”

      1

      0

      0

      ”val”

    • If it is a string, it must be a directory which should contain subfolders

    with training (‘train’) and validation (‘val’) samples and second subfolders with labels.

    Example:

    └── animals
        ├── train
        |   ├── cat
        |   │  ├── cat1.jpg
        |   │  └── cat2.jpg
        |   └── dog
        |       ├── dog1.jpg
        |       └── dog2.jpg
        └── val
            ├── cat
            │  ├── cat3.jpg
            │  └── cat4.jpg
            └── dog
                ├── dog3.jpg
                └── dog4.jpg
    

  • output_dir (str, default=None) – File where the weights of the neural network will be saved. If None output_dir = ‘weights_model_name_time.pt’

  • lr (float, default=0.001) – Learning rate used by the torch optimizer.

  • batch_size (int, default=16) – Number of training samples.

  • num_epochs (int, default=5) – Number of training epochs.

  • checkpointing_metric (str, default="loss") –

    Metric with which the best model will be saved. Possible values:

    • ”loss”

    • ”acc”

    • ”f1”

    F1 is calculated as ‘macro-averaged F1 score’

  • verbose (int, default=3) – Controls the verbosity: the higher, the more messages.

  • shuffle (bool, default=True) – Whether or not to shuffle the data before splitting.

  • num_workers (int, default=2) – How many subprocesses to use for data loading. 0 means that the data will be loaded in the main process.

  • criterion (torch.nn, default= CrossEntropyLoss for single_label) – default= BCELoss for multi_label A loss function from pytorch. This criterion computes loss between input logits and target.

  • transformations (dic_data_transforms = dict with 'train' and 'val' image) – A function/transform that takes in an PIL image and returns a transformed version. If None for train: resize, horizontal flip and normalize val: resize and normalize

  • default=None – A function/transform that takes in an PIL image and returns a transformed version. If None for train: resize, horizontal flip and normalize val: resize and normalize

  • optimizer (torch.optim, default=None) – Add an optimizer from pytorch. If None Adam will be used.

  • optimer_kwargs (dict, default=None) – Optimizer parameters.

load_data_dict() dict[str, torch.utils.data.dataloader.DataLoader][source]#

Create training and validation dataloaders.

Returns:

dataloaders_dict – dict with “train” and “val” torch class dataloader.

Return type:

dict [str: torch.utils.data.DataLoader]

train()[source]#

Train the base_model.

Train the base_model and updates the weights based on the best on the checkpointing_metric.

Create the hist attribute with the training history.

Return type:

self

aisee.utils module#

aisee.utils.check_multilabel_df(df)[source]#

Check if the data has the structure needed for training on multi_label task.

Parameters:

data (pandas.DataFrame) – Multilabel image dataframe. It must contain the path and fold columns where the fold column indicates the ‘train’ and ‘val’ samples. Also, it must include a column for each possible label.

aisee.utils.check_single_label_data(data)[source]#

Check if the data has the structure needed for training on single_label task.

Parameters:

data (str or pandas.DataFrame) – Directory path or dataframe. For directory, it must contain subfolders with training (‘train’) and validation (‘val’) samples. For data frame, it must contain the path, label, and fold columns where the fold column indicates the ‘train’ and ‘val’ samples.

aisee.utils.get_data_split(data, fold)[source]#

Split data for train or `val.

Parameters:
  • data (str or pandas.DataFrame) – Directory path or dataframe. For directory, it must contain subfolders with training (‘train’) and validation (‘val’) samples. For data frame, it must contain the path, label, and fold columns where the fold column indicates the ‘train’ and ‘val’ samples.

  • fold (str, "train" or "val") – Directory or data split to be selected.

Returns:

d – Dataframe split (‘train’ or ‘val’) if data=pandas.DataFrame path to the directory split (‘train’ or ‘val’) if data=str

Return type:

pandas.DataFrame or str

aisee.utils.get_n_classes(data)[source]#

Get the number of classes.

Parameters:

data (str or pandas.DataFrame) – Directory path or dataframe. For directory, it must contain subfolders with training (‘train’) and validation (‘val’) samples. For data frame, it must contain the path, label, and fold columns where the fold column indicates the ‘train’ and ‘val’ samples.

Returns:

n – Number of classes.

Return type:

int

aisee.utils.get_n_classes_multilabel(df)[source]#

Get the number of classes in multilabel problems.

Parameters:

df (pandas.DataFrame) – Multilabel image dataframe. It must contain the path and fold columns where the fold column indicates the ‘train’ and ‘val’ samples. Also, it must include a column for each possible label.

Returns:

n – Number of classes.

Return type:

int

aisee.utils.numpy_image_from_jpg(path: str, rgb: bool = True, resize: tuple[int, int] = None) ndarray[source]#

Get numpy array from image path.

Parameters:
  • path (str) – Image path

  • rgb (bool, default=True) – Convert the image to RGB, if it is not.

  • resize (tuple[int, int], default=None) – Resize image.

Returns:

n – Numerical representation of an image

Return type:

np.ndarray

aisee.vision_classifier module#

class aisee.vision_classifier.VisionClassifier(model_name: str, num_classes: int, class_to_idx: dict[str, int] = None, weights_path: str = None, learning_method: str = 'freezed', extra_layer: int = None, dropout: float = None, task: str = 'single_label', device: str = 'cpu')[source]#

Bases: object

Instantiating and predicting with a computer vision model from timm library.

The VisionClassifier class allows loading and utilizing neural networks from timm library. The class provides methods for loading models with pre-trained or non pre-trained weights, or instantiating models with custom weights. It also allows training models with pre-trained weights (freezing some layers or training the entire network) and from scratch. Predictions can be made through a path to an image, a directory with images, or a dataframe. Additionally, it allows working with multiclass and multilabel problems.

Parameters:
  • model_name (str) – Name of the model that will be obtained from the timm library.

  • num_classes (int) – Number of classes in the problem. The number of classes will be the number of outputs of the neural net.

  • class_to_idx (dict[str, int], default=None) –

    Equivalence between the label and the index of the neural net output. This parameter is equivalent to label2id of the transformers library. Example:

    class_to_idx = {
        "cat": 0,
        "dog": 1
    }
    
    nerual_net_output = [0.2, 0.8]
    
    • Cat has probability 0.2

    • Dog has probability 0.8

  • weights_path (str, default=None) – Directory where network weights are located. If value is different from None, pretrained weigths from the timm library will be ignored.

  • learning_method (str, default="freezed") –

    Possible values: from_scratch, freezed, and unfreezed:

    • from_scratch: The model will be trained from scratch, without

    using any pre-trained weights contained in the timm library.

    • freezed: The model will be trained using pre-trained weights

    from the timm library. For this training, all layers of the network will be frozen (weights will not be updated) except for the last layer, and the extra layer if it is added with the extra_layer parameter.

    • unfreezed: The model will be trained using pre-trained weights

    from the timm library. In this case, all layers of the network will be updated without exception.

    Note that if custom weights are passed in the custom_weights parameter, the network weights will be those, and the pre-trained weights from the timm library will be ignored.

  • extra_layer (int, default=None) – If value is different from None, a linear layer is added before the last layer with extra_layer number of neurons. If None, this does nothing.

  • dropout (float, default=None) – If dropout has a value different from None, dropout layer is added before the last layer. Otherwise this does nothing.

  • task (str, default="single_label") –

    Task to be resolved. Possible values:

    • ”single_label”

    • ”multi_label”

  • device (str, default="cpu") –

    Device where the neural network will be running.

    Example: “cuda:1”, “cpu”

create_dataloader(data: Series | DataFrame | str | ndarray, num_workers: int = 2, data_transform: Compose = None, batch_size: int = 8, shuffle: bool = False) DataLoader[source]#

Create dataloaders from data.

Parameters:
  • data (pandas.DataFrame, str or numpy.ndarray) –

    Numpy array images only for predict.

    A DataFrame, a string, or array which contains the data:

  • dataframe (- If it is a) –

    • If it is a multiclass problem: the dataframe must contain a path

    column with the full path of the image and a label column (optional) with the label assigned to the image. This label can be a number or a string.

    Example:

    path

    label

    ”sample/cat1.png”

    ”cat”

    ”sample/cat2.png”

    ”cat”

    ”sample/dog1.png”

    ”dog”

    ”sample/cat3.png”

    ”cat”

    or

    path

    ”sample/cat1.png”

    ”sample/cat2.png”

    ”sample/dog1.png”

    ”sample/cat3.png”

    • If it is a multilabel problem: the dataframe must contain a “path”

    column with the full path of the image and one column for each class in the problem. The classes that belong to that image will be indicated with a “1” and those that do not with a “0”.

    Example:

    path

    car

    motorbike

    bus

    ”sample/vehicles1.png”

    1

    1

    0

    ”sample/vehicles2.png”

    0

    0

    1

    ”sample/vehicles3.png”

    1

    0

    1

    ”sample/vehicles4.png”

    1

    0

    0

  • string (- If it is a) –

    • A path to an image.

    Example

    /data/cat.png
    
    • A directory where each folder is a class, and inside that class are the

    images.

    Example:

    ├── animals
    ├── cat
    │  ├── cat1.jpg
    │  └── cat2.jpg
    └── dog
        ├── dog1.jpg
        └── dog2.jpg
    

  • be (it must) –

    • A path to an image.

    Example

    /data/cat.png
    
    • A directory where each folder is a class, and inside that class are the

    images.

    Example:

    ├── animals
    ├── cat
    │  ├── cat1.jpg
    │  └── cat2.jpg
    └── dog
        ├── dog1.jpg
        └── dog2.jpg
    

  • images (np.array() –

  • images

  • height

  • width

  • channels)

  • num_workers (int, default=2) – Subprocesses to use for data loading.

  • data_transform (torchvision.transforms.Compose, default=None) –

  • batch_size (int, default=8) –

  • shuffle (bool, default=False) – Shuffle the data.

Returns:

image_dataloader

Return type:

torch.utils.data.DataLoader

create_default_transform() dict[str, torchvision.transforms.transforms.Compose][source]#

Create default transform based on timm config model.

Returns:

dict_transform – A torchvision.transforms.Compose for train and another one for val/test. The dictionary keys are “train” and “val”. Both keys contain a torchvision.transforms.Compose that contains the following layers:

  • Resize

  • ToTensor

  • Normalize

Return type:

dict[str, torchvision.transforms.Compose]

evaluate(data: Series | DataFrame | str, metrics: list[Callable], metrics_kwargs: dict[str, dict[str, Any]], num_workers: int = 2, data_transform: Compose = None, batch_size: int = 8) dict[str, Any][source]#

Evaluate data over the metrics indicated in the metrics parameter.

Parameters:
  • data (pandas.DataFrame or str) – A DataFrame or a string which contains the training data:

  • dataframe (- If it is a) –

    • If it is a multiclass problem: the dataframe must contain a path

    column with the full path of the image and a label column (optional) with the label assigned to the image. This label can be a number or a string.

    Example:

    path

    label

    ”sample/cat1.png”

    ”cat”

    ”sample/cat2.png”

    ”cat”

    ”sample/dog1.png”

    ”dog”

    ”sample/cat3.png”

    ”cat”

    or

    path

    ”sample/cat1.png”

    ”sample/cat2.png”

    ”sample/dog1.png”

    ”sample/cat3.png”

    • If it is a multilabel problem: the dataframe must contain a “path”

    column with the full path of the image and one column for each class in the problem. The classes that belong to that image will be indicated with a “1” and those that do not with a “0”.

    Example:

    path

    car

    motorbike

    bus

    ”sample/vehicles1.png”

    1

    1

    0

    ”sample/vehicles2.png”

    0

    0

    1

    ”sample/vehicles3.png”

    1

    0

    1

    ”sample/vehicles4.png”

    1

    0

    0

  • string (- If it is a) –

    • A path to an image.

    Example

    /data/cat.png
    
    • A directory where each folder is a class, and inside that class are the

    images.

    Example:

    ├── animals
    ├── cat
    │  ├── cat1.jpg
    │  └── cat2.jpg
    └── dog
        ├── dog1.jpg
        └── dog2.jpg
    

  • be (it must) –

    • A path to an image.

    Example

    /data/cat.png
    
    • A directory where each folder is a class, and inside that class are the

    images.

    Example:

    ├── animals
    ├── cat
    │  ├── cat1.jpg
    │  └── cat2.jpg
    └── dog
        ├── dog1.jpg
        └── dog2.jpg
    

  • metrics (List[Callable]) –

    Each element of the list is a function that at least has the parameters y_pred and y_true, and each parameter accepts pure predictions as 1D array-like.

    For example, you can import accuracy_score from sklearn.metrics.

  • metrics_kwargs (Dict[str, Dict[str, Any]]) –

    Each key of the dictionary represents the name of one of the functions indicated in the metrics parameter. The value is a dictionary with the arguments of thath function.

    For example, if your metrics parameter has the function f1_score from sklearn.metrics and you want to use the parameter average with micro value, your kwargs will be:

    
    

    metrics_kwargs = {“f1_score”: {“average”: “micro”}}

  • num_workers (int, default=2) – Subprocesses to use for data loading.

  • data_transform (torchvision.transforms.Compose, default=None) –

  • batch_size (int, default=8) –

Returns:

evaluation_results – The resulting dictionary has a key for each function in the metrics parameter. The values are the results of each function.

Return type:

Dict[str, Any]

predict(data: DataFrame | str, num_workers: int = 2, data_transform: Compose = None, batch_size: int = 8) list[collections.abc.Mapping[str, T]][source]#

Predict images in data.

Parameters:
  • data (pandas.DataFrame, str or numpy.ndarray) –

    It must be a dataframe a string or numpy.ndarray:

    • If it is a dataframe:
      • If it is a multiclass problem: the dataframe must contain a path

      column with the full path of the image and a label column with the label assigned to the image. This label can be a number or a string.

      Example:

      path

      label

      ”sample/cat1.png”

      ”cat”

      ”sample/cat2.png”

      ”cat”

      ”sample/dog1.png”

      ”dog”

      ”sample/cat3.png”

      ”cat”

      • If it is a multilabel problem: the dataframe must contain a “path”

      column with the full path of the image and one column for each class in the problem. The classes that belong to that image will be indicated with a “1” and those that do not with a “0”.

      Example:

      path

      car

      motorbike

      bus

      ”sample/vehicles1.png”

      1

      1

      0

      ”sample/vehicles2.png”

      0

      0

      1

      ”sample/vehicles3.png”

      1

      0

      1

      ”sample/vehicles4.png”

      1

      0

      0

    • If it is a string, it must be:

      • A path to an image.

      Example

      /data/cat.png
      
      • A directory where each folder is a class, and inside that class are the

      images.

      Example:

      ├── animals
      ├── cat
      │  ├── cat1.jpg
      │  └── cat2.jpg
      └── dog
          ├── dog1.jpg
          └── dog2.jpg
      
    • If it is a numpy array it must be a numpy representation of images:

    np.array(nº images, height, width, channels)

  • num_workers (int, default=2) – Subprocesses to use for data loading.

  • data_transform (torchvision.transforms.Compose, default=None) –

  • batch_size (int, default=8) –

Returns:

result – Each position of the lists contains a dictionaty like this:

{
    "image_path": <path of the image>,
    "probabilities": <all classes probabilities>,
    "prediction": <class(es) predicted>,
    "real_label": <real label(s)>
}

Return type:

list[Mapping[str, T]]

predict_loop(dataloader: DataLoader) tuple[list, torch.Tensor, torch.Tensor, torch.Tensor][source]#

Make the loop for predict.

Parameters:

dataloader (torch.utils.data.DataLoader) –

Returns:

results – The tuple contains: - all_paths: Paths of the images predicted - all_probabilities: Probabilities of the images predicted - all_predictions: Classes predicted of the images. - all_real_labels: Real labels of the images predicted

Return type:

tuple[list, torch.Tensor, torch.Tensor, torch.Tensor]

Module contents#