Aller au contenu

Diagrammes Step 00 - Download Inputs

Vue d'ensemble du processus de téléchargement

graph TB
    Start([Début Step 00]) --> Init[Initialisation variables environnement]
    Init --> Retry{"Tentative n°<br/>max_retries=5"}
    Retry --> Connect[Connexion SFTP<br/>timeout=120s<br/>keepalive=30s]
    Connect -->|Succès| FindDir[get_remote_directory]
    Connect -->|Échec Auth| AuthFail[Arrêt immédiat<br/>Vérifier credentials]
    Connect -->|Autres erreurs| Wait[Attendre delay secondes]
    FindDir --> Try1["Essai: /data/to/startflow/<br/>{country}/{business_unit}/"]
    Try1 -->|Existe| Download
    Try1 -->|"N'existe pas"| Try2["Essai: /data/to/startflow/<br/>{country}/{business_unit avec __}/"]
    Try2 -->|Existe| Download
    Try2 -->|"N'existe pas"| NoDir[FileNotFoundError<br/>Aucun répertoire valide]
    Download[Télécharger tous les fichiers] --> SaveLocal["Sauvegarder dans<br/>/app/inputs/{business_unit}/"]
    SaveLocal --> Close[Fermer connexion SFTP]
    Close --> Success([Succès])
    Wait --> Backoff["delay = min(delay * 2, max_delay)"]
    Backoff --> Increment["attempt += 1"]
    Increment --> CheckMax{"attempt < max_retries?"}
    CheckMax -->|Oui| Retry
    CheckMax -->|Non| MaxFail[Échec après 5 tentatives]

    style Success fill:#90EE90
    style AuthFail fill:#FFB6C1
    style NoDir fill:#FFB6C1
    style MaxFail fill:#FFB6C1

Mécanisme de Retry avec Backoff Exponentiel

sequenceDiagram
    participant Script as download_files_from_sftp()
    participant Transport as paramiko.Transport
    participant SFTP as SFTPClient
    participant FS as Système de fichiers

    Note over Script: max_retries=5<br/>base_delay=5<br/>max_delay=60

    loop Retry Loop (attempt < max_retries)
        Script->>Script: log(f"Attempt {attempt+1} of {max_retries}")

        Script->>Transport: Transport((sftp_host, sftp_port))
        Note over Transport: banner_timeout=120<br/>auth_timeout=120<br/>keepalive=30

        Script->>Transport: connect(username, password)

        alt Authentication Success
            Transport-->>Script: Connected
            Script->>SFTP: SFTPClient.from_transport()

            Script->>Script: get_remote_directory(sftp, business_unit)
            Note over Script: Extract country from business_unit

            Script->>SFTP: stat(path1)
            alt Path exists
                SFTP-->>Script: Path found
            else Path not found
                Script->>SFTP: stat(path2 with __)
                SFTP-->>Script: Path found or FileNotFoundError
            end

            Script->>SFTP: listdir(remote_directory)
            SFTP-->>Script: files_list

            loop For each file
                Script->>Script: Skip if hidden or directory
                Script->>SFTP: get(remote_file, local_file)
                SFTP->>FS: Write file
                Script->>Script: log(f"Downloaded: {file}")
            end

            Script->>SFTP: close()
            Script->>Transport: close()
            Note over Script: Return success

        else Authentication Failed
            Transport-->>Script: AuthenticationException
            Script->>Script: log("Authentication failed")
            Note over Script: Break loop - no retry

        else Connection Error
            Transport-->>Script: SSHException/SFTPError/EOFError
            Script->>Script: log(f"Error: {err}")

            Note over Script: Clean up resources
            Script->>SFTP: close() if exists
            Script->>Transport: close() if exists

            Script->>Script: attempt += 1
            Script->>Script: time.sleep(delay)
            Script->>Script: delay = min(delay * 2, max_delay)
            Note over Script: Backoff: 5→10→20→40→60
        end
    end

    Script->>Script: log("Failed after multiple attempts")

Structure des chemins distants

flowchart LR
    subgraph "Serveur SFTP Lactalis"
        Root["/data/to/startflow/"] 
        Italy["italy/"]
        France["france/"]
        Galbani1["italy_galbani/"]
        Galbani2["italy__galbani/<br/>(variante double underscore)"]
        Parmalat1["italy_parmalat/"]
        Parmalat2["italy__parmalat/<br/>(variante double underscore)"]
        Files1["PRODUCT-BASE.txt<br/>SELLIN-BASE.txt<br/>PROMOCALENDAR.txt<br/>fichiers xlsx config"]
    end

    subgraph "Local"
        LocalDir["/app/inputs/{business_unit}/"]
        LocalFiles["Tous les fichiers téléchargés"]
    end

    Root --> Italy
    Root --> France
    Italy --> Galbani1
    Italy --> Galbani2
    Italy --> Parmalat1
    Italy --> Parmalat2
    Galbani1 --> Files1
    LocalDir --> LocalFiles
    Files1 -.->|"download"| LocalFiles

    style Galbani2 stroke-dasharray: 5 5
    style Parmalat2 stroke-dasharray: 5 5

Gestion des erreurs et états

stateDiagram-v2
    [*] --> Initialisation: Start

    Initialisation --> TentativeConnexion: attempt=0

    state TentativeConnexion {
        [*] --> Connexion
        Connexion --> Authentification

        Authentification --> RechercheRepertoire: Success
        Authentification --> ErreurAuth: AuthenticationException

        RechercheRepertoire --> CheminStandard: Check path
        CheminStandard --> CheminDouble: Not found
        CheminDouble --> ListeFichiers: Found
        CheminStandard --> ListeFichiers: Found
        CheminDouble --> ErreurChemin: Not found

        ListeFichiers --> Telechargement
        Telechargement --> FermetureConnexion
        FermetureConnexion --> [*]: Success

        Connexion --> ErreurReseau: SSHException/EOFError
        Telechargement --> ErreurReseau: SFTPError
    }

    ErreurAuth --> Echec: Stop (no retry)
    ErreurChemin --> Echec: FileNotFoundError

    ErreurReseau --> Nettoyage: Handle error
    Nettoyage --> Attente: time.sleep(delay)

    Attente --> CalculBackoff: delay *= 2
    CalculBackoff --> VerificationMax: attempt < 5?

    VerificationMax --> TentativeConnexion: Yes
    VerificationMax --> Echec: No (max attempts)

    FermetureConnexion --> Succes: All files downloaded

    Succes --> [*]
    Echec --> [*]

    note right of ErreurAuth
        Pas de retry pour
        erreur d'authentification
    end note

    note right of CalculBackoff
        Backoff exponentiel:
        5s → 10s → 20s → 40s → 60s
        Plafonné à max_delay
    end note

Points clés de l'implémentation

mindmap
  root((Step 00 - Download))
    Robustesse
      Retry automatique
        5 tentatives max
        Backoff exponentiel
        Plafond 60 secondes
      Gestion erreurs
        Auth stop
        Réseau retry
        Chemin stop
      Timeouts étendus
        Banner: 120s
        Auth: 120s
        Keepalive: 30s
    Flexibilité
      Chemins multiples
        Standard : country_business
        Variante : country__business
    Debug
      Paramiko
        paramiko_debug.log
    Sécurité
      Variables environnement
        SFTP_HOST
        SFTP_PORT
        SFTP_USERNAME
        SFTP_PASSWORD
        BUSINESS_UNIT
    Nettoyage ressources
      Finally block
        Close SFTP
        Close Transport

Flux de données

graph TD
    subgraph "Inputs"
        ENV[Variables d'environnement] --> |Configuration| FUNC
    end

    subgraph "Processing"
        FUNC[download_files_from_sftp] --> GRD[get_remote_directory]
        GRD --> |Détermine chemin| PATH{Quel chemin?}
        PATH -->|Standard| P1["{country}/{business_unit}"]
        PATH -->|Double underscore| P2["{country}/{business avec __}"]
        FUNC --> CONN[Connexion SFTP]
        CONN --> LIST[Liste fichiers distants]
        LIST --> FILTER["Filtre fichiers<br/>- Skip hidden (.)<br/>- Skip directories"]
        FILTER --> DL[Download chaque fichier]
    end

    subgraph "Outputs"
        DL --> LOCAL["/app/inputs/{business_unit}/<br/>Tous les fichiers"]
        FUNC --> LOG["Logs détaillés<br/>- Tentatives<br/>- Erreurs<br/>- Fichiers téléchargés"]
    end

    style ENV fill:#e1f5fe
    style LOCAL fill:#c8e6c9
    style LOG fill:#fff9c4