Vue d'ensemble du processus de téléchargement
graph TB
Start([Début Step 00]) --> Init[Initialisation variables environnement]
Init --> Retry{"Tentative n°<br/>max_retries=5"}
Retry --> Connect[Connexion SFTP<br/>timeout=120s<br/>keepalive=30s]
Connect -->|Succès| FindDir[get_remote_directory]
Connect -->|Échec Auth| AuthFail[Arrêt immédiat<br/>Vérifier credentials]
Connect -->|Autres erreurs| Wait[Attendre delay secondes]
FindDir --> Try1["Essai: /data/to/startflow/<br/>{country}/{business_unit}/"]
Try1 -->|Existe| Download
Try1 -->|"N'existe pas"| Try2["Essai: /data/to/startflow/<br/>{country}/{business_unit avec __}/"]
Try2 -->|Existe| Download
Try2 -->|"N'existe pas"| NoDir[FileNotFoundError<br/>Aucun répertoire valide]
Download[Télécharger tous les fichiers] --> SaveLocal["Sauvegarder dans<br/>/app/inputs/{business_unit}/"]
SaveLocal --> Close[Fermer connexion SFTP]
Close --> Success([Succès])
Wait --> Backoff["delay = min(delay * 2, max_delay)"]
Backoff --> Increment["attempt += 1"]
Increment --> CheckMax{"attempt < max_retries?"}
CheckMax -->|Oui| Retry
CheckMax -->|Non| MaxFail[Échec après 5 tentatives]
style Success fill:#90EE90
style AuthFail fill:#FFB6C1
style NoDir fill:#FFB6C1
style MaxFail fill:#FFB6C1
Mécanisme de Retry avec Backoff Exponentiel
sequenceDiagram
participant Script as download_files_from_sftp()
participant Transport as paramiko.Transport
participant SFTP as SFTPClient
participant FS as Système de fichiers
Note over Script: max_retries=5<br/>base_delay=5<br/>max_delay=60
loop Retry Loop (attempt < max_retries)
Script->>Script: log(f"Attempt {attempt+1} of {max_retries}")
Script->>Transport: Transport((sftp_host, sftp_port))
Note over Transport: banner_timeout=120<br/>auth_timeout=120<br/>keepalive=30
Script->>Transport: connect(username, password)
alt Authentication Success
Transport-->>Script: Connected
Script->>SFTP: SFTPClient.from_transport()
Script->>Script: get_remote_directory(sftp, business_unit)
Note over Script: Extract country from business_unit
Script->>SFTP: stat(path1)
alt Path exists
SFTP-->>Script: Path found
else Path not found
Script->>SFTP: stat(path2 with __)
SFTP-->>Script: Path found or FileNotFoundError
end
Script->>SFTP: listdir(remote_directory)
SFTP-->>Script: files_list
loop For each file
Script->>Script: Skip if hidden or directory
Script->>SFTP: get(remote_file, local_file)
SFTP->>FS: Write file
Script->>Script: log(f"Downloaded: {file}")
end
Script->>SFTP: close()
Script->>Transport: close()
Note over Script: Return success
else Authentication Failed
Transport-->>Script: AuthenticationException
Script->>Script: log("Authentication failed")
Note over Script: Break loop - no retry
else Connection Error
Transport-->>Script: SSHException/SFTPError/EOFError
Script->>Script: log(f"Error: {err}")
Note over Script: Clean up resources
Script->>SFTP: close() if exists
Script->>Transport: close() if exists
Script->>Script: attempt += 1
Script->>Script: time.sleep(delay)
Script->>Script: delay = min(delay * 2, max_delay)
Note over Script: Backoff: 5→10→20→40→60
end
end
Script->>Script: log("Failed after multiple attempts")
Structure des chemins distants
flowchart LR
subgraph "Serveur SFTP Lactalis"
Root["/data/to/startflow/"]
Italy["italy/"]
France["france/"]
Galbani1["italy_galbani/"]
Galbani2["italy__galbani/<br/>(variante double underscore)"]
Parmalat1["italy_parmalat/"]
Parmalat2["italy__parmalat/<br/>(variante double underscore)"]
Files1["PRODUCT-BASE.txt<br/>SELLIN-BASE.txt<br/>PROMOCALENDAR.txt<br/>fichiers xlsx config"]
end
subgraph "Local"
LocalDir["/app/inputs/{business_unit}/"]
LocalFiles["Tous les fichiers téléchargés"]
end
Root --> Italy
Root --> France
Italy --> Galbani1
Italy --> Galbani2
Italy --> Parmalat1
Italy --> Parmalat2
Galbani1 --> Files1
LocalDir --> LocalFiles
Files1 -.->|"download"| LocalFiles
style Galbani2 stroke-dasharray: 5 5
style Parmalat2 stroke-dasharray: 5 5
Gestion des erreurs et états
stateDiagram-v2
[*] --> Initialisation: Start
Initialisation --> TentativeConnexion: attempt=0
state TentativeConnexion {
[*] --> Connexion
Connexion --> Authentification
Authentification --> RechercheRepertoire: Success
Authentification --> ErreurAuth: AuthenticationException
RechercheRepertoire --> CheminStandard: Check path
CheminStandard --> CheminDouble: Not found
CheminDouble --> ListeFichiers: Found
CheminStandard --> ListeFichiers: Found
CheminDouble --> ErreurChemin: Not found
ListeFichiers --> Telechargement
Telechargement --> FermetureConnexion
FermetureConnexion --> [*]: Success
Connexion --> ErreurReseau: SSHException/EOFError
Telechargement --> ErreurReseau: SFTPError
}
ErreurAuth --> Echec: Stop (no retry)
ErreurChemin --> Echec: FileNotFoundError
ErreurReseau --> Nettoyage: Handle error
Nettoyage --> Attente: time.sleep(delay)
Attente --> CalculBackoff: delay *= 2
CalculBackoff --> VerificationMax: attempt < 5?
VerificationMax --> TentativeConnexion: Yes
VerificationMax --> Echec: No (max attempts)
FermetureConnexion --> Succes: All files downloaded
Succes --> [*]
Echec --> [*]
note right of ErreurAuth
Pas de retry pour
erreur d'authentification
end note
note right of CalculBackoff
Backoff exponentiel:
5s → 10s → 20s → 40s → 60s
Plafonné à max_delay
end note
Points clés de l'implémentation
mindmap
root((Step 00 - Download))
Robustesse
Retry automatique
5 tentatives max
Backoff exponentiel
Plafond 60 secondes
Gestion erreurs
Auth stop
Réseau retry
Chemin stop
Timeouts étendus
Banner: 120s
Auth: 120s
Keepalive: 30s
Flexibilité
Chemins multiples
Standard : country_business
Variante : country__business
Debug
Paramiko
paramiko_debug.log
Sécurité
Variables environnement
SFTP_HOST
SFTP_PORT
SFTP_USERNAME
SFTP_PASSWORD
BUSINESS_UNIT
Nettoyage ressources
Finally block
Close SFTP
Close Transport
Flux de données
graph TD
subgraph "Inputs"
ENV[Variables d'environnement] --> |Configuration| FUNC
end
subgraph "Processing"
FUNC[download_files_from_sftp] --> GRD[get_remote_directory]
GRD --> |Détermine chemin| PATH{Quel chemin?}
PATH -->|Standard| P1["{country}/{business_unit}"]
PATH -->|Double underscore| P2["{country}/{business avec __}"]
FUNC --> CONN[Connexion SFTP]
CONN --> LIST[Liste fichiers distants]
LIST --> FILTER["Filtre fichiers<br/>- Skip hidden (.)<br/>- Skip directories"]
FILTER --> DL[Download chaque fichier]
end
subgraph "Outputs"
DL --> LOCAL["/app/inputs/{business_unit}/<br/>Tous les fichiers"]
FUNC --> LOG["Logs détaillés<br/>- Tentatives<br/>- Erreurs<br/>- Fichiers téléchargés"]
end
style ENV fill:#e1f5fe
style LOCAL fill:#c8e6c9
style LOG fill:#fff9c4