Diagrammes Step 09 - Upload Outputs
Vue d'ensemble du processus d'upload
graph TB
Start([upload_files_to_sftp]) --> Init("Initialisation<br/>max_retries=5<br/>base_delay=5<br/>max_delay=60")
Init --> Retry{Tentative<br/>< max_retries?}
Retry -->|Oui| Connect(Connexion SFTP)
Retry -->|Non| Fail("❌ Échec après<br/>5 tentatives")
Connect --> Auth{Authentification}
Auth -->|Success| GetDir(get_remote_directory)
Auth -->|Failure| AuthFail("❌ Auth failed<br/>No retry")
GetDir --> CheckPath{Chemin<br/>trouvé?}
CheckPath -->|Oui| CreateDirs("Créer répertoires<br/>outputs/ & logs/")
CheckPath -->|Non| PathError(FileNotFoundError)
CreateDirs --> UploadOutputs(Upload fichiers outputs)
UploadOutputs --> UploadLogs(Upload fichiers logs)
UploadLogs --> CloseConn(Fermer connexion)
CloseConn --> Success("✅ Upload réussi")
Connect -->|Network Error| HandleError(Gérer erreur)
PathError --> HandleError
HandleError --> Cleanup(Nettoyer ressources)
Cleanup --> Wait(Attendre delay secondes)
Wait --> IncDelay("delay = min(delay*2, 60)")
IncDelay --> IncAttempt("attempt++")
IncAttempt --> Retry
style Success fill:#90EE90
style AuthFail fill:#FFB6C1
style Fail fill:#FFB6C1
Mécanisme de retry avec backoff exponentiel
sequenceDiagram
participant Main as upload_files_to_sftp()
participant Transport as paramiko.Transport
participant SFTP as SFTPClient
participant Remote as Serveur SFTP
Note over Main: Configuration initiale<br/>attempt=0, delay=5
loop Retry Loop (max 5 fois)
Main->>Main: Log attempt {attempt+1}/5
Main->>Transport: Create((host, port))
Note over Transport: banner_timeout=120<br/>auth_timeout=120<br/>keepalive=30
Main->>Transport: connect(user, pass)
alt Success
Transport->>Remote: SSH handshake
Remote-->>Transport: Connected
Transport-->>Main: Auth OK
Main->>SFTP: from_transport()
SFTP-->>Main: Client créé
Main->>Main: get_remote_directory()
Note over Main: Test chemins:<br/>1. {country}/{bu}<br/>2. {country}/{bu__}
Main->>SFTP: stat(path)
SFTP->>Remote: Check exists
Remote-->>SFTP: Path status
SFTP-->>Main: Path found/not found
Main->>Main: Upload all files
Main->>SFTP: close()
Main->>Transport: close()
Note over Main: Return True
else Auth Failed
Transport-->>Main: AuthenticationException
Main->>Main: Log "Auth failed"
Note over Main: Break (no retry)
else Network Error
Transport-->>Main: SSHException/EOFError
Main->>Main: Log error + retry info
Note over Main: Cleanup
Main->>SFTP: close() if exists
Main->>Transport: close() if exists
Main->>Main: time.sleep(delay)
Main->>Main: delay = min(delay*2, 60)
Note over Main: 5→10→20→40→60
end
end
Détermination du chemin distant
flowchart TD
subgraph "get_remote_directory"
Start(business_unit)
Extract("Extraire country<br/>split('_')[0]")
Base("base_path =<br/>/data/to/startflow/{country}")
Path1("{base}/{business_unit}")
Path2("{base}/{business_unit avec __}")
Check1{"sftp.stat(path1)<br/>existe?"}
Check2{"sftp.stat(path2)<br/>existe?"}
Found1(Return path1)
Found2(Return path2)
NotFound(FileNotFoundError)
end
Start --> Extract
Extract --> Base
Base --> Path1
Base --> Path2
Path1 --> Check1
Check1 -->|Oui| Found1
Check1 -->|Non| Path2
Path2 --> Check2
Check2 -->|Oui| Found2
Check2 -->|Non| NotFound
style Found1 fill:#90EE90
style Found2 fill:#90EE90
style NotFound fill:#FFB6C1
Structure des répertoires distants
graph LR
subgraph "Serveur SFTP Lactalis"
Root["/data/to/startflow/"]
subgraph "Pays"
Italy[italy/]
France[france/]
Spain[spain/]
end
subgraph "Business Units"
Galbani1[italy_galbani/]
Galbani2[italy__galbani/<br/>variante]
Parmalat[italy_parmalat/]
President[france_president/]
end
subgraph "Structure créée"
Outputs[outputs/]
Logs[logs/]
Files1[📄 *.xlsx]
Files2[📄 *.log]
end
end
Root --> Italy
Root --> France
Root --> Spain
Italy --> Galbani1
Italy --> Galbani2
Italy --> Parmalat
France --> President
Galbani1 --> Outputs
Galbani1 --> Logs
Outputs --> Files1
Logs --> Files2
style Galbani2 stroke-dasharray: 5 5
style Files1 fill:#90EE90
style Files2 fill:#fff9c4
Processus d'upload des fichiers
flowchart LR
subgraph "Local"
L1("app/outputs/{bu}/")
L2("app/logs/{bu}/")
O1(Promotion_Level_Table.xlsx)
O2(Baseline_Table.xlsx)
O3(Autres fichiers)
LOG1("pipeline_*.log")
LOG2("step_*.log")
LOG3(Autres logs)
end
subgraph "Process"
P1(Liste fichiers locaux)
P2{Est un<br/>fichier?}
P3("sftp.put()")
P4(Log upload)
end
subgraph "Remote"
R1("{remote_dir}/outputs/")
R2("{remote_dir}/logs/")
RO1(Fichiers Excel)
RL1(Fichiers Log)
end
L1 --> O1
L1 --> O2
L1 --> O3
L2 --> LOG1
L2 --> LOG2
L2 --> LOG3
O1 --> P1
LOG1 --> P1
P1 --> P2
P2 -->|Oui| P3
P2 -->|Non| P1
P3 --> P4
P4 --> P1
P3 --> R1
P3 --> R2
R1 --> RO1
R2 --> RL1
Gestion des erreurs et états
stateDiagram-v2
[*] --> Init: Start upload
Init --> TryConnect: attempt = 0
state TryConnect {
[*] --> CreateTransport
CreateTransport --> SetTimeouts
SetTimeouts --> Connect
Connect --> AuthCheck
AuthCheck --> Success: Auth OK
AuthCheck --> AuthError: Auth failed
Success --> GetRemoteDir
GetRemoteDir --> PathFound: Path exists
GetRemoteDir --> PathNotFound: No valid path
PathFound --> CreateOutputDir
CreateOutputDir --> UploadOutputs
UploadOutputs --> CreateLogDir
CreateLogDir --> UploadLogs
UploadLogs --> CloseConnection
CloseConnection --> [*]: Return True
Connect --> NetworkError: SSH/SFTP Error
CreateTransport --> NetworkError: Connection failed
UploadOutputs --> NetworkError: Transfer error
}
AuthError --> Failed: No retry for auth
PathNotFound --> HandleError: FileNotFoundError
NetworkError --> HandleError: Recoverable error
HandleError --> Cleanup: Close resources
Cleanup --> CheckRetries: attempt < 5?
CheckRetries --> Wait: Yes
CheckRetries --> Failed: No (max attempts)
Wait --> Backoff: sleep(delay)
Backoff --> IncDelay: delay *= 2
IncDelay --> IncAttempt: attempt++
IncAttempt --> TryConnect
CloseConnection --> Done: Success
Failed --> Done: Failure
Done --> [*]
note right of AuthError
Authentication errors
don't trigger retry
end note
note right of Backoff
Exponential backoff:
5s → 10s → 20s → 40s → 60s
end note
Configuration et paramètres
graph TD
subgraph "Variables d'environnement"
E1[SFTP_HOST]
E2[SFTP_PORT]
E3[SFTP_USERNAME]
E4[SFTP_PASSWORD]
E5[BUSINESS_UNIT]
end
subgraph "Paramètres connexion"
C1[banner_timeout = 120s]
C2[auth_timeout = 120s]
C3[keepalive = 30s]
end
subgraph "Paramètres retry"
R1[max_retries = 5]
R2[base_delay = 5s]
R3[max_delay = 60s]
end
subgraph "Configuration Transport"
T1[Transport création]
T2[Timeouts setting]
T3[Keepalive activation]
T4[Authentication]
end
E1 --> T1
E2 --> T1
E3 --> T4
E4 --> T4
C1 --> T2
C2 --> T2
C3 --> T3
R1 --> RetryLogic[Logique<br/>de retry]
R2 --> RetryLogic
R3 --> RetryLogic
Création des répertoires distants
sequenceDiagram
participant Upload as upload_files_to_sftp()
participant SFTP as SFTPClient
participant Remote as Serveur
Note over Upload: Répertoires à créer
Upload->>Upload: remote_output_directory = {remote_dir}/outputs
Upload->>Upload: remote_log_directory = {remote_dir}/logs
rect rgb(200, 230, 255)
Note right of Upload: Création outputs/
Upload->>SFTP: stat(remote_output_directory)
alt Existe
SFTP-->>Upload: Directory info
Upload->>Upload: Skip creation
else N'existe pas
SFTP-->>Upload: IOError
Upload->>SFTP: mkdir(remote_output_directory)
SFTP->>Remote: Create directory
Remote-->>SFTP: OK
SFTP-->>Upload: Directory created
end
end
rect rgb(255, 230, 200)
Note right of Upload: Création logs/
Upload->>SFTP: stat(remote_log_directory)
alt Existe
SFTP-->>Upload: Directory info
Upload->>Upload: Skip creation
else N'existe pas
SFTP-->>Upload: IOError
Upload->>SFTP: mkdir(remote_log_directory)
SFTP->>Remote: Create directory
Remote-->>SFTP: OK
SFTP-->>Upload: Directory created
end
end
Nettoyage des ressources
flowchart TD
subgraph "Finally Block"
F1("Bloc finally<br/>toujours exécuté")
C1{"sftp in<br/>locals()?"}
C2("sftp.close()")
C3{"transport in<br/>locals()?"}
C4("transport.close()")
E1(Exception cleanup)
L1(Log cleanup error)
end
F1 --> C1
C1 -->|Oui| C2
C1 -->|Non| C3
C2 --> C3
C2 -->|Error| E1
C3 -->|Oui| C4
C3 -->|Non| End(Fin cleanup)
C4 --> End
C4 -->|Error| E1
E1 --> L1
L1 --> Continue(Continue anyway)
style F1 fill:#fff3cd
style E1 fill:#f8d7da
style L1 fill:#d1ecf1
Points clés de maintenance et troubleshooting
mindmap
root((Step 09<br/>Upload<br/>Outputs))
Configuration
Variables env
SFTP credentials
Business unit
Host et port
Timeouts
Banner 120s
Auth 120s
Keepalive 30s
Retry params
5 tentatives max
Backoff 5 to 60s
Pas de retry auth
Structure
Chemins locaux
app outputs bu
app logs bu
Chemins distants
data to startflow
country bu path
Variante double underscore
Répertoires créés
outputs folder
logs folder
Robustesse
Retry intelligent
Network errors only
Exponential backoff
Resource cleanup
Gestion chemins
Test 2 variantes
FileNotFoundError
Auto création dirs
Debug
paramiko debug log
Logs détaillés
État connexion
Troubleshooting
Auth failed
Check credentials
Account status
IP restrictions
Path not found
Vérifier structure
Permissions SFTP
Format BU
Timeout
Network stability
File sizes
Bandwidth