Incident Project Manager (M/F/NB) - D-EDGE



Référence : 74436
Date de dépot : 28/05/2022
Entreprise : Accor
Description : Talents Handicap est un forum virtuel Emploi-Alternance-Formation dédié aux candidats en situation de handicap

Descriptif
Description : ABOUT THE TEAM :
Our goal is to design and develop the best possible products. Our R&D (100 people based in Paris, Barcelona, Bordeaux and Nantes) has developed a culture in which agility and teamwork are at the service of innovation and excellence.

We continue to maintain a “startup” spirit with autonomous SCRUM teams (3 to 8 people), focused on a product theme, and with great freedom of innovation and experimentation.

We are at the heart of an ecosystem of several hundred partners with whom we develop and maintain connectivity via API.

Our technical environment is very rich: distributed architecture of web services and applications on Service Bus, high performance, reliability and security constraints: you will need autonomy, initiative, and adaptability.

ABOUT THE ROLE :
The Service Reliability Manager (SRM) is responsible for both proactive and reactive monitoring and communication for issues which negatively impact services and customer experience. This role acts as a liaison to coordinate internal efforts and drive issue resolution and effective communication for all issues that require customer contact or follow up. The SRM is also responsible for developing, monitoring and reporting of trends and metrics to measure and improve platform stability, reliability and overall customer experience.
The job vacancy is based in Europe and placed under the responsibility of the SVP of Engineering

Note that most of the work is done jointly with the Level 2 Support team, development teams, infrastructure teams and management.

MISSIONS :


Optimizing On-Call Rotations and Processes
Make sure applicative on-call schedules are well organized and slots are taken.
Ensure supervision of the platform is organized (one person each day, on-boarding of new devs done with a buddy, people are notified…).
Help to update tools and documentation to help prepare on-call teams for future incidents.
Ensure that the alerting board is maintained (alerts are pertinent, no alerts are missing, documentation is in place, alerts are ringing at the right times with the appropriate frequency…).
Make sure that alerts’ criticality is properly established.
Make sure that proper consciousness is taken into account and that proper reactions (supervision, itops intervention, devs intervention, cross-team intervention…) are taken when (as soon as) those alerts are raised.
Constantly work on ensuring we have the right organization to respond quickly and efficiently in case of an incident.
Ensure that high stake deployments done by the teams have been risk assessed and that the necessary observability is in place.


Leading the Incident Management


Lead the writing of internal/external Incidents reports and follow the validation process.
Participate actively in finding root causes of cross-team incidents.
You may have to Constantly work on improving our reactivity regarding the speed and clarity with which we communicate (inter R&D and outside) around incidents (all incidents are dive deep into log investigating and work with other teams to make sure the source is identified.
Make sure the action plans are clear and integrated in reports.
During incidents your involvement may exceptionally be required to assist other L2 members in order to help unblock situations (you are their escalation point).
Conducting Post-Incident Follow-up
Ensure that knowledge and lessons learned around incidents are shared across the R&D so that a same incident doesn’t happen twice at different places.
Follow up with the dev and infra teams regarding the execution of their action plans, with a monthly report on progress.
Define KPI on incidents and build a monthly report communicated internally.


Communicating

Exchange regularly with the Client Support team in order to have a general overview of the ongoing strategic matters and the topics that are at risk.
Share that knowledge and act accordingly with the L2 team.
Constantly build and share knowledge around the most efficient ways of determining when an issue is an incident and ideally be able to pinpoint the teams in the right direction when you solicit them.
Setup strong communication with your pair in Asia as well as the L2 team on a daily basis in order to ensure a constant working force that can handle incidents


Profil recherché
Expérience : Débutant
Lieu de la mission : Île-de-France -
Poste(s) disponible(s) : 1
Poste de cadre : Non
Contrat : CDI
Début de la mission : Dès que possible

Entreprise
Nom de l'entreprise : Forum Emploi-Formation-Alternance: Talents Handicap
Site Web : http://www.talents-handicap.com
Activité : Talents Handicap est un forum virtuel Emploi-Alternance-Formation dédié aux candidats en situation de handicap
Contact : Monsieur Forum Talents Handicap
Adresse : *
***** *
France