In view of the urban rail transit safety problems caused by limited transport capacity for passenger demand in peak period, it is necessary to adopt passenger flow control strategy to adjust the inbound passenger flow and alleviate station congestion. In this paper, a multi-station cooperative control model based on reinforcement learning deep-q network is proposed to optimize the arrival number of each station in a certain period of time. The goal is to minimize the comprehensive benefits of platform overrun, average waiting time and passenger flow control intensity. Taking Beijing Metro Batong line as an example, the simulation results verify the effectiveness of the method. The simulation results show that the model can effectively reduce the waiting time of passengers and improve the travel efficiency of passengers under the condition of low passenger flow control intensity. The method is helpful to alleviate the passenger congestion at the station without reducing the travel efficiency of passengers.