复杂的城市轨道交通线网给乘客提供多种出行路径选择，而轨道网络起讫点间可能存在多条可选有效路径，给城市轨道客流清分工作带来难度。为求解相同起讫站点间各路线乘客选择的概率问题，以广州市地铁自动售检票(automatic fare collection，AFC)系统刷卡数据为研究对象，提出一种创新性的半监督聚类算法框架。首先基于广度优先(breadth first search, BFS)的Ｋ短路径的搜索算法，识别起讫点间的有效路径集，由此确定初始聚类中心及个数；然后以路径距离和换乘次数等特征值依次标定各有效路径权重，由这些标记数据出发，采用加权半监督的方式增强聚类算法的分类能力。最后结合客流调查结果，与经典K-means算法和朴素贝叶斯分类算法进行比对。通过算例证实提出的客流分配算法性能最优，准确率高达94%，具有较好的分类效果。
A complex urban rail transit network provides passengers with a variety of travel route options, and there may be multiple optional valid paths between the track network and the urban rail passenger flow distribution. To solve the problem of the probability of passenger selection for each route between the same starting stations, this paper proposes an innovative semi-supervised clustering algorithm framework based on the AFC credit card data of Guangzhou Metro. First, based on the wide-priority K-short path search algorithm, the effective path set between the starting points is identified, thereby determining the initial cluster center and the number. Then, the effective path weights are sequentially calibrated by feature values such as path distance and transfer times. Starting from these marked data, the weighting semi-supervised method is used to enhance the classification ability of the clustering algorithm. Finally, the results of the passenger flow survey are compared with the classical K-means algorithm and the naive Bayesian classification algorithm. The results show that the passenger flow distribution algorithm proposed in this paper has the best performance, and the accuracy rate is up to 94%, which has a good classification effect.