I am currently working on (1) uncovering interesting structures and patterns in high-dimensional biology data, (2) devising clustering/visualization algorithms to exploit them, and (3) theoretically formulating the model and performance guarantee. Below are the main results:
CoreSPECT: Enhancing Clustering Algorithms via an Interplay of Density and Geometry
Chandra Sekhar Mukherjee*, Joonyoung Bae*, and Jiapeng Zhang (* Equal Contribution)
CoreSPECT (Core Space Projection-based Enhancement of Clustering Techniques) introduces a unified framework integrating density estimation, stable core extraction, and label propagation to robustly identify multi-scale structures in complex data. By leveraging relative centrality, it achieves interpretable and geometry-aware clustering performance across both biological and image datasets, with initial theoretical guarantees.
CoreMAP: Visualization Algorithm for Non-Linear Manifold Data with Core-Periphery Structure
Chandra Sekhar Mukherjee, Joonyoung Bae, and Jiapeng Zhang
CoreMAP (Core-based Manifold Approximation and Projection) is a new visualization tool that enables hierarchical display of most-to-least separable points by incorporating a novel anchoring idea and clustering results on the core nodes into the attraction–repulsion dynamics of UMAP.
It is currently under development, but the prototype can be found in the cplearn Python package.
Justified Representation based Graph Centrality and Clustering
Joonyoung Bae, Anish Jayant, and Chandra Sekhar Mukherjee (in alphabetical order)
By viewing nodes on the graph as voters/candidates and the edges as approval ballots, we devised a new graph centrality measure that guarantees balancedness of various size clusters by applying committee selection algorithms that guarantees justified representation. Mainly Method of Equal Shares (MES) algorithm was used as it was the only algorithm with reasonable run-time. The performance was empirically tested and it showed promising outcome compared to other existing centrality measures such as Pagerank.
It was submitted as a final project for CSCI673:Structure and Dynamics of Networked Information by Prof. David Kempe, and also independently appeared in the paper Papasotiropoulos et. al., 2025 with theoretical guarantees and other algorithms.