Crowd density estimation has gained much attention from researchers recently due to availability of low cost cameras and communication bandwidth. In video surveillance applications, counting people and creating a temporal profile is of high interest. Surveillance systems face difficulties in detecting motion from the scene due to varying environmental conditions and occlusion. Instead of detecting and tracking individual person, density estimation is an approximate method to count people. The approximation is often more accurate than individual tracking in occluded scenarios. In this work, a new technique to estimate crowd density is proposed. A block-based dense optical flow with spatial and temporal filtering is used to obtain velocities in order to infer the locations of objects in crowded scenarios. Furthermore, a hierarchical clustering is employed to cluster the objects based on Euclidean distance metric. The Cophenetic correlation coefficient for the clusters highlighted the fact that our preprocessing and localizing of object movements form hierarchical clusters that are structured well with reasonable accuracy without temporal post-processing.